Acta Optica Sinica, Volume. 45, Issue 14, 1420006(2025)
Recent Research Advances of On‐Chip Optical Nonlinear Activation Function Devices (Invited)
The rapid advancement of artificial intelligence (AI) technologies is driving transformative changes across multiple industries including scientific research, automated manufacturing, healthcare, service sectors, and autonomous transportation. This AI revolution has created unprecedented demands for computational power and energy efficiency. Traditional electronic computing architectures struggle to meet the fundamental limitations of Thevon Neumann architecture, particularly the “memory-wall” problem arising from the physical seperation of processing and memory units, which leads to excessive energy consumption during data transfer operations. Furthermore, the approaching physical limits of semiconductor miniaturization under Moore’s Law severely constrain further improvements in processor clock speeds.Optical computing has emerged as a promising alternative paradigm to address these critical challenges. By utilizing photons instead of electrons as information carriers, photonic computing systems offer several inherent advantages: (1) massive parallelism enabled by wavelength division multiplexing and optical interference phenomena, (2) near-zero heat dissipation during information transmission, (3) ultra-high bandwidth capabilities exceeding 100 GHz, and (4) light-speed processing latency. These characteristics make photonic neural networks particularly well-suited for accelerating matrix-vector multiplications, which constitute over 90% of computations in deep learning models.However, within integrated photonic neural networks, implementing efficient nonlinear activation functions remains a significant technical challenge. While linear operations can be effectively performed using Mach-Zehnder interferometer arrays or microring resonator weight banks, introducing essential nonlinear transformations is problematic. Current hybrid photonic-electronic systems typically offload nonlinear activation to electronic processors, creating substantial bandwidth bottlenecks and energy overhead at the optoelectronic interfaces. This architectural limitation undermines many potential advantages of all-optical computing systems.Therefore, the development of high-performance and programmable optical nonlinear activation devices is crucial for realizing end-to-end optical neural networks. Successful implementation would enable: (1) orders-of-magnitude improvements in processing speed by eliminating electro-optic conversion delays, (2) dramatic reductions in power consumption through all-optical signal processing, and (3) novel computing architectures leveraging quantum optical effects. These advancements could revolutionize AI hardware for applications ranging from real-time video analysis to large language model inference, potentially reducing energy consumption by several orders of magnitude compared to conventional electronic processors.
Recent years have witnessed significant advancements in on-chip optical activation functions, progressing primarily along two technical pathways: electro-optic and all-optical implementations.Electro-optic approaches, currently the most mature technological solution, leverage established silicon photonics manufacturing. These systems typically employ a three-stage architecture: optical-to-electrical conversion via photodetectors, electronic nonlinear processing, and electrical-to-optical modulation. The ITO-based electro-absorption modulator platform has demonstrated notable promise, achieving 98% accuracy on MNIST classification tasks with 5 mW threshold power. Recent innovations using graphene-ITO heterostructures have further reduced operating voltages to sub-1 V levels while maintaining favorable nonlinear response characteristics. However, these devices face inherent tradeoffs between speed (typically limited to ~100 ps by carrier dynamics) and energy efficiency (usually >1 pJ/operation). Novel integration schemes are addressing these limitations: the ECU-ORS-MZI configuration incorporates non-volatile MoS2-based optoelectronic memory switches, enabling reconfigurable activation functions (Sigmoid, Softplus, Clamped ReLU) with only 2 V drive voltage. More radically, graphene-silicon heterojunction devices integrate detection and modulation functionalities within a single microring resonator, achieving 8 μW threshold power through innovative photocurrent contour mapping techniques. While these co-designed systems show promise for reducing device footprints and power consumption, challenges persist in scaling large arrays while maintaining uniformity.All-optical nonlinear activation represents the ultimate solution for photonic neural networks, with breakthroughs across multiple material platforms: silicon photonic devices exploit combinations of Kerr nonlinearity, two-photon absorption, and free-carrier effects. MRR-MZI configuration achieves 25 mW/π thermal tuning efficiency with 2.5 ns response, while inverse-designed nanostructures reduce optical power thresholds to 2.9 mW. Emerging silicon nitride platforms enable 10 Gbit/s operation using pure Kerr nonlinearities with negligible absorption loss. Germanium-based devices leverage strong absorption characteristics and carrier plasma effects. Ge-Si photodiode architecture operate at 20 GHz with 1.1 mW threshold power, while microring versions achieve 0.74 mW thresholds via innovative thermal feedback loops. These devices show excellent compatibility with standard CMOS processes. Lithium niobate platforms exploit large second-order nonlinear coefficients. The SHG-DOPA configuration demonstrates record 16 fJ thresholds and 75 fs response time, with periodically poled waveguides suggesting further energy reductions. Phase change materials enable non-volatile state switching: GST-based microrings achieve 500 pJ switching energy with <200 ns crystallization time, while VO? devices show 0.5 mW threshold and broadband operation (visible to near-infrared wavelengths), supporting in-memory computing architectures. Two-dimensional materials offer exceptional versatility: graphene-plasmonic hybrids reach 35 fJ threshold and 260 fs response time using universal absorption. MXene devices operate at 50 μW across 1310?1550 nm bands, while MoTe2-glass waveguide systems achieve 0.94 μW threshold with 2.08 THz bandwidth, highlighting multi-wavelength parallel processing potential.
The field of on-chip optical activation functions has achieved remarkable progress through both electro-optic and all-optical approaches, each offering distinct advantages. Future research directions should prioritize three critical directions: (1) wafer-scale heterogeneous integration of novel materials (2D materials, PCMs) with standard photonic platforms, (2) development of standardized programming interfaces for optical nonlinearities, and (3) system-level solutions for maintaining signal integrity in multi-layer networks. Hybrid approaches combining complementary platforms may provide near-term pathways while fundamental material challenges are resolved. With continued advances in materials science, nanofabrication techniques, and photonic design methodologies, optical neural networks incorporating efficient nonlinear activation functions could soon achieve the transition from laboratory demonstrations to commercial deployment, potentially revolutionizing energy-efficient AI computing across diverse application domains.
Get Citation
Copy Citation Text
Ruizhe Liu, Zijia Wang, Hongtao Lin. Recent Research Advances of On‐Chip Optical Nonlinear Activation Function Devices (Invited)[J]. Acta Optica Sinica, 2025, 45(14): 1420006
Category: Optics in Computing
Received: Apr. 15, 2025
Accepted: Jun. 30, 2025
Published Online: Jul. 22, 2025
The Author Email: Hongtao Lin (hometown@zju.edu.cn)
CSTR:32393.14.AOS250928