Acta Optica Sinica, Volume. 45, Issue 14, 1420011(2025)
Principles and Applications of Nonlinearity in Optical Neural Networks (Invited)
Research on nonlinearity in optical neural networks is of critical importance because nonlinear activation functions enable neural networks to overcome limitations of pure linear transformations and to learn complex features. As artificial intelligence applications increasingly demand high-efficiency, low-power computing platforms, implementing nonlinear activation optically can leverage intrinsic advantages of optics, including massive parallelism, low latency, and low energy consumption, and thus holds the potential to drive revolutionary advances in areas such as computer vision and natural language processing. To date, linear weighted operations in optical neural networks have been widely validated across various platforms and architectures; however, the realization of nonlinear functions still largely relies on backend electronic nonlinearities. This typically involves converting optical signals to electrical signals via photodetectors, then introducing nonlinearity in the digital domain through analog-to-digital conversion. Such a process incurs substantial energy overhead, preventing optical neural networks from simultaneously achieving strong representational power and low operational energy. To overcome this limitation, researchers have explored multiple optical nonlinear schemes, including fully optical control and optoelectronic hybrid control. In optoelectronic hybrid schemes, energy consumption arises mainly from pump light, modulators, and receivers, whereas in fully optical control schemes, the energy cost is dominated by pump light alone. When low-threshold designs such as resonance enhancement or phase-change materials are employed, fully optical nonlinear control has greater potential for low-energy operation compared to optoelectronic hybrid approaches. Conversely, optoelectronic nonlinear schemes offer higher reconfigurability and flexibility relative to fully optical implementations.
Against this background, this review surveys schemes for realizing nonlinearity and their applications in optical neural networks. Specifically, the review covers 1) fully optical nonlinear schemes, including encompassing second-order nonlinear processes, third-order nonlinear effects, and phase-change-based modulation approaches; 2) optoelectronic hybrid control schemes, including optical-electrical-optical and optical-electrical configurations; and 3) the deployment of nonlinear activation functions and nonlinear neuron constructs within optical neural network architectures.
In the domain of fully optical nonlinearity, second-order nonlinear processes exploit materials such as periodically poled lithium niobate to achieve activation-like behavior (e.g., ReLU- or Sigmoid-like mapping) via second-harmonic generation or parametric interactions, as shown in Fig. 1. Extensions include combining polycrystalline lithium niobate scattering with frequency-doubled light to construct composite linear-nonlinear mappings. Third-order nonlinear approaches leverage saturable absorption or reverse-saturable absorption in atomic media or two-dimensional materials (e.g., graphene, MoS2, Ti3C2Tx, MoTe2, Bi2Te3) integrated into waveguides or atomic vapor cells to introduce activation behavior, as shown in Fig. 2. Additional third-order schemes use microring resonators (MRRs): free-carrier dispersion and thermo-optic effects within the resonator produce soft-threshold or ReLU/Sigmoid-like responses, as shown in Fig. 3. Phase-change material-based modulation (e.g., VO2, Ge2Sb2Te25) combined with resonant structures yields nonvolatile, multilevel activation units, affording memory-enabled nonlinear operations, as shown in Fig. 4.
Turning to optoelectronic hybrid control, optical-electrical-optical configurations implement programmable nonlinear functions by feeding photodetector outputs into electro-optic, thermo-optic, or free-carrier modulators and then back into the optical domain; such schemes can incorporate two-dimensional material devices (for example, MoS2 photoconductive memory driving Mach-Zehnder interferometer (MZI) or MRR phase modulation, or graphene/silicon heterojunction MRR) to realize amplitude- and phase-reconfigurable activations, as shown in Fig. 5 and Fig. 6. Optical-electrical schemes exploit the inherent square-law response of photodetectors or interferometric balanced detection to form nonlinear nodes that deliver rapid, low-power nonlinear mappings without requiring feedback into the optical domain, as shown in Fig. 7.
Finally, applications of these nonlinear implementations are surveyed: nonlinear activation functions have been integrated into feedforward optical neural network hardware to achieve high-accuracy tasks such as handwritten digit recognition, color image classification, and speech classification, as shown in Fig. 8; in reservoir computing, spatially and temporally structured reservoirs employing phase modulation, optical feedback, and optoelectronic detection enable large-scale or deep reservoir networks for action recognition, time-series prediction, and cardiac rhythm detection, as shown in Fig. 9; in spiking neural network implementations, pulses triggered by saturable absorbers or phase-change materials realize threshold-integrate-and-fire dynamics, supporting both supervised and unsupervised pattern recognition, as shown in Fig. 10.
Although optoelectronic hybrid schemes are relatively mature, they are limited by latency and energy consumption. Fully optical nonlinear approaches offer potential advantages in speed and energy efficiency but require breakthroughs in low-threshold, fast-response materials and devices. Different network architectures impose distinct requirements on activation functions: future research should focus on providing reconfigurable and collaboratively optimized activation functions at the device level. Scaling up network size faces challenges such as optical power attenuation and integration complexity; system-level strategies including gain compensation, topology optimization, and energy-recycling mechanisms are needed. Moreover, issues such as device variability, thermal management, and fabrication yield must be addressed to ensure reliable operation. Standardized benchmarks and calibration protocols are necessary for fair performance evaluation, and modular architectures can facilitate scalable deployment and maintenance. Demonstrations on representative artificial intelligence (AI) tasks and integration with existing electronic platforms will validate practical viability and guide iterative improvements. Cross-disciplinary integration, combining novel nonlinear photonic materials, micro/nano-device innovations, and co-design of devices and systems, promises to accelerate the realization of large-scale, efficient, low-power optical neural networks and to drive innovative applications in complex artificial intelligence tasks.
Get Citation
Copy Citation Text
Yufei Wang, Yumeng Chen, Yongzheng Yang, Kun Liao, Xiaoyong Hu, Qihuang Gong. Principles and Applications of Nonlinearity in Optical Neural Networks (Invited)[J]. Acta Optica Sinica, 2025, 45(14): 1420011
Category: Optics in Computing
Received: Apr. 15, 2025
Accepted: Jun. 23, 2025
Published Online: Jul. 18, 2025
The Author Email: Kun Liao (kunliao@pku.edu.cn), Xiaoyong Hu (xiaoyonghu@pku.edu.cn)
CSTR:32393.14.AOS250924