Acta Optica Sinica, Volume. 45, Issue 14, 1420011(2025)

Principles and Applications of Nonlinearity in Optical Neural Networks (Invited)

Yufei Wang1,2, Yumeng Chen1, Yongzheng Yang1, Kun Liao1、*, Xiaoyong Hu1,2、**, and Qihuang Gong1,2
Author Affiliations
  • 1State Key Laboratory of Artificial Microstructure and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, China
  • 2Yangtze Delta Institute of Optoelectronics, Peking University, Nantong 226010, Jiangsu , China
  • show less

    Significance

    Research on nonlinearity in optical neural networks is of critical importance because nonlinear activation functions enable neural networks to overcome limitations of pure linear transformations and to learn complex features. As artificial intelligence applications increasingly demand high-efficiency, low-power computing platforms, implementing nonlinear activation optically can leverage intrinsic advantages of optics, including massive parallelism, low latency, and low energy consumption, and thus holds the potential to drive revolutionary advances in areas such as computer vision and natural language processing. To date, linear weighted operations in optical neural networks have been widely validated across various platforms and architectures; however, the realization of nonlinear functions still largely relies on backend electronic nonlinearities. This typically involves converting optical signals to electrical signals via photodetectors, then introducing nonlinearity in the digital domain through analog-to-digital conversion. Such a process incurs substantial energy overhead, preventing optical neural networks from simultaneously achieving strong representational power and low operational energy. To overcome this limitation, researchers have explored multiple optical nonlinear schemes, including fully optical control and optoelectronic hybrid control. In optoelectronic hybrid schemes, energy consumption arises mainly from pump light, modulators, and receivers, whereas in fully optical control schemes, the energy cost is dominated by pump light alone. When low-threshold designs such as resonance enhancement or phase-change materials are employed, fully optical nonlinear control has greater potential for low-energy operation compared to optoelectronic hybrid approaches. Conversely, optoelectronic nonlinear schemes offer higher reconfigurability and flexibility relative to fully optical implementations.

    Progress

    Against this background, this review surveys schemes for realizing nonlinearity and their applications in optical neural networks. Specifically, the review covers 1) fully optical nonlinear schemes, including encompassing second-order nonlinear processes, third-order nonlinear effects, and phase-change-based modulation approaches; 2) optoelectronic hybrid control schemes, including optical-electrical-optical and optical-electrical configurations; and 3) the deployment of nonlinear activation functions and nonlinear neuron constructs within optical neural network architectures.

    In the domain of fully optical nonlinearity, second-order nonlinear processes exploit materials such as periodically poled lithium niobate to achieve activation-like behavior (e.g., ReLU- or Sigmoid-like mapping) via second-harmonic generation or parametric interactions, as shown in Fig. 1. Extensions include combining polycrystalline lithium niobate scattering with frequency-doubled light to construct composite linear-nonlinear mappings. Third-order nonlinear approaches leverage saturable absorption or reverse-saturable absorption in atomic media or two-dimensional materials (e.g., graphene, MoS2, Ti3C2Tx, MoTe2, Bi2Te3) integrated into waveguides or atomic vapor cells to introduce activation behavior, as shown in Fig. 2. Additional third-order schemes use microring resonators (MRRs): free-carrier dispersion and thermo-optic effects within the resonator produce soft-threshold or ReLU/Sigmoid-like responses, as shown in Fig. 3. Phase-change material-based modulation (e.g., VO2, Ge2Sb2Te25) combined with resonant structures yields nonvolatile, multilevel activation units, affording memory-enabled nonlinear operations, as shown in Fig. 4.

    Turning to optoelectronic hybrid control, optical-electrical-optical configurations implement programmable nonlinear functions by feeding photodetector outputs into electro-optic, thermo-optic, or free-carrier modulators and then back into the optical domain; such schemes can incorporate two-dimensional material devices (for example, MoS2 photoconductive memory driving Mach-Zehnder interferometer (MZI) or MRR phase modulation, or graphene/silicon heterojunction MRR) to realize amplitude- and phase-reconfigurable activations, as shown in Fig. 5 and Fig. 6. Optical-electrical schemes exploit the inherent square-law response of photodetectors or interferometric balanced detection to form nonlinear nodes that deliver rapid, low-power nonlinear mappings without requiring feedback into the optical domain, as shown in Fig. 7.

    Finally, applications of these nonlinear implementations are surveyed: nonlinear activation functions have been integrated into feedforward optical neural network hardware to achieve high-accuracy tasks such as handwritten digit recognition, color image classification, and speech classification, as shown in Fig. 8; in reservoir computing, spatially and temporally structured reservoirs employing phase modulation, optical feedback, and optoelectronic detection enable large-scale or deep reservoir networks for action recognition, time-series prediction, and cardiac rhythm detection, as shown in Fig. 9; in spiking neural network implementations, pulses triggered by saturable absorbers or phase-change materials realize threshold-integrate-and-fire dynamics, supporting both supervised and unsupervised pattern recognition, as shown in Fig. 10.

    Conclusions and Prospects

    Although optoelectronic hybrid schemes are relatively mature, they are limited by latency and energy consumption. Fully optical nonlinear approaches offer potential advantages in speed and energy efficiency but require breakthroughs in low-threshold, fast-response materials and devices. Different network architectures impose distinct requirements on activation functions: future research should focus on providing reconfigurable and collaboratively optimized activation functions at the device level. Scaling up network size faces challenges such as optical power attenuation and integration complexity; system-level strategies including gain compensation, topology optimization, and energy-recycling mechanisms are needed. Moreover, issues such as device variability, thermal management, and fabrication yield must be addressed to ensure reliable operation. Standardized benchmarks and calibration protocols are necessary for fair performance evaluation, and modular architectures can facilitate scalable deployment and maintenance. Demonstrations on representative artificial intelligence (AI) tasks and integration with existing electronic platforms will validate practical viability and guide iterative improvements. Cross-disciplinary integration, combining novel nonlinear photonic materials, micro/nano-device innovations, and co-design of devices and systems, promises to accelerate the realization of large-scale, efficient, low-power optical neural networks and to drive innovative applications in complex artificial intelligence tasks.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Yufei Wang, Yumeng Chen, Yongzheng Yang, Kun Liao, Xiaoyong Hu, Qihuang Gong. Principles and Applications of Nonlinearity in Optical Neural Networks (Invited)[J]. Acta Optica Sinica, 2025, 45(14): 1420011

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Optics in Computing

    Received: Apr. 15, 2025

    Accepted: Jun. 23, 2025

    Published Online: Jul. 18, 2025

    The Author Email: Kun Liao (kunliao@pku.edu.cn), Xiaoyong Hu (xiaoyonghu@pku.edu.cn)

    DOI:10.3788/AOS250924

    CSTR:32393.14.AOS250924

    Topics