Multimodal feature fusion based on heterogeneous optical neural networks

Yi-zhen ZHENG; Jian DAI; Tian ZHANG; Kun XU

doi:10.37188/CO.2023-0036

Chinese Optics, Volume. 16, Issue 6, 1343(2023)

Multimodal feature fusion based on heterogeneous optical neural networks

Yi-zhen ZHENG, Jian DAI, Tian ZHANG^*, and Kun XU

State Key Laboratory of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications, Beijing 100876, China

show less

Abstract Get PDF(in Chinese)

Current study on photonic neural networks mainly focuses on improving the performance of single-modal networks, while study on multimodal information processing is lacking. Compared with single-modal networks, multimodal learning utilizes complementary information between modalities. Therefore, multimodal learning can make the representation learned by the model more complete. In this paper, we propose a method that combines photonic neural networks and multimodal fusion techniques. First, a heterogeneous photonic neural network is constructed by combining a photonic convolutional neural network and a photonic artificial neural network, and multimodal data are processed by the heterogeneous photonic neural network. Second, the fusion performance is enhanced by introducing attention mechanism in the fusion stage. Ultimately, the accuracy of task classification is improved. In the MNIST dataset of handwritten digits classification task, the classification accuracy of the heterogeneous photonic neural network fused by the splicing method is 95.75%; the heterogeneous photonic neural network fused by introducing the attention mechanism is classified with an accuracy of 98.31%, which is better than many current advanced single-modal photonic neural networks. Compared with the electronic heterogeneous neural network, the training speed of the model is improved by 1.7 times; compared with the single-modality photonic neural network model, the heterogeneous photonic neural network can make the representation learned by the model more complete, thus effectively improving the classification accuracy of MNIST dataset of handwritten digits.

Keywords

attention mechanism multimodal photonic neural network

Tools

Get Citation

Copy Citation Text

Yi-zhen ZHENG, Jian DAI, Tian ZHANG, Kun XU. Multimodal feature fusion based on heterogeneous optical neural networks[J]. Chinese Optics, 2023, 16(6): 1343

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Original Article

Received: Mar. 1, 2023

Accepted: --

Published Online: Nov. 29, 2023

The Author Email:

DOI:10.37188/CO.2023-0036

Topics