Multimodal feature fusion based on heterogeneous optical neural networks

Yi-zhen ZHENG; Jian DAI; Tian ZHANG; Kun XU

doi:10.37188/CO.2023-0036

[1] WANG H Q, HOU W B, HUANG R, . Spatial pulse position modulation multi-classification detector based on deep learning[J]. Chinese Optics, 16, 415-424(2023).

[2] JIANG L Q, NING CH Y, YU H T, . Classification model based on fusion of multi-scale feature and channel feature for benign and malignant brain tumors[J]. Chinese Optics, 15, 1339-1349(2022).

[3] LI G N, SHI J K, CHEN X M, . Through-focus scanning optical microscopy measurement based on machine learning[J]. Chinese Optics, 15, 703-711(2022).

[4] XIAO SH L, HU CH H, GAO L Y, . Pixel mapping variable-resolution spectral imaging reconstruction[J]. Chinese Optics, 15, 1045-1054(2022).

[5] MARKRAM H, MULLER E, RAMASWAMY S, et al. Reconstruction and simulation of neocortical microcircuitry[J]. Cell, 163, 456-492(2015).

[6] GOODMAN J W, DIAS A R, WOODY L M. Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms[J]. Optics Letters, 2, 1-3(1978).

[7] RECK M, ZEILINGER A, BERNSTEIN H J, et al. Experimental realization of any discrete unitary operator[J]. Physical Review Letters, 73, 58-61(1994).

[8] CLEMENTS W R, HUMPHREYS P C, METCALF B J, et al. Optimal design for universal multiport interferometers[J]. Optica, 3, 1460-1465(2016).

[9] SHEN Y CH, HARRIS N C, SKIRLO S, et al. Deep learning with coherent nanophotonic circuits[J]. Nature Photonics, 11, 441-446(2017).

[10] ZHANG T, WANG J, LIU Q, et al. Efficient spectrum prediction and inverse design for plasmonic waveguide systems based on artificial neural networks[J]. Photonics Research, 7, 368-380(2019).

[11] BAGHERIAN H, SKIRLO S, SHEN Y CH, et al. On-chip optical convolutional neural networks[J]. arXiv:, 03303, 2018(1808).

[12] QU Y R, ZHU H ZH, SHEN Y CH, et al. Inverse design of an integrated-nanophotonics optical neural network[J]. Science Bulletin, 65, 1177-1183(2020).

[13] DAN Y H, FAN Z Y, SUN X J, et al. All-type optical logic gates using plasmonic coding metamaterials and multi-objective optimization[J]. Optics Express, 30, 11633-11646(2022).

[14] ZHANG CH, YANG Z CH, HE X D, et al. Multimodal intelligence: representation learning, information fusion, and applications[J]. IEEE Journal of Selected Topics in Signal Processing, 14, 478-493(2020).

[15] [15] HUANG Y, DU CH ZH, XUE Z H, et al.. What makes multimodal learning better than single (provably)[C]. 35th Conference on Neural Infmation Processing Systems, NeurIPS, 2021: 1094410956.

[16] [16] PENG X K, WEI Y K, DENG A D, et al.. Balanced multimodal learning via onthefly gradient modulation[C]. Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, IEEE, 2022: 82288237.

[17] [17] RAMESH A, PAVLOV M, GOH G, et al.. Zeroshot texttoimage generation[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 88218831.

[18] [18] NAGRANI A, YANG SH, ARNAB A, et al.. Attention bottlenecks f multimodal fusion[C]. 35th Conference on Neural Infmation Processing Systems, NeurIPS, 2021: 1420014213.

[19] [19] TROSTEN D J, LØKSE S, JENSSEN R, et al.. Reconsidering representation alignment f multiview clustering[C]. Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, IEEE, 2021: 12551265.

[20] [20] JIA CH, YANG Y F, XIA Y, et al.. Scaling up visual visionlanguage representation learning with noisy text supervision[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 49044916.

[21] ANASTASOPOULOS A, KUMAR S, LIAO H. Neural language modeling with visual features[J]. arXiv:, 02930, 2019(1903).

[22] [22] VIELZEUF V, LECHERVY A, PATEUX S, et al.. Central: a multilayer approach f multimodal fusion[C]. Proceedings of the European Conference on Computer Vision, Munich, 2019: 575589.

[23] ZHANG H, GU M, JIANG X D, et al. An optical neural chip for implementing complex-valued neural network[J]. Nature Communications, 12, 457(2021).

[24] [24] WOO S, PARK J, LEE J Y, et al.. CBAM: convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, 2018: 319.

[25] LIN X, RIVENSON Y, YARDIMCI N T, et al. All-optical machine learning using diffractive deep neural networks[J]. Science, 361, 1004-1008(2018).

[26] WU Q H, SUI X B, FEI Y H, et al. Multi-layer optical Fourier neural network based on the convolution theorem[J]. AIP Advances, 11, 055012(2021).

[27] FELDMANN J, YOUNGBLOOD N, KARPOV M, et al. Parallel convolutional processing using an integrated photonic tensor core[J]. Nature, 589, 52-58(2021).

[28] ZHANG D N, ZHANG Y J, ZHANG Y, et al. Training and inference of optical neural networks with noise and low-bits control[J]. Applied Sciences, 11, 3692(2021).

[29] KRIEGESKORTE N. Deep neural networks: a new framework for modeling biological vision and brain information processing[J]. Annual Review of Vision Science, 1, 417-446(2015).

[30] [30] GENG Y, HAN Z B, ZHANG CH Q, et al.. Uncertaintyaware multiview representation learning[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 75457553.

[31] JIA X D, JING X Y, ZHU X K, et al. Semi-supervised multi-view deep discriminant representation learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 2496-2509(2021).

[32] HAN Z B, ZHANG CH Q, FU H ZH, et al. Trusted multi-view classification with dynamic evidential fusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 2551-2566(2023).

[33] SHAO R, ZHANG G, GONG X. Generalized robust training scheme using genetic algorithm for optical neural networks with imprecise components[J]. Photonics Research, 10, 1868-1876(2022).

CLP Journals