Chinese Optics, Volume. 16, Issue 6, 1343(2023)
Multimodal feature fusion based on heterogeneous optical neural networks
[11] BAGHERIAN H, SKIRLO S, SHEN Y CH, et al. On-chip optical convolutional neural networks[J]. arXiv:, 03303, 2018(1808).
[15] [15] HUANG Y, DU CH ZH, XUE Z H, et al.. What makes multimodal learning better than single (provably)[C]. 35th Conference on Neural Infmation Processing Systems, NeurIPS, 2021: 1094410956.
[16] [16] PENG X K, WEI Y K, DENG A D, et al.. Balanced multimodal learning via onthefly gradient modulation[C]. Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, IEEE, 2022: 82288237.
[17] [17] RAMESH A, PAVLOV M, GOH G, et al.. Zeroshot texttoimage generation[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 88218831.
[18] [18] NAGRANI A, YANG SH, ARNAB A, et al.. Attention bottlenecks f multimodal fusion[C]. 35th Conference on Neural Infmation Processing Systems, NeurIPS, 2021: 1420014213.
[19] [19] TROSTEN D J, LØKSE S, JENSSEN R, et al.. Reconsidering representation alignment f multiview clustering[C]. Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, IEEE, 2021: 12551265.
[20] [20] JIA CH, YANG Y F, XIA Y, et al.. Scaling up visual visionlanguage representation learning with noisy text supervision[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 49044916.
[21] ANASTASOPOULOS A, KUMAR S, LIAO H. Neural language modeling with visual features[J]. arXiv:, 02930, 2019(1903).
[22] [22] VIELZEUF V, LECHERVY A, PATEUX S, et al.. Central: a multilayer approach f multimodal fusion[C]. Proceedings of the European Conference on Computer Vision, Munich, 2019: 575589.
[24] [24] WOO S, PARK J, LEE J Y, et al.. CBAM: convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, 2018: 319.
[30] [30] GENG Y, HAN Z B, ZHANG CH Q, et al.. Uncertaintyaware multiview representation learning[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 75457553.
Get Citation
Copy Citation Text
Yi-zhen ZHENG, Jian DAI, Tian ZHANG, Kun XU. Multimodal feature fusion based on heterogeneous optical neural networks[J]. Chinese Optics, 2023, 16(6): 1343
Category: Original Article
Received: Mar. 1, 2023
Accepted: --
Published Online: Nov. 29, 2023
The Author Email: