Chinese Optics, Volume. 16, Issue 6, 1343(2023)

Multimodal feature fusion based on heterogeneous optical neural networks

Yi-zhen ZHENG, Jian DAI, Tian ZHANG*, and Kun XU
Author Affiliations
  • State Key Laboratory of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • show less
    References(33)

    [11] BAGHERIAN H, SKIRLO S, SHEN Y CH, et al. On-chip optical convolutional neural networks[J]. arXiv:, 03303, 2018(1808).

    [15] [15] HUANG Y, DU CH ZH, XUE Z H, et al.. What makes multimodal learning better than single (provably)[C]. 35th Conference on Neural Infmation Processing Systems, NeurIPS, 2021: 1094410956.

    [16] [16] PENG X K, WEI Y K, DENG A D, et al.. Balanced multimodal learning via onthefly gradient modulation[C]. Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, IEEE, 2022: 82288237.

    [17] [17] RAMESH A, PAVLOV M, GOH G, et al.. Zeroshot texttoimage generation[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 88218831.

    [18] [18] NAGRANI A, YANG SH, ARNAB A, et al.. Attention bottlenecks f multimodal fusion[C]. 35th Conference on Neural Infmation Processing Systems, NeurIPS, 2021: 1420014213.

    [19] [19] TROSTEN D J, LØKSE S, JENSSEN R, et al.. Reconsidering representation alignment f multiview clustering[C]. Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, IEEE, 2021: 12551265.

    [20] [20] JIA CH, YANG Y F, XIA Y, et al.. Scaling up visual visionlanguage representation learning with noisy text supervision[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 49044916.

    [21] ANASTASOPOULOS A, KUMAR S, LIAO H. Neural language modeling with visual features[J]. arXiv:, 02930, 2019(1903).

    [22] [22] VIELZEUF V, LECHERVY A, PATEUX S, et al.. Central: a multilayer approach f multimodal fusion[C]. Proceedings of the European Conference on Computer Vision, Munich, 2019: 575589.

    [24] [24] WOO S, PARK J, LEE J Y, et al.. CBAM: convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, 2018: 319.

    [30] [30] GENG Y, HAN Z B, ZHANG CH Q, et al.. Uncertaintyaware multiview representation learning[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 75457553.

    CLP Journals

    [1] Hui-bin CHEN, Kai-fei TANG, Zhen-yu YOU. Fully complex optical neural network with insertion-loss robustness[J]. Chinese Optics, 2024, 17(4): 834

    Tools

    Get Citation

    Copy Citation Text

    Yi-zhen ZHENG, Jian DAI, Tian ZHANG, Kun XU. Multimodal feature fusion based on heterogeneous optical neural networks[J]. Chinese Optics, 2023, 16(6): 1343

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Original Article

    Received: Mar. 1, 2023

    Accepted: --

    Published Online: Nov. 29, 2023

    The Author Email:

    DOI:10.37188/CO.2023-0036

    Topics