A Transformer Frequency Domain Learnability Method for Infrared Image Recognition

[14] [14] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need［C］//The 31st International Conference on Neural Information Processing Systems.Long Beach:Curran Associates Inc.,2017:6000-6010.

[15] [15] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16×16 words:Transformers for image recognition at scale［EB/OL］.(2021-06-03)［2022-06-17］.https://arxiv.org/abs/2010.11929.

[16] [16] LIU Z,LIN Y T,CAO Y,et al.Swin Transformer:hierarchical vision Transformer using shifted windows［C］//IEEE/CVF International Conference on Computer Vision.Montreal:IEEE,2021:9992-10002.

[17] [17] TOUVRON H,CORD M,SABLAYROLLES A,et al.Going deeper with image transformers［C］//IEEE/CVF International Conference on Computer Vision.Montreal:IEEE, 2021:32-42.

[18] [18] LEE J H,HEO M,KIM K R,et al.Single-image depth estimation based on Fourier domain analysis［C］//IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:330-339.

[19] [19] LI S H,XUE K P,ZHU B,et al.FALCON:a Fourier transform based approach for fast and secure convolutional neural network predictions［C］//IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:8702-8711.

[20] [20] HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition［C］//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:770-778.

[21] [21] ASHFAQ Q,AKRAM U,ZAFAR R.Thermal image dataset for object classification［EB/OL］.(2021-05-17)［2022-06-17］.https://data.mendeley.com/datasets/btmrycjpbj/1.

[22] [22] University of Tennessee.IRIS thermal/visible face database［EB/OL］.［2022-06-17］.https://vcipl-okstate.org/pbvs/bench/Data/02/download.html.

[23] [23] RAO Y M,ZHAO W L,ZHU Z,et al.Global filter networks for image classification［EB/OL］.(2021-10-26)［2022-06-17］.https://arxiv.org/abs/2107.00645.

[24] [24] RUDER S.An overview of gradient descent optimization algorithms［EB/OL］.(2017-06-15)［2022-06-17］.https://arxiv.org/abs/1609.04747.

[25] [25] IANDOLA F N,HAN S,MOSKEWICZ M W,et al.SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size［EB/OL］.(2016-11-04)［2022-06-17］.https://arxiv.org/abs/1602.07360.

[26] [26] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition［EB/OL］.(2015-04-10)［2022-06-17］.https://arxiv.org/abs/1409.1556.

[27] [27] HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks［C］//IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:2261-2269.

[28] [28] SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolutions［C］//IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:1-9.

Tools

Get Citation

Copy Citation Text

LAI Guangming, ZHANG Zhuoshi, GUO Xinping, WANG Min. A Transformer Frequency Domain Learnability Method for Infrared Image Recognition[J]. Electronics Optics & Control, 2023, 30(8): 13

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Jun. 17, 2022

Accepted: --

Published Online: Jan. 17, 2024

The Author Email:

DOI:10.3969/j.issn.1671-637x.2023.08.003

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology