Photonics Research, Volume. 13, Issue 7, 1902(2025)

Super-wide-field-of-view long-wave infrared gaze polarization imaging embedded in a multi-strategy detail feature extraction and fusion network

Dongdong Shi1、†, Jinhang Zhang1、†, Jun Zou1, Fuyu Huang1,2、*, Limin Liu1,3、*, Li Li1, Yudan Chen1, Bing Zhou1, and Gang Li1
Author Affiliations
  • 1Shijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050000, China
  • 2e-mail: hfyoptics@163.com
  • 3e-mail: lk0256@163.com
  • show less
    Figures & Tables(11)
    Concept of SWFOV LWIR gaze polarization imaging. (a) Equipment overview. It integrates a vanadium oxide uncooled IR focal plane detector with SWFOV LWIR gaze polarization lens designed independently. The IR movement focal plane resolution is 1280×1024, the element size is 17 μm, and the frame rate is up to 30 Hz. (b) The polarization lens is coupled with a rotatable holographic line-grid IR polarization device to achieve 150°×120° large airspace acquisition capability. (c) The polarization lens is designed based on the non-similar imaging principle and adopts isometric projection to optimize SWFOV imaging performance. (d) Stokes vectors are obtained by parsing the polarization components of the radiation information to reveal the target polarization properties. (e) Utilizing fusion image technology to improve image quality and enhance the value of fusion images in subsequent applications.
    Performance evaluation of SWFOV LWIR gazing polarizer heads. (a) Blur spot diagram: demonstrating diffuse spot characteristics by testing blur spots at four FOV angles with reference to the main light. (b) Calculating the percentage of total enclosed energy at the four FOV angles and comparing it with the diffraction limit curve (which represents the zero distortion response). (c) Using the FFT method, diffraction MTF is calculated for the four FOV angles, with the spatial frequency range set as [0,29] lp/mm (Nyquist frequency is 29 lp/mm). (d) The relative illuminance is calculated by integrating the effective area of the exit pupil observed at the image point (performed in cosine space). The effective F/# is inversely proportional to the square root of the stereo emission angular area of the optical pupil in cosine space, with consideration of polarization lens transmittance weighting. (e) The field curvature curve shows the offsets of the near-axis image plane. (f) The distortion curve shows the corresponding distortion offset.
    SWFOV LWIR gaze polarization imaging test. By rotating the holographic line-grid IR polarization device in the IR polarizer head, the image data are captured in four directions: 0°, 45°, 90°, and 135°. Stokes vectors are calculated for each pixel, where the s0 parameter directly corresponds to the SWFOV LWIR image. Further to obtain SWFOV LWIR DoLP images, the SWFOV LWIR images and the SWFOV LWIR DoLP images are presented in checkerboard image.
    Image fusion network. (a) This fusion network contains LLRR model, Detail CNN, ADFU model, and decoder network. The LLRR model extracts sparse features from IR images and IR DoLP images (SIR and SDoLP). Detail CNN model extracts high-frequency detail features (ΨIRD and ΨDoLPD). The primary detail features obtained by adding S and Ψ are delivered to the ADFU model, which is based on the attentional mechanism, and the new detail features FDoLPD of salient targets can be obtained. Meanwhile, the LLRR model and Base Poolformer model extract the low-rank features LIR and low-frequency features BIR of the IR image, respectively, and compose them into the background features FIRB. Finally, the detail feature FDoLPD and the background feature FIRB are fused by the decoder network to obtain the fusion image. (b) LLRR model structure. The input data is X. The activation function hξ is defined as hξ=sign(x)·max(|x|−ξ,0), where Z0=hξ(V0*X). (c) ADFU model structure. To segment the primary detail features into two parts, A and B, the 3×3 deep convolution operation (DWCovn) is used for A to generate non-local features. The Pooling denotes the adaptive maximum pooling operation to obtain low frequency information. After using the 3×3 deep convolution operation for B, the 1×1 convolution and a hidden activation function are used directly to obtain enhanced high-frequency features. The features obtained from the A and B channels are added to obtain the improved detail features. Then the Detail CNN model in the ADFU model is utilized to perform the third detail feature extraction on the improved detail features to obtain the final detail features FDoLPD. (d) Base Poolformer model structure. This model acquires background features, where pooling operations are performed to enhance the perceptual ability of the feature information; PoolMLP uses two 1×1 convolutional layers and activation operations are performed on the features.
    Fusion effects of ablation study. (a) The fusion effects and close-ups of the fusion images obtained from the baseline model, the LLRR model, the ADFU model, and the LLRR + ADFU model are shown. Three scenes were reconstructed using pseudo-color for viewing convenience. (b) Loss function of the image fusion network. (c) Number of parameters of the four ablation models.
    Qualitative comparison of different fusion methods for three scenes. (a) IR. (b) IR DoLP. (c) RFN-Nest. (d) SwinFusion. (e) YDTR. (f) CDDFuse. (g) CMTFusion. (h) DIF-Fusion. (i) DCSFuse. (j) Ours.
    Testing our dataset using different fusion methods. Using the YOLOX detection network to detect car targets in SWFOV LWIR images, SWFOV LWIR DoLP images, and SWFOV fused images, and demonstrating the average precision and confidence of car detection.
    Car detection task is performed at different distance locations. Cars are detected using YOLOX detection network for SWFOV LWIR images, SWFOV LWIR DoLP images, and SWFOV fusion images.
    IR camouflage target recognition experiments. (a) Experimental demonstration. (b) Visual images wearing the IR camouflage suit in bush, building, and grass environments. (c) Recognition tests of IR targets and IR camouflaged targets in road and grass environments, respectively. The SWFOV LWIR images, SWFOV LWIR DoLP images, and SWFOV fusion images are shown from left to right. The gray values of rectangular contour-marked regions in the images are counted and annotated with the HIS color model. The close-ups show the statistics of the gray values of the outline marking area, and the height indicates the gray value. (d) Analyzing the pixel distributions of the SWFOV LWIR images, SWFOV LWIR DoLP images, and SWFOV fusion images of the four scenes in Fig. 3.
    • Table 1. Quantitative Evaluation of Ablation Study on the Test Dataseta

      View table
      View in Article

      Table 1. Quantitative Evaluation of Ablation Study on the Test Dataseta

      MethodCER (%)DR (%)PIDPSNREn
      Baseline74.5631__74.915__299.857__2.8916__3.7342__
      LLRR128.8028__76.1781__259.3154__4.0078__3.0442__
      ADFU17.3612128.894786.666512.90086.7563
      LLRR + ADFU32.0598147.121481.357113.92146.5471
    • Table 2. Quantitative Evaluation of Different Fusion Methods on the Test Dataseta

      View table
      View in Article

      Table 2. Quantitative Evaluation of Different Fusion Methods on the Test Dataseta

      MethodCER (%)DR (%)PIDPSNREn
      RFN-Nest5.1064121.942699.791411.84876.7448
      SwinFusion4.5682109.5763188.07446.1777.2394
      YDTR19.143281.1398166.23867.38327.0926
      CDDFuse6.0211125.2308186.9456.23677.3335
      CMTFusion19.2669131.9363114.885510.66676.9971
      DIF-Fusion9.5311126.0948196.83225.91097.1988
      DCSFuse23.8086113.7567250.06974.11587.2151
      Ours32.0598147.121481.357113.92146.5471
    Tools

    Get Citation

    Copy Citation Text

    Dongdong Shi, Jinhang Zhang, Jun Zou, Fuyu Huang, Limin Liu, Li Li, Yudan Chen, Bing Zhou, Gang Li, "Super-wide-field-of-view long-wave infrared gaze polarization imaging embedded in a multi-strategy detail feature extraction and fusion network," Photonics Res. 13, 1902 (2025)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing and Image Analysis

    Received: Feb. 18, 2025

    Accepted: Apr. 24, 2025

    Published Online: Jul. 1, 2025

    The Author Email: Fuyu Huang (hfyoptics@163.com), Limin Liu (lk0256@163.com)

    DOI:10.1364/PRJ.559833

    CSTR:32188.14.PRJ.559833

    Topics