Opto-Electronic Engineering, Volume. 52, Issue 4, 240309(2025)

Multi-task attention mechanism based no reference quality assessment algorithm for screen content images

Ziyi Zhou, Wu Dong*, Likun Lu, Qian Ma, Guopeng Hou, and Erqing Zhang
Author Affiliations
  • Beijing Key Laboratory of Signal and Information Processing for High-end Printing Equipment, Beijing Institute of Graphic Communication, Beijing 102600, China
  • show less
    Figures & Tables(18)
    Structure of the MTA-SCI proposed in the paper
    Structure diagram of integrated local attention mechanism
    Structure of group-wise attention mechanism with spatial shifts
    Structure of asymmetric convolutional channel attention mechanism
    Structure of dual-channel feature mapping module
    Variation curves of PLCC, SRCC, and loss values obtained on the SCID dataset. (a) Variation curves of PLCC and SRCC; (b) Variation curve of the loss value
    Variation curves of PLCC, SRCC, and loss values obtained on the SIQAD dataset. (a) Variation curves of PLCC and SRCC; (b) Variation curve of the loss value
    Distorted screen content image. (a) Reference image; (b) SCI33_5_3.bmp; (c) SCI33_5_4.bmp; (d) SCI33_5_5.bmp
    • Table 1. Typical methods of screen content image quality assessment

      View table
      View in Article

      Table 1. Typical methods of screen content image quality assessment

      CategoryMethodTypeFeature
      The first categorySPQA[6]FRBrightness and sharpness
      ESIM[9]FREdge contrast
      MSDL[19]FRFeature extraction using log gabor filters
      BLIQUP-SCI[7]NRNatural scene statistics features and local texture
      Yang et al.[8]NRThe amplitude, variance, entropy, and edge structure of wavelet coefficients
      Huang et al.[20]RROriented histogram, local discrete cosine transform coefficients, and gradient of amplitude in color channels
      The second categorySR-CNN[10]FRMulti-level CNN features
      QODCNN[12]FR/NRCNN features
      Gao et al.[15]NRCNN features
      Zhang et al.[16]NRCNN features
      MIC-CNN[13]NRCNN features
      SIQA-DF-II[11]NRCNN features
      RIQA[14]NRCNN features
      DAMC[21]FRCNN features
      MTDL[17]NRCNN features
    • Table 2. Group-based spatial shift operation

      View table
      View in Article

      Table 2. Group-based spatial shift operation

      XnSpatial shift
      n=1Shift1:move tensor x1 down by one pixel vertically and right by one pixel horizontally
      n=2Shift2:move tensor x2 down by two pixels vertically and right by two pixels horizontally
      n=3Shift3:move tensor x3 right by one pixel horizontally and down by one pixel vertically
      n=4Without any processing
    • Table 3. Convolutional kernel size and padding method

      View table
      View in Article

      Table 3. Convolutional kernel size and padding method

      Convolution layerKernel sizePadding size
      Conv05×52×2
      Conv0_11×50×2
      Conv0_25×12×0
      Conv1_11×130×6
      Conv1_213×16×0
      Conv2_11×199×0
      Conv2_219×10×9
      Conv31×11×1
    • Table 4. Commonly used screen content image datasets

      View table
      View in Article

      Table 4. Commonly used screen content image datasets

      DatasetNumber of referenceNumber of distortionDistortion types countDistortion levels countSubjective score type
      SCID40180095MOS
      SIQAD2098077DMOS
    • Table 5. Environmental configuration and parameters of the experiment

      View table
      View in Article

      Table 5. Environmental configuration and parameters of the experiment

      ParameterValue
      Param count307.94577 M
      Compilation EnvironmentPython 3.7.0, Pytorch-GPU 1.13.1, and CUDA 11.3
      CPU modelIntel Core i7-13700
      GPU modelNVIDIA RTX 4090
      Average time/epoch90 s
    • Table 6. Performance comparison of various screen content image quality assessment algorithms

      View table
      View in Article

      Table 6. Performance comparison of various screen content image quality assessment algorithms

      TypeMethodSCIDSIQAD
      SRCCPLCCSRCCPLCC
      FRMIC-CNN[13]--0.96360.9669
      ESIM[9]0.84780.86300.86320.8788
      DAMC[21]0.96170.96170.93040.9373
      SR-CNN[10]0.94000.93900.89430.9042
      NRYang et al.[8]0.75620.78670.85430.8738
      QODCNN[12]0.87600.88200.88900.9010
      RIQA[14]-0.90000.9110
      Zhang et al.[16]0.90500.91330.92420.9260
      BLIQUP-SCI[7]--0.79900.7705
      Yang et al.[8]0.75620.78670.85430.8738
      SIQA-DF-II[11]--0.88800.9000
      Gao et al.[15]0.85690.86130.89620.9000
      MTDL[17]--0.92330.9248
      DFSS-IQA[29]0.81460.81380.88200.8818
      Zhang[30]0.94450.94330.86400.8889
      MTA-SCI0.96020.96090.92330.9294
    • Table 7. Predicted scores of the distorted screen content images

      View table
      View in Article

      Table 7. Predicted scores of the distorted screen content images

      Image IDPrediction valueMOSNormalized MOS
      SCI33_5_3.bmp0.106736.75690.2711
      SCI33_5_4.bmp0.065325.47410.0619
      SCI33_5_5.bmp0.065825.63760.0650
    • Table 8. Impact of different numbers of groups on the MTA-SCI performance

      View table
      View in Article

      Table 8. Impact of different numbers of groups on the MTA-SCI performance

      Group countPLCCSRCC
      k=20.94710.9520
      k=30.95910.9587
      k=40.96020.9609
      k=50.95720.9487
    • Table 9. Impact of different asymmetric convolution kernel combinations on the performance of the MTA-SCI

      View table
      View in Article

      Table 9. Impact of different asymmetric convolution kernel combinations on the performance of the MTA-SCI

      Kernel combinationPLCCSRCC
      Kernal=3, 15, 190.95680.9585
      Kernal=5, 7, 90.95690.9584
      Kernal=5, 13, 190.96020.9609
      Kernal=7, 11, 210.95750.9563
      Kernal=9, 17, 250.95810.9589
      Kernal=15, 23, 270.89670.9005
    • Table 10. Impact of ILAM, DFM, and residual connection on the algorithm performance

      View table
      View in Article

      Table 10. Impact of ILAM, DFM, and residual connection on the algorithm performance

      No.ILAMDFMRCPLCCSRCC
      1×××0.87920.8651
      2××0.88320.8796
      3××0.94810.9508
      4×0.95710.9592
      50.96020.9609
    Tools

    Get Citation

    Copy Citation Text

    Ziyi Zhou, Wu Dong, Likun Lu, Qian Ma, Guopeng Hou, Erqing Zhang. Multi-task attention mechanism based no reference quality assessment algorithm for screen content images[J]. Opto-Electronic Engineering, 2025, 52(4): 240309

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Article

    Received: Dec. 30, 2024

    Accepted: Feb. 25, 2025

    Published Online: Jun. 11, 2025

    The Author Email: Wu Dong (董武)

    DOI:10.12086/oee.2025.240309

    Topics