Multi-task attention mechanism based no reference quality assessment algorithm for screen content images

Ziyi Zhou; Wu Dong; Likun Lu; Qian Ma; Guopeng Hou; Erqing Zhang

doi:10.12086/oee.2025.240309

Opto-Electronic Engineering, Volume. 52, Issue 4, 240309(2025)

Multi-task attention mechanism based no reference quality assessment algorithm for screen content images

Ziyi Zhou, Wu Dong^*, Likun Lu, Qian Ma, Guopeng Hou, and Erqing Zhang

Beijing Key Laboratory of Signal and Information Processing for High-end Printing Equipment, Beijing Institute of Graphic Communication, Beijing 102600, China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(18)

Fig. 1. Structure of the MTA-SCI proposed in the paper

Download full size

View in Article

Fig. 2. Structure diagram of integrated local attention mechanism

Download full size

View in Article

Fig. 3. Structure of group-wise attention mechanism with spatial shifts

Download full size

View in Article

Fig. 4. Structure of asymmetric convolutional channel attention mechanism

Download full size

View in Article

Fig. 5. Structure of dual-channel feature mapping module

Download full size

View in Article

Fig. 6. Variation curves of PLCC, SRCC, and loss values obtained on the SCID dataset. (a) Variation curves of PLCC and SRCC; (b) Variation curve of the loss value

Download full size

View in Article

Fig. 7. Variation curves of PLCC, SRCC, and loss values obtained on the SIQAD dataset. (a) Variation curves of PLCC and SRCC; (b) Variation curve of the loss value

Download full size

View in Article

Fig. 8. Distorted screen content image. (a) Reference image; (b) SCI33_5_3.bmp; (c) SCI33_5_4.bmp; (d) SCI33_5_5.bmp

Download full size

View in Article

Table 1. Typical methods of screen content image quality assessment

View table

View in Article

Table 1. Typical methods of screen content image quality assessment

Category	Method	Type	Feature
The first category	SPQA^[6]	FR	Brightness and sharpness
	ESIM^[9]	FR	Edge contrast
	MSDL^[19]	FR	Feature extraction using log gabor filters
	BLIQUP-SCI^[7]	NR	Natural scene statistics features and local texture
	Yang et al.^[8]	NR	The amplitude, variance, entropy, and edge structure of wavelet coefficients
	Huang et al.^[20]	RR	Oriented histogram, local discrete cosine transform coefficients, and gradient of amplitude in color channels
The second category	SR-CNN^[10]	FR	Multi-level CNN features
	QODCNN^[12]	FR/NR	CNN features
	Gao et al.^[15]	NR	CNN features
	Zhang et al.^[16]	NR	CNN features
	MIC-CNN^[13]	NR	CNN features
	SIQA-DF-II^[11]	NR	CNN features
	RIQA^[14]	NR	CNN features
	DAMC^[21]	FR	CNN features
	MTDL^[17]	NR	CNN features

Table 2. Group-based spatial shift operation

View table

View in Article

Table 2. Group-based spatial shift operation

X_n	Spatial shift
n=1	Shift1：move tensor x₁ down by one pixel vertically and right by one pixel horizontally
n=2	Shift2：move tensor x₂ down by two pixels vertically and right by two pixels horizontally
n=3	Shift3：move tensor x₃ right by one pixel horizontally and down by one pixel vertically
n=4	Without any processing

Table 3. Convolutional kernel size and padding method
View table
View in Article
Table 3. Convolutional kernel size and padding method
Convolution layer Kernel size Padding size
Conv0 5×5 2×2
Conv0_1 1×5 0×2
Conv0_2 5×1 2×0
Conv1_1 1×13 0×6
Conv1_2 13×1 6×0
Conv2_1 1×19 9×0
Conv2_2 19×1 0×9
Conv3 1×1 1×1

Table 4. Commonly used screen content image datasets
View table
View in Article
Table 4. Commonly used screen content image datasets
Dataset Number of reference Number of distortion Distortion types count Distortion levels count Subjective score type
SCID 40 1800 9 5 MOS
SIQAD 20 980 7 7 DMOS

Table 5. Environmental configuration and parameters of the experiment
View table
View in Article
Table 5. Environmental configuration and parameters of the experiment
Parameter Value
Param count 307.94577 M
Compilation Environment Python 3.7.0, Pytorch-GPU 1.13.1, and CUDA 11.3
CPU model Intel Core i7-13700
GPU model NVIDIA RTX 4090
Average time/epoch 90 s

Table 6. Performance comparison of various screen content image quality assessment algorithms

View table

View in Article

Table 6. Performance comparison of various screen content image quality assessment algorithms

Type	Method	SCID		SIQAD
Type	Method	SRCC	PLCC	SRCC	PLCC
FR	MIC-CNN^[13]	-	-	0.9636	0.9669
	ESIM^[9]	0.8478	0.8630	0.8632	0.8788
	DAMC^[21]	0.9617	0.9617	0.9304	0.9373
	SR-CNN^[10]	0.9400	0.9390	0.8943	0.9042
NR	Yang et al.^[8]	0.7562	0.7867	0.8543	0.8738
	QODCNN^[12]	0.8760	0.8820	0.8890	0.9010
	RIQA^[14]	-		0.9000	0.9110
	Zhang et al.^[16]	0.9050	0.9133	0.9242	0.9260
	BLIQUP-SCI^[7]	-	-	0.7990	0.7705
	Yang et al.^[8]	0.7562	0.7867	0.8543	0.8738
	SIQA-DF-II^[11]	-	-	0.8880	0.9000
	Gao et al.^[15]	0.8569	0.8613	0.8962	0.9000
	MTDL^[17]	-	-	0.9233	0.9248
	DFSS-IQA^[29]	0.8146	0.8138	0.8820	0.8818
	Zhang^[30]	0.9445	0.9433	0.8640	0.8889
	MTA-SCI	0.9602	0.9609	0.9233	0.9294

Table 7. Predicted scores of the distorted screen content images
View table
View in Article
Table 7. Predicted scores of the distorted screen content images
Image ID Prediction value MOS Normalized MOS
SCI33_5_3.bmp 0.1067 36.7569 0.2711
SCI33_5_4.bmp 0.0653 25.4741 0.0619
SCI33_5_5.bmp 0.0658 25.6376 0.0650

Table 8. Impact of different numbers of groups on the MTA-SCI performance
View table
View in Article
Table 8. Impact of different numbers of groups on the MTA-SCI performance
Group count PLCC SRCC
k=2 0.9471 0.9520
k=3 0.9591 0.9587
k=4 0.9602 0.9609
k=5 0.9572 0.9487

Table 9. Impact of different asymmetric convolution kernel combinations on the performance of the MTA-SCI
View table
View in Article
Table 9. Impact of different asymmetric convolution kernel combinations on the performance of the MTA-SCI
Kernel combination PLCC SRCC
Kernal=3, 15, 19 0.9568 0.9585
Kernal=5, 7, 9 0.9569 0.9584
Kernal=5, 13, 19 0.9602 0.9609
Kernal=7, 11, 21 0.9575 0.9563
Kernal=9, 17, 25 0.9581 0.9589
Kernal=15, 23, 27 0.8967 0.9005

Table 10. Impact of ILAM, DFM, and residual connection on the algorithm performance
View table
View in Article
Table 10. Impact of ILAM, DFM, and residual connection on the algorithm performance
No. ILAM DFM RC PLCC SRCC
1 × × × 0.8792 0.8651
2 × √ × 0.8832 0.8796
3 √ × × 0.9481 0.9508
4 √ √ × 0.9571 0.9592
5 √ √ √ 0.9602 0.9609

Tools

Get Citation

Copy Citation Text

Ziyi Zhou, Wu Dong, Likun Lu, Qian Ma, Guopeng Hou, Erqing Zhang. Multi-task attention mechanism based no reference quality assessment algorithm for screen content images[J]. Opto-Electronic Engineering, 2025, 52(4): 240309

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Article

Received: Dec. 30, 2024

Accepted: Feb. 25, 2025

Published Online: Jun. 11, 2025

The Author Email: Wu Dong (董武)

DOI:10.12086/oee.2025.240309

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Typical methods of screen content image quality assessment

Table 1. Typical methods of screen content image quality assessment

Table 2. Group-based spatial shift operation

Table 2. Group-based spatial shift operation

Table 3. Convolutional kernel size and padding method

Table 3. Convolutional kernel size and padding method

Table 4. Commonly used screen content image datasets

Table 4. Commonly used screen content image datasets

Table 5. Environmental configuration and parameters of the experiment

Table 5. Environmental configuration and parameters of the experiment

Table 6. Performance comparison of various screen content image quality assessment algorithms

Table 6. Performance comparison of various screen content image quality assessment algorithms

Table 7. Predicted scores of the distorted screen content images

Table 7. Predicted scores of the distorted screen content images

Table 8. Impact of different numbers of groups on the MTA-SCI performance

Table 8. Impact of different numbers of groups on the MTA-SCI performance

Table 9. Impact of different asymmetric convolution kernel combinations on the performance of the MTA-SCI

Table 9. Impact of different asymmetric convolution kernel combinations on the performance of the MTA-SCI

Table 10. Impact of ILAM, DFM, and residual connection on the algorithm performance

Table 10. Impact of ILAM, DFM, and residual connection on the algorithm performance