Chinese Journal of Liquid Crystals and Displays, Volume. 38, Issue 3, 356(2023)
No-reference image quality assessment based on feature tokenizer and Transformer
The no-reference IQA methods based on deep learning have problems of insufficient semantic relevance or high model training requirements. This paper proposes a no-reference IQA based on semantic visual feature tokens and Transformer (VTT-IQA). We firstly use a deep convolutional neural network to extract high-level semantic features of the image, and then map the semantic features to visual feature tokens. Subsequently, the relationship between visual feature tokens is modelled based on the Transformer self-attention mechanism to extract the global information. Meanwhile, a shallow neural network is used to extract the low-level local features of the image and capture its distortion information. Finally, the high-level semantic information and the low-level visual information are integrated to accurately predict the image quality. In order to verify the superiority and robustness of our proposed model, we compared our method with 15 traditional and deep learning based non-reference IQA methods on five mainstream IQA datasets and one underwater IQA dataset, using PLCC and SROCC as the performance evaluation metrics. The experimental results show that the proposed method achieves superior performance with less parameters (about 1.56 MB). Especially, VTT-IQA achieves 0.958 of SROCC on LIVE-MD that contains multiply distorted images. It is proved that VTT-IQA can still accurately evaluate the image quality under complex distortion, and can meet the practical application.
Get Citation
Copy Citation Text
Wei SONG, Jia-jin LI, Xiao-chen LIU, Zhi-xiang LIU, Shao-hua SHI. No-reference image quality assessment based on feature tokenizer and Transformer[J]. Chinese Journal of Liquid Crystals and Displays, 2023, 38(3): 356
Category: Research Articles
Received: Jun. 29, 2022
Accepted: --
Published Online: Apr. 3, 2023
The Author Email: Wei SONG (wsong@shou.edu.cn)