Hyperspectral Estimation of Selenium Content in Selenium-Rich Tea Based on Feature Selection and Machine Learning

WEN Zhu; GUO Song; SHU Tian; ZHAO Long-cai

doi:10.3964/j.issn.1000-0593(2025)09-2590-07

Spectroscopy and Spectral Analysis, Volume. 45, Issue 9, 2590(2025)

Hyperspectral Estimation of Selenium Content in Selenium-Rich Tea Based on Feature Selection and Machine Learning

WEN Zhu¹, GUO Song¹, SHU Tian¹, and ZHAO Long-cai^2,3

Author Affiliations

¹Guizhou Agricultural Science and Technology Information Institute, Guiyang　550006，China

²College of Natural Resources and Environment, Key Laboratory of Plant Nutrition and Agro-Environment in Northwest China, Ministry of Agriculture and Rural Affairs, Yangling　712100, China

³College of Natural Resources and Environment, Northwest A&F University, Yangling　712100, China

show less

Abstract Get PDF(in Chinese)

Selenium is one of the important nutrient indices in selenium-rich tea, and its content determines the economic and nutritional value of selenium-rich tea. Hyperspectral remote sensing inversion technology has the characteristics of non-destructive, real-time, and rapid monitoring. This study utilizes the selenium content in selenium-rich tea from the Nangong River tea garden in Kaiyang County, Guizhou Province, and corresponding canopy non-imaging hyperspectral data as source data. The Savitzky-Golay second-order smoothing filter was used to preprocess the primary spectrum, and the potential of the primary spectral data was explored through first-order derivative transformation and continuum removal transformation. The independent variables for the modeling were obtained using a band elimination combination and various feature selection algorithms. Multiple inversion models of selenium content in tea were constructed using different algorithms. The results showed that: (1) the combination of spectral transformation and spectral index could enhance the ability of retrieving selenium content from the primary spectrum. (2) SPA was better than UVE overall; Continuum removal spectrum was superior to the primary spectrum and the first derivative spectrum. (3) The accuracy of the multi-factor model was better than that of the factor model, and the performance of ELMR in the multi-factor model was the best. Among all the models, the SPA-ELMR model under the continuum removal spectrum had the highest accuracy. The coefficient of determination (R²) and normalized root mean square error (nRMSE) of this model were 0.689 and 18.869%, respectively, and the corresponding verification R² and nRMSE were 0.627 and 20.429%, respectively. In this study, the response relationship between selenium content in tea and spectral reflectance at specific growth stages was discussed. A single-factor inversion model and a multi-factor inversion model with appropriate accuracy were constructed, providing a theoretical basis for the rapid and non-destructive monitoring of selenium content in tea. Also, they provided some technical support for the digital construction of tea gardens.

Keywords

Feature selection Inversion model Machine learning Selenium-rich tea Spectral index

Tools

Get Citation

Copy Citation Text

WEN Zhu, GUO Song, SHU Tian, ZHAO Long-cai. Hyperspectral Estimation of Selenium Content in Selenium-Rich Tea Based on Feature Selection and Machine Learning[J]. Spectroscopy and Spectral Analysis, 2025, 45(9): 2590

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Received: Nov. 26, 2024

Accepted: Sep. 19, 2025

Published Online: Sep. 19, 2025

The Author Email:

DOI:10.3964/j.issn.1000-0593(2025)09-2590-07

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology