Spectroscopy and Spectral Analysis, Volume. 45, Issue 9, 2590(2025)

Hyperspectral Estimation of Selenium Content in Selenium-Rich Tea Based on Feature Selection and Machine Learning

WEN Zhu1, GUO Song1, SHU Tian1, and ZHAO Long-cai2,3
Author Affiliations
  • 1Guizhou Agricultural Science and Technology Information Institute, Guiyang 550006,China
  • 2College of Natural Resources and Environment, Key Laboratory of Plant Nutrition and Agro-Environment in Northwest China, Ministry of Agriculture and Rural Affairs, Yangling 712100, China
  • 3College of Natural Resources and Environment, Northwest A&F University, Yangling 712100, China
  • show less

    Selenium is one of the important nutrient indices in selenium-rich tea, and its content determines the economic and nutritional value of selenium-rich tea. Hyperspectral remote sensing inversion technology has the characteristics of non-destructive, real-time, and rapid monitoring. This study utilizes the selenium content in selenium-rich tea from the Nangong River tea garden in Kaiyang County, Guizhou Province, and corresponding canopy non-imaging hyperspectral data as source data. The Savitzky-Golay second-order smoothing filter was used to preprocess the primary spectrum, and the potential of the primary spectral data was explored through first-order derivative transformation and continuum removal transformation. The independent variables for the modeling were obtained using a band elimination combination and various feature selection algorithms. Multiple inversion models of selenium content in tea were constructed using different algorithms. The results showed that: (1) the combination of spectral transformation and spectral index could enhance the ability of retrieving selenium content from the primary spectrum. (2) SPA was better than UVE overall; Continuum removal spectrum was superior to the primary spectrum and the first derivative spectrum. (3) The accuracy of the multi-factor model was better than that of the factor model, and the performance of ELMR in the multi-factor model was the best. Among all the models, the SPA-ELMR model under the continuum removal spectrum had the highest accuracy. The coefficient of determination (R2) and normalized root mean square error (nRMSE) of this model were 0.689 and 18.869%, respectively, and the corresponding verification R2 and nRMSE were 0.627 and 20.429%, respectively. In this study, the response relationship between selenium content in tea and spectral reflectance at specific growth stages was discussed. A single-factor inversion model and a multi-factor inversion model with appropriate accuracy were constructed, providing a theoretical basis for the rapid and non-destructive monitoring of selenium content in tea. Also, they provided some technical support for the digital construction of tea gardens.

    Tools

    Get Citation

    Copy Citation Text

    WEN Zhu, GUO Song, SHU Tian, ZHAO Long-cai. Hyperspectral Estimation of Selenium Content in Selenium-Rich Tea Based on Feature Selection and Machine Learning[J]. Spectroscopy and Spectral Analysis, 2025, 45(9): 2590

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Nov. 26, 2024

    Accepted: Sep. 19, 2025

    Published Online: Sep. 19, 2025

    The Author Email:

    DOI:10.3964/j.issn.1000-0593(2025)09-2590-07

    Topics