Spectroscopy and Spectral Analysis, Volume. 45, Issue 7, 1916(2025)

Rapid Detection for Xylose Content Using Near-Infrared Spectroscopy Technology

LAN Xi-hua, WANG Zhi-guo*, LUAN Xiao-li, and LIU Fei
Author Affiliations
  • Key Laboratory for Advanced Process Control of Light Industry of the Ministry of Education,Jiangnan University,Wuxi 214122,China
  • show less

    Xylose, as a functional oligosaccharide, possesses health benefits such as antioxidant properties and promoting intestinal health, and is widely used in food, medicine, and biofuels. There is still a lack of effective rapid detection methods for xylose content. An online detection method based on near-infrared spectroscopy technology is proposed to address the issue of content detection during xylose production. Firstly, sample solutions are collected and scanned using a near-infrared spectrometer to obtain raw spectra. The raw spectra are then preprocessed using first derivative and smoothing filter methods to remove noise and baseline drift effects. Subsequently, the random frog algorithm is employed for feature selection of spectral variables, and the prediction relative analysis error is used to search for the optimal number of features. The results show that the model's predictive performance is optimal when the number of features is between 20 and 30. Considering other indicators, the number of features is selected as 25, determining the wavelength characteristics representing xylose content. Due to the random subset selection and random forest regression characteristics of the random frog algorithm, this algorithm has obvious advantages in performing the task of feature wavelength screening for high-dimensional xylose data, but also has the defect of low result reproducibility. After obtaining the wavelength features, the results are weighted and accumulated to weaken the impact of the algorithm's uncertainty on the final model. Then, a predictive model for xylose content is established using data measured by a liquid chromatograph as labels. Finally, the method is used to rapidly determine the xylose content of samples collected from the process site, and the prediction effects are compared with those of the PLS and Lasso models. The results indicate that the training set determination coefficient R2=0.937 7, and the test set determination coefficient Rp2=0.933 5, with R2 and Rp2 close to 1, indicating that the model can explain the training setdata well and has good generalization performance. The prediction root mean square error RMSEP=5.844 6, and the prediction relative analysis error RPD=3.879 2>2.5, indicating that the model can predict the xylose content of samples relatively accurately. Through comparison, it is found that the RJFA-PLS model's evaluation indicators are superior to those of the PLS model, with RMSEP reduced by 112.7%, and R2, RPD, and Rp2 increased by 21.8%, 52.5%, and 24.6%, respectively. However, the Lasso algorithm performs poorly predictingxylose content based on this dataset. Under the experimental conditions of this study, the model established using the above method is more suitable for predicting xylose content than the PLS and Lasso models. The proposal of this method solves the problem of lag in xylose content detection results and also provides aprerequisite for the research of online detection technology for xylose.

    Tools

    Get Citation

    Copy Citation Text

    LAN Xi-hua, WANG Zhi-guo, LUAN Xiao-li, LIU Fei. Rapid Detection for Xylose Content Using Near-Infrared Spectroscopy Technology[J]. Spectroscopy and Spectral Analysis, 2025, 45(7): 1916

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Aug. 7, 2024

    Accepted: Jul. 24, 2025

    Published Online: Jul. 24, 2025

    The Author Email: WANG Zhi-guo (zhiguowang@jiangnan.edu.cn)

    DOI:10.3964/j.issn.1000-0593(2025)07-1916-08

    Topics