Spectroscopy and Spectral Analysis, Volume. 45, Issue 7, 1916(2025)
Rapid Detection for Xylose Content Using Near-Infrared Spectroscopy Technology
Xylose, as a functional oligosaccharide, possesses health benefits such as antioxidant properties and promoting intestinal health, and is widely used in food, medicine, and biofuels. There is still a lack of effective rapid detection methods for xylose content. An online detection method based on near-infrared spectroscopy technology is proposed to address the issue of content detection during xylose production. Firstly, sample solutions are collected and scanned using a near-infrared spectrometer to obtain raw spectra. The raw spectra are then preprocessed using first derivative and smoothing filter methods to remove noise and baseline drift effects. Subsequently, the random frog algorithm is employed for feature selection of spectral variables, and the prediction relative analysis error is used to search for the optimal number of features. The results show that the model's predictive performance is optimal when the number of features is between 20 and 30. Considering other indicators, the number of features is selected as 25, determining the wavelength characteristics representing xylose content. Due to the random subset selection and random forest regression characteristics of the random frog algorithm, this algorithm has obvious advantages in performing the task of feature wavelength screening for high-dimensional xylose data, but also has the defect of low result reproducibility. After obtaining the wavelength features, the results are weighted and accumulated to weaken the impact of the algorithm's uncertainty on the final model. Then, a predictive model for xylose content is established using data measured by a liquid chromatograph as labels. Finally, the method is used to rapidly determine the xylose content of samples collected from the process site, and the prediction effects are compared with those of the PLS and Lasso models. The results indicate that the training set determination coefficient R2=0.937 7, and the test set determination coefficient =0.933 5, with R2 and close to 1, indicating that the model can explain the training setdata well and has good generalization performance. The prediction root mean square error RMSEP=5.844 6, and the prediction relative analysis error RPD=3.879 2>2.5, indicating that the model can predict the xylose content of samples relatively accurately. Through comparison, it is found that the RJFA-PLS model's evaluation indicators are superior to those of the PLS model, with RMSEP reduced by 112.7%, and R2, RPD, and increased by 21.8%, 52.5%, and 24.6%, respectively. However, the Lasso algorithm performs poorly predictingxylose content based on this dataset. Under the experimental conditions of this study, the model established using the above method is more suitable for predicting xylose content than the PLS and Lasso models. The proposal of this method solves the problem of lag in xylose content detection results and also provides aprerequisite for the research of online detection technology for xylose.
Get Citation
Copy Citation Text
LAN Xi-hua, WANG Zhi-guo, LUAN Xiao-li, LIU Fei. Rapid Detection for Xylose Content Using Near-Infrared Spectroscopy Technology[J]. Spectroscopy and Spectral Analysis, 2025, 45(7): 1916
Received: Aug. 7, 2024
Accepted: Jul. 24, 2025
Published Online: Jul. 24, 2025
The Author Email: WANG Zhi-guo (zhiguowang@jiangnan.edu.cn)