Spectroscopy and Spectral Analysis, Volume. 44, Issue 10, 2932(2024)

Study on Sugar Content Detection of Kiwifruit Using Near-Infrared Spectroscopy Combined With Stacking Ensemble Learning

GUO Zhi-qiang1, ZHANG Bo-tao1, and ZENG Yun-liu2、*
Author Affiliations
  • 1College of Information Engineering, Wuhan University of Technology, Wuhan 430070, China
  • 2National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, National R&D Center for Citrus Preservation, Wuhan 430070, China
  • show less

    In this study, we employ near-infrared spectroscopy with Stacking ensemble learning to perform non-destructive sugar content analysis in kiwifruit. Our research focuses on the “Yunhai No.1” kiwifruit variety from Hubei. Using an infrared analyzer, we gathered spectral data from 280 samples, spanning 1 557 wavelengths in the 4 000~10 000 cm-1 range, and measured sugar content with a refractometer. Outliers were identified and excluded using a singular sample identification algorithm that combines Monte Carlo random sampling with a T-test. The SPXY algorithm was then employed to split the data into training and testing sets in a 4∶1 ratio. Data preprocessing involved multiple scattering corrections (MSC), Savitzky-Golay smoothing (SG), de-trending (DT), vector normalization (VN), and standard normal variable (SNV) transformations. Feature wavelengths were initially selected using uninformative variable elimination (UVE), competitive adaptive reweighted sampling (CARS), and interval variable iterative space shrinkage approach (iVISSA), followed by a secondary selection with the successive projections algorithm (SPA) to remove collinear variables. To address the limitations of single models in generalization, we designed an integrated learning model using the Stacking algorithm. This model incorporated Bayesian ridge regression (BRR), partial least squares regression (PLSR), support vector regression (SVR), and artificial neural networks (ANN) as base learners, with linear regression (LR) serving as the meta-learner. We assessed the performance of various ensemble model combinations and analyzed the influence of base learners on ensemble performance using the Pearson correlation coefficient. Experimental results indicated that vector normalization was the most effective among the five preprocessing methods. The VN-CARS-PLSR model demonstrated superior performance, with Rp2 of 0.805 and RMSEP of 0.498, identifying 177 feature wavelengths and reducing data volume by 88.6% compared to the original spectrum. Comparisons of different base learner combinations in the Stacking algorithm revealed that the PLS+SVR+ANN integrated model achieved the highest predictive accuracy, with Rp2 of 0.853 and RMSEP of 0.433. The study concludes that the stacking ensemble model offers more comprehensive modeling capabilities and superior generalization than single models, providing valuable technical support for non-destructive sugar quality detection in kiwifruit.

    Tools

    Get Citation

    Copy Citation Text

    GUO Zhi-qiang, ZHANG Bo-tao, ZENG Yun-liu. Study on Sugar Content Detection of Kiwifruit Using Near-Infrared Spectroscopy Combined With Stacking Ensemble Learning[J]. Spectroscopy and Spectral Analysis, 2024, 44(10): 2932

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Jun. 1, 2023

    Accepted: Jan. 16, 2025

    Published Online: Jan. 16, 2025

    The Author Email: Yun-liu ZENG (zengyl@mail.hzau.edu.cn)

    DOI:10.3964/j.issn.1000-0593(2024)10-2932-09

    Topics