Spectroscopy and Spectral Analysis, Volume. 45, Issue 2, 584(2025)
Hyperspectral Method for Tracing the Origin of Hawthorn Using the Weighted Combination Model
Hawthorns from different origins have uneven quality due to the differences in growth environment and geographic climate, so determining the geographic origin of hawthorns is of great significance. A combined identification model based on error reciprocal weighting was proposed to improve the stability and accuracy of the hawthorn origin traceability model. Firstly, the hyperspectral information of 456 hawthorns was collected using hyperspectral imaging technology; and by comparing Savitzky-Golay Convolutional Smoothing (SG), Multiplicative Scattering Correction (MSC), and Standard Normal Variables (SNV) three preprocessing methods, and used the preprocessed data and the original data to construct BP Neural Network (BPNN) and Random Forest (RF) models, the preprocessing method with SNV as the average spectral value was determined based on their accuracy. Then, the hyperspectral image of the hawthorn was subjected to principal component analysis, and the 1st principal component image was selected; at the same time, six feature wavelengths were screened based on the weight coefficients under the full wavelength band, and then the corresponding average spectral value was used as the representation value of the spectral information. Secondly, the texture features corresponding to the 1st principal component image and the feature wavelengths grayscale images were extracted, respectively, and the spectral representation values of the feature wavelengths were combined with the texture representation values of these feature wavelengths grayscale images as well as the texture representation values of the principal component image to construct the input vectors of the origin traceability identification model. Finally, three methods of BPNN, RF, and weighted combination model (BPNN-RF) were selected for the identification model construction, and two evaluation indexes, namely, accuracy (Acc) and macroF1 score (macroF1) were selected to evaluate and analyze the hawthorn origin identification models constructed by different input vectors. The results showed that the accuracy and macroF1 score of the BPNN-RF model with the same input vector were mostly better than those of the BPNN model and the RF model, in which the accuracy of the actual test data set of the BPNN-RF model with the input vector consisting of three kinds of representation values was increased from 89.01% to 98.90%. The macroF1 score was also increased from 89.32% to 98.95%. This indicates that the combined BPNN-RF model based on the error inverse assignment has the strongest discriminative ability and the best effect on the identification of hawthorn origin, which is better than the single discriminative model such as BPNN or RF. This study provides methodological support for the traceability of hawthorn origin without relying on physicochemical analysis and only relying on hyperspectral information.
Get Citation
Copy Citation Text
FANG Ao, YIN Yong, YU Hui-chun, YUAN Yun-xia. Hyperspectral Method for Tracing the Origin of Hawthorn Using the Weighted Combination Model[J]. Spectroscopy and Spectral Analysis, 2025, 45(2): 584
Received: Mar. 8, 2024
Accepted: May. 21, 2025
Published Online: May. 21, 2025
The Author Email: YIN Yong (yinyong@haust.edu.cn)