Spectroscopy and Spectral Analysis, Volume. 44, Issue 8, 2202(2024)

Discrimination of Chuzhou Chrysanthemum Tea Grades Using Noise Discriminant C-Means Clustering

WU Bin1、*, XIE Chen-ao2, CHEN Yong2, WU Xiao-hong2, and JIA Hong-wen1
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • show less

    Near-infrared (NIR) spectroscopy detection technology can reflect the measured samples organic chemical composition and structural information by detecting the spectral features in the NIR region. During the material composition analysis, NIR spectroscopy often involves a significant amount of wavelength data, resulting in relatively high data dimensions. Furthermore, spectra are susceptible to phenomena such as overlap and redundancy, which impact the models performance. Therefore, we proposed a noise discriminant C-means clustering (NDCM) algorithm that combined fast generalized noise clustering (FGNC) and fuzzy linear discrimination analysis (FLDA). NDCM can realize the extraction of data identification information and data space compression in the fuzzy clustering process, which can achieve higher clustering accuracy. The fuzzy membership degree and the cluster centers obtained by fuzzy C-means clustering (FCM) on the near-infrared spectral data of Chuzhou chrysanthemum tea are used as the initial fuzzy membership degree and initial clustering centers of NDCM, respectively, so that NDCM has the advantages of fast clustering speed and high accuracy. The FCM algorithm is sensitive to noisy data, while the NDCM algorithm can perform better when dealing with noisy data in spectra. In this study, 240 samples of Chuzhou chrysanthemum tea with three quality grades, namely special grade, first grade and second grade, were selected as experimental samples. A portable NIR spectrometer (NIR-M-F1-C) was used to collect the NIR spectra of Chuzhou chrysanthemum tea, and they are the 400-dimensional data. At first, the NIR spectra were pretreated with Savitzky-Golay filtering and multivariate scattering correction (MSC) to reduce spectral scattering and noise. Secondly, the dimensionality of the spectral data was reduced by principal component analysis (PCA), and the dimensionality of the data after PCA reduction was 6. Next, linear discriminant analysis (LDA) was applied to extract the discriminant information in the spectral data of Chuzhou Chrysanthemum tea and further transform the data space dimension into 2 dimensions. Finally, three algorithms, i.e. FCM, FGNC and NDCM, were utilized to perform cluster analysis on the processed data to accurately classify chrysanthemum tea. The experimental results exhibited that when the weight index m=2.5, the clustering accuracy rates of FCM, FGNC and NDCM were 92.42%, 98.48%, and 100%, respectively. The clustering time of NDCM was slightly longer compared to FGNC. FCM had 27 iterations to reach convergence, while FGNC and NDCM took 13 and 10 times, respectively. NIR spectroscopy combined with MSC, Savitzky-Golay filtering, PCA, LDA and NDCM can provide a clustering model to accurately identify Chuzhou chrysanthemum tea quality.

    Tools

    Get Citation

    Copy Citation Text

    WU Bin, XIE Chen-ao, CHEN Yong, WU Xiao-hong, JIA Hong-wen. Discrimination of Chuzhou Chrysanthemum Tea Grades Using Noise Discriminant C-Means Clustering[J]. Spectroscopy and Spectral Analysis, 2024, 44(8): 2202

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Apr. 2, 2023

    Accepted: --

    Published Online: Oct. 11, 2024

    The Author Email: Bin WU (wubind2003@163.com)

    DOI:10.3964/j.issn.1000-0593(2024)08-2202-06

    Topics