Spectroscopy and Spectral Analysis, Volume. 45, Issue 7, 1866(2025)
Research on Variety Classification of Starch Based on Terahertz Time-Domain Spectroscopy
As a major stored carbohydrate, starch is a major source of energy in the human diet and provides more than 50% of the energy needs of the human body. Meanwhile, the starch and its deep-processing industry are fundamental to the national economy and people's livelihoods. However, due to the diversity of starch types and their high similarity in appearance, it is relatively challenging to distinguish amongthem directly. Some illegal merchants often package lower-priced starches as higher-priced starches to increase profits. Consequently, the classification of starch types has significant practical relevance for food processing and industrial production in China. Terahertz (THz) technology, as an effective non-destructive, non-contact, and label-free optical approach, does not produce harmful ionizing radiation during interactions with materials, and can obtain optical parameters such as the absorption coefficient of samples simultaneously. It has a high signal-to-noise ratio and detection sensitivity, and many scholars have applied it to the quality detection of agricultural products. Five of the most common starch samples were selected from cereal starch and rhizome starch to achieve rapid and non-destructive identification of starch. The spectral information was obtained using Terahertz time-domain spectroscopy (THz-TDS) technology, and the absorption coefficient of different starch varieties in the range of 0.2~1.2 THz was calculated based on the experimental data. Subsequently, the original spectra were processed using three preprocessing methods: Savitzky-Golay (S-G) smoothing, multiplicative scatter correction (MSC), and standard normal variate (SNV). Principal component analysis (PCA) was employed to extract feature data based on a cumulative contribution rate exceeding 95%, resulting in the selection of the first three principal components. A multi-classification model was established using the support vector machine (SVM) method. Three types of kernels (linear, polynomial, and radial basis functions) were selected to identify different varieties of starch. The results showed that the PCA-SVM-polynomial combined with SG smoothing achieved the best modeling performance for starch variety classification, with an average accuracy of 0.941 9 on the test set, a Kappa of 0.933, and an F1 score of 0.941 7. Furthermore, this method was compared with logistic regression (LR), decisiontree (DT), and random forest (RF). The research results indicated that PCA-SVM was superior to other methods, proving the feasibility of THz technology for starch variety identification and demonstrating important practical application value for the modernization of the food processing industry and the development of starch-based products.
Get Citation
Copy Citation Text
WEI Tao, WANG Heng, GE Hong-yi, JIANG Yu-ying, ZHANG Yuan, WEN Xi-xi, GUO Chun-yan. Research on Variety Classification of Starch Based on Terahertz Time-Domain Spectroscopy[J]. Spectroscopy and Spectral Analysis, 2025, 45(7): 1866
Received: Oct. 24, 2024
Accepted: Jul. 24, 2025
Published Online: Jul. 24, 2025
The Author Email: JIANG Yu-ying (jiangyuying11@163.com)