Infrared and Laser Engineering, Volume. 54, Issue 7, 20250065(2025)

COD determination method using small-sample UV-Visible absorption spectral data

Peichao ZHENG1, Wei RUAN1, Shubin CHEN2, Haijuan LI2, Yan HOU2, Chenglin LI1, Haonan HE1, Qin YANG1, Jinmei WANG1、*, Biao LI1, and Lianbo GUO3
Author Affiliations
  • 1School of Electronic Science and Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • 2Chongqing Feiyang Measurement and Control Technology Research Institute Co., Ltd., Chongqing 400065, China
  • 3Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan 430074, China
  • show less

    ObjectiveAccurate prediction of Chemical Oxygen Demand (COD) concentrations is crucial for water quality monitoring and environmental protection, as COD serves as an important indicator of organic pollution levels in water bodies. Traditional COD measurement methods often involve the use of chemical reagents, a cumbersome and time-consuming process that requires strict experimental conditions and may generate harmful by-products. These factors pose significant challenges for the widespread use of traditional methods in real-time monitoring and large-scale applications. As a result, ultraviolet-visible (UV-Vis) spectroscopy combined with machine learning approaches, particularly Support Vector Regression (SVR), has emerged as an effective alternative for COD prediction. However, SVR models face several challenges, including limited sample sizes that fail to adequately represent data diversity, leading to overfitting, and the high computational complexity associated with hyperparameter optimization, which increases training time and computational demands.MethodsA novel COD prediction method is proposed, integrating Kernel Principal Component Analysis (KPCA), Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP), and the Newton-Raphson-based optimizer (NRBO). Specifically, KPCA is applied to extract key features from Ultraviolet-Visible (UV-Vis) spectral data, reducing dimensionality to improve computational efficiency. WGAN-GP facilitates data augmentation, addressing the challenge of limited sample size and enhancing the model’s ability to learn complex nonlinear relationships. During the model optimization phase, various optimization algorithms are assessed for hyperparameter stability, convergence speed, and regression accuracy. Based on this comparison, NRBO is chosen to optimize the hyperparameters of the Support Vector Regression (SVR) model, ultimately improving prediction accuracy and generalization capability.Results and DiscussionsThe synergistic application of Kernel Principal Component Analysis (KPCA) dimensionality reduction and Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP)-based data augmentation has led to a noticeable performance improvement in the Support Vector Regression (SVR) model's prediction of real water samples. The R2 value increased from 0.88442 to 0.9103, while the root mean square error (RMSE) decreased from 0.3368 to 0.2964, and the mean absolute error (MAE) reduced from 0.2760 to 0.2406 (see Tab.2), indicating an enhancement in model performance. A comprehensive evaluation of three optimization algorithms—Newton-Raphson-Based Optimizer (NRBO), Sparrow Search Algorithm (SSA), and Particle Swarm Optimization (PSO)—in terms of hyperparameter stability (Tab.3), convergence speed, and regression accuracy (Tab.4, respectively) revealed that the Newton-Raphson-Based Optimizer, when combined with the Support Vector Regression model, yielded the best results. This study provides an in-depth analysis of optimization algorithms from multiple perspectives, helping researchers select the most suitable optimization algorithm based on specific data characteristics and task requirements, thereby improving the prediction accuracy and generalization capability of the model.ConclusionsThis study proposes a novel method for chemical oxygen demand (COD) concentration prediction under small-sample conditions by integrating Kernel Principal Component Analysis (KPCA), an improved Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP), and the Newton-Raphson-Based Optimization (NRBO) algorithm. KPCA effectively extracts key spectral features through dimensionality reduction, thereby enhancing computational efficiency. WGAN-GP improves data diversity, enabling Support Vector Regression (SVR) to more accurately capture nonlinear relationships under limited data conditions. NRBO optimizes the hyperparameters of SVR, thereby improving both prediction accuracy and generalization capability. Experimental results demonstrate that the proposed method exhibits superior predictive performance under small-sample conditions. Compared with conventional SVR, the coefficient of determination (R2) improves from 0.8842 to 0.96248, root mean square error (RMSE) decreases by 36.34%, and mean absolute error (MAE) is reduced by 49.54%.This method also holds potential for large-scale data applications. KPCA effectively reduces the computational complexity of high-dimensional data, while WGAN-GP enhances sample diversity and improves model robustness. Moreover, NRBO demonstrates strong convergence properties in high-dimensional spaces. However, as dataset size increases, both WGAN-GP and NRBO may introduce substantial computational overhead. Future studies could explore alternative generative adversarial networks (GANs) or deep reinforcement learning strategies to optimize performance. Additionally, cross-waterbody generalization remains an open challenge. The current study primarily focuses on the Yangtze River and Jialing River basins, and the applicability of this method to other water bodies may require adjustments in spectral preprocessing techniques to accommodate variations in spectral characteristics under different water quality conditions.In conclusion, this study provides a novel methodological framework for modeling and optimizing small-sample spectral data, offering technological support for accurate COD concentration prediction in water quality monitoring and pollution control. Future research will explore the integration of alternative GAN architectures with SVR, optimize computational methodologies to enhance predictive performance on large-scale datasets, and validate the adaptability of the proposed approach across diverse aquatic environments. These efforts aim to further contribute to the advancement of environmental monitoring applications.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Peichao ZHENG, Wei RUAN, Shubin CHEN, Haijuan LI, Yan HOU, Chenglin LI, Haonan HE, Qin YANG, Jinmei WANG, Biao LI, Lianbo GUO. COD determination method using small-sample UV-Visible absorption spectral data[J]. Infrared and Laser Engineering, 2025, 54(7): 20250065

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: 光电测量

    Received: Jan. 20, 2025

    Accepted: --

    Published Online: Aug. 29, 2025

    The Author Email: Jinmei WANG (wangjm@cqupt.edu.cn)

    DOI:10.3788/IRLA20250065

    Topics