Spectroscopy and Spectral Analysis, Volume. 42, Issue 8, 2353(2022)

A Comparative Study of the COD Hyperspectral Inversion Models in Water Based on the Maching Learning

Chun-ling WANG1,*... Kai-yuan SHI1,1; 2;, Xing MING3,3; *;, Mao-qin CONG3,3;, Xin-yue LIU3,3; and Wen-ji GUO3,3; |Show fewer author(s)
Author Affiliations
  • 11. School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China
  • 33. Nanjing Institute of Software Technology, Institute of Software Chinese Academy of Sciences, Nanjing 210049, China
  • show less

    Chemical oxygen demand (COD) is an important indicator of organic pollution in water. How to quickly and accurately test the COD content of water is particularly important. The application of machine learning in the field of water quality inversion is increasing, and more research results have been obtained. Hyperspectral remote sensing has the advantages of high spectral-spatial resolution and multiple imaging channels, so it has great potential in retrieving water’s COD. This study uses different hyperspectral pre-processing methods to process the original hyperspectral data. It uses the hyperspectral data before and after processing to compare the inversion performance of different machine learning models and different hyperspectral pre-processing methods on the COD content of water. Firstly, 1 548 groups of COD content and corresponding hyperspectral data (400~1 000 nm) samples were collected by ZK-UVIR-I in-situ spectral water quality on-line monitor in Baodai River. In order to reduce the interference of spectral noise and eliminate the influence of spectral scattering, Savitzky-Golay (SG) smoothing, Multiplicative scatter correction (MSC) and SG smoothing combined with MSC methods were used to pre-process the original spectra. Secondly, the sample set is randomly divided into training set and test set, where the training set accounts for 80% and the test set accounts for 20%. A COD hyperspectral inversion model based on the four machine learning methods of linear regression, random forest (random forest), AdaBoost, and XGBoost was established for the pre-processed training set full-band spectrum. Moreover, three indexes of determination coefficient (R2), root mean square error (RMSE) and relative analysis error (RPD) were selected to evaluate the accuracy of the hyperspectral inversion model. The results show that random forest, AdaBoost and XGboost are all the better than linear regression. The prediction ability of the inversion model established by XGboost is the best whether the spectral data is processed or not, with R2 of 0.92, RMSE of 7.1 mg·L-1, and RPD of 3.4. Considering that the original spectrum may be redundant, the dimensionality reduction of the spectrum after SG smoothing and MSC processing is performed by principal component analysis (PCA), and the top ten principal components with a cumulative contribution rate of 95% are selected as the input variables of the model. XGBoost established the inversion model, and the results show that after PCA, the accuracy of the inversion model is improved, the RPD is 3.8, and the training time of the model is shortened from 72 seconds to 2.9 seconds. The above research can provide new methods and ideas for establishing hyperspectral inversion models of this water area and similar water areas.

    Tools

    Get Citation

    Copy Citation Text

    Chun-ling WANG, Kai-yuan SHI, Xing MING, Mao-qin CONG, Xin-yue LIU, Wen-ji GUO. A Comparative Study of the COD Hyperspectral Inversion Models in Water Based on the Maching Learning[J]. Spectroscopy and Spectral Analysis, 2022, 42(8): 2353

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Orginal Article

    Received: Jun. 15, 2021

    Accepted: --

    Published Online: Mar. 17, 2025

    The Author Email: WANG Chun-ling (wangchl@bjfu.edu.cn)

    DOI:10.3964/j.issn.1000-0593(2022)08-2353-06

    Topics