Acta Optica Sinica, Volume. 39, Issue 9, 0930002(2019)

Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods

Guanwen Li1,2、**, Xiaohong Gao、*, Nengwen Xiao2, and Yunfei Xiao1
Author Affiliations
  • 1 Qinghai Provincial Key Laboratory of Physical Geography and Environmental Process, School of Geography Sciences, Qinghai Normal University, Xining, Qinghai 810008, China
  • 2 Chinese Research Academy of Environmental Sciences, Beijing 100012, China
  • show less

    In view of the large amount of soil hyperspectral data and obvious spectral information redundancy, this paper aims to compare prediction abilities of multiple feature variable selection methods for estimating soil organic matter. The stability competitive adaptive reweighted sampling (sCARS), successive projections algorithm (SPA), genetic algorithm (GA), iteratively retained information variables (IRIV), and sCARS-SPA are used to select the characteristic variables from full spectral data. Based on these characteristic bands and full spectral bands, partial least squares regression (PLSR), support vector machine (SVM), and random forest (RF) models are used to predict the soil organic matter content. The results show that the PLSR and SVM models combined with variable selection can not only improve the efficiency of the model, but also improve the model prediction ability over the full bands. The accuracy of RF model constructed with characteristic variables is not obviously improved, but the variable number in the construction model is significantly reduced and the modeling efficiency is greatly improved. Overall, the RF model’s accuracy is better than those of the SVM model and the PLSR model. The variable number of the prediction model from the combination of IRIV and RF is only 63, and the coefficients of determination (R2) from calibration set and validation set are respectively 0.941 and 0.96, and the relative deviation for the validation set RPD is 4.8, showing a very good prediction capacity. Compared to modeling based on the full bands, the combination of characteristic variable selection and regression methods can effectively improve the modeling efficiency while ensuring the accuracy of the model.

    Tools

    Get Citation

    Copy Citation Text

    Guanwen Li, Xiaohong Gao, Nengwen Xiao, Yunfei Xiao. Estimation of Soil Organic Matter Content Based on Characteristic Variable Selection and Regression Methods[J]. Acta Optica Sinica, 2019, 39(9): 0930002

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Spectroscopy

    Received: Mar. 5, 2019

    Accepted: May. 5, 2019

    Published Online: Sep. 9, 2019

    The Author Email: Li Guanwen (lgw126522@163.com), Gao Xiaohong (xiaohonggao226@163.com)

    DOI:10.3788/AOS201939.0930002

    Topics