Laser & Optoelectronics Progress, Volume. 61, Issue 15, 1530002(2024)

Identification of Types of Tobacco Leaf Diseases Using Near-Infrared Spectroscopy and Random Forest Algorithm

Ying Liang, Kun Ma, Xinyu Zhang, Qifu Yang, and Jiaquan Wu*
Author Affiliations
  • Faculty of Science, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
  • show less
    Figures & Tables(10)
    Spectral scanning position of tobacco leaf sample. (a) Healthy; (b) powdery mildew; (c) deep green; (d) mosaic virus; (e) brown-spot
    Process of forming a random forest (N is the number of samples and K is the number of sampling sets)
    Original spectra and the spectra after pretreatment of different disease species of tobacco leaf. (a) Original spectra; (b) spectra after pretreatment
    ROC curves of different disease species based on random forest algorithm.(a) Healthy; (b) powdery mildew; (c) deep green; (d) mosaic virus; (e) brown-spot
    • Table 1. Samples of different types of tobacco leaf

      View table

      Table 1. Samples of different types of tobacco leaf

      Disease typeYearRegional sourceSpectral dimension /nmTotal sampleTraining sampleTest sample
      Healthy2022Longtankou village,Fumin county,Yunnan province908‒1676804436
      Powdery mildew2022Longtankou village,Fumin county,Yunnan province908‒16761006436
      Deep green2022Longtankou village,Fumin county,Yunnan province908‒1676804828
      Mosaic virus2022Longtankou village,Fumin county,Yunnan province908‒16761207644
      Brown-spot2022Longtankou village,Fumin county,Yunnan province908‒1676804436
    • Table 2. Comparison of SG+D1 pre-processing methods based on SVM under different parameters

      View table

      Table 2. Comparison of SG+D1 pre-processing methods based on SVM under different parameters

      AlgorithmPreprocessing methodSmoothing pointPolynomial orderTraining set accuracy /%Test set accuracy /%
      SVMSG+D11271.3773.18
      SG+D13270.4068.42
      SG+D15267.5271.24
      SG+D17266.3869,87
      SG+D19270.4669.63
    • Table 3. Different pre-processing methods accuracies based on SVM algorithm

      View table

      Table 3. Different pre-processing methods accuracies based on SVM algorithm

      AlgorithmPre-processing methodTraining set accuracy /%Test set accuracy /%
      SVMSNV63.7664.13
      MSC69.9267.93
      SG70.1271.30
      SG+D171.3773.18
    • Table 4. Comparison model effects of training models for different species of tobacco leaf disease test

      View table

      Table 4. Comparison model effects of training models for different species of tobacco leaf disease test

      AlgorithmTypeTraining sampleTraining correct numberNumber of false positivePrecision /%Sensitivity /%Specificity /%F1-scoreOAC /%
      SVMHealthy5150198.1100.099.60.9974.28
      Powdery mildew3528781.455.696.20.66
      Deep green237671.953.596.10.62
      Mosaic virus43222151.866.285.80.58
      Brown-spot53431080.398.194.10.88
      BP neural networkHealthy4339491.397.798.30.9491.67
      Powdery mildew5957296.491.599.10.94
      Deep green5245785.780.896.90.83
      Mosaic virus7067395.692.998.50.94
      Brown-spot5245787.796.296.90.92
      PLS-DAHealthy48480100.0100.0100.01.0099.63
      Powdery mildew60600100.0100.0100.01.00
      Deep green4847198.0100.099.50.98
      Mosaic virus71710100.098.6100.00.99
      Brown-spot48480100.0100.0100.01.00
      RFHealthy44440100.0100.0100.01.00100.00
      Powdery mildew64640100.0100.0100.01.00
      Deep green48480100.0100.0100.01.00
      Mosaic virus76760100.0100.0100.01.00
      Brown-spot44440100.0100.0100.01.00
    • Table 5. Comparison model effects of test models for different species of tobacco leaf disease

      View table

      Table 5. Comparison model effects of test models for different species of tobacco leaf disease

      AlgorithmTypeTest sampleNumber of correct testNumber of incorrect testPrecision /%Sensitivity /%Specificity /%F1-scoreOAC /%
      SVMHealthy29290100.0100.0100.01.0067.93
      Powdery mildew1612476.243.296.60.55
      Deep green1811572.048.695.20.58
      Mosaic virus38211755.969.176.70.62
      Brown-spot24141058.592.389.20.72
      BP neural networkHealthy3732588.1100.096.60.9485.87
      Powdery mildew4138391.982.997.90.87
      Deep green2822680.057.197.40.67
      Mosaic virus5045589.686.096.30.88
      Brown-spot2821775.7100.094.20.87
      PLS-DAHealthy3332197.0100.099.30.9894.00
      Powdery mildew4030197.597.599.30.97
      Deep green3730781.293.895.30.87
      Mosaic virus4140197.683.899.20.90
      Brown-spot3332197.0100.099.30.98
      RFHealthy36360100.0100.0100.01.0098.10
      Powdery mildew3635197.3100.0100.00.99
      Deep green28280100.087.597.40.93
      Mosaic virus4442295.7100.0100.00.98
      Brown-spot3635197.3100.0100.00.99
    • Table 6. AUC values of different classification algorithms

      View table

      Table 6. AUC values of different classification algorithms

      AlgorithmHealthyPowdery mildewDeep greenMosaic virusBrown-spot
      SVM1.000.750.740.690.73
      BP neural network0.970.870.620.890.96
      PLS-DA0.980.900.940.890.95
      RF1.000.910.960.990.99
    Tools

    Get Citation

    Copy Citation Text

    Ying Liang, Kun Ma, Xinyu Zhang, Qifu Yang, Jiaquan Wu. Identification of Types of Tobacco Leaf Diseases Using Near-Infrared Spectroscopy and Random Forest Algorithm[J]. Laser & Optoelectronics Progress, 2024, 61(15): 1530002

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Spectroscopy

    Received: Jun. 5, 2023

    Accepted: Aug. 8, 2023

    Published Online: Aug. 12, 2024

    The Author Email: Jiaquan Wu (710866288@qq.com)

    DOI:10.3788/LOP231466

    CSTR:32186.14.LOP231466

    Topics