Spectroscopy and Spectral Analysis, Volume. 42, Issue 7, 2148(2022)

Drugs Identification Using Near-Infrared Spectroscopy Based on Random Forest and CatBoost

Ping JIANG1,1;... Hao-xiang LU2,2; and Zhen-bing LIU2,2; *; |Show fewer author(s)
Author Affiliations
  • 11. School of Computer and Information Technology, Guangxi Police College, Nanning 530028, China
  • 22. College of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
  • show less
    Figures & Tables(9)
    The structure of RF-CatBoost
    NIR spectra of pretreated cefixime tablets
    Covariance matrix of drug NIR data before (a) and after (b) pretreatment
    Classification accuracy of different decision tree numbers in Catboost on datasets of different sizes in group A (a) and group B (b)
    Standard deviations of each model on different sizes data sets in group A (a) and group B (b)
    • Table 1. Near infrared spectral data of cefixime tablets

      View table
      View in Article

      Table 1. Near infrared spectral data of cefixime tablets

      厂商非铝塑
      包装
      铝塑
      包装
      合计
      湖南方盛制药股份有限公司5454108
      江苏正大清江制药有限公司6356119
      山东鲁抗医药股份有限公司514091
      山东罗欣药业股份有限公司484896
      共计216198414
    • Table 2. Configuration table of different number of training sets in group A and B

      View table
      View in Article

      Table 2. Configuration table of different number of training sets in group A and B

      数据集样本总数正样本数负样本数
      A401525
      602040
      802555
      1003070
      1203585
      14040100
      16045115
      18050130
      B301020
      501535
      702050
      902565
      1103080
      1303595
      15040110
      17045125
    • Table 3. Classification accuracy of each model on different sizes data sets in group A and B (%)

      View table
      View in Article

      Table 3. Classification accuracy of each model on different sizes data sets in group A and B (%)

      组别训练/测试集ELMSWELMSVMBPBoostingCatBoostCatBoost RF-CatBoost
      A40/17692.3692.3393.6889.3694.4494.9996.79
      60/15693.6593.8894.0990.3194.9595.0397.89
      80/13694.9995.1195.3591.0196.2296.8598.82
      100/11696.0396.2996.8891.8597.0597.5299.05
      120/9697.8897.9597.2392.9997.9997.8999.95
      140/7697.9997.6498.8893.3598.8598.98100
      160/5698.0598.0199.0594.9999.0199.02100
      180/3698.8898.9199.0195.8999.1899.35100
      B30/16891.2890.9992.3488.7592.8693.9595.97
      50/14892.6991.6793.3890.3194.0194.9896.59
      70/12893.1993.1194.2091.1195.1996.8298.79
      90/10894.2594.2695.3891.8596.2196.9899.92
      110/8894.9595.8996.2192.9996.8997.99100
      130/6895.9297.2298.0993.3597.9698.09100
      150/4897.9598.0998.8994.9998.3998.19100
      170/2898.8898.8599.0095.8999.0898.51100
    • Table 4. Runningtime of each model on different sizes data sets in group A and group B (s)

      View table
      View in Article

      Table 4. Runningtime of each model on different sizes data sets in group A and group B (s)

      组别训练/测试集ELMSWELMSVMBPBoostingCatBoostCatBoost RF-CatBoost
      A40/1760.009 40.003 10.017 038.098 815.928 88.717 06.088 3
      60/1560.013 00.005 00.032 138.339 717.985 415.903 07.408 3
      80/1360.013 80.015 80.063 338.449 820.111 623.147 49.237 4
      100/1160.021 10.017 10.113 838.864 722.273 030.441 09.365 9
      120/960.031 10.030 20.151 839.800 824.884 638.090 210.297 3
      140/760.044 20.040 10.219 840.885 026.690 645.062 610.988 8
      160/560.056 70.059 70.287 841.297 828.565 252.382 012.083 4
      180/360.078 70.074 00.359 542.380 730.661 259.526 412.530 8
      B30/1680.017 50.009 40.007 32.171 02.510 61.496 20.477 1
      50/1480.053 50.023 70.024 34.159 93.322 72.395 71.164 1
      70/1280.127 50.053 30.052 36.208 04.081 73.325 82.031 3
      90/1080.215 50.098 30.102 78.350 65.336 24.422 02.907 4
      110/880.339 10.167 00.177 810.416 06.174 35.406 63.831 2
      130/680.495 80.260 20.295 312.404 06.887 96.411 84.785 0
      150/480.706 10.406 30.439 314.357 27.694 57.363 15.779 7
      170/280.912 00.616 80.643 016.322 88.565 58.346 46.829 1
    Tools

    Get Citation

    Copy Citation Text

    Ping JIANG, Hao-xiang LU, Zhen-bing LIU. Drugs Identification Using Near-Infrared Spectroscopy Based on Random Forest and CatBoost[J]. Spectroscopy and Spectral Analysis, 2022, 42(7): 2148

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Orginal Article

    Received: Jan. 12, 2022

    Accepted: --

    Published Online: Nov. 16, 2022

    The Author Email:

    DOI:10.3964/j.issn.1000-0593(2022)07-2148-08

    Topics