Laser & Optoelectronics Progress, Volume. 54, Issue 10, 103001(2017)
Feature Selection Algorithm Application in Near-Infrared Spectroscopy Classification Based on Binary Search Combined with Random Forest Pruning
In view of the problems of the random forest in the feature selection process in high-dimensional spaces, such as calculation complexity, large model memory overhead, and low classification accuracy, a feature selection algorithm named binary search random forest pruning (BSRFP) is proposed. This algorithm firstly obtains the feature importance scores according to the purity Gini index, and deletes features with low importance scores. The optimal feature subset and the classifier with the highest classification accuracy are then obtained with utilization of the pruning technique combining binary search with the diversity among base classifiers. To verify the effectiveness of this algorithm, a cigarette quality recognition model is established and compared with other methods. The results show that the binary search algorithm simplifies the feature search process, and the RFP algorithm reduces the size of random forest algorithm. The classification accuracy of the random forest pruning algorithm is 96.47%. The features selected by using BSRFP algorithm are more correlated, and the algorithm provides higher accuracy of cigarette quality recognition.
Get Citation
Copy Citation Text
Liu Ming, Li Zhongren, Zhang Haitao, Yu Chunxia, Tang Xinghong, Ding Xiangqian. Feature Selection Algorithm Application in Near-Infrared Spectroscopy Classification Based on Binary Search Combined with Random Forest Pruning[J]. Laser & Optoelectronics Progress, 2017, 54(10): 103001
Category: Spectroscopy
Received: Apr. 26, 2017
Accepted: --
Published Online: Oct. 9, 2017
The Author Email: