Supervised | PLS-DA | By way of projection, the predictor and observed variables are projected into a new space, and then a linear regression model is found in this new space to achieve the classification or discrimination task | Noise filtering, dimensionality reduction, discrimination and classification, sample prediction, good interpretation ability | The situation where the number of explanatory variables is large and there is multicollinearity, the number of sample observations is small and the interference noise is large | When there is a high correlation between variables, PLS-DA may overfit the data | [11-12] |
LDA | Maximize the distance between classes and minimize the distance within classes | Dimensionality reduction, discrimination and classification, high computational efficiency | Sample data with clear class label, high dimensional feature space and approximate Gaussian distribution | It is not suitable for samples with non-Gaussian distribution and cannot handle nonlinear problems | [13-15] |
Supervised | KNN | By majority rule, a sample belongs to a class if most of its neighbors belong to that class | Noise filtering, discrimination and classification, sample prediction, no training required | Multiple classification problems, nonlinear problems and rare event classification | Large computational complexity and sensitivity to distance measurement | [16] |
SVM | An optimal hyperplane is found to divide the samples so that the distance between the two types of samples closest to the plane is maximized | Discrimination and classification, sample prediction | Binary classification problem, small sample high-dimensional data set, separable data set | Sensitive to parameter adjustment and kernel function selection | [17-18] |
SIMCA | PCA analysis is used to build a model for each class, calculate the distance between the unknown samples and these models, and determine the category of the unknown samples according to the distance discrimination | Dimensionality reduction, discrimination and classification, sample prediction | Samples with clear category labels | Large computational complexity | [19-20] |
Non-supervised | PCA | Using a small number of basic principal components (PCS) to explain the correlation between a large number of variables, the dimensionality of the original data set is reduced and the original information content is preserved as much as possible | Dimensionality reduction, sample classification, maximum retention of original variable information | Data sets with a large number of features and sufficient sample size, especially when dimensionality reduction is needed to simplify the problem | When the sample size is insufficient or there are too many variables, PCA may not be effective in extracting representative principal components | [21-22] |
HCA | The variables or samples that are less similar are aggregated into one class, and the variables or samples that are more similar are aggregated into one class | Discrimination and classification, intuitiveness and visualization | Samples that are not clearly labeled or classified | It is sensitive to distance calculation methods and feature weights and difficult to deal with large data sets | [23] |