Spectroscopy and Spectral Analysis, Volume. 45, Issue 2, 463(2025)
Spectral Binary Star Analysis Based on Rough Set and Cluster Voting Mechanism
Spectral binary star usually refers to the spectra that show double dominant component characteristics. Due to the double component's complexity and diversity, its formation is complicated. At the same time, the spectral signal-to-noise ratio is relatively low. Many of the existing analytical methods separated two-component system spectra into two spectra. Still, the separation method can't guarantee the accuracy of the spectra, and the reliability of the existing clustering methods of the single clustering is relatively low. This paper proposes a binary star spectrum analysis and evaluation method based on a rough set and cluster voting mechanism. Using the idea of multiple clustering and voting, the gradient reliability of each spectrum belongs to the corresponding category. The method consists of two parts: First, the spectral binary star data set is reconstructed by using clustering algorithms with different ideas, and each clustering algorithm label is aligned with the Hungarian algorithm as a spectral attribute to reconstruct the data set. Secondly, the voting mechanism is used to reflect the consistency of the clustering results and give the category of each spectrum. At the same time, rough sets are defined to trace the characteristics of each spectrum, and the reliability of the classification of each spectrum is given by using the up/down approximation set. LAMOST DR10 was selected to publish the spectral set of binary stars as the analysis object. Four clustering algorithms, partition-based K-means, model-based Gaussian mixture model (GMM), Spectral clustering, and Agglomerative clustering, were used to reconstruct the spectral data set. Select the lower bound of votes as 2 and obtain clustering results with reliability gradients of 1, 0.75, and 0.5 through voting. About 1/3 of the samples have a reliability of 1, indicating that the four clustering results of this batch of samples are completely consistent. The SNR of each spectrum and the number of votes arestatistically analyzed. The SNR of the samples with the low number of votes is relatively low, which is one of the reasons why they are divided into different categories by different clustering algorithms. We analyzed the physical origin of 6 spectral samples with a reliability of 1, among which binary stars, Hanoi Nebula, and target stars were the main ones. The difference in clustering labels may be caused by the difference in the flow rate of the two components or data processing such as splicing and calibration. In addition, factors may lead to pipeline misjudgment due to low spectral quality, and its sky location distribution is consistent with the research on the distribution characteristics of low-quality data.
Get Citation
Copy Citation Text
WANG Qi, YANG Hai-feng, CAI Jiang-hui. Spectral Binary Star Analysis Based on Rough Set and Cluster Voting Mechanism[J]. Spectroscopy and Spectral Analysis, 2025, 45(2): 463
Received: Feb. 4, 2024
Accepted: May. 21, 2025
Published Online: May. 21, 2025
The Author Email: YANG Hai-feng (hfyang@tyust.edu.cn)