Computer Engineering, Volume. 51, Issue 8, 53(2025)
Sequence Alignment Algorithm Based on Combined minimizer Seeds on Pan-Genome Graph
With advancements in sequencing technology, human genome analysis has shifted from individual analysis to population analysis. To better demonstrate the genetic variation information between different samples within a population, the pan-genome graph model has replaced the traditional linear multi-sequence reference genome model, and sequence-to-graph alignment has become a key issue in biological sequence analyses. Existing alignment algorithms employ seed-and-extend strategies. However, owing to the numerous paths formed by graph combinations, localization and verification phases become time-consuming, necessitating further optimization and improvement of single-seed selection methods. To address this issue, this paper proposes a sequence alignment algorithm based on a combined minimizer seed. In the localization phase, the algorithm enhances the coverage range of a single seed through the combined hashing of minimizer seeds. Simultaneously, seeds are located through both sequence and relative position information, which significantly reducing the number of false-positive matching positions, thus lowering the workload of the subsequent filtering and verification processes. Experimental results demonstrate that the proposed algorithm can reduce candidate positions by approximately 80%, optimize time performance by one to three times, and have index memory and precise comparison capabilities comparable to mainstream alignment algorithms.
Get Citation
Copy Citation Text
GAO Jia, XU Yun. Sequence Alignment Algorithm Based on Combined minimizer Seeds on Pan-Genome Graph[J]. Computer Engineering, 2025, 51(8): 53
Category:
Received: Jan. 15, 2024
Accepted: Aug. 26, 2025
Published Online: Aug. 26, 2025
The Author Email: XU Yun (xuyun@ustc.edu.cn)