Computer Engineering, Volume. 51, Issue 8, 53(2025)
Sequence Alignment Algorithm Based on Combined minimizer Seeds on Pan-Genome Graph
[1] [1] QUAN W, GUAN D F, QUAN G R, et al. Short read alignment based on maximal approximate match seeds[J]. Frontiers in Molecular Biosciences, 2020, 7: 572934.
[3] [3] TETTELIN H, MASIGNANI V, CIESLEWICZ M J, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”[J]. Proceedings of the National Academy of Sciences of the United States of America, 2005, 102(39): 13950-13955.
[4] [4] BRANDT D Y C, AGUIAR V R C, BITARELLO B D, et al. Mapping bias overestimates reference allele frequencies at the HLA genes in the 1000 genomes project phase I data[J]. G3: Genes, Genomes, Genetics, 2015, 5(5): 931-941.
[5] [5] OUTTEN J, WARREN A. Methods and developments in graphical pangenomics[J]. Journal of the Indian Institute of Science, 2021, 101(3): 485-498.
[7] [7] WILBUR W J, LIPMAN D J. Rapid similarity searches of nucleic acid and protein data banks[J]. Proceedings of the National Academy of Sciences of the United States of America, 1983, 80(3): 726-730.
[8] [8] SMITH T F, WATERMAN M S. Identification of common molecular subsequences[J]. Journal of Molecular Biology, 1981, 147(1): 195-197.
[9] [9] NEEDLEMAN S B, WUNSCH C D. A general method applicable to the search for similarities in the amino acid sequence of two proteins[J]. Journal of Molecular Biology, 1970, 48(3): 443-453.
[10] [10] DELCHER A L, KASIF S, FLEISCHMANN R D, et al. Alignment of whole genomes[J]. Nucleic Acids Research, 1999, 27(11): 2369-2376.
[11] [11] DELCHER A L, PHILLIPPY A, CARLTON J, et al. Fast algorithms for large-scale genome alignment and comparison[J]. Nucleic Acids Research, 2002, 30(11): 2478-2483.
[12] [12] SIRN J, MONLONG J, CHANG X, et al. Pangenomics enables genotyping of known structural variants in 5202 diverse genomes[J]. Science, 2021, 374(6574): 8871.
[13] [13] RAUTIAINEN M, MARSCHALL T. GraphAligner: rapid and versatile sequence-to-graph alignment[J]. Genome Biology, 2020, 21(1): 253.
[14] [14] ROBERTS M, HAYES W, HUNT B R, et al. Reducing storage requirements for biological sequence comparison[J]. Bioinformatics, 2004, 20(18): 3363-3369.
[15] [15] JAIN C, RHIE A, ZHANG H W, et al. Weighted minimizer sampling improves long read mapping[J]. Bioinformatics, 2020, 36: 111-118.
[16] [16] LI H. Minimap2: pairwise alignment for nucleotide sequences[J]. Bioinformatics, 2018, 34(18): 3094-3100.
[17] [17] LI H, FENG X W, CHU C. The design and construction of reference pangenome graphs with minigraph[J]. Genome Biology, 2020, 21(1): 265.
[18] [18] MA J, CCERES M, SALMELA L, et al. GraphChainer: co-linear chaining for accurate alignment of long reads to variation graphs[J]. Bioinformatics, 2023, 39(8): 475.
[19] [19] CHANDRA G, JAIN C. Sequence to graph alignment using gap-sensitive co-linear chaining[C]//Proceedings of the 27th Annual International Conference on Research in Computational Molecular Biology. Berlin, Germany: Springer, 2023: 58-73.
[20] [20] JOUDAKI A, METEREZ A, MUSTAFA H, et al. Aligning distant sequences to graphs using long seed sketches[J]. Genome Research, 2023, 33(7): 1208-1217.
[21] [21] HOANG M, ZHENG H Y, KINGSFORD C. Differentiable learning of sequence-specific minimizer schemes with DeepMinimizer[J]. Journal of Computational Biology, 2022, 29(12): 1288-1304.
[23] [23] ALTSCHUL S F, GISH W, MILLER W, et al. Basic local alignment search tool[J]. Journal of Molecular Biology, 1990, 215(3): 403-410.
[24] [24] OI M, IKI M. Edlib: a C/C++ library for fast, exact sequence alignment using edit distance[J]. Bioinformatics, 2017, 33(9): 1394-1395.
[25] [25] ONO Y, ASAI K, HAMADA M. PBSIM: PacBio reads simulator—toward accurate genome assembly[J]. Bioinformatics, 2013, 29(1): 119-121.
[26] [26] GARRISON E, SIRN J, NOVAK A M, et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference[J]. Nature Biotechnology, 2018, 36(9): 875-879.
[27] [27] MOKVELD T, LINTHORST J, AL-ARS Z, et al. CHOP: haplotype-aware path indexing in population graphs[J]. Genome Biology, 2020, 21(1): 65.
Get Citation
Copy Citation Text
GAO Jia, XU Yun. Sequence Alignment Algorithm Based on Combined minimizer Seeds on Pan-Genome Graph[J]. Computer Engineering, 2025, 51(8): 53
Category:
Received: Jan. 15, 2024
Accepted: Aug. 26, 2025
Published Online: Aug. 26, 2025
The Author Email: XU Yun (xuyun@ustc.edu.cn)