Forensic Sciences Research, Volume. 9, Issue 1, owad047(2024)

Systematical explorations of forensic feature and population genetic diversity of the Chinese Mongolian group from northwest China via a self-constructed InDel panel

Xuebing Chen1, Hui Xu1, Wei Cui1, Ming Zhao1, and Bofeng Zhu1,2,3、*
Author Affiliations
  • 1Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
  • 2Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an, China
  • 3Clinical Research Center of Shaanxi Province for Dental and Maxillofacial Diseases, College of Stomatology, Xi'an, China
  • show less

    This study aimed to investigate the genetic polymorphisms and population characteristics of Chinese Mongolian group from northwest China (NCM) through a self-developed panel including 43 autosomal insertion/deletion (A-InDel) polymorphism genetic markers. Herein, 288 unrelated healthy individuals from the NCM group were employed to obtain the genetic data of 43 A-InDels through multiplex PCR amplification and InDel genotyping using capillary electrophoresis platform. In addition, multiplex population genetic analyses were performed between the NCM group and 27 reference populations. There were no deviations at 43 loci from Hardy–Weinberg equilibrium in the NCM group. The observed heterozygosity (Ho) values ranged from 0.312 8 to 0.559 2, and the combined power of discrimination (CPD) and cumulative probability of exclusion (CPE) values in the NCM group were 0.999 999 999 999 999 998 77 and 0.999 814, respectively. The forensic parameter values indicated that this panel was polymorphic and informative in the NCM group and could be used as an effective tool for forensic personal identification. Furthermore, the results of pairwise genetic distances, principal component analysis, multidimensional scaling analysis, phylogenetic tree construction, and admixture analysis among the NCM group and 27 reference populations revealed that there were closer genetic relationships between the NCM group and East Asian populations, especially Chinese Hui group (CHH) from the northwest China, which is consistent with the geographical location. These present findings contributed to the ongoing genetic explorations and insights into the genetic architecture of the NCM group.

    Introduction

    The insertion/deletion (InDel) polymorphic genetic marker, which is widely distributed in the human genome, plays a crucial role in forensic and population genetic study. Since the initial research was reported byWeber et al. [1], InDel markers have generated significant attentions from forensic geneticists. InDel markers have several advantages over traditional short tandem repeat (STR) loci, making them the mutual complementation tool to the commonly used STRs nowadays [2, 3]. Recently, some studies have confirmed the indispensable role of InDel genetic markers in forensic genetics, particularly in population genetics and personal identification in degraded sample [46]. Personal identification is a crucial aspect of forensic practice, especially in that case involving degraded sample. However, commonly used STR genotyping of highly degraded sample may fail to provide a complete genotype profile due to the loss of large sizes of amplicons [7, 8]. Previously, this novel genotyping system consisting of 43 A-InDels and an Amelogenin gene locus, was found to be suitable for personal identification in Chinese Hui group (CHH) [9]. Besides, the amplicon sizes of this self-developed multiplex PCR panel, which includes 43 A-InDels, are less than 200 bp, making it ideal tool for obtaining complete genotyping profile of degraded sample. Nevertheless, this amplification system is still in its early stage in forensic genetics, and further population data are required before its widespread adoption.

    The Mongolian group is an ancient nomadic people with unique language, characters and customs. According to the 2020 census, Chinese Mongolian group reached 6.29 million, which is one of the most populous ethnic groups in China (http://www.stats.gov.cn/tjsj/ndsj/2021/indexch.htm). Mongolian, which belongs to the Mongolian branch of the Altaic language family, is the national official language of Mongolia. Mongolians are renowned for their Mongol Empire, which was established by Genghis Khan in the 13th century. With the establishment of the Mongolian Empire and the continuous expansion of its territory, it promoted the gene exchanges between Mongolian group and other nationalities, especially reflected in the gene structure of Eurasia continent [10, 11]. The polymorphism analyses of diverse genetic markers such as STR [12], single nucleotide polymorphism (SNP) [13] and mitochondrial DNA (mtDNA) [14] also largely affirm the intimate genetic associations between the Mongolians and the East Asians. An in-depth exploration of the genetic background and structure of the Mongolian group from northwest China (NCM) may help to better understand the background of Mongol group and their adjacent groups.

    In the current study, genetic distributions and forensic efficiencies of all the A-InDels using the self-developed panel in the NCM group were further investigated. Moreover, Nei's DA distances, pairwise FST values, principal component analysis (PCA), multidimensional scaling (MDS) analysis, phylogenetic tree construction, and genetic structure analysis based on the same 43 A-InDels were used to reveal the genetic differentiations and relationships among the NCM group and 27 worldwide comparison populations.

    Materials and methods

    Sample collections and comparison populations

    Bloodstain samples were collected from 288 unrelated healthy donors from the Mongolian group residing in northwest China. Prior to this experiment, all donors gave written informed consents. As for the 3 626 individuals from 27 reference populations, 26 of them were from the Project Phase 3 database of the 1 000 Genomes Project and the remaining population was from our previously published research [9]. In addition, we combined the two sets of genotype data of Han Chinese in Beijing (CHB) group from the 1 000 Genomes Project Phase 3 and Beijing Han from our previously published work [15] to create a combined dataset of 404 Beijing Han individuals, which is named as CHB in the present study. Moreover, detailed information about the 28 populations is presented in Supplementary Table S1.

    PCR amplification and subsequent InDels genotyping

    Genomic DNA was extracted by the Chelex-100 method. The GeneAmp PCR System 9700 Thermal Cycler (Thermo Fisher Scientific, Waltham, MA, USA) was used to amplify the 43 A-InDels following the PCR protocol outlined in our previous study [9]. Subsequently, PCR amplification products were separated and detected through the ABI 3500xL Genetic Analyzer (Thermo Fisher Scientific). InDels genotyping was performed using GeneMapper ID-X software v1.5 (Thermo Fisher Scientific). The positive and negative controls during the experimental procedures were DNA 9947A, 9948, and deionized water, respectively.

    Forensic statistical analysis

    The Hardy–Weinberg equilibrium (HWE) analyses of the 43 A-InDels in the NCM group were conducted by Genepop software v4.7.5 [16], while linkage disequilibrium (LD) analyses for pairwise InDels were performed using SNPAnalyzer v2.0 (Istech, Republic of Korea) software [17]. The allele frequencies of 43 A-InDels and values of forensic parameters containing the polymorphism information content (PIC), typical paternity index (TPI), match probability (MP), probability of exclusion (PE), expected heterozygosity (He), and observed heterozygosity (Ho) in the NCM group were calculated by the STRAF online program v1.0.5 [18]. Meanwhile, the cumulative power of discrimination (CPD) and combined probability of exclusion (CPE) values of the 43 A-InDels in the NCM group were carried out by the corresponding formula in the Excel. Moreover, locus-by-locus analysis of molecular variance (AMOVA) was performed on 28 populations, and the calculation of pairwise fixation index (FST) values was executed by Arlequin software v3.5 [19]. In addition, the DA distances between pairs of populations were calculated by the DISPAN program [20].

    Phylogenetic trees were visualized and managed using the ITOL online tool (https://itol.embl.de). Heatmaps of the InDel allele frequencies, pairwise DA distances, and FST values were drawn using Origin v2019 following the previous study [21]. Meanwhile, PCA plots, including the population-level map based on allele frequencies, and individual-level map derived from the 43 A-InDels genotyping data, along with MDS plot based on pairwise FST values among the 28 populations were also drawn by R software v3.6.2. The genetic structure analysis was performed using the software STRUCTURE v2.3.4 [22]. An estimate of the optimum K value was obtained using the online tool STRUCTURE HARVESTER [23].

    Results

    HWE and LD analyses

    In the present study, the LD tests were performed on pairwise loci from 43 A-InDels to evaluate the independence of every locus in the NCM group (Supplementary Figure S1). Pairwise r2 values were less than 0.25 (Supplementary Table S2) in the NCM group, meaning that these loci could be used as independent genetic markers. Meantime, all A-InDel loci were found to comply with HWE by Bonferroni's correction (P = 0.05/43 = 0.0012), and the lowest P value of the HWE was 0.0322 (rs63064161) in the NCM group (Table 1).

    • Table 1. Allelic frequencies and forensic parameters of 43 A-InDel loci in the Mongolian group from northwest China (NCM, n = 288)

      Table 1. Allelic frequencies and forensic parameters of 43 A-InDel loci in the Mongolian group from northwest China (NCM, n = 288)

      LociAF+PICMPPETPIHeHoP-HWE
      rs1422811200.41840.36830.38330.17850.97960.48750.48961.0000
      rs1468801830.36280.35550.38830.14090.89440.46320.44100.4634
      rs30362400.37850.35980.40110.18150.98630.47130.49310.4585
      rs1421593060.46530.37380.37400.18150.98630.49850.49310.9055
      rs357008810.47400.37430.35940.15620.92900.49950.46180.2316
      rs38305640.41150.36700.37380.15620.92900.48520.46180.4587
      rs30923070.47920.37460.38310.19991.02860.50000.51390.7325
      rs792874220.55380.37210.41090.23731.11630.49510.55210.0608
      rs58521310.38890.36230.39970.18751.00000.47610.50000.4485
      rs61444730.42010.36850.38440.18150.98630.48810.49310.9041
      rs339902820.60590.36350.37550.14580.90570.47840.44790.3230
      rs105373210.49130.37490.38820.20971.05110.50070.52430.4799
      rs2018443360.36110.35500.41710.19361.01410.46220.50690.1220
      rs40199860.39240.36310.38260.15890.93510.47770.46530.7071
      rs58251450.44790.37230.39550.21301.05880.49540.52780.2802
      rs105410720.51390.37480.38260.19991.02860.50050.51390.7232
      rs105333370.60760.36310.38840.16990.96000.47770.47921.0000
      rs679412590.56940.37010.38460.18751.00000.49120.50000.8063
      rs105891410.41670.36800.37070.15360.92310.48700.45830.3346
      rs39930570.40630.36610.38920.18150.98630.48330.49310.8031
      rs58822320.39060.36280.40810.20311.03600.47690.51740.1724
      rs105554340.48960.37490.35210.14330.90000.50070.44440.0584
      rs631360600.53470.37380.37080.17560.97300.49850.48610.7213
      rs1426231770.47220.37420.36380.16430.94740.49930.47220.4083
      rs557140890.38720.36190.38100.15100.91720.47540.45490.5263
      rs630641610.56250.37110.41520.24091.12500.49300.55560.0322
      rs359745960.55900.37150.38920.19991.02860.49390.51390.5578
      rs105408670.52080.37460.39500.21971.07460.50000.53470.2887
      rs1400258630.49830.37500.38040.19681.02130.50090.51040.8178
      rs58229090.46700.37390.37210.17850.97960.49870.48960.8074
      rs105883410.42360.36910.36840.15360.92310.48920.45830.3235
      rs105738090.60760.36310.37990.15360.92310.47770.45830.5289
      rs166460.49130.37490.37690.19061.00700.50070.50351.0000
      rs1476826920.39060.36280.39390.17850.97960.47690.48960.7054
      rs38308850.48090.37460.37750.19061.00700.50010.50351.0000
      rs105440530.38020.36020.38950.16160.94120.47210.46880.8993
      rs105551330.52260.37450.36780.17280.96640.49980.48260.6310
      rs1423921130.51910.37460.36750.17280.96640.50010.48260.5541
      rs1445376090.51040.37490.36860.17560.97300.50070.48610.6378
      rs58215250.63190.35700.38230.13360.87800.46600.43060.1993
      rs105848750.54510.37300.39410.21301.05880.49680.52780.3557
      rs58929490.46350.37370.38680.20311.03600.49820.51740.5581
      rs30438040.44970.37250.37840.18450.99310.49580.49651.0000

    The allele frequency distributions and forensic parameter evaluations

    The allele frequencies and corresponding forensic parameters of the 43 A-InDels in the NCM group are listed in Table 1. Results showed that the insertion allelic frequencies of 43 A-InDel loci ranged from 0.3611 (rs201844336) to 0.6319 (rs5821525) in the NCM group, the 30 loci of which ranged from 0.4 to 0.6. Besides, the values of PIC, He, and Ho in the NCM group ranged from 0.3550 (rs201844336) to 0.3750 (rs140025863), 0.4622 (rs201844336) to 0.5009 (rs140025863), and 0.4306 (rs5821525) to 0.5556 (rs63064161), respectively. The values of PE in the NCM group ranged from 0.1336 (rs5821525) to 0.2409 (rs63064161). Moreover, the CPD and CPE values of 43 A-InDels in the NCM group were 0.999 999 999 999 999 998 77 and 0.999 814, respectively. The above results indicated that this panel was informative and polymorphic and could be used for individual identification, as one of the effective complementary tools for the NCM group in forensic paternity testing. In addition, this studied panel including 43 A-InDels had higher CPE and CPD values in the NCM group when compared to the previously reported InDel systems (Supplementary Table S3), such as the 21-plex InDels [24], 30 InDels [25], 32 InDels [26], and 35 InDels [27] panels.

    As shown in Supplementary Figure S2, results from the heatmap based on insertion allelic frequencies of 43 A-InDel loci showed different insertion allelic frequency distributions among 28 populations. In addition, all 28 populations were divided into two branches: one branch consisted of seven African populations clustering together, while the other populations gathered together in another branch. The similar distributions of allelic frequencies in those populations tend to gather together, even if they do not come from the same continent. In the East Asian populations, almost all the 43 A-InDel loci displayed middle insertion allelic frequencies. In addition, insertion allelic frequencies in the NCM group were between 0.36 and 0.63. Based on the allelic frequency distributions, the NCM group and East Asian subpopulations clustered closely. All values of insertion allelic frequencies at 43 A-InDels in the 28 populations are shown in Supplementary Table S4.

    To better understand the genetic differentiations among the NCM group and 27 reference populations based on the 43 A-InDels, the locus-by-locus P values of AMOVA are listed in Supplementary Table S5. The NCM group had significant differences with East Asian populations on one locus (Japanese in Tokyo, JPT), three (CHH), four (Kinh in Ho Chi Minh City, Vietnam, KHV; and Southern Han Chinese, CHS), and five (Han Chinese in Beijing, CHB; and Chinese Dai in Xishuangbanna, CDX) loci, respectively. In addition, the NCM group had significant differences with European populations on 15 (Finnish in Finland, FIN), 18 (Toscani in Italy, TSI; Iberian populations in Spain, IBS; and Utah residents with Northern and Western European ancestry, CEU), and 19 (British in England and Scotland, GBR); with African populations on 17 (African Ancestry in Southwest US, ASW), 21 (Luhya in Webuye, Kenya, LWK), 23 (Gambian in Western Division, The Gambia, GWD), 24 (Esan in Nigeria, ESN; and African Caribbean in Barbados, ACB), 25 (Mende in Sierra Leone, MSL), and 28 (Yoruba in Ibadan, Nigeria, YRI) loci, respectively.

    FSTand DAdistance analyses

    The values of pairwise FST and DA distances among 28 populations are shown in Supplementary Tables S6 and S7. In comparisons with other populations, the NCM group exhibited the lowest FST value with CHH (0.0039), followed by JPT (0.0047) and CHB (0.0066). Conversely, the highest FST value was found between NCM and YRI (0.0903), followed by the ESN (0.0872) and MSL (0.0856). Meanwhile, the results of DA distances among the NCM group and 27 comparison populations and FST values showed similar trends. The closest DA distance was observed between the NCM group and CHH (0.0013), followed by the JPT (0.0020) and CHB (0.0025); while the greatest DA distance was between the NCM group and YRI (0.0277), followed by the ESN (0.0269) and MSL (0.0269). Furthermore, two heatmaps were intuitively displayed through different colours, one was based on the values of pairwise FST (Supplementary Figure S3A) and the other was on DA distances (Supplement Figure S3B). As displayed in the heatmaps, the NCM group exhibited greater FST values and DA distances with populations from Africa, while the NCM group displayed smaller FST values and DA distances with populations from East Asia.

    PCA and MDS analyses

    PCA and MDS analyses were also performed to explore the genetic relationships among the NCM group and 27 comparison populations. Moreover, the results of allele frequency-based PCA and pairwise FST-based MDS are exhibited in Figure 1A and B, respectively, while results of genotype-based PCA are presented in Figure 1C and D. The first two principal components, PC1 and PC2, cumulatively contributed 53.64% of the total variation with PC1 and PC2 contributing 34.07% and 19.57% at the population level, respectively, as depicted in Figure 1A. Furthermore, the result of the MDS analysis based on pairwise FST values is exhibited in Figure 1B. The similar population distribution pattern was also found in the MDS plot, further confirming the close relationships between the NCM group and East Asian populations involved in the present research. At the individual level, 9.85% of the total variation could be attributed to the first two principal components (PC1, 6.67%; PC2, 3.18%), as shown in Figure 1C. Subsequently, we conducted the PCA analysis among the populations from three continents (Africa, Europe, and East Asia) and the NCM group. Compared with the results shown in Figure 1D, more obvious boundaries were observed among the African, European, and East Asian populations. Populations from the same continent were displayed with the same colour. There were most overlapping dots between the NCM and the East Asian populations, suggesting the close relationships between the NCM group and East Asian subpopulations.

    The principal component analysis (PCA) and multidimensional scaling (MDS) for the studied NCM group and 27 reference populations. (A) PCA plot based on the allele frequency values and (B) MDS based on the pairwise FST values among the NCM group and 27 reference populations at the population level. (C) PCA plot from 28 populations at the individual level. (D) PCA plot among the NCM group and populations from three continents (East Asia, Europe, and Africa) at the individual level. NCM: Mongolian group from northwest China; CHH: Chinese Hui group, China; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CHS: Southern Han Chinese, China; KHV: Kinh in Ho Chi Minh City, Vietnam; JPT: Japanese in Tokyo, Japan; CEU: Utah residents with Northern and Western European ancestry; FIN: Finnish in Finland; GBR: British in England and Scotland; IBS: Iberian populations in Spain; TSI: Toscani in Italy; CLM: Colombian in Medellin, Colombia; MXL: Mexican Ancestry in Los Angeles, CA; PEL: Peruvian in Lima, Peru; PUR: Puerto Rican in Puerto Rico; PJL: Punjabi in Lahore, Pakistan; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; STU: Sri Lankan Tamil in the UK; BEB: Bengali in Bangladesh; ACB: African Caribbean in Barbados; ASW: African Ancestry in Southwest US; ESN: Esan in Nigeria; GWD: Gambian in Western Division, The Gambia; LWK: Luhya in Webuye, Kenya; MSL: Mende in Sierra Leone; YRI: Yoruba in Ibadan, Nigeria.

    Figure 1.The principal component analysis (PCA) and multidimensional scaling (MDS) for the studied NCM group and 27 reference populations. (A) PCA plot based on the allele frequency values and (B) MDS based on the pairwise FST values among the NCM group and 27 reference populations at the population level. (C) PCA plot from 28 populations at the individual level. (D) PCA plot among the NCM group and populations from three continents (East Asia, Europe, and Africa) at the individual level. NCM: Mongolian group from northwest China; CHH: Chinese Hui group, China; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CHS: Southern Han Chinese, China; KHV: Kinh in Ho Chi Minh City, Vietnam; JPT: Japanese in Tokyo, Japan; CEU: Utah residents with Northern and Western European ancestry; FIN: Finnish in Finland; GBR: British in England and Scotland; IBS: Iberian populations in Spain; TSI: Toscani in Italy; CLM: Colombian in Medellin, Colombia; MXL: Mexican Ancestry in Los Angeles, CA; PEL: Peruvian in Lima, Peru; PUR: Puerto Rican in Puerto Rico; PJL: Punjabi in Lahore, Pakistan; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; STU: Sri Lankan Tamil in the UK; BEB: Bengali in Bangladesh; ACB: African Caribbean in Barbados; ASW: African Ancestry in Southwest US; ESN: Esan in Nigeria; GWD: Gambian in Western Division, The Gambia; LWK: Luhya in Webuye, Kenya; MSL: Mende in Sierra Leone; YRI: Yoruba in Ibadan, Nigeria.

    Population genetic structure analyses and phylogenetic relationship reconstructions

    In this study, we performed a STRUCTURE analysis of 28 populations based on the genotypic data of the 43 A-InDels. The result of online Harvest program analysis showed that the optimal K value was 3 (Supplementary Figure S4). Figure 2 and Supplementary Figure S5 showed the STRUCTURE results intuitively among the 28 populations when K = 2–7. Noteworthy, the genetic structure of the NCM group was similar to those of East Asian populations, but differed from those of other continental populations when K = 2–7 at the individual level.

    The population genetic structure analysis among the NCM group and 27 reference populations when K = 2–4. NCM: Mongolian group from northwest China; CHH: Chinese Hui group, China; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CHS: Southern Han Chinese, China; KHV: Kinh in Ho Chi Minh City, Vietnam; JPT: Japanese in Tokyo, Japan; CEU: Utah residents with Northern and Western European ancestry; FIN: Finnish in Finland; GBR: British in England and Scotland; IBS: Iberian populations in Spain; TSI: Toscani in Italy; CLM: Colombian in Medellin, Colombia; MXL: Mexican Ancestry in Los Angeles, CA; PEL: Peruvian in Lima, Peru; PUR: Puerto Rican in Puerto Rico; PJL: Punjabi in Lahore, Pakistan; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; STU: Sri Lankan Tamil in the UK; BEB: Bengali in Bangladesh; ACB: African Caribbean in Barbados; ASW: African Ancestry in Southwest US; ESN: Esan in Nigeria; GWD: Gambian in Western Division, The Gambia; LWK: Luhya in Webuye, Kenya; MSL: Mende in Sierra Leone; YRI: Yoruba in Ibadan, Nigeria.

    Figure 2.The population genetic structure analysis among the NCM group and 27 reference populations when K = 2–4. NCM: Mongolian group from northwest China; CHH: Chinese Hui group, China; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CHS: Southern Han Chinese, China; KHV: Kinh in Ho Chi Minh City, Vietnam; JPT: Japanese in Tokyo, Japan; CEU: Utah residents with Northern and Western European ancestry; FIN: Finnish in Finland; GBR: British in England and Scotland; IBS: Iberian populations in Spain; TSI: Toscani in Italy; CLM: Colombian in Medellin, Colombia; MXL: Mexican Ancestry in Los Angeles, CA; PEL: Peruvian in Lima, Peru; PUR: Puerto Rican in Puerto Rico; PJL: Punjabi in Lahore, Pakistan; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; STU: Sri Lankan Tamil in the UK; BEB: Bengali in Bangladesh; ACB: African Caribbean in Barbados; ASW: African Ancestry in Southwest US; ESN: Esan in Nigeria; GWD: Gambian in Western Division, The Gambia; LWK: Luhya in Webuye, Kenya; MSL: Mende in Sierra Leone; YRI: Yoruba in Ibadan, Nigeria.

    On the basis of the phylogenetic tree, we were able to determine the genetic relationships among the 28 populations. As displayed in Figure 3, there were five major clusters in the phylogenetic tree. Seven African populations clustered, which were marked in green, seven East Asian populations clustered, which were marked in red, five European populations clustered, which were marked in yellow, and five South Asian populations clustered and marked in blue. Moreover, the NCM group was situated in the East Asian cluster branch, particularly in proximity to the JPT and CHH groups.

    The phylogenetic tree reconstruction based on the pairwise DA distances among the NCM group and 27 reference populations. NCM: Mongolian group from northwest China; CHH: Chinese Hui group, China; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CHS: Southern Han Chinese, China; KHV: Kinh in Ho Chi Minh City, Vietnam; JPT: Japanese in Tokyo, Japan; CEU: Utah residents with Northern and Western European ancestry; FIN: Finnish in Finland; GBR: British in England and Scotland; IBS: Iberian populations in Spain; TSI: Toscani in Italy; CLM: Colombian in Medellin, Colombia; MXL: Mexican Ancestry in Los Angeles, CA; PEL: Peruvian in Lima, Peru; PUR: Puerto Rican in Puerto Rico; PJL: Punjabi in Lahore, Pakistan; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; STU: Sri Lankan Tamil in the UK; BEB: Bengali in Bangladesh; ACB: African Caribbean in Barbados; ASW: African Ancestry in Southwest US; ESN: Esan in Nigeria; GWD: Gambian in Western Division, The Gambia; LWK: Luhya in Webuye, Kenya; MSL: Mende in Sierra Leone; YRI: Yoruba in Ibadan, Nigeria.

    Figure 3.The phylogenetic tree reconstruction based on the pairwise DA distances among the NCM group and 27 reference populations. NCM: Mongolian group from northwest China; CHH: Chinese Hui group, China; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CHS: Southern Han Chinese, China; KHV: Kinh in Ho Chi Minh City, Vietnam; JPT: Japanese in Tokyo, Japan; CEU: Utah residents with Northern and Western European ancestry; FIN: Finnish in Finland; GBR: British in England and Scotland; IBS: Iberian populations in Spain; TSI: Toscani in Italy; CLM: Colombian in Medellin, Colombia; MXL: Mexican Ancestry in Los Angeles, CA; PEL: Peruvian in Lima, Peru; PUR: Puerto Rican in Puerto Rico; PJL: Punjabi in Lahore, Pakistan; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; STU: Sri Lankan Tamil in the UK; BEB: Bengali in Bangladesh; ACB: African Caribbean in Barbados; ASW: African Ancestry in Southwest US; ESN: Esan in Nigeria; GWD: Gambian in Western Division, The Gambia; LWK: Luhya in Webuye, Kenya; MSL: Mende in Sierra Leone; YRI: Yoruba in Ibadan, Nigeria.

    Discussion

    In the present study, we evaluated the genetic features of 43 A-InDels in the NCM group to determine the potential forensic application of this multiplex PCR system for purposes of individual identification and paternity testing. The present results showed that there were no A-InDels deviated from HWE. Furthermore, all pairwise A-InDels were confirmed to linkage equilibrium, indicating that all loci were independent and could be adapted to the following population genetic and forensic application analyses. In a population, heterozygosity of a genetic marker is the proportion of heterozygotes among all genotypes. The high degree of heterozygosity suggests that the genetic marker holds great application value in forensic personal identification. In the NCM group, all 43 A-InDels showed observed heterozygosity values above 0.4, and 17 loci of them were greater than 0.5. As compared with the previous InDels systems used in Chinese Mongolian group from different regions (Supplementary Table S3), the studied system including 43 A-InDels had higher CPE and CPD values in the NCMgroup, which indicated that this studied panel improved personal identification ability and could be as a supplement tool for forensic paternity testing in the NCM group.

    We then compared the NCM group with 27 reference populations based on the 43 A-InDels to gain more in-depth understanding of their genetic relationships. The insertion allele frequencies of 43 A-InDels were similar in the same intercontinental populations, with the exception of the American populations. The results of AMOVA, pairwise FST values, and DA distances all indicated that the NCM group had the greatest genetic differentiations with African populations and the smallest differentiations with the East Asian populations, specifically with the CHH group. Throughout the history of China, the Hui and Mongolian groups have a long history of frequent and close social exchanges [28]. The Mongolian expeditions to the west and south promoted the migration and integration of ethnic groups. According to relevant historical records and published studies, a large number of Muslims from West and Central Asia, such as Turks, Persia, and Arabia, came to live and multiply in China, known as “Huihui”, which might lead to close genetic relationships between Mongolia and Hui groups [2931]. It is worth noting that there are only 27 reference populations, of which only six are from East Asian populations, which are not representative of the whole East Asian populations. In the future, more populations need to be included, especially the East Asian sub-populations, to investigate their genetic relationships with the NCM group in detail.

    The outcomes of phylogenetic relationship reconstruction showed that the NCM group closely clustered with East Asian populations, which was consistent with the results obtained from PCA and MDS analyses. The analysis of population genetic structure revealed that the proportions of ancestral compositions in the NCM group were similar to those in East Asian populations. Previously, several researchers have investigated the genetic polymorphisms of Mongolians in China based on different InDel panels. Huang et al. [26] found that the Mongolian population based on the 32-plex InDels panel showed mixed ancestral components related to East Asian and European populations, and Zhang et al. [27] reported that Chinese Mongolian group might have similar genetic structures and closely related genetic relationships to the East Asian populations, which were consistent with our present study. In addition, the polymorphism analyses of other diverse genetic markers were also used to study the genetic characteristics of the Mongolian group. For example, the results of 22 A-STR [12], 23 Y-STR [32], and 60 mtDNA [14] loci primarily confirmed the most intimate genetic relationships between the Mongolian group and the East Asian populations as compared to Mongolian group and non-East Asian populations, which further bolstered the reliability of our research findings.

    Herein, we utilized the self-developed panel to conduct a thorough assessment of the forensic effectiveness of this multiplex PCR amplification system in the NCM group and investigate the genetic relationships between the NCM group and 27 comparison populations. The results of forensic parameters based on 43 A-InDels demonstrated that this in-house panel held great potential as a reliable tool for individual identification. The evaluation of the genetic relationship showed the NCM group was a close relationship with the CHH group. In summary, this study will not only provide a robust foundation for the application of InDels in forensic genetics, but also enrich the resources of the InDel database and promote more comprehensive understanding of the genetic architecture of the NCM group.

    Acknowledgements

    The authors want to thank the volunteers in this research.

    Authors' contributions

    Bofeng Zhu conceived and designed this study. Xuebing Chen conceived the experiments and wrote the manuscript. Hui Xu collected the samples. Wei Cui and Ming Zhao extracted DNA and helped to conduct the statistical analysis. Bofeng Zhu also revised the manuscript. All authors contributed to the final text and approved it.

    Compliance with ethical standards

    This study was performed in accordance with the principles of the Declaration of Helsinki and approved by the Ethics Committees of Southern Medical University, Guangzhou, China and Xi'an Jiaotong University, Xi'an, China (No. XJTULAC201). Written informed consent was obtained from all the participants.

    Disclosure statement

    The authors declare that they have no conflict of interest.

    Funding

    This work was supported by the National Natural Science Foundation of China [Grant No. 81772031].

    [1] JL Weber, D David, J Heil et al. Human diallelic insertion/deletion polymorphisms. Am J Hum Genet, 71, 854-862(2002).

    [2] YL Wei, CJ Qin, H Dong et al. A validation study of a multiplex INDEL assay for forensic use in four Chinese populations. Forensic Sci Int Genet, 9, e22-e25(2014).

    [3] B Zhu, Q Lan, Y Guo et al. Population genetic diversity and clustering analysis for Chinese Dongxiang group with 30 autosomal InDel loci simultaneously analyzed. Front Genet, 9, 279(2018).

    [4] F Oldoni, V Castella, F Grosjean et al. Sensitive DIP-STR markers for the analysis of unbalanced mixtures from “touch” DNA samples. Forensic Sci Int Genet, 28, 111-117(2017).

    [5] Y Guo, C Shen, H Meng et al. Population differentiations and phylogenetic analysis of Tibet and Qinghai Tibetan groups based on 30 InDel loci. DNA Cell Biol, 35, 787-794(2016).

    [6] Y Guo, C Chen, X Jin et al. Autosomal DIPs for population genetic structure and differentiation analyses of Chinese Xinjiang Kyrgyz ethnic group. Sci Rep, 8, 11054(2018).

    [7] R Alaeddini, SJ Walsh, A Abbas. Forensic implications of genetic analyses from degraded DNA—a review. Forensic Sci Int Genet, 4, 148-157(2010).

    [8] M Takahashi, Y Kato, H Mukoyama et al. Evaluation of five polymorphic microsatellite markers for typing DNA from decomposed human tissues—correlation between the size of the alleles and that of the template DNA. Forensic Sci Int, 90, 1-9(1997).

    [9] R Jin, W Cui, Y Fang et al. A novel panel of 43 insertion/deletion loci for human identifications of forensic degraded DNA samples: development and validation. Front Genet, 12(2021).

    [10] H Bai, X Guo, D Zhang et al. The genome of a Mongolian individual reveals the genetic imprints of Mongolians on modern human populations. Genome Biol Evol, 6, 3122-3136(2014).

    [11] T Zerjal, Y Xue, G Bertorelle et al. The genetic legacy of the Mongols. Am J Hum Genet, 72, 717-721(2003).

    [12] Y Fang, T Xie, Q Lan et al. Multiple genetic analyses to investigate the polymorphisms of Chinese Mongolian population with an efficient short tandem repeat panel. Croat Med J, 60, 191-200(2019).

    [13] R Wu, R Li, N Wang et al. Genetic polymorphism and population structure of Torghut Mongols and comparison with a Mongolian population 3000 kilometers away. Forensic Sci Int Genet, 42, 235-243(2019).

    [14] Q Lan, T Xie, X Jin et al. MtDNA polymorphism analyses in the Chinese Mongolian group: efficiency evaluation and further matrilineal genetic structure exploration. Mol Genet Genomic Med, 7(2019).

    [15] C Zhao, J Yang, H Xu et al. Genetic diversity analysis of forty-three insertion/deletion loci for forensic individual identification in Han Chinese from Beijing based on a novel panel. J Zhejiang Univ Sci B, 23, 241-248(2022).

    [16] F Rousset. Genepop'007: a complete re-implementation of the genepop software for Windows and Linux. Mol Ecol Resour, 8, 103-106(2008).

    [17] J Yoo, Y Lee, Y Kim et al. SNPAnalyzer 2.0: a web-based integrated workbench for linkage disequilibrium analysis and association analysis. BMC Bioinformatics, 9, 290(2008).

    [18] A Gouy, M Zieger. STRAF—a convenient online tool for STR data evaluation in forensic genetics. Forensic Sci Int Genet, 30, 148-151(2017).

    [19] L Excoffier, HE Lischer. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour, 10, 564-567(2010).

    [20] M Nei, F Tajima, Y Tateno. Accuracy of estimated phylogenetic trees from molecular data. II. Gene frequency data. J Mol Evol, 19, 153-170(1983).

    [21] X Chen, S Nie, L Hu et al. Forensic efficacy evaluation and genetic structure exploration of the Yunnan Miao group by a multiplex InDel panel. Electrophoresis, 43, 1765-1773(2022).

    [22] G Evanno, S Regnaut, J Goudet. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol, 14, 2611-2620(2005).

    [23] DA Earl, BMJCGR Vonholdt. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method, 4, 359-361(2012).

    [24] Y Jiang, W He, H Wang et al. Population genetic analysis of a 21-plex DIP panel in seven Chinese ethnic populations. Int J Leg Med, 132, 145-147(2018).

    [25] CT Li, SH Zhang, SM Zhao. Genetic analysis of 30 InDel markers for forensic use in five different Chinese populations. Genet Mol Res, 10, 964-979(2011).

    [26] Y Huang, X Chen, C Liu et al. Genetic analysis of 32 InDels in four ethnic minorities from Chinese Xinjiang. PloS One, 16(2021).

    [27] W Zhang, X Jin, Y Wang et al. Genetic polymorphisms and forensic efficiencies of a set of novel autosomal InDel markers in a Chinese Mongolian group. Biomed Res Int, 2020(2020).

    [28] CC Wang, Y Lu, L Kang et al. The massive assimilation of indigenous East Asian populations in the origin of Muslim Hui people inferred from paternal Y chromosome. Am J Phys Anthropol, 169, 341-347(2019).

    [29] C Chen, Y Li, R Tao et al. The genetic structure of Chinese Hui ethnic group revealed by complete mitochondrial genome analyses using massively parallel sequencing. Genes (Basel), 11, 1352(2020).

    [30] W Hong, S Chen, H Shao et al. HLA class I polymorphism in Mongolian and Hui ethnic groups from northern China. Hum Immunol, 68, 439-448(2007).

    [31] YG Yao, QP Kong, CY Wang et al. Different matrilineal contributions to genetic structure of ethnic groups in the Silk Road region in China. Mol Biol Evol, 21, 2265-2280(2004).

    [32] T Gao, L Yun, S Gao et al. Population genetics of 23 Y-STR loci in the Mongolian minority population in Inner Mongolia of China. Int J Leg Med, 130, 1509-1511(2016).

    Tools

    Get Citation

    Copy Citation Text

    Xuebing Chen, Hui Xu, Wei Cui, Ming Zhao, Bofeng Zhu. Systematical explorations of forensic feature and population genetic diversity of the Chinese Mongolian group from northwest China via a self-constructed InDel panel[J]. Forensic Sciences Research, 2024, 9(1): owad047

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Research Articles

    Received: Oct. 15, 2022

    Accepted: Oct. 16, 2023

    Published Online: Sep. 22, 2025

    The Author Email: Bofeng Zhu (zhubofeng7372@126.com)

    DOI:10.1093/fsr/owad047

    Topics