CN107419017B

CN107419017B - A method and system for inferring the origin of five intercontinental populations in individuals of unknown origin

Info

Publication number: CN107419017B
Application number: CN201710610369.8A
Authority: CN
Inventors: 刘京; 李彩霞; 赵雯婷; 江丽; 郝伟琪
Original assignee: Institute of Forensic Science Ministry of Public Security PRC
Current assignee: Institute of Forensic Science Ministry of Public Security PRC
Priority date: 2017-07-25
Filing date: 2017-07-25
Publication date: 2020-09-08
Anticipated expiration: 2037-07-25
Also published as: CN107419017A

Abstract

本发明公开了一种用于五个洲际人群区分的SNP体系及其应用。本发明所提供的应用具体为28个SNP位点在如下任一中的应用：(a)构建五大洲际人群基因分型数据库；(b)区分五大洲际人群。本发明所提供的体系不仅能对五大洲际人群进行区分，而且对混合人群也有一定的区分能力，对已知来源样本的祖先主成分和人群匹配与其来源信息一致。并能够推断个体祖先来源成分组成，在实际检案中可以进行推广应用。The invention discloses a SNP system for distinguishing five intercontinental populations and its application. The application provided by the present invention is specifically the application of 28 SNP sites in any of the following: (a) constructing a genotyping database of five intercontinental populations; (b) distinguishing five intercontinental populations. The system provided by the present invention can not only distinguish five intercontinental populations, but also has a certain ability to distinguish mixed populations, and the ancestral principal components and population matching of known source samples are consistent with their source information. And it can infer the composition of individual ancestral sources, which can be popularized and applied in actual cases.

Description

A method and system for inferring the origin of five intercontinental populations in individuals of unknown origin

技术领域technical field

本发明属于生物技术领域，涉及一种对未知来源个体进行五大洲际族群来源推断的方法和系统。The invention belongs to the field of biotechnology, and relates to a method and system for inferring the origin of five intercontinental ethnic groups for individuals of unknown origin.

背景技术Background technique

检测人群间分布差异大的DNA多态性位点即祖先信息位点(Ancestryinformative marks,AIMs)可以推断犯罪现场DNA供者的族群地域来源。短串联重复序列(STRs)、单核苷酸多态性(SNPs)、插入/缺失多态性(Indels)等均可作为祖先信息位点(AIMs)用于族群推断，发挥了良好的作用。而SNPs有HapMap Project、1000Genomes等数据库支持，成为近年来筛选AIMs位点的重要遗传标记。目前报道了大量洲际大人群区分的AIMs体系，例如本项目组的27-SNPs等用于非洲、东亚和欧洲三大人群的推断。在现行法医DNA实验室检测技术条件下，表现良好的SNPs族群推断体系应在确保人群区分效能的前提下，各人群之间的区分能力要保持均衡；位点信息量尽量高，位点数目尽量少；检测方法简单易行。Ancestry informative marks (AIMs) can be detected by detecting DNA polymorphisms with large distribution differences among populations, which can infer the ethnic and geographic origin of DNA donors at crime scenes. Short tandem repeats (STRs), single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (Indels), etc. can be used as ancestry information sites (AIMs) for population inference, and play a good role. SNPs are supported by databases such as HapMap Project and 1000Genomes, and have become important genetic markers for screening AIMs loci in recent years. At present, a large number of AIMs systems have been reported for the differentiation of large intercontinental populations. For example, the 27-SNPs of this project group are used to infer three major populations in Africa, East Asia and Europe. Under the current conditions of forensic DNA laboratory detection technology, a well-performing SNPs population inference system should maintain a balanced ability to distinguish between populations on the premise of ensuring the effectiveness of population differentiation; the amount of locus information should be as high as possible, and the number of loci should be as large as possible Less; the detection method is simple and easy to implement.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种对未知来源个体进行五大洲际族群来源推断的方法和系统。The purpose of the present invention is to provide a method and system for inferring the origin of five intercontinental ethnic groups for individuals of unknown origin.

本发明保护28个SNP位点在如下任一中的应用：The present invention protects the application of 28 SNP sites in any of the following:

(a)构建五大洲际人群基因分型数据库；(a) Construction of a genotyping database for five intercontinental populations;

(b)区分五大洲际人群；(b) distinguish between five intercontinental populations;

所述28个SNP位点分别为：rs10483251、rs12142199、rs1229984、rs12402499、rs12498138、rs12594144、rs1426654、rs1557553、rs16891982、rs17822931、rs1871534、rs2080161、rs2139931、rs2789823、rs2814778、rs3751050、rs3827760、rs4657449、rs4749305、rs4792928、rs6054465、rs6437783、rs715605、rs8072587、rs8137373、rs9522149、rs9809818、rs9908046(表2)。下文所述28个SNP位点与此相同。所述28个SNP位点分别为：rs10483251、rs12142199、rs1229984、rs12402499、rs12498138、rs12594144、rs1426654、rs1557553、rs16891982、rs17822931、rs1871534、rs2080161、rs2139931、rs2789823、rs2814778、rs3751050、rs3827760、rs4657449、rs4749305、rs4792928、 rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818, rs9908046 (Table 2). The 28 SNP sites described below are the same.

本发明保护用于检测人基因组中28个SNP位点的引物对组。The present invention protects a primer pair set for detecting 28 SNP sites in the human genome.

本发明所提供的用于检测人基因组中28个SNP位点的引物对组由如下(1)-(28)组成：(1)用于检测rs10483251的引物对1，由序列表中序列1和序列2所示的两条单链DNA组成；(2)用于检测rs12142199的引物对2，由序列表中序列4和序列5所示的两条单链DNA组成；(3)用于检测rs1229984的引物对3，由序列表中序列7和序列8所示的两条单链DNA组成；(4)用于检测rs12402499的引物对4，由序列表中序列10和序列11所示的两条单链DNA组成；(5)用于检测rs12498138的引物对5，由序列表中序列13和序列14所示的两条单链DNA组成；(6)用于检测rs12594144的引物对6，由序列表中序列16和序列17所示的两条单链DNA组成；(7)用于检测rs1426654的引物对7，由序列表中序列19和序列20所示的两条单链DNA组成；(8)用于检测rs1557553的引物对8，由序列表中序列22和序列23所示的两条单链DNA组成；(9)用于检测rs16891982的引物对9，由序列表中序列25和序列26所示的两条单链DNA组成；(10)用于检测rs17822931的引物对10，由序列表中序列28和序列29所示的两条单链DNA组成；(11)用于检测rs1871534的引物对11，由序列表中序列31和序列32所示的两条单链DNA组成；(12)用于检测rs2080161的引物对12，由序列表中序列34和序列35所示的两条单链DNA组成；(13)用于检测rs2139931的引物对13，由序列表中序列37和序列38所示的两条单链DNA组成；(14)用于检测rs2789823的引物对14，由序列表中序列40和序列41所示的两条单链DNA组成；(15)用于检测rs2814778的引物对15，由序列表中序列43和序列44所示的两条单链DNA组成；(16)用于检测rs3751050的引物对16，由序列表中序列46和序列47所示的两条单链DNA组成；(17)用于检测rs3827760的引物对17，由序列表中序列49和序列50所示的两条单链DNA组成；(18)用于检测rs4657449的引物对18，由序列表中序列52和序列53所示的两条单链DNA组成；(19)用于检测rs4749305的引物对19，由序列表中序列55和序列56所示的两条单链DNA组成；(20)用于检测rs4792928的引物对20，由序列表中序列58和序列59所示的两条单链DNA组成；(21)用于检测rs6054465的引物对21，由序列表中序列61和序列62所示的两条单链DNA组成；(22)用于检测rs6437783的引物对22，由序列表中序列64和序列65所示的两条单链DNA组成；(23)用于检测rs715605的引物对23，由序列表中序列67和序列68所示的两条单链DNA组成；(24)用于检测rs8072587的引物对24，由序列表中序列70和序列71所示的两条单链DNA组成；(25)用于检测rs8137373的引物对25，由序列表中序列73和序列74所示的两条单链DNA组成；(26)用于检测rs9522149的引物对26，由序列表中序列76和序列77所示的两条单链DNA组成；(27)用于检测rs9809818的引物对27，由序列表中序列79和序列80所示的两条单链DNA组成；(28)用于检测rs9908046的引物对28，由序列表中序列82和序列83所示的两条单链DNA组成。The primer pair set for detecting 28 SNP sites in the human genome provided by the present invention consists of the following (1)-(28): (1) primer pair 1 for detecting rs10483251, consisting of sequence 1 and It consists of two single-stranded DNAs shown in sequence 2; (2) primer pair 2 for detecting rs12142199 consists of two single-stranded DNAs shown in sequence 4 and sequence 5 in the sequence table; (3) for detecting rs1229984 The primer pair 3 is composed of two single-stranded DNAs shown in sequence 7 and sequence 8 in the sequence listing; (4) primer pair 4 for detecting rs12402499 is composed of two single-stranded DNAs shown in sequence 10 and sequence 11 in the sequence listing. single-stranded DNA composition; (5) primer pair 5 for detecting rs12498138, consisting of two single-stranded DNAs shown in sequence 13 and sequence 14 in the sequence table; (6) primer pair 6 for detecting rs12594144, consisting of sequence It consists of two single-stranded DNAs shown in sequence 16 and sequence 17 in the list; (7) primer pair 7 for detecting rs1426654 consists of two single-stranded DNAs shown in sequence 19 and sequence 20 in the sequence list; (8) ) primer pair 8 for detecting rs1557553, consisting of two single-stranded DNAs shown in sequence 22 and sequence 23 in the sequence listing; (9) primer pair 9 for detecting rs16891982, consisting of sequence 25 and sequence 26 in the sequence listing (10) primer pair 10 for detecting rs17822931, consisting of two single-stranded DNAs shown in sequence 28 and sequence 29 in the sequence table; (11) primer for detecting rs1871534 Pair 11 is composed of two single-stranded DNAs shown in sequence 31 and sequence 32 in the sequence listing; (12) primer pair 12 for detecting rs2080161 is composed of two single-stranded DNAs shown in sequence 34 and sequence 35 in the sequence listing. DNA composition; (13) primer pair 13 for detecting rs2139931, consisting of two single-stranded DNAs shown in sequence 37 and sequence 38 in the sequence listing; (14) primer pair 14 for detecting rs2789823, from the sequence listing It consists of two single-stranded DNAs shown in sequence 40 and sequence 41; (15) primer pair 15 for detecting rs2814778 consists of two single-stranded DNAs shown in sequence 43 and sequence 44 in the sequence table; (16) use Primer pair 16 for detecting rs3751050 consists of two single-stranded DNAs shown in sequence 46 and sequence 47 in the sequence listing; (17) primer pair 17 for detecting rs3827760, which is shown in sequence 49 and sequence 50 in the sequence listing (18) primer pair 18 for detecting rs4657449, consisting of two single-stranded DNAs shown in sequence 52 and sequence 53 in the sequence table; (19) primer pair 19 for detecting rs4749305 , consisting of two single-stranded DNAs shown in sequence 55 and sequence 56 in the sequence listing; (20) use Primer pair 20 for detecting rs4792928, consisting of two single-stranded DNAs shown in sequence 58 and sequence 59 in the sequence listing; (21) primer pair 21 for detecting rs6054465, shown in sequence 61 and sequence 62 in the sequence listing (22) primer pair 22 for detecting rs6437783, consisting of two single-stranded DNAs shown in sequence 64 and sequence 65 in the sequence table; (23) primer pair 23 for detecting rs715605 , consisting of two single-stranded DNAs shown in Sequence 67 and Sequence 68 in the sequence listing; (24) Primer pair 24 for detecting rs8072587, consisting of two single-stranded DNAs shown in Sequence 70 and Sequence 71 in the sequence listing (25) primer pair 25 for detecting rs8137373, consisting of two single-stranded DNAs shown in sequence 73 and sequence 74 in the sequence listing; (26) primer pair 26 for detecting rs9522149, consisting of sequence 76 in the sequence listing and two single-stranded DNAs shown in sequence 77; (27) primer pair 27 for detecting rs9809818, consisting of two single-stranded DNAs shown in sequence 79 and sequence 80 in the sequence listing; (28) for detecting Primer pair 28 of rs9908046 consists of two single-stranded DNAs shown in SEQ ID NO: 82 and SEQ ID NO: 83 in the sequence listing.

所述引物对组中，所述引物对1、所述引物对2、所述引物对3、所述引物对4、所述引物对5、所述引物对6、所述引物对7、所述引物对8、所述引物对9、所述引物对10、所述引物对11、所述引物对12、所述引物对13、所述引物对14、所述引物对15、所述引物对16、所述引物对17、所述引物对18、所述引物对19、所述引物对20、所述引物对21、所述引物对22、所述引物对23、所述引物对24、所述引物对25、所述引物对26、所述引物对27和所述引物对28的摩尔比为0.8：0.6：1.5：2.7：3：0.8：2：1.3：1：1：1.5：1：2.7：0.5：0.8：2.5：0.4：1.6：2.5：4：3：0.8：0.8：3：3：6：0.6：0.6。每个引物对中两条引物的摩尔比均为1：1。In the primer pair set, the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, the primer pair 5, the primer pair 6, the primer pair 7, the The primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair 13, the primer pair 14, the primer pair 15, the primer pair pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21, the primer pair 22, the primer pair 23, the primer pair 24 , the primer pair 25, the primer pair 26, the primer pair 27 and the primer pair 28 have a molar ratio of 0.8:0.6:1.5:2.7:3:0.8:2:1.3:1:1:1.5: 1:2.7:0.5:0.8:2.5:0.4:1.6:2.5:4:3:0.8:0.8:3:3:6:0.6:0.6. The molar ratio of both primers in each primer pair was 1:1.

本发明保护用于检测人基因组中28个SNP位点的单链DNA组。The present invention protects a single-stranded DNA set for detecting 28 SNP sites in the human genome.

本发明所提供的用于检测人基因组中28个SNP位点的单链DNA组由如下(1)-(28)组成：(1)用于检测rs10483251的引物对1和延伸引物1；所述引物对1由序列表中序列1和序列2所示的两条单链DNA组成；所述延伸引物1为序列表中序列3所示单链DNA；(2)用于检测rs12142199的引物对2和延伸引物2；所述引物对2由序列表中序列4和序列5所示的两条单链DNA组成；所述延伸引物2为序列表中序列6所示单链DNA；(3)用于检测rs1229984的引物对3和延伸引物3；所述引物对3由序列表中序列7和序列8所示的两条单链DNA组成；所述延伸引物3为序列表中序列9所示单链DNA；(4)用于检测rs12402499的引物对4和延伸引物4；所述引物对4由序列表中序列10和序列11所示的两条单链DNA组成；所述延伸引物4为序列表中序列12所示单链DNA；(5)用于检测rs12498138的引物对5和延伸引物5；所述引物对5由序列表中序列13和序列14所示的两条单链DNA组成；所述延伸引物5为序列表中序列15所示单链DNA；(6)用于检测rs12594144的引物对6和延伸引物6；所述引物对6由序列表中序列16和序列17所示的两条单链DNA组成；所述延伸引物6为序列表中序列18所示单链DNA；(7)用于检测rs1426654的引物对7和延伸引物7；所述引物对7由序列表中序列19和序列20所示的两条单链DNA组成；所述延伸引物7为序列表中序列21所示单链DNA；(8)用于检测rs1557553的引物对8和延伸引物8；所述引物对8由序列表中序列22和序列23所示的两条单链DNA组成；所述延伸引物8为序列表中序列24所示单链DNA；(9)用于检测rs16891982的引物对9和延伸引物9；所述引物对9由序列表中序列25和序列26所示的两条单链DNA组成；所述延伸引物9为序列表中序列27所示单链DNA；(10)用于检测rs17822931的引物对10和延伸引物10；所述引物对10由序列表中序列28和序列29所示的两条单链DNA组成；所述延伸引物10为序列表中序列30所示单链DNA；(11)用于检测rs1871534的引物对11和延伸引物11；所述引物对11由序列表中序列31和序列32所示的两条单链DNA组成；所述延伸引物11为序列表中序列33所示单链DNA；(12)用于检测rs2080161的引物对12和延伸引物12；所述引物对12由序列表中序列34和序列35所示的两条单链DNA组成；所述延伸引物12为序列表中序列36所示单链DNA；(13)用于检测rs2139931的引物对13和延伸引物13；所述引物对13由序列表中序列37和序列38所示的两条单链DNA组成；所述延伸引物13为序列表中序列39所示单链DNA；(14)用于检测rs2789823的引物对14和延伸引物14；所述引物对14由序列表中序列40和序列41所示的两条单链DNA组成；所述延伸引物14为序列表中序列42所示单链DNA；(15)用于检测rs2814778的引物对15和延伸引物15；所述引物对15由序列表中序列43和序列44所示的两条单链DNA组成；所述延伸引物15为序列表中序列45所示单链DNA；(16)用于检测rs3751050的引物对16和延伸引物16；所述引物对16由序列表中序列46和序列47所示的两条单链DNA组成；所述延伸引物16为序列表中序列48所示单链DNA；(17)用于检测rs3827760的引物对17和延伸引物17；所述引物对17由序列表中序列49和序列50所示的两条单链DNA组成；所述延伸引物17为序列表中序列51所示单链DNA；(18)用于检测rs4657449的引物对18和延伸引物18；所述引物对18由序列表中序列52和序列53所示的两条单链DNA组成；所述延伸引物18为序列表中序列54所示单链DNA；(19)用于检测rs4749305的引物对19和延伸引物19；所述引物对19由序列表中序列55和序列56所示的两条单链DNA组成；所述延伸引物19为序列表中序列57所示单链DNA；(20)用于检测rs4792928的引物对20和延伸引物20；所述引物对20由序列表中序列58和序列59所示的两条单链DNA组成；所述延伸引物20为序列表中序列60所示单链DNA；(21)用于检测rs6054465的引物对21和延伸引物21；所述引物对21由序列表中序列61和序列62所示的两条单链DNA组成；所述延伸引物21为序列表中序列63所示单链DNA；(22)用于检测rs6437783的引物对22和延伸引物22；所述引物对22由序列表中序列64和序列65所示的两条单链DNA组成；所述延伸引物22为序列表中序列66所示单链DNA；(23)用于检测rs715605的引物对23和延伸引物23；所述引物对23由序列表中序列67和序列68所示的两条单链DNA组成；所述延伸引物23为序列表中序列69所示单链DNA；(24)用于检测rs8072587的引物对24和延伸引物24；所述引物对24由序列表中序列70和序列71所示的两条单链DNA组成；所述延伸引物24为序列表中序列72所示单链DNA；(25)用于检测rs8137373的引物对25和延伸引物25；所述引物对25由序列表中序列73和序列74所示的两条单链DNA组成；所述延伸引物25为序列表中序列75所示单链DNA；(26)用于检测rs9522149的引物对26和延伸引物26；所述引物对26由序列表中序列76和序列77所示的两条单链DNA组成；所述延伸引物26为序列表中序列78所示单链DNA；(27)用于检测rs9809818的引物对27和延伸引物27；所述引物对27由序列表中序列79和序列80所示的两条单链DNA组成；所述延伸引物27为序列表中序列81所示单链DNA；(28)用于检测rs9908046的引物对28和延伸引物28；所述引物对28由序列表中序列82和序列83所示的两条单链DNA组成；所述延伸引物28为序列表中序列84所示单链DNA。The single-stranded DNA set for detecting 28 SNP sites in the human genome provided by the present invention consists of the following (1)-(28): (1) primer pair 1 and extension primer 1 for detecting rs10483251; the Primer pair 1 is composed of two single-stranded DNAs shown in sequence 1 and sequence 2 in the sequence listing; the extension primer 1 is the single-stranded DNA shown in sequence 3 in the sequence listing; (2) primer pair 2 for detecting rs12142199 and extension primer 2; the primer pair 2 is composed of two single-stranded DNAs shown in sequence 4 and sequence 5 in the sequence table; the extension primer 2 is the single-stranded DNA shown in sequence 6 in the sequence table; (3) use Primer pair 3 and extension primer 3 for detecting rs1229984; the primer pair 3 consists of two single-stranded DNAs shown in sequence 7 and sequence 8 in the sequence table; the extension primer 3 is a single-stranded DNA shown in sequence 9 in the sequence table. stranded DNA; (4) primer pair 4 and extension primer 4 for detecting rs12402499; the primer pair 4 consists of two single-stranded DNAs shown in sequence 10 and sequence 11 in the sequence listing; the extension primer 4 is sequence The single-stranded DNA shown in sequence 12 in the list; (5) primer pair 5 and extension primer 5 for detecting rs12498138; the primer pair 5 is composed of two single-stranded DNAs shown in sequence 13 and sequence 14 in the sequence listing; The extension primer 5 is the single-stranded DNA shown in the sequence 15 in the sequence table; (6) the primer pair 6 and extension primer 6 for detecting rs12594144; the primer pair 6 is shown in the sequence table 16 and 17 in the sequence table. It consists of two single-stranded DNAs; the extension primer 6 is the single-stranded DNA shown in sequence 18 in the sequence table; (7) primer pair 7 and extension primer 7 for detecting rs1426654; the primer pair 7 consists of the sequence in the sequence table. 19 and two single-stranded DNAs shown in sequence 20; the extension primer 7 is the single-stranded DNA shown in sequence 21 in the sequence table; (8) primer pair 8 and extension primer 8 for detecting rs1557553; the primers Pair 8 is composed of two single-stranded DNAs shown in sequence 22 and sequence 23 in the sequence listing; the extension primer 8 is the single-stranded DNA shown in sequence 24 in the sequence listing; (9) primer pairs 9 and 24 for detecting rs16891982 Extension primer 9; the primer pair 9 consists of two single-stranded DNAs shown in sequence 25 and sequence 26 in the sequence listing; the extension primer 9 is the single-stranded DNA shown in sequence 27 in the sequence listing; (10) used for Primer pair 10 and extension primer 10 for detecting rs17822931; the primer pair 10 is composed of two single-stranded DNAs shown in sequence 28 and sequence 29 in the sequence listing; the extension primer 10 is the single-stranded DNA shown in sequence 30 in the sequence listing DNA; (11) primer pair 11 and extension primer 11 for detecting rs1871534; the primer pair 11 consists of two single-stranded DNAs shown in sequence 31 and sequence 32 in the sequence listing; the extension primer 11 is the sequence listing Single-stranded DNA shown in sequence 33; (12) used to detect rs20 Primer pair 12 and extension primer 12 of 80161; the primer pair 12 consists of two single-stranded DNAs shown in sequence 34 and sequence 35 in the sequence listing; the extension primer 12 is the single-stranded DNA shown in sequence 36 in the sequence listing (13) primer pair 13 and extension primer 13 for detecting rs2139931; described primer pair 13 is made up of two single-stranded DNAs shown in sequence 37 and sequence 38 in the sequence table; the extension primer 13 is in the sequence table single-stranded DNA shown in sequence 39; (14) primer pair 14 and extension primer 14 for detecting rs2789823; the primer pair 14 is composed of two single-stranded DNAs shown in sequence 40 and sequence 41 in the sequence listing; the The extension primer 14 is the single-stranded DNA shown in the sequence 42 in the sequence listing; (15) the primer pair 15 and extension primer 15 used to detect rs2814778; the primer pair 15 consists of the two sequences shown in the sequence 43 and the sequence 44 in the sequence listing. Single-stranded DNA composition; the extension primer 15 is the single-stranded DNA shown in sequence 45 in the sequence listing; (16) primer pair 16 and extension primer 16 for detecting rs3751050; the primer pair 16 consists of sequence 46 and 46 in the sequence listing. It consists of two single-stranded DNAs shown in sequence 47; the extension primer 16 is the single-stranded DNA shown in sequence 48 in the sequence listing; (17) primer pair 17 and extension primer 17 for detecting rs3827760; the primer pair 17 It consists of two single-stranded DNAs shown in sequence 49 and sequence 50 in the sequence listing; the extension primer 17 is the single-stranded DNA shown in sequence 51 in the sequence listing; (18) Primer pair 18 and extension primer for detecting rs4657449 18; the primer pair 18 is composed of two single-stranded DNAs shown in the sequence 52 and the sequence 53 in the sequence listing; the extension primer 18 is the single-stranded DNA shown in the sequence 54 in the sequence listing; (19) for detecting rs4749305 The primer pair 19 and the extension primer 19 are composed of the two single-stranded DNAs shown in the sequence 55 and the sequence 56 in the sequence listing; the extension primer 19 is the single-stranded DNA shown in the sequence 57 in the sequence listing; (20) a primer pair 20 and an extension primer 20 for detecting rs4792928; the primer pair 20 is composed of two single-stranded DNAs shown in sequence 58 and sequence 59 in the sequence table; the extension primer 20 is the sequence in the sequence table. The single-stranded DNA shown in 60; (21) a primer pair 21 and an extension primer 21 for detecting rs6054465; the primer pair 21 is composed of two single-stranded DNAs shown in sequence 61 and sequence 62 in the sequence listing; the extension Primer 21 is the single-stranded DNA shown in sequence 63 in the sequence listing; (22) primer pair 22 and extension primer 22 for detecting rs6437783; the primer pair 22 consists of two single-stranded DNAs shown in sequence 64 and sequence 65 in the sequence listing. Strand DNA composition; the extension primer 22 is the single-stranded DNA shown in sequence 66 in the sequence table; (23) primer pair 23 for detecting rs715605 and extension primer 23; the primer pair 23 is composed of two single-stranded DNAs shown in sequence 67 and sequence 68 in the sequence listing; the extension primer 23 is the single-stranded DNA shown in sequence 69 in the sequence listing; (24) use Primer pair 24 and extension primer 24 for detecting rs8072587; the primer pair 24 consists of two single-stranded DNAs shown in SEQ ID NO: 70 and SEQ ID NO: 71 in the sequence listing; stranded DNA; (25) primer pair 25 and extension primer 25 for detecting rs8137373; the primer pair 25 consists of two single-stranded DNAs shown in sequence 73 and sequence 74 in the sequence listing; the extension primer 25 is the sequence The single-stranded DNA shown in sequence 75 in the list; (26) primer pair 26 and extension primer 26 for detecting rs9522149; the primer pair 26 is composed of two single-stranded DNAs shown in sequence 76 and sequence 77 in the sequence listing; The extension primer 26 is the single-stranded DNA shown in the sequence 78 in the sequence listing; (27) the primer pair 27 and extension primer 27 for detecting rs9809818; the primer pair 27 is shown in the sequence 79 and the sequence 80 in the sequence listing. It consists of two single-stranded DNAs; the extension primer 27 is the single-stranded DNA shown in sequence 81 in the sequence listing; (28) a primer pair 28 and an extension primer 28 for detecting rs9908046; the primer pair 28 consists of the sequence in the sequence listing. 82 and two single-stranded DNAs shown in sequence 83; the extension primer 28 is the single-stranded DNA shown in sequence 84 in the sequence listing.

所述单链DNA组中，所述引物对1、所述引物对2、所述引物对3、所述引物对4、所述引物对5、所述引物对6、所述引物对7、所述引物对8、所述引物对9、所述引物对10、所述引物对11、所述引物对12、所述引物对13、所述引物对14、所述引物对15、所述引物对16、所述引物对17、所述引物对18、所述引物对19、所述引物对20、所述引物对21、所述引物对22、所述引物对23、所述引物对24、所述引物对25、所述引物对26、所述引物对27和所述引物对28的摩尔比为0.8：0.6：1.5：2.7：3：0.8：2：1.3：1：1：1.5：1：2.7：0.5：0.8：2.5：0.4：1.6：2.5：4：3：0.8：0.8：3：3：6：0.6：0.6；每个引物对中两条引物的摩尔比均为1：1。所述延伸引物1、所述延伸引物2、所述延伸引物3、所述延伸引物4、所述延伸引物5、所述延伸引物6、所述延伸引物7、所述延伸引物8、所述延伸引物9、所述延伸引物10、所述延伸引物11、所述延伸引物12、所述延伸引物13、所述延伸引物14、所述延伸引物15、所述延伸引物16、所述延伸引物17、所述延伸引物18、所述延伸引物19、所述延伸引物20、所述延伸引物21、所述延伸引物22、所述延伸引物23、所述延伸引物24、所述延伸引物25、所述延伸引物26、所述延伸引物27和所述延伸引物28的摩尔比为0.45：0.35：1.2：1.7：4：1：3：0.8：1：0.8：1.8：1.1：1.1：1.1：0.8：1.4：1：0.5：1.3：1.8：1.6：0.9：0.9：2：2.3：3：1：0.6。In the single-stranded DNA group, the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, the primer pair 5, the primer pair 6, the primer pair 7, The primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair 13, the primer pair 14, the primer pair 15, the primer pair primer pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21, the primer pair 22, the primer pair 23, the primer pair 24. The molar ratio of the primer pair 25, the primer pair 26, the primer pair 27 and the primer pair 28 is 0.8:0.6:1.5:2.7:3:0.8:2:1.3:1:1:1.5 : 1: 2.7: 0.5: 0.8: 2.5: 0.4: 1.6: 2.5: 4: 3: 0.8: 0.8: 3: 3: 6: 0.6: 0.6; the molar ratio of both primers in each primer pair is 1: 1. The extension primer 1, the extension primer 2, the extension primer 3, the extension primer 4, the extension primer 5, the extension primer 6, the extension primer 7, the extension primer 8, the extension primer extension primer 9, the extension primer 10, the extension primer 11, the extension primer 12, the extension primer 13, the extension primer 14, the extension primer 15, the extension primer 16, the extension primer 17. The extension primer 18, the extension primer 19, the extension primer 20, the extension primer 21, the extension primer 22, the extension primer 23, the extension primer 24, the extension primer 25, The molar ratio of the extension primer 26, the extension primer 27 and the extension primer 28 is 0.45:0.35:1.2:1.7:4:1:3:0.8:1:0.8:1.8:1.1:1.1:1.1:0.8 :1.4:1:0.5:1.3:1.8:1.6:0.9:0.9:2:2.3:3:1:0.6.

本发明还保护用于区分五大洲际人群的试剂盒。The present invention also protects a kit for distinguishing between five continents.

本发明提供的用于区分五大洲际人群的试剂盒含有前文所述引物对组或单链DNA组。The kit for distinguishing five intercontinental populations provided by the present invention contains the aforementioned primer pair set or single-stranded DNA set.

根据需要，所述试剂盒还可含有如下物质中的至少一种：dNTP、DNA聚合酶、碱性磷酸酶。According to needs, the kit may further contain at least one of the following substances: dNTP, DNA polymerase, alkaline phosphatase.

本发明还保护用于检测28个SNP位点的物质在如下任一中的应用：The present invention also protects the use of a substance for detecting 28 SNP sites in any of the following:

(b)区分五大洲际人群。(b) Distinguish between five intercontinental populations.

其中，所述用于检测28个SNP位点的物质可为前文所述的引物对组或单链DNA组或试剂盒。Wherein, the substance for detecting 28 SNP sites may be the primer pair set or single-stranded DNA set or kit described above.

本发明还保护一种构建五大洲际人群基因分型数据库的方法。The present invention also protects a method for constructing a genotyping database of five intercontinental populations.

本发明所提供的构建五大洲际人群基因分型数据库的方法，具体可包括如下步骤：The method for constructing a genotyping database of five intercontinental populations provided by the present invention may specifically include the following steps:

(a1)从千人基因组项目和人类基因组多样性计划中选取五大洲际人群的所述28个SNP位点分型形成原始分型库；(a1) Selecting the 28 SNP loci from the 1000 Genomes Project and the Human Genome Diversity Project for typing of five intercontinental populations to form an original typing library;

(a2)将原始分型库里所有样本进行structure聚类分析，从中选取祖先主成分大于90％的部分即构成五大洲际人群基因分型数据库。(a2) Perform structure cluster analysis on all the samples in the original genotyping database, and select the part with the ancestral principal component greater than 90% to constitute the genotyping database of the five intercontinental populations.

Structure是一款免费的、多平台(Windows,Mac,Linux)、开源的、使用由不连锁的标记组成的多位点基因型数据实施基于模型的聚类方法来推断人群组成结构的经典生物信息软件，并广泛用于人类遗传学、群体遗传学、法医遗传学等领域。所用参数为：10000burnins,10000repetitions,混合模型；运行结果可得知每个样本的祖先成分比例。Structure is a free, multi-platform (Windows, Mac, Linux), open source, classical bioinformatics that implements model-based clustering methods using multi-locus genotype data consisting of unlinked markers to infer population composition structure It is widely used in human genetics, population genetics, forensic genetics and other fields. The parameters used are: 10000burnins, 10000repetitions, mixed model; the running result can know the proportion of ancestral components of each sample.

本发明还保护一种区分五大洲际人群的方法。The present invention also protects a method of distinguishing between five intercontinental populations.

本发明所提供的区分五大洲际人群的方法，具体可包括如下步骤：The method for distinguishing five intercontinental populations provided by the present invention may specifically include the following steps:

(b1)按照前文所述方法构建五大洲际人群基因分型数据库；(b1) Constructing a genotyping database of five intercontinental populations according to the method described above;

(b2)提取待测者的基因组DNA，并进行28个SNP位点的检测，获得所述待测者在所述28个SNP位点上的原始基因型数据；(b2) extracting the genomic DNA of the subject to be tested, and performing the detection of 28 SNP sites to obtain the original genotype data of the subject to be tested on the 28 SNP sites;

(b3)将所述待测者在所述28个SNP位点上的原始基因型数据与所述五大洲际人群基因分型数据库通过分析方法进行比对，从而确定所述待测者属于五大洲际人群中的哪一种。(b3) Comparing the original genotype data of the test subject at the 28 SNP sites with the five-continental population genotyping database by an analytical method, so as to determine that the test subject belongs to five continents which one in the crowd.

在上述两种方法中，进行所述28个SNP位点的检测时，采用的是前文所述的引物对组或单链DNA组或试剂盒；进行的是28重PCR扩增，退火温度为55℃。In the above two methods, when the detection of the 28 SNP sites is carried out, the primer pair set or single-stranded DNA set or kit described above is used; 28-fold PCR amplification is carried out, and the annealing temperature is 55°C.

前文所述的五大洲际为东亚、欧洲、非洲、大洋洲和美洲。The five continents mentioned above are East Asia, Europe, Africa, Oceania and the Americas.

为实现五大洲际人群(东亚、欧洲、非洲、大洋洲和美洲)的区分，本发明筛选出28个SNP位点，构建复合检测体系，用该体系检测来自16个人群的712份样本，检测结果与千人基因组中的20个人群和CEPH库中的2个人群合并共计2804份个体的分型数据，采用聚类分析的方法和主成分分析方法进行体系效能评价。选取祖先主成分大于90％的样本构建参考人群基因分型数据库，对140份已知祖先来源的个体进行群体匹配概率、个体主成分分析等进行人群来源推断，评估该体系在实际样本中的人群来源区分能力。结果发现，该体系不仅能对五大洲际人群进行区分，而且对混合人群也有一定的区分能力，对已知来源样本的祖先主成分和人群匹配与其来源信息一致。并能够推断个体祖先来源成分组成，在实际检案中可以进行推广应用。In order to realize the distinction of five intercontinental populations (East Asia, Europe, Africa, Oceania and America), the present invention screened out 28 SNP sites, constructed a composite detection system, and used the system to detect 712 samples from 16 populations. A total of 2804 individual genotyping data were combined from 20 populations in the 1000 Genomes and 2 populations in the CEPH database. Cluster analysis and principal component analysis were used to evaluate the system efficacy. Select samples with an ancestral principal component greater than 90% to construct a reference population genotyping database, perform population matching probability and individual principal component analysis on 140 individuals with known ancestral origins to infer population origin, and evaluate the population of the system in actual samples. Source discrimination ability. The results show that the system can not only distinguish five intercontinental populations, but also has a certain ability to distinguish mixed populations. The ancestral principal components and population matching of samples from known sources are consistent with their source information. And it can infer the composition of individual ancestral sources, which can be popularized and applied in actual cases.

附图说明Description of drawings

图1为DNA浓度为5ng/μL样本分型图(28-plex SNP检测结果)。图中，1-28表示28个SNP(与表2中的编号相对应)，每个数字后面是对应的是分型数据(由于有的检测的是互补链，所以数字后显示的分型数据为表2中的互补碱基)。Figure 1 shows the genotyping diagram of samples with a DNA concentration of 5 ng/μL (28-plex SNP detection results). In the figure, 1-28 represent 28 SNPs (corresponding to the numbers in Table 2), and each number is followed by the corresponding typing data (since some detect complementary strands, the typing data displayed after the number is the complementary base in Table 2).

图2为基于28SNPs在38人群中分型基因频率的主成分分析。A：非洲个体；B：美洲个体；C：东亚个体；D：欧洲个体；E：混合人群；F：大洋洲个体。Figure 2 is a principal component analysis of genotype frequency based on 28SNPs in 38 populations. A: African individuals; B: American individuals; C: East Asian individuals; D: European individuals; E: mixed population; F: Oceania individuals.

图3为28个SNPs对38人群的Structure分析结果。Figure 3 shows the results of Structure analysis of 28 SNPs on 38 populations.

图4为测试样本的人群归类分析图。A：非洲个体；B：美洲个体；C：东亚个体(广西汉)；D：东亚个体(河南汉)；E：欧洲个体；F：维族；G：大洋洲个体。Figure 4 is a graph of the population classification analysis of the test samples. A: African individuals; B: American individuals; C: East Asian individuals (Guangxi Han); D: East Asian individuals (Henan Han); E: European individuals; F: Uighurs; G: Oceania individuals.

图5为本发明和文章体系对“9947”和“HWQ”样本的检测结果图谱。A：“9947”样本文章体系检测结果；B：“HWQ”样本文章体系检测结果；C：“9947”样本本发明体系检测结果；D：“HWQ”样本本发明体系检测结果。A-D中，1-28表示28个SNP(与表6中“编号”一栏的数字相对应)，每个数字后面是对应的是分型数据。Fig. 5 is a graph showing the detection results of the "9947" and "HWQ" samples of the present invention and the article system. A: "9947" sample article system test result; B: "HWQ" sample article article system test result; C: "9947" sample article system test result; D: "HWQ" sample article system test result. In A-D, 1-28 represent 28 SNPs (corresponding to the numbers in the "Number" column in Table 6), and each number is followed by the corresponding typing data.

具体实施方式Detailed ways

下述实施例中所使用的实验方法如无特殊说明，均为常规方法。The experimental methods used in the following examples are conventional methods unless otherwise specified.

下述实施例中所用的材料、试剂等，如无特殊说明，均可从商业途径得到。The materials, reagents, etc. used in the following examples can be obtained from commercial sources unless otherwise specified.

实施例1、本发明五大洲际人群基因分型数据库的构建及应用Embodiment 1, the construction and application of the five intercontinental population genotyping database of the present invention

一、材料和方法1. Materials and methods

1、样本信息1. Sample information

本发明选取公共数据库千人基因组(1000genomes)里的20个人群2000个样本及人类基因组多样性计划(HGDP-CEPH)中2个人群92个样本，检测样本包括16个人群的712份样本，共38个人群2804个个体作为验证样本。详细信息见表1。The present invention selects 2000 samples from 20 populations in the public database Thousand Genomes (1000genomes) and 92 samples from 2 populations in the Human Genome Diversity Project (HGDP-CEPH), and the detection samples include 712 samples from 16 populations. 2804 individuals from 38 populations were used as validation samples. See Table 1 for details.

表1人群样本信息表Table 1 Population sample information table

注：斜体表示测试人群，括号里表示所选用的测试样本的数目。Note: Italics indicate the test population, and brackets indicate the number of selected test samples.

2、SNPs位点的来源2. Source of SNPs

通过对“The Global AIMs Nano set”的分析，删去31个AIMs(de la Puente M,Santos C,Fondevila M,et al.The Global AIMs Nano set:A 31-plex SNaPshot assayof ancestry-informative SNPs[J].Forensic Science International:Genetics,2016,22:81-88.)中的3个三等位SNP位点，利用剩余的28个二等位基因SNPs位点进行复合体系的构建，位点信息见表2。By analyzing "The Global AIMs Nano set", 31 AIMs were deleted (de la Puente M, Santos C, Fondevila M, et al. The Global AIMs Nano set: A 31-plex SNaPshot assay of ancestry-informative SNPs [J ]. Forensic Science International: Genetics, 2016, 22: 81-88.), the 3 triallelic SNP sites were used, and the remaining 28 biallelic SNPs sites were used to construct a composite system. For site information, see Table 2.

表2 28个SNP位点的详细信息Table 2 Detailed information of 28 SNP sites

3、PCR复合扩增3. PCR compound amplification

PCR复合扩增反应体系为5μL，内含10×PCR buffer(含Mg²⁺15mmol/L)0.6μL，25mmol/LMgCl₂ 0.9μL，10mmol/L dNTP 0.1μL，复合扩增引物0.7μL，5U/μL

plusDNA聚合酶(QIAGEN公司，德国)0.1μL，5ng/μL模板DNA 1μL，水补足至5μL。PCR反应条件：95℃10min后；95℃30s，55℃40s，72℃1min，循环40次；最后延伸72℃20min。纯化反应体系为7.5μL，内含扩增产物5μL，H₂O 1μL，ExoΙ(10U/μL)0.2μL，SAP(1U/μL)1μL，10×SAP buffer0.3μL。充分振荡混匀后37℃孵育45min，85℃15min灭活酶活性。The PCR compound amplification reaction system is 5μL, containing 0.6μL of 10×PCR buffer (containing Mg ²⁺ 15mmol/L), 0.9μL of 25mmol/LMgCl ₂ , 0.1μL of 10mmol/L dNTP, 0.7μL of compound amplification primers, 5U/L μL

plus DNA polymerase (QIAGEN, Germany) 0.1 μL, 5 ng/μL template DNA 1 μL, and water to make up to 5 μL. PCR reaction conditions: 95°C for 10min; 95°C for 30s, 55°C for 40s, 72°C for 1min, cycle 40 times; final extension at 72°C for 20min. The purification reaction system was 7.5 μL, containing 5 μL of amplification product, 1 μL of H ₂ O, 0.2 μL of ExoI (10 U/μL), 1 μL of SAP (1 U/μL), and 0.3 μL of 10×SAP buffer. After fully shaking and mixing, incubate at 37°C for 45min, and at 85°C for 15min to inactivate the enzyme activity.

使用

Multiplex Kit(ABI公司，美国)进行单碱基延伸反应，采用5.5μL体系，内含纯化后PCR产物2μL，复合延伸引物1μL，SNaPshot mix 2.5μL。PCR反应条件：96℃10s后；59℃5s，60℃30s，循环33次。之后向反应产物中加入1μL SAP(1U/μL)，37℃孵育80min，85℃15min进行纯化处理，去除多余的引物和dNTP。use

Multiplex Kit (ABI, USA) was used for single-base extension reaction, using a 5.5 μL system, containing 2 μL of purified PCR product, 1 μL of composite extension primers, and 2.5 μL of SNaPshot mix. PCR reaction conditions: 96°C for 10s; 59°C for 5s, 60°C for 30s, cycle 33 times. Then, 1 μL of SAP (1 U/μL) was added to the reaction product, incubated at 37° C. for 80 min, and purified at 85° C. for 15 min to remove excess primers and dNTPs.

在3130-XL遗传分析仪(ABI公司，美国)上对上述延伸纯化产物进行毛细管电泳检测。检测体系10μL，包括1μL单碱基延伸纯化后产物、9μL甲酰胺和内标GeneScanLiz-120(38:1体积比)的混合物(ABI公司，美国)。电泳参数设置：进样时间18s，进样电压3kV，电泳电压13.4kV，电泳时间15min。根据Genemapper ID v3.2软件进行基因分型分析。Capillary electrophoresis was performed on the above-mentioned extension purified products on a 3130-XL Genetic Analyzer (ABI, USA). The detection system was 10 μL, including 1 μL of the purified product after single-base extension, 9 μL of formamide and a mixture of internal standard GeneScanLiz-120 (38:1 volume ratio) (ABI Company, USA). Electrophoresis parameter settings: injection time 18s, injection voltage 3kV, electrophoresis voltage 13.4kV, and electrophoresis time 15min. Genotyping analysis was performed according to Genmapper ID v3.2 software.

其中，用于检测各SNP位点的引物序列及其浓度详见表3。Among them, the primer sequences and concentrations used to detect each SNP site are shown in Table 3.

表3用于检测各SNP位点的引物序列及其浓度Table 3 Primer sequences and their concentrations used to detect each SNP site

位点site 标号label 扩增引物(F)Amplification primer (F) 扩增引物(R)Amplification primer (R) 扩增引物浓度(μM)Amplification primer concentration (μM) 延伸引物extension primer 碱基数number of bases 延伸引物浓度(μM)Extension primer concentration (μM) rs10483251rs10483251 EX-28*22EX-28*22 序列1sequence 1 序列2sequence 2 0.80.8 序列3sequence 3 4848 0.450.45 rs12142199rs12142199 EX-28*23EX-28*23 序列4sequence 4 序列5sequence 5 0.60.6 序列6sequence 6 6363 0.350.35 rs1229984rs1229984 EX-28*2EX-28*2 序列7sequence 7 序列8sequence 8 1.51.5 序列9sequence 9 7373 1.21.2 rs12402499rs12402499 EX-28*24EX-28*24 序列10sequence 10 序列11sequence 11 2.72.7 序列12sequence 12 5151 1.71.7 rs12498138rs12498138 EX-28*25EX-28*25 序列13sequence 13 序列14sequence 14 33 序列15sequence 15 6363 44 rs12594144rs12594144 EX-28*26EX-28*26 序列16sequence 16 序列17sequence 17 0.80.8 序列18sequence 18 4040 11 rs1426654rs1426654 EX-28*3EX-28*3 序列19sequence 19 序列20sequence 20 22 序列21sequence 21 6767 33 rs1557553rs1557553 EX-28*4EX-28*4 序列22sequence 22 序列23sequence 23 1.31.3 序列24sequence 24 9393 0.80.8 rs16891982rs16891982 EX-28*27EX-28*27 序列25sequence 25 序列26sequence 26 11 序列27sequence 27 8080 11 rs17822931rs17822931 EX-28*28EX-28*28 序列28sequence 28 序列29sequence 29 11 序列30sequence 30 5959 0.80.8 rs1871534rs1871534 EX-28*5EX-28*5 序列31sequence 31 序列32sequence 32 1.51.5 序列33sequence 33 105105 1.81.8 rs2080161rs2080161 EX-28*6EX-28*6 序列34sequence 34 序列35sequence 35 11 序列36sequence 36 4848 1.11.1 rs2139931rs2139931 EX-28*7EX-28*7 序列37sequence 37 序列38sequence 38 2.72.7 序列39sequence 39 109109 1.11.1 rs2789823rs2789823 EX-28*8EX-28*8 序列40sequence 40 序列41sequence 41 0.50.5 序列42sequence 42 5151 1.11.1 rs2814778rs2814778 EX-28*9EX-28*9 序列43sequence 43 序列44sequence 44 0.80.8 序列45sequence 45 8989 0.80.8 rs3751050rs3751050 EX-28*10EX-28*10 序列46sequence 46 序列47sequence 47 2.52.5 序列48sequence 48 4444 1.41.4 rs3827760rs3827760 EX-28*11EX-28*11 序列49sequence 49 序列50sequence 50 0.40.4 序列51sequence 51 8989 11 rs4657449rs4657449 EX-28*12EX-28*12 序列52sequence 52 序列53sequence 53 1.61.6 序列54sequence 54 9393 0.50.5 rs4749305rs4749305 EX-28*13EX-28*13 序列55sequence 55 序列56sequence 56 2.52.5 序列57sequence 57 6767 1.31.3 rs4792928rs4792928 EX-28*14EX-28*14 序列58sequence 58 序列59sequence 59 44 序列60sequence 60 9999 1.81.8 rs6054465rs6054465 EX-28*15EX-28*15 序列61sequence 61 序列62sequence 62 33 序列63sequence 63 8484 1.61.6 rs6437783rs6437783 EX-28*16EX-28*16 序列64sequence 64 序列65sequence 65 0.80.8 序列66sequence 66 4545 0.90.9 rs715605rs715605 EX-28*1EX-28*1 序列67sequence 67 序列68sequence 68 0.80.8 序列69sequence 69 113113 0.90.9 rs8072587rs8072587 EX-28*17EX-28*17 序列70sequence 70 序列71sequence 71 33 序列72sequence 72 9696 22 rs8137373rs8137373 EX-28*18EX-28*18 序列73sequence 73 序列74sequence 74 33 序列75sequence 75 100100 2.32.3 rs9522149rs9522149 EX-28*19EX-28*19 序列76sequence 76 序列77sequence 77 66 序列78sequence 78 7272 33 rs9809818rs9809818 EX-28*20EX-28*20 序列79sequence 79 序列80sequence 80 0.60.6 序列81sequence 81 5555 11 rs9908046rs9908046 EX-28*21EX-28*21 序列82sequence 82 序列83sequence 83 0.60.6 序列84sequence 84 109109 0.60.6

二、软件及分析方法2. Software and Analysis Methods

1、主成分分析(PCA)1. Principal Component Analysis (PCA)

利用Rv3.2.3软件进行主成分分析：a、将包括公共数据库样本和检测样本在内的38个人群，按洲际和地域分为非洲、美洲、东亚、欧洲、大洋洲、混合人群(中亚、中南亚、南亚)，并进行基于基因频率的群体主成分分析；b、随机从7个测试人群中(表1)各抽取一个样本进行个体主成分分析(用R v3.3.2进行主成分分析，并用R程序包ggplot2画出种族归类图)。Use Rv3.2.3 software to conduct principal component analysis: a. Divide 38 populations, including public database samples and test samples, into Africa, America, East Asia, Europe, Oceania, mixed populations (Central Asia, South Asia, South Asia), and perform population principal component analysis based on gene frequency; b. Randomly select one sample from each of the seven test populations (Table 1) for individual principal component analysis (perform principal component analysis with R v3.3.2, and use R v3.3.2 for principal component analysis, and use The R package ggplot2 draws racial categorization graphs).

2、聚类分析2. Cluster analysis

针对表1中的38个人群，用Structure.v2.3.4软件进行聚类分析(K取3-7)，分析各人群的遗传结构，使用Distruct 1.1绘制人群聚类结果图；并对上述进行个体主成分分析的7个样本进行个体祖先成分的统计。For the 38 populations in Table 1, use Structure.v2.3.4 software to perform clustering analysis (K takes 3-7), analyze the genetic structure of each population, and use Distruct 1.1 to draw a population clustering result map; The 7 samples of principal component analysis were used for the statistics of individual ancestry components.

3、随机人群匹配概率3. Probability of random population matching

用法医智能软件对随机挑选的来自7个人群140份测试样本(表1中已标注)进行人群随机匹配的计算。Forensic intelligence software was used to calculate random population matching for 140 randomly selected test samples (marked in Table 1) from 7 populations.

三、实验结果3. Experimental results

1、28-plex SNP检测结果1. 28-plex SNP test results

结果如图1所示，由图可见：28个AIMs符合以下标准：1)对五大洲际人群的区分能力保持良好的均衡性；2)位点间间隔至少1Mb，减少了连锁遗传的发生几率。该体系采用较为普及的SNaPshot检测分型技术，28个位点等位基因均可明显判断分型。The results are shown in Figure 1. It can be seen from the figure that: 28 AIMs meet the following criteria: 1) maintain a good balance in the discrimination ability of five intercontinental populations; 2) the interval between loci is at least 1Mb, which reduces the chance of linked inheritance. The system adopts the relatively popular SNaPshot detection and typing technology, and alleles of 28 loci can be obviously judged for typing.

2、对体系区分效能的评价2. Evaluation of system differentiation efficiency

(1)主成分分析(1) Principal Component Analysis

结果如图2所示，由图可见：主成分1(PC1)和主成分2(PC2)解释了61.3％的差异，28个位点可以把38个人群很明显的区分成六大部分，结合表1我们发现，分布较集中的人群为同一洲际祖先来源，来源于五大洲际(非洲、欧洲、东亚、美洲、大洋洲)的30个人群被很明显的区分为五大部分，在主成分1中美洲、东亚、大洋洲、非洲、欧洲被依次区分开，其中东亚人群和欧洲人群分布相对较集中，混合人群(8个)的人群分布相对较分散，但均位于东亚人群和欧洲之间，该部分属于南亚和欧亚混合人群，其分布的不集中也进一步说明了其遗传结构的复杂，需要进一步研究；在主成分2中，大洋洲和美洲人群被区分开。The results are shown in Figure 2. It can be seen from the figure that Principal Component 1 (PC1) and Principal Component 2 (PC2) explain 61.3% of the difference, and 28 loci can clearly distinguish 38 populations into six major parts. In Table 1, we found that the more concentrated populations are from the same intercontinental ancestry, and 30 populations from five continents (Africa, Europe, East Asia, America, and Oceania) were clearly divided into five major parts. In principal component 1, Central America , East Asia, Oceania, Africa, and Europe are divided in turn. Among them, the East Asian population and the European population are relatively concentrated, and the mixed population (8) is relatively scattered, but they are all located between the East Asian population and Europe. This part belongs to In the mixed populations of South Asia and Eurasian, the non-centralized distribution further illustrates the complexity of their genetic structure and needs further research; in principal component 2, Oceania and America populations are distinguished.

(2)聚类分析(2) Cluster analysis

使用该28个AIMs，对上述的38个人群共2804个样本进行遗传结构的分析。结果如图3所示，由图可见：当K＝3时，人群被聚类为三大部分：非洲、欧洲被区分开，美洲、大洋洲和东亚人群呈现出一致的祖先组成分，维族等混合人群的遗传成分呈现在欧洲和东亚祖先成分的连续分布。随着K值增加，先后在美洲、大洋洲出现新的祖先主成分，当K值为6时，所有个体对应于6个人群组：非洲、美洲、大洋洲、东亚、混合人群、欧洲，混合人群出现独立祖先成分，说明维吾尔族和南亚的混合人种经过长期的融合与进化，已经成为遗传成分相对稳定的过渡人种。当K值增加到7时，南亚混合人群出现新的成分，说明其与中亚的混合人群在祖先来源与遗传结构上有差异。当K值继续增加时，东亚人群未出现进一步分层现象，说明该体系不能对东亚局部的亚人群进一步区分。Using the 28 AIMs, a total of 2804 samples from the above 38 populations were analyzed for genetic structure. The results are shown in Figure 3. It can be seen from the figure that when K = 3, the population is clustered into three parts: Africa and Europe are separated, the American, Oceanian and East Asian populations show the same ancestral composition, and the Uighurs are mixed. The genetic component of the population presents a continuous distribution of European and East Asian ancestral components. As the K value increases, new ancestral principal components appear successively in the Americas and Oceania. When the K value is 6, all individuals correspond to 6 population groups: Africa, America, Oceania, East Asia, mixed populations, and Europe, and mixed populations appear. The independent ancestral component shows that the mixed race of Uyghur and South Asia has become a transitional race with relatively stable genetic components after long-term fusion and evolution. When the K value increased to 7, a new component appeared in the mixed population of South Asia, indicating that it was different from the mixed population of Central Asia in ancestral origin and genetic structure. When the K value continued to increase, there was no further stratification in the East Asian population, indicating that the system could not further differentiate the subpopulations in the East Asian region.

(3)单独个体的人群来源推断测试(3) Population-derived inference test for individual individuals

为避免混合成分出现，同时达到最大的区分能力，选择K＝6作为该体系较理想的区分能力；同时为了进一步应用于法医实际检案中，提高推断的准确性，本发明选取祖先主成分大于90％的个体(共2201个样本)作为参考样本构建参考人群基因分型数据库，并采用以下统计方法进行该体系个体推断能力的评价。In order to avoid the appearance of mixed components and achieve the maximum distinguishing ability at the same time, K=6 is selected as the ideal distinguishing ability of the system; at the same time, in order to be further applied to actual forensic cases and improve the accuracy of inference, the present invention selects the ancestral principal component greater than 90% of the individuals (2201 samples in total) were used as reference samples to construct a reference population genotyping database, and the following statistical methods were used to evaluate the individual inference ability of the system.

A.似然比A. Likelihood ratio

对已知个体来源的140个样本基于参考数据库(不包含140个测试样本)进行随机人群匹配概率进行计算，基于似然比进行其可能洲际人群来源的统计。群体匹配概率即随机匹配概率，简单地说，就是对某位点组合的一个特定分型可能出现在人群中的估计概率，也可以理解为从人群中随机抽取一份样本，会出现特定DNA分型的理论概率。LR值的计算具体为：未知个体概率最大的群体匹配概率为分母，其他群体的匹配概率为分子，依次得到不同人群的似然比值。结果如表4。140个测试样本中祖先来源推断与样本信息一致的有137个；另外3个来源于维族的样本祖先推断来源分别为混合人群、东亚和欧洲，但其似然比均小于100，推断结果不排除样本信息。综上，该体系对测试样本祖先来源推断绝对准确率达97.86％，另有2.14％不能排除样本信息来源。结合表4可以看出该体系在对五大洲际人群来源推断时准确性较高，而在进行混合人群推断时需要综合祖先成分和MP值进行分析。另外，种族和民族不是一致的概念，维族地处亚欧交界，属欧亚混合人种，不同人种之间的融合没有明显的界限，户籍登记中的民族信息也并非完全与种族一致，三个维族样本虽没达到判别标准，但推断结果符合其地理位置分布，在以后对该人群的区分中人要进行多方面的综合考虑。The probability of random population matching was calculated for 140 samples of known individual origin based on the reference database (excluding 140 test samples), and the statistics of possible intercontinental population sources were performed based on the likelihood ratio. The population matching probability is the random matching probability. Simply put, it is the estimated probability that a specific type of a combination of loci may appear in the population. It can also be understood as a random sample drawn from the population. type of theoretical probability. The calculation of the LR value is as follows: the matching probability of the group with the largest unknown individual probability is the denominator, the matching probability of other groups is the numerator, and the likelihood ratios of different groups are obtained in turn. The results are shown in Table 4. Among the 140 test samples, 137 have the same ancestry source inferred with the sample information; the other 3 samples from the Uyghur ethnic group are from mixed population, East Asia and Europe, but their likelihood ratios are all less than 100. , the inference results do not exclude sample information. To sum up, the absolute accuracy rate of the system inferring the ancestral source of the test sample is 97.86%, and another 2.14% cannot exclude the source of sample information. Combining with Table 4, it can be seen that the system has high accuracy in inferring the origin of five intercontinental populations, while inferring mixed populations requires comprehensive analysis of ancestry components and MP values. In addition, race and ethnicity are not the same concept. The Uighurs are located at the border of Asia and Europe, and belong to the mixed race of Europe and Asia. There is no obvious boundary between the integration of different races, and the ethnic information in the household registration is not completely consistent with the race. Three Although each Uyghur sample did not meet the criteria for discrimination, the inferred results were in line with its geographical distribution. In the future, various aspects should be considered comprehensively in the differentiation of this population.

表4测试样本匹配概率结果Table 4 Test sample matching probability results

B.测试个体的人群来源归类B. Population Origin Classification of Test Subjects

从上述140个测试样本中，每个测试人群随机抽取1个样本基于所构建的参考数据库进行个体主成分分析，推断其人群来源，如图4所示，7个样本的祖先成分计算统计见表5。从个体主成分分析结果(图4)可以看出7个已知个体样本均能落到相对应的祖先人群中，个体祖先成分计算中(表5)我们可以看出7个测试样本的祖先主成分都达到了94％以上。From the above 140 test samples, one sample was randomly selected from each test population to conduct individual principal component analysis based on the constructed reference database to infer its population origin, as shown in Figure 4. The calculation statistics of the ancestral components of the seven samples are shown in the table. 5. From the results of individual principal component analysis (Figure 4), it can be seen that the seven known individual samples can all fall into the corresponding ancestral population. In the calculation of individual ancestry components (Table 5), we can see that the ancestral main components of the seven test samples The ingredients are all over 94%.

表5七个测试样本祖先来源分析结果Table 5 Analysis results of ancestry origin of seven test samples

综合本实施例的结果，可见：本发明所提供的SNP复合体系可以有效进行东亚、欧洲、非洲、美洲和大洋洲五大人群和混合人群的遗传结构分析和个体的祖先来源推断。考虑到当今社会人口流动频繁，流动范围广，该体系采用较少的位点实现了针对五大洲际人群的区分，相比我们之前建立的针对欧、亚、非三大洲际人群的27-Plex SNP覆盖范围更广，两体系在法医DNA检验中针对未知样本的祖先来源推断可以相互印证，为案件提供更多更准确的侦查线索。Based on the results of this example, it can be seen that the SNP composite system provided by the present invention can effectively carry out genetic structure analysis and individual ancestral origin inference of five major populations and mixed populations in East Asia, Europe, Africa, America and Oceania. Considering the frequent and wide flow of population in today's society, the system uses fewer sites to distinguish between five intercontinental populations, compared to our previously established 27-Plex SNP for three intercontinental populations of Europe, Asia, and Africa. The coverage is wider, and the two systems can confirm each other's ancestral origin inference for unknown samples in forensic DNA testing, providing more and more accurate investigative clues for cases.

实施例2、本发明与现有技术的比较Embodiment 2, the comparison of the present invention and prior art

本实施例以标准欧洲人的DNA样本(9947)和东亚人的DNA样本(HWQ，发明人所在实验室的亚洲人)来进一步验证本发明28体系与31个AIMs(de la Puente M,Santos C,Fondevila M,et al.The Global AIMs Nano set:A 31-plex SNaPshot assay ofancestry-informative SNPs[J].Forensic Science International:Genetics,2016,22:81-88.)两者的优劣。In this example, standard European DNA samples (9947) and East Asian DNA samples (HWQ, Asians in the inventor's laboratory) were used to further verify the 28 systems of the present invention and 31 AIMs (de la Puente M, Santos C , Fondevila M, et al. The Global AIMs Nano set: A 31-plex SNaPshot assay ofancestry-informative SNPs [J]. Forensic Science International: Genetics, 2016, 22: 81-88.) The pros and cons of both.

本发明28体系组：按照实施例1中的相关步骤进行操作。The 28-system group of the present invention: operate according to the relevant steps in Example 1.

文章31个AIMs组：按照“de la Puente M,Santos C,Fondevila M,et al.TheGlobal AIMs Nano set:A 31-plex SNaPshot assay of ancestry-informative SNPs[J].Forensic Science International:Genetics,2016,22:81-88.”文中记载进行操作，即扩增引物及延伸引物按照文章中浓度进行配比(除去三个三等位基因)。Article 31 AIMs set: According to "de la Puente M, Santos C, Fondevila M, et al. TheGlobal AIMs Nano set:A 31-plex SNaPshot assay of ancestry-informative SNPs[J].Forensic Science International:Genetics,2016, 22:81-88." The operation is carried out as described in the text, that is, the amplification primers and extension primers are matched according to the concentrations in the article (three trialleles are removed).

结果如表6和图5所示。对于“9947”这一标准欧洲人的DNA样本来说，采用本发明28体系和文章体系鉴定结果一致；而对于“HWQ”这一东亚人的DNA样本，采用本发明28体系和文章体系鉴定结果出现两处不一致(详见表6中粗体下划线部分)，进一步通过对不一致位点进行单扩和测序，结果证实本发明28体系的检测结果是正确的。可见，本发明28体系，相比于文章的31个AIMs来说，在减少了SNP位点数量的情况下，还进一步提高了检测结果的准确性，本实施例中特别体现为对于东亚样本的检测结果更加准确。The results are shown in Table 6 and Figure 5. For the standard European DNA sample of "9947", the identification results using the 28 system of the present invention and the article system are consistent; and for the East Asian DNA sample "HWQ", the identification results using the 28 system of the present invention and the article system are used. There were two inconsistencies (see the bold underlined part in Table 6 for details), and the results confirmed that the detection results of the 28 system of the present invention were correct by single amplification and sequencing of the inconsistent sites. It can be seen that, compared with the 31 AIMs in the article, the 28 system of the present invention further improves the accuracy of the detection results while reducing the number of SNP sites. The detection results are more accurate.

由图5可见，扩增引物及延伸引物按照文章中浓度进行配比(除去三个三等位基因)，主要出现以下问题：文章31个AIMs组存在位点出峰不全、位点出峰不平衡(杂合子峰高差别较大)，以及某些位点之间出峰重叠，对判型有影响。本发明28体系有效的解决了以上问题：1、针对“位点出峰不全”的解决方法：调整引物之间的配比。2、针对“位点出峰不平衡(杂合子峰高差别较大)”的解决方法：保证出峰特异的情况下降低退火温度。3、针对“某些位点之间出峰重叠，对判型有影响”的解决方法：调整延伸引物长度。本发明28体系通过以上调整，最终达到了准确地分型。It can be seen from Figure 5 that the amplification primers and extension primers are matched according to the concentrations in the article (excluding three trialleles), and the following problems mainly occur: the 31 AIMs groups in the article have incomplete locus peaks and incomplete locus peaks. Equilibrium (heterozygous peak heights differ greatly), and peak overlap between certain sites has an impact on the type. The 28 system of the present invention effectively solves the above problems: 1. The solution for "incomplete peaks at sites": adjust the ratio between primers. 2. The solution to "imbalance of site peaks (heterozygote peak heights are quite different)": reduce the annealing temperature while ensuring the specificity of the peaks. 3. The solution for "overlapping peaks between certain sites and affecting the type": adjust the length of the extension primer. The 28 system of the present invention finally achieves accurate typing through the above adjustments.

表6本发明和文章两检测体系对“9947”和“HWQ”的检测结果Table 6 The detection results of the two detection systems of the present invention and the article to "9947" and "HWQ"

<110> 公安部物证鉴定中心<110> Ministry of Public Security Material Evidence Identification Center

<120> 对未知来源个体进行五大洲际族群来源推断的方法和系统<120> A method and system for inferring the origin of five intercontinental populations for individuals of unknown origin

<130> GNCLN171048<130> GNCLN171048

<160> 84<160> 84

<170> PatentIn version 3.5<170> PatentIn version 3.5

<210> 1<210> 1

<211> 22<211> 22

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 1<400> 1

gcacgttctt aaccttggct at 22gcacgttctt aaccttggct at 22

<210> 2<210> 2

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 2<400> 2

ttctgaatat cccacccaca a 21ttctgaatat cccacccaca a 21

<210> 3<210> 3

<211> 48<211> 48

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 3<400> 3

gtgccacgtc gtgaaagtct gacaaggaaa aagttatgtg accagatt 48gtgccacgtc gtgaaagtct gacaaggaaa aagttatgtg accagatt 48

<210> 4<210> 4

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 4<400> 4

aggccttgat gtgcttgaac 20aggccttgat gtgcttgaac 20

<210> 5<210> 5

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 5<400> 5

cgagaaggcc aaccactact 20cgagaaggcc aaccactact 20

<210> 6<210> 6

<211> 63<211> 63

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 6<400> 6

aacaactgac taaactaggt gccacgtcgt gaaagtctga caatcaaaca tgttcctctg 60aacaactgac taaactaggt gccacgtcgt gaaagtctga caatcaaaca tgttcctctg 60

cac 63cac 63

<210> 7<210> 7

<211> 24<211> 24

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 7<400> 7

attctgtaga tggtggctgt agga 24attctgtaga tggtggctgt agga 24

<210> 8<210> 8

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 8<400> 8

ctgcctcatg gcctaaaatc a 21ctgcctcatg gcctaaaatc a 21

<210> 9<210> 9

<211> 59<211> 59

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 9<400> 9

caactgacta aactaggtgc cacgtcgtga aagtctgaca aaccacgtgg tcatctgtg 59caactgacta aactaggtgc cacgtcgtga aagtctgaca aaccacgtgg tcatctgtg 59

<210> 10<210> 10

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 10<400> 10

tgaagggtat tactagtggc 20tgaagggtat tactagtggc 20

<210> 11<210> 11

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 11<400> 11

ttgacagact tctgcttttg 20ttgacagact tctgcttttg 20

<210> 12<210> 12

<211> 51<211> 51

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 12<400> 12

aggtgccacg tcgtgaaagt ctgacaactg cttttgattt caagtatcag t 51aggtgccacg tcgtgaaagt ctgacaactg cttttgattt caagtatcag t 51

<210> 13<210> 13

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 13<400> 13

tcttcttcag ggaatcctgt 20tcttcttcag ggaatcctgt 20

<210> 14<210> 14

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 14<400> 14

gagttacata ggatttgcga g 21gagttacata ggatttgcga g 21

<210> 15<210> 15

<211> 63<211> 63

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 15<400> 15

caactgacta aactaggtgc cacgtcgtga aagtctgaca agggaatcct gttattcaca 60caactgacta aactaggtgc cacgtcgtga aagtctgaca agggaatcct gttattcaca 60

tta 63tta 63

<210> 16<210> 16

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 16<400> 16

cctacaagac cacccaccag 20cctacaagac cacccaccag 20

<210> 17<210> 17

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 17<400> 17

ggacccatgg tcattccata 20ggacccatgg tcattccata 20

<210> 18<210> 18

<211> 40<211> 40

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 18<400> 18

cacgtcgtga aagtctgaca agctcccacc ctgaaaaaga 40cacgtcgtga aagtctgaca agctcccacc ctgaaaaaga 40

<210> 19<210> 19

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 19<400> 19

aattcaggag ctgaactgcc 20aattcaggag ctgaactgcc 20

<210> 20<210> 20

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 20<400> 20

tgttcagccc ttggattgtc 20tgttcagccc ttggattgtc 20

<210> 21<210> 21

<211> 67<211> 67

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 21<400> 21

ctctctctct ctctctctct ctctctctct ctctctctct ctctctcttt cgctgccatg 60ctctctctct ctctctctct ctctctctct ctctctctct ctctctcttt cgctgccatg 60

aaagttg 67aaagttg 67

<210> 22<210> 22

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 22<400> 22

taatacaaga gccgcctgga 20taatacaaga gccgcctgga 20

<210> 23<210> 23

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 23<400> 23

cttgcaagga actgcagcta t 21cttgcaagga actgcagcta t 21

<210> 24<210> 24

<211> 93<211> 93

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 24<400> 24

taaactaggt gccacgtcgt gaaagtctga caacaactga ctaaactagg tgccacgtcg 60taaactaggt gccacgtcgt gaaagtctga caacaactga ctaaactagg tgccacgtcg 60

tgaaagtctg acaacccaaa gcccctggaa aaa 93tgaaagtctg acaacccaaa gcccctggaa aaa 93

<210> 25<210> 25

<211> 25<211> 25

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 25<400> 25

gaataaagtg aggaaaacac ggagt 25gaataaagtg aggaaaacac ggagt 25

<210> 26<210> 26

<211> 25<211> 25

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 26<400> 26

gtttctcatc tacgaaagag gagtc 25gtttctcatc tacgaaagag gagtc 25

<210> 27<210> 27

<211> 80<211> 80

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 27<400> 27

cacgtcgtga aagtctgaca acaactgact aaactaggtg ccacgtcgtg aaagtctgac 60cacgtcgtga aagtctgaca acaactgact aaactaggtg ccacgtcgtg aaagtctgac 60

aaggttggat gttggggctt 80aaggttggat gttggggctt 80

<210> 28<210> 28

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 28<400> 28

cctagagtcc cccaaacctc 20cctagagtcc cccaaacctc 20

<210> 29<210> 29

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 29<400> 29

cacttctggg catctgcttc 20cacttctggg catctgcttc 20

<210> 30<210> 30

<211> 59<211> 59

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 30<400> 30

aactgactaa actaggtgcc acgtcgtgaa agtctgacaa ctgcattgcc agtgtactc 59aactgactaa actaggtgcc acgtcgtgaa agtctgacaa ctgcattgcc agtgtactc 59

<210> 31<210> 31

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 31<400> 31

acatcctgca gaccttcctg 20acatcctgca gaccttcctg 20

<210> 32<210> 32

<211> 19<211> 19

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 32<400> 32

cagaccttgg gcgtcagat 19cagaccttgg gcgtcagat 19

<210> 33<210> 33

<211> 105<211> 105

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 33<400> 33

ctgacaacaa ctgactaaac taggtgccac gtcgtgaaag tctgacaaca actgactaaa 60ctgacaacaa ctgactaaac taggtgccac gtcgtgaaag tctgacaaca actgactaaa 60

ctaggtgcca cgtcgtgaaa gtctgacaac ctggcagtgg gtgca 105ctaggtgcca cgtcgtgaaa gtctgacaac ctggcagtgg gtgca 105

<210> 34<210> 34

<211> 27<211> 27

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 34<400> 34

gagtatgata taattttgtt cctgctg 27gagtatgata taattttgtt cctgctg 27

<210> 35<210> 35

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 35<400> 35

tggactttat gggttgttgt ttt 23tggactttat gggttgttgt ttt 23

<210> 36<210> 36

<211> 48<211> 48

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 36<400> 36

cacgtcgtga aagtctgaca attttttgtt ttttttttgc actcatca 48cacgtcgtga aagtctgaca atttttttgtt ttttttttgc actcatca 48

<210> 37<210> 37

<211> 22<211> 22

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 37<400> 37

agtcttggct agggcgttag ta 22agtcttggct agggcgttag ta 22

<210> 38<210> 38

<211> 22<211> 22

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 38<400> 38

ctcctagtca tggttgatgt gg 22ctcctagtca tggttgatgt gg 22

<210> 39<210> 39

<211> 109<211> 109

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 39<400> 39

acaacaactg actaaactag gtgccacgtc gtgaaagtct gacaacaact gactaaacta 60acaacaactg actaaactag gtgccacgtc gtgaaagtct gacaacaact gactaaacta 60

ggtgccacgt cgtgaaagtc tgacaattcg tgttgatgag aaaatttca 109ggtgccacgt cgtgaaagtc tgacaattcg tgttgatgag aaaatttca 109

<210> 40<210> 40

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 40<400> 40

agagggcttc tgttcacacc 20agagggcttc tgttcacacc 20

<210> 41<210> 41

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 41<400> 41

atgcaccact actgtccaag 20atgcaccact actgtccaag 20

<210> 42<210> 42

<211> 51<211> 51

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 42<400> 42

aaactaggtg ccacgtcgtg aaagtctgac aaggaggtga gcttcacggg g 51aaactaggtg ccacgtcgtg aaagtctgac aaggaggtga gcttcacggg g 51

<210> 43<210> 43

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 43<400> 43

aacctgatgg ccctcattag t 21aacctgatgg ccctcattag t 21

<210> 44<210> 44

<211> 19<211> 19

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 44<400> 44

atggcaccgt ttggttcag 19atggcaccgt ttggttcag 19

<210> 45<210> 45

<211> 89<211> 89

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 45<400> 45

agtctgacaa ctaggtgcca cgtcgtgaaa gtctgacaac taggtgccac gtcgtgaaag 60agtctgacaa ctaggtgcca cgtcgtgaaa gtctgacaac taggtgccac gtcgtgaaag 60

tctgacatct cattagtcct tggctctta 89tctgacatct cattagtcct tggctctta 89

<210> 46<210> 46

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 46<400> 46

gaaggctccc aactcgttag 20gaaggctccc aactcgttag 20

<210> 47<210> 47

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 47<400> 47

gtcattaaag tcaacctagg c 21gtcattaaag tcaacctagg c 21

<210> 48<210> 48

<211> 44<211> 44

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 48<400> 48

ctctctctct ctctctctct cttgtttagg agagttgaga catc 44ctctctctct ctctctctct cttgtttagg agagttgaga catc 44

<210> 49<210> 49

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 49<400> 49

tgctcagctc cacgtacaac 20tgctcagctc cacgtacaac 20

<210> 50<210> 50

<211> 19<211> 19

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 50<400> 50

ctcttcaggc cgaagctct 19ctcttcaggc cgaagctct 19

<210> 51<210> 51

<211> 89<211> 89

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 51<400> 51

actaggtgcc acgtcgtgaa agtctgacaa caactgacta aactaggtgc cacgtcgtga 60actaggtgcc acgtcgtgaa agtctgacaa caactgacta aactaggtgc cacgtcgtga 60

aagtctgaca atggcgccac gttttcaca 89aagtctgaca atggcgccac gttttcaca 89

<210> 52<210> 52

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 52<400> 52

cccctcggga gaaaacatag 20cccctcggga gaaaacatag 20

<210> 53<210> 53

<211> 24<211> 24

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 53<400> 53

ttctagagtt gaatgagggt caga 24ttctagagtt gaatgagggt caga 24

<210> 54<210> 54

<211> 93<211> 93

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 54<400> 54

aaactaggtg ccacgtcgtg aaagtctgac aacaactgac taaactaggt gccacgtcgt 60aaactaggtg ccacgtcgtg aaagtctgac aacaactgac taaactaggt gccacgtcgt 60

gaaagtctga caagagctaa ggaaagatac gtg 93gaaagtctga caagagctaa ggaaagatac gtg 93

<210> 55<210> 55

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 55<400> 55

cagcccaacc tactcctctg 20cagcccaacc tactcctctg 20

<210> 56<210> 56

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 56<400> 56

tccctacaaa gtggcaaacc 20tccctacaaa gtggcaaacc 20

<210> 57<210> 57

<211> 67<211> 67

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 57<400> 57

caacaactga ctaaactagg tgccacgtcg tgaaagtctg acaacagtaa atagtaactc 60caacaactga ctaaactagg tgccacgtcg tgaaagtctg acaacagtaa atagtaactc 60

catcttc 67catcttc 67

<210> 58<210> 58

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 58<400> 58

tctctcagga tatccctttg g 21tctctcagga tatccctttg g 21

<210> 59<210> 59

<211> 25<211> 25

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 59<400> 59

aaaatcttga ttctgtatcg cagtc 25aaaatcttga ttctgtatcg cagtc 25

<210> 60<210> 60

<211> 101<211> 101

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 60<400> 60

caactgacta aactaggtgc cacgtcgtga aagtctgaca acaactgact aaactaggtg 60caactgacta aactaggtgc cacgtcgtga aagtctgaca acaactgact aaactaggtg 60

ccacgtcgtg aaagtctgac aacgcagtct actagttgtc c 101ccacgtcgtg aaagtctgac aacgcagtct actagttgtc c 101

<210> 61<210> 61

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 61<400> 61

tatggcctca ggttctccac 20tatggcctca ggttctccac 20

<210> 62<210> 62

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 62<400> 62

cacatgatct caccgtttcc t 21cacatgatct caccgtttcc t 21

<210> 63<210> 63

<211> 113<211> 113

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 63<400> 63

tgacaacaac tgactaaact aggtgccacg tcgtgaaagt ctgacaacaa ctgactaaac 60tgacaacaac tgactaaact aggtgccacg tcgtgaaagt ctgacaacaa ctgactaaac 60

taggtgccac gtcgtgaaag tctgacaaca catgcaaaat caggataata atg 113taggtgccac gtcgtgaaag tctgacaaca catgcaaaat caggataata atg 113

<210> 64<210> 64

<211> 22<211> 22

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 64<400> 64

gcaatgagat tagttgcact gg 22gcaatgagat tagttgcact gg 22

<210> 65<210> 65

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 65<400> 65

attatatgcc caccctgctc 20attatatgcc caccctgctc 20

<210> 66<210> 66

<211> 44<211> 44

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 66<400> 66

tgccacgtcg tgaaagtctg acaactggtt gaggcacact atta 44tgccacgtcg tgaaagtctg acaactggtt gaggcacact atta 44

<210> 67<210> 67

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 67<400> 67

cccagctagg gctagacacc 20cccagctagg gctagacacc 20

<210> 68<210> 68

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 68<400> 68

tcaaagactg agccatgcac 20tcaaagactg agccatgcac 20

<210> 69<210> 69

<211> 113<211> 113

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 69<400> 69

gaaagtctga caacaactga ctaaactagg tgccacgtcg tgaaagtctg acaacaactg 60gaaagtctga caacaactga ctaaactagg tgccacgtcg tgaaagtctg acaacaactg 60

actaaactag gtgccacgtc gtgaaagtct gacaaccacc ctaaggggac aga 113actaaactag gtgccacgtc gtgaaagtct gacaaccacc ctaaggggac aga 113

<210> 70<210> 70

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 70<400> 70

tggcaacctc acatggtaga 20tggcaacctc acatggtaga 20

<210> 71<210> 71

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 71<400> 71

ccaggggagg tagaaagagg 20ccaggggagg tagaaagagg 20

<210> 72<210> 72

<211> 97<211> 97

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 72<400> 72

aactgactaa actaggtgcc acgtcgtgaa agtctgacaa caactgacta aactaggtgc 60aactgactaa actaggtgcc acgtcgtgaa agtctgacaa caactgacta aactaggtgc 60

cacgtcgtga aagtctgaca acagtctcct gcccggc 97cacgtcgtga aagtctgaca acagtctcct gcccggc 97

<210> 73<210> 73

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 73<400> 73

ccagagcttt gcagcacttt 20ccagagcttt gcagcacttt 20

<210> 74<210> 74

<211> 19<211> 19

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 74<400> 74

caaggacgca gctctctca 19caaggacgca gctctctca 19

<210> 75<210> 75

<211> 101<211> 101

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 75<400> 75

acaactgact aaactaggtg ccacgtcgtg aaagtctgac aacaactgac taaactaggt 60acaactgact aaactaggtg ccacgtcgtg aaagtctgac aacaactgac taaactaggt 60

gccacgtcgt gaaagtctga caagagtgtt ttgtgggcct c 101gccacgtcgt gaaagtctga caagagtgtt ttgtgggcct c 101

<210> 76<210> 76

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 76<400> 76

agaaaggaga ggaaacaccg 20agaaaggaga ggaaacaccg 20

<210> 77<210> 77

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 77<400> 77

tcagcaactt ctagtcctcg 20tcagcaactt ctagtcctcg 20

<210> 78<210> 78

<211> 55<211> 55

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 78<400> 78

ctctctctct ctctctctct ctctctctct gacaatctga ggtccttgca gctcc 55ctctctctct ctctctctct ctctctctct gacaatctga ggtccttgca gctcc 55

<210> 79<210> 79

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 79<400> 79

tgtgtggttt tctcagcgac 20tgtgtggttt tctcagcgac 20

<210> 80<210> 80

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 80<400> 80

agcatggtat gagcactgag 20agcatggtat gagcactgag 20

<210> 81<210> 81

<211> 40<211> 40

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 81<400> 81

tctctctctc tctctctctc tcctcctaat aagagctggc 40tctctctctc tctctctctc tcctcctaat aagagctggc 40

<210> 82<210> 82

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 82<400> 82

ccttggcatg ttcctctctc 20ccttggcatg ttcctctctc 20

<210> 83<210> 83

<211> 24<211> 24

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 83<400> 83

tcagaggaat tagaaaggcc taaa 24tcagaggaat tagaaaggcc taaa 24

<210> 84<210> 84

<211> 109<211> 109

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequences

<220><220>

<223><223>

<400> 84<400> 84

agtctgacaa caactgacta aactaggtgc cacgtcgtga aagtctgaca acaactgact 60agtctgacaa caactgacta aactaggtgc cacgtcgtga aagtctgaca acaactgact 60

aaactaggtg ccacgtcgtg aaagtctgac aaggaggtag gagcaccca 109aaactaggtg ccacgtcgtg aaagtctgac aaggaggtag gagcaccca 109

Claims

1. The application of 28 SNP site combinations in any of the following:

(a) Construction of a genotyping database for five intercontinental populations;

(b) distinguish between five intercontinental populations;

所述28个SNP位点分别为：rs10483251、rs12142199、rs1229984、rs12402499、rs12498138、rs12594144、rs1426654、rs1557553、rs16891982、rs17822931、rs1871534、rs2080161、rs2139931、rs2789823、rs2814778、rs3751050、rs3827760、rs4657449、rs4749305、rs4792928、 rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818, and rs9908046.

2. A primer pair set for detecting 28 SNP sites in the human genome; the 28 SNP sites are: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, 3534829 rs2080161、rs2139931、rs2789823、rs2814778、rs3751050、rs3827760、rs4657449、rs4749305、rs4792928、rs6054465、rs6437783、rs715605、rs8072587、rs8137373、rs9522149、rs9809818和rs9908046；所述引物对组由如下(1)-(28)组成：

(1) Primer pair 1 for detecting rs10483251, consisting of two single-stranded DNAs shown in sequence 1 and sequence 2 in the sequence listing;

(2) Primer pair 2 for detecting rs12142199, consisting of two single-stranded DNAs shown in sequence 4 and sequence 5 in the sequence listing;

(3) Primer pair 3 for detecting rs1229984, consisting of two single-stranded DNAs shown in sequence 7 and sequence 8 in the sequence listing;

(4) Primer pair 4 for detecting rs12402499, consisting of two single-stranded DNAs shown in sequence 10 and sequence 11 in the sequence listing;

(5) Primer pair 5 for detecting rs12498138, consisting of two single-stranded DNAs shown in sequence 13 and sequence 14 in the sequence listing;

(6) Primer pair 6 for detecting rs12594144, consisting of two single-stranded DNAs shown in sequence 16 and sequence 17 in the sequence listing;

(7) Primer pair 7 for detecting rs1426654, consisting of two single-stranded DNAs shown in sequence 19 and sequence 20 in the sequence listing;

(8) Primer pair 8 for detecting rs1557553, consisting of two single-stranded DNAs shown in sequence 22 and sequence 23 in the sequence listing;

(9) primer pair 9 for detecting rs16891982, consisting of two single-stranded DNAs shown in sequence 25 and sequence 26 in the sequence listing;

(10) primer pair 10 for detecting rs17822931, consisting of two single-stranded DNAs shown in sequence 28 and sequence 29 in the sequence listing;

(11) Primer pair 11 for detecting rs1871534, consisting of two single-stranded DNAs shown in sequence 31 and sequence 32 in the sequence listing;

(12) primer pair 12 for detecting rs2080161, consisting of two single-stranded DNAs shown in sequence 34 and sequence 35 in the sequence listing;

(13) primer pair 13 for detecting rs2139931, consisting of two single-stranded DNAs shown in sequence 37 and sequence 38 in the sequence listing;

(14) primer pair 14 for detecting rs2789823, consisting of two single-stranded DNAs shown in sequence 40 and sequence 41 in the sequence listing;

(15) primer pair 15 for detecting rs2814778, consisting of two single-stranded DNAs shown in sequence 43 and sequence 44 in the sequence listing;

(16) primer pair 16 for detecting rs3751050, consisting of two single-stranded DNAs shown in sequence 46 and sequence 47 in the sequence listing;

(17) primer pair 17 for detecting rs3827760, consisting of two single-stranded DNAs shown in sequence 49 and sequence 50 in the sequence listing;

(18) primer pair 18 for detecting rs4657449, consisting of two single-stranded DNAs shown in sequence 52 and sequence 53 in the sequence listing;

(19) primer pair 19 for detecting rs4749305, consisting of two single-stranded DNAs shown in sequence 55 and sequence 56 in the sequence listing;

(20) primer pair 20 for detecting rs4792928, consisting of two single-stranded DNAs shown in sequence 58 and sequence 59 in the sequence listing;

(21) primer pair 21 for detecting rs6054465, consisting of two single-stranded DNAs shown in sequence 61 and sequence 62 in the sequence listing;

(22) primer pair 22 for detecting rs6437783, consisting of two single-stranded DNAs shown in sequence 64 and sequence 65 in the sequence listing;

(23) primer pair 23 for detecting rs715605, consisting of two single-stranded DNAs shown in sequence 67 and sequence 68 in the sequence listing;

(24) primer pair 24 for detecting rs8072587, consisting of two single-stranded DNAs shown in sequence 70 and sequence 71 in the sequence listing;

(25) primer pair 25 for detecting rs8137373, consisting of two single-stranded DNAs shown in sequence 73 and sequence 74 in the sequence listing;

(26) primer pair 26 for detecting rs9522149, consisting of two single-stranded DNAs shown in sequence 76 and sequence 77 in the sequence listing;

(27) primer pair 27 for detecting rs9809818, consisting of two single-stranded DNAs shown in sequence 79 and sequence 80 in the sequence listing;

(28) Primer pair 28 for detecting rs9908046, consisting of two single-stranded DNAs shown in SEQ ID NO: 82 and SEQ ID NO: 83 in the sequence listing.

3. The primer pair set according to claim 2, wherein: in the primer pair set, the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, the primer pair 4, the primer pair primer pair 5, the primer pair 6, the primer pair 7, the primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair 13. The primer pair 14, the primer pair 15, the primer pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21, The molar ratio of the primer pair 22, the primer pair 23, the primer pair 24, the primer pair 25, the primer pair 26, the primer pair 27 and the primer pair 28 is 0.8:0.6:1.5 :2.7:3:0.8:2:1.3:1:1:1.5:1:2.7:0.5:0.8:2.5:0.4:1.6:2.5:4:3:0.8:0.8:3:3:6:0.6:0.6 ;

The molar ratio of both primers in each primer pair was 1:1.

4. Single-stranded DNA set for detecting 28 SNP sites in the human genome; the 28 SNP sites are: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs14276654, rs1557553, rs16891982, rs15348222 、rs2080161、rs2139931、rs2789823、rs2814778、rs3751050、rs3827760、rs4657449、rs4749305、rs4792928、rs6054465、rs6437783、rs715605、rs8072587、rs8137373、rs9522149、rs9809818和rs9908046；所述单链DNA组由如下(1)-(28) composition:

(1) Primer pair 1 and extension primer 1 for detecting rs10483251; the primer pair 1 consists of two single-stranded DNAs shown in sequence 1 and sequence 2 in the sequence table; the extension primer 1 is the sequence in the sequence table. 3 shows the single-stranded DNA;

(2) Primer pair 2 and extension primer 2 for detecting rs12142199; the primer pair 2 consists of two single-stranded DNAs shown in sequence 4 and sequence 5 in the sequence table; the extension primer 2 is the sequence in the sequence table. The single-stranded DNA shown in 6;

(3) primer pair 3 and extension primer 3 for detecting rs1229984; the primer pair 3 is composed of two single-stranded DNAs shown in sequence 7 and sequence 8 in the sequence table; the extension primer 3 is the sequence in the sequence table. The single-stranded DNA shown in 9;

(4) Primer pair 4 and extension primer 4 for detecting rs12402499; the primer pair 4 is composed of two single-stranded DNAs shown in sequence 10 and sequence 11 in the sequence table; the extension primer 4 is the sequence in the sequence table. The single-stranded DNA shown in 12;

(5) Primer pair 5 and extension primer 5 for detecting rs12498138; the primer pair 5 is composed of two single-stranded DNAs shown in sequence 13 and sequence 14 in the sequence table; the extension primer 5 is the sequence in the sequence table. Single-stranded DNA shown in 15;

(6) primer pair 6 and extension primer 6 for detecting rs12594144; the primer pair 6 is composed of two single-stranded DNAs shown in sequence 16 and sequence 17 in the sequence table; the extension primer 6 is the sequence in the sequence table. The single-stranded DNA shown in 18;

(7) Primer pair 7 and extension primer 7 for detecting rs1426654; the primer pair 7 is composed of two single-stranded DNAs shown in sequence 19 and sequence 20 in the sequence table; the extension primer 7 is the sequence in the sequence table. The single-stranded DNA shown in 21;

(8) Primer pair 8 and extension primer 8 for detecting rs1557553; the primer pair 8 is composed of two single-stranded DNAs shown in sequence 22 and sequence 23 in the sequence table; the extension primer 8 is the sequence in the sequence table. Single-stranded DNA shown in 24;

(9) Primer pair 9 and extension primer 9 for detecting rs16891982; the primer pair 9 is composed of two single-stranded DNAs shown in sequence 25 and sequence 26 in the sequence table; the extension primer 9 is the sequence in the sequence table. The single-stranded DNA shown in 27;

(10) A primer pair 10 and an extension primer 10 for detecting rs17822931; the primer pair 10 is composed of two single-stranded DNAs shown in sequence 28 and sequence 29 in the sequence listing; the extension primer 10 is the sequence in the sequence listing. Single-stranded DNA shown in 30;

(11) Primer pair 11 and extension primer 11 for detecting rs1871534; the primer pair 11 consists of two single-stranded DNAs shown in sequence 31 and sequence 32 in the sequence table; the extension primer 11 is the sequence in the sequence table. The single-stranded DNA shown in 33;

(12) Primer pair 12 and extension primer 12 for detecting rs2080161; the primer pair 12 is composed of two single-stranded DNAs shown in sequence 34 and sequence 35 in the sequence listing; the extension primer 12 is the sequence in the sequence listing. Single-stranded DNA shown in 36;

(13) Primer pair 13 and extension primer 13 for detecting rs2139931; the primer pair 13 consists of two single-stranded DNAs shown in sequence 37 and sequence 38 in the sequence listing; the extension primer 13 is the sequence in the sequence listing. The single-stranded DNA shown in 39;

(14) Primer pair 14 and extension primer 14 for detecting rs2789823; the primer pair 14 is composed of two single-stranded DNAs shown in sequence 40 and sequence 41 in the sequence listing; the extension primer 14 is the sequence in the sequence listing. Single-stranded DNA shown in 42;

(15) Primer pair 15 and extension primer 15 for detecting rs2814778; the primer pair 15 consists of two single-stranded DNAs shown in sequence 43 and sequence 44 in the sequence listing; the extension primer 15 is the sequence in the sequence listing Single-stranded DNA shown in 45;

(16) primer pair 16 and extension primer 16 for detecting rs3751050; the primer pair 16 is composed of two single-stranded DNAs shown in sequence 46 and sequence 47 in the sequence listing; the extension primer 16 is the sequence in the sequence listing Single-stranded DNA shown in 48;

(17) Primer pair 17 and extension primer 17 for detecting rs3827760; the primer pair 17 consists of two single-stranded DNAs shown in sequence 49 and sequence 50 in the sequence listing; the extension primer 17 is the sequence in the sequence listing. Single-stranded DNA shown in 51;

(18) Primer pair 18 and extension primer 18 for detecting rs4657449; the primer pair 18 consists of two single-stranded DNAs shown in sequence 52 and sequence 53 in the sequence listing; the extension primer 18 is the sequence in the sequence listing Single-stranded DNA shown in 54;

(19) Primer pair 19 and extension primer 19 for detecting rs4749305; the primer pair 19 is composed of two single-stranded DNAs shown in sequence 55 and sequence 56 in the sequence listing; the extension primer 19 is the sequence in the sequence listing. Single-stranded DNA shown in 57;

(20) a primer pair 20 and an extension primer 20 for detecting rs4792928; the primer pair 20 is composed of two single-stranded DNAs shown in sequence 58 and sequence 59 in the sequence table; the extension primer 20 is the sequence in the sequence table. Single-stranded DNA shown in 60;

(21) A primer pair 21 and an extension primer 21 for detecting rs6054465; the primer pair 21 is composed of two single-stranded DNAs shown in sequence 61 and sequence 62 in the sequence listing; the extension primer 21 is the sequence in the sequence listing. Single-stranded DNA shown in 63;

(22) A primer pair 22 and an extension primer 22 for detecting rs6437783; the primer pair 22 is composed of two single-stranded DNAs shown in sequence 64 and sequence 65 in the sequence listing; the extension primer 22 is the sequence in the sequence listing. Single-stranded DNA shown in 66;

(23) primer pair 23 and extension primer 23 for detecting rs715605; the primer pair 23 is composed of two single-stranded DNAs shown in sequence 67 and sequence 68 in the sequence listing; the extension primer 23 is the sequence in the sequence listing Single-stranded DNA shown in 69;

(24) primer pair 24 and extension primer 24 for detecting rs8072587; the primer pair 24 is composed of two single-stranded DNAs shown in sequence 70 and sequence 71 in the sequence listing; the extension primer 24 is the sequence in the sequence listing Single-stranded DNA shown in 72;

(25) primer pair 25 and extension primer 25 for detecting rs8137373; the primer pair 25 is composed of two single-stranded DNAs shown in sequence 73 and sequence 74 in the sequence listing; the extension primer 25 is the sequence in the sequence listing Single-stranded DNA shown in 75;

(26) A primer pair 26 and an extension primer 26 for detecting rs9522149; the primer pair 26 is composed of two single-stranded DNAs shown in sequence 76 and sequence 77 in the sequence listing; the extension primer 26 is the sequence in the sequence listing. Single-stranded DNA shown in 78;

(27) Primer pair 27 and extension primer 27 for detecting rs9809818; the primer pair 27 is composed of two single-stranded DNAs shown in sequence 79 and sequence 80 in the sequence listing; the extension primer 27 is the sequence in the sequence listing. Single-stranded DNA shown in 81;

(28) primer pair 28 and extension primer 28 for detecting rs9908046; the primer pair 28 is composed of two single-stranded DNAs shown in sequence 82 and sequence 83 in the sequence listing; the extension primer 28 is the sequence in the sequence listing 84 single-stranded DNA.

5. The single-stranded DNA group according to claim 4, wherein: in the single-stranded DNA group, the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, The primer pair 5, the primer pair 6, the primer pair 7, the primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair primer pair 13, the primer pair 14, the primer pair 15, the primer pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21. The molar ratio of the primer pair 22, the primer pair 23, the primer pair 24, the primer pair 25, the primer pair 26, the primer pair 27 and the primer pair 28 is 0.8:0.6 :1.5:2.7:3:0.8:2:1.3:1:1:1.5:1:2.7:0.5:0.8:2.5:0.4:1.6:2.5:4:3:0.8:0.8:3:3:6:0.6 : 0.6; the molar ratio of the two primers in each primer pair is 1:1;

The extension primer 1, the extension primer 2, the extension primer 3, the extension primer 4, the extension primer 5, the extension primer 6, the extension primer 7, the extension primer 8, the extension primer extension primer 9, the extension primer 10, the extension primer 11, the extension primer 12, the extension primer 13, the extension primer 14, the extension primer 15, the extension primer 16, the extension primer 17. The extension primer 18, the extension primer 19, the extension primer 20, the extension primer 21, the extension primer 22, the extension primer 23, the extension primer 24, the extension primer 25, The molar ratio of the extension primer 26, the extension primer 27 and the extension primer 28 is 0.45:0.35:1.2:1.7:4:1:3:0.8:1:0.8:1.8:1.1:1.1:1.1:0.8 :1.4:1:0.5:1.3:1.8:1.6:0.9:0.9:2:2.3:3:1:0.6.

6. A test kit for distinguishing five intercontinental populations, comprising the primer pair group described in claim 2 or 3 or the single-stranded DNA group described in claim 4 or 5, and at least one of the following substances: dNTP, DNA polymerase, alkaline phosphatase.

7. Use of a substance for detecting a combination of 28 SNP sites in any of the following:

(b) distinguish between five intercontinental populations;

8. The application according to claim 7, characterized in that: the substance used to detect the combination of 28 SNP sites is the primer pair set described in claim 2 or 3 or the single described in claim 4 or 5. stranded DNA set or the kit of claim 6.

9. A method for constructing a genotyping database of five intercontinental populations, comprising the steps of:

(a1) Select 28 SNP loci from the 1000 Genomes Project and the Human Genome Diversity Project for typing of five intercontinental populations to form an original typing library;

(a2) Perform structure clustering analysis on all the samples in the original genotyping database, and select the part with the ancestral principal component greater than 90% to constitute the genotyping database of five intercontinental populations;

10. A method of distinguishing five intercontinental populations, comprising the steps of:

(b1) constructing a genotyping database of five intercontinental populations according to the method of claim 9;

(b2) extracting the genomic DNA of the subject to be tested, and performing the detection of 28 SNP sites to obtain the original genotype data of the subject to be tested on the 28 SNP sites;

(b3) comparing the original genotype data of the subject at the 28 SNP sites with the genotyping database of the five intercontinental populations, so as to determine that the subject belongs to the five intercontinental populations What kind;

11. method according to claim 10, is characterized in that: when carrying out the detection of described 28 SNP sites, what adopts is the primer pair set described in claim 2 or 3 or the primer pair described in claim 4 or 5 The single-stranded DNA group or the kit of claim 6; 28-fold PCR amplification is performed, and the annealing temperature is 55°C.