CN107419017B - Method and system for inferring source of five continental ethnic groups of individuals of unknown origin - Google Patents

Method and system for inferring source of five continental ethnic groups of individuals of unknown origin Download PDF

Info

Publication number
CN107419017B
CN107419017B CN201710610369.8A CN201710610369A CN107419017B CN 107419017 B CN107419017 B CN 107419017B CN 201710610369 A CN201710610369 A CN 201710610369A CN 107419017 B CN107419017 B CN 107419017B
Authority
CN
China
Prior art keywords
sequence
primer pair
primer
detecting
extension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710610369.8A
Other languages
Chinese (zh)
Other versions
CN107419017A (en
Inventor
刘京
李彩霞
赵雯婷
江丽
郝伟琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Forensic Science Ministry of Public Security PRC
Original Assignee
Institute of Forensic Science Ministry of Public Security PRC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Forensic Science Ministry of Public Security PRC filed Critical Institute of Forensic Science Ministry of Public Security PRC
Priority to CN201710610369.8A priority Critical patent/CN107419017B/en
Publication of CN107419017A publication Critical patent/CN107419017A/en
Application granted granted Critical
Publication of CN107419017B publication Critical patent/CN107419017B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses an SNP system for distinguishing five intercontinental populations and application thereof. The application provided by the invention is specifically the application of 28SNP loci in any one of the following: (a) constructing a five continental population genotyping database; (b) five interstellar populations were distinguished. The system provided by the invention not only can distinguish five interstellar populations, but also has certain distinguishing capability to mixed populations, and matches ancestral principal components and populations of known source samples with source information of the ancestral principal components and the populations. And the source composition of the ancestors of the individuals can be deduced, and the method can be popularized and applied in actual inspection plans.

Description

Method and system for inferring source of five continental ethnic groups of individuals of unknown origin
Technical Field
The invention belongs to the technical field of biology, and relates to a method and a system for deducing five intercostal ethnic group sources of an individual with unknown source.
Background
The detection of DNA polymorphic sites with large distribution difference among people, namely Ancestral Information Sites (AIMs) can infer the regional sources of the groups of DNA donors in the crime scene. Short Tandem Repeat Sequences (STRs), Single Nucleotide Polymorphisms (SNPs), insertion/deletion polymorphisms (Indels) and the like can be used as Ancestral Information Sites (AIMs) for ethnic group inference, and play a good role. SNPs are supported by databases such as HapMap Project and 1000Genomes, and become important genetic markers for screening AIMs in recent years. A large number of AIMs differentiated from a large population of intercontinental regions are currently reported, e.g., the 27-SNPs of this panel and others are used for the inference of three populations in Africa, east Asia and Europe. Under the condition of the existing forensic DNA laboratory detection technology, the SNPs group inference system with good performance is required to keep the distinguishing capability among all groups balanced on the premise of ensuring the distinguishing efficiency of the groups; the information quantity of the sites is high, and the number of the sites is small as much as possible; the detection method is simple and easy to implement.
Disclosure of Invention
The invention aims to provide a method and a system for deducing the source of five continental groups of an individual with unknown source.
The invention protects the application of 28SNP loci in any one of the following methods:
(a) constructing a five continental population genotyping database;
(b) distinguishing five interstellar populations;
the 28SNP loci are respectively: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, rs17822931, rs1871534, rs2080161, rs2139931, rs2789823, rs2814778, rs3751050, rs3827760, rs4657449, rs4749305, rs4792928, rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818, rs9908046 (Table 2). The 28SNP sites described below are identical.
The invention protects a primer pair group for detecting 28SNP sites in a human genome.
The primer pair group provided by the invention for detecting 28SNP sites in a human genome consists of the following components (1) to (28): (1) the primer pair 1 for detecting rs10483251 consists of two single-stranded DNAs shown as a sequence 1 and a sequence 2 in a sequence table; (2) a primer pair 2 for detecting rs12142199, which consists of two single-stranded DNAs shown as a sequence 4 and a sequence 5 in a sequence table; (3) the primer pair 3 for detecting rs1229984 consists of two single-stranded DNAs shown as a sequence 7 and a sequence 8 in a sequence table; (4) the primer pair 4 for detecting rs12402499 consists of two single-stranded DNAs shown as a sequence 10 and a sequence 11 in a sequence table; (5) the primer pair 5 for detecting rs12498138 consists of two single-stranded DNAs shown as a sequence 13 and a sequence 14 in a sequence table; (6) the primer pair 6 for detecting rs12594144 consists of two single-stranded DNAs shown as a sequence 16 and a sequence 17 in a sequence table; (7) primer pair 7 for detecting rs1426654, which consists of two single-stranded DNAs shown as sequence 19 and sequence 20 in the sequence table; (8) the primer pair 8 for detecting rs1557553 consists of two single-stranded DNAs shown as a sequence 22 and a sequence 23 in a sequence table; (9) a primer pair 9 for detecting rs16891982, which consists of two single-stranded DNAs shown as a sequence 25 and a sequence 26 in a sequence table; (10) the primer pair 10 for detecting rs17822931 consists of two single-stranded DNAs shown as a sequence 28 and a sequence 29 in a sequence table; (11) a primer pair 11 for detecting rs1871534, which consists of two single-stranded DNAs shown as a sequence 31 and a sequence 32 in a sequence table; (12) the primer pair 12 for detecting rs2080161 consists of two single-stranded DNAs shown as a sequence 34 and a sequence 35 in a sequence table; (13) the primer pair 13 for detecting rs2139931 consists of two single-stranded DNAs shown as a sequence 37 and a sequence 38 in a sequence table; (14) the primer pair 14 for detecting rs2789823 comprises two single-stranded DNAs shown as a sequence 40 and a sequence 41 in a sequence table; (15) primer pair 15 for detecting rs2814778, which consists of two single-stranded DNAs shown as sequence 43 and sequence 44 in the sequence table; (16) the primer pair 16 for detecting rs3751050 consists of two single-stranded DNAs shown as a sequence 46 and a sequence 47 in a sequence table; (17) the primer pair 17 for detecting rs3827760 consists of two single-stranded DNAs shown as a sequence 49 and a sequence 50 in a sequence table; (18) the primer pair 18 for detecting rs4657449 consists of two single-stranded DNAs shown as a sequence 52 and a sequence 53 in a sequence table; (19) the primer pair 19 for detecting rs4749305 comprises two single-stranded DNAs shown as a sequence 55 and a sequence 56 in a sequence table; (20) the primer pair 20 for detecting rs4792928 consists of two single-stranded DNAs shown as a sequence 58 and a sequence 59 in a sequence table; (21) the primer pair 21 for detecting rs6054465 consists of two single-stranded DNAs shown as a sequence 61 and a sequence 62 in a sequence table; (22) the primer pair 22 for detecting rs6437783 consists of two single-stranded DNAs shown as a sequence 64 and a sequence 65 in a sequence table; (23) the primer pair 23 for detecting rs715605 consists of two single-stranded DNAs shown as a sequence 67 and a sequence 68 in a sequence table; (24) the primer pair 24 for detecting rs8072587 consists of two single-stranded DNAs shown as a sequence 70 and a sequence 71 in a sequence table; (25) the primer pair 25 for detecting rs8137373 is composed of two single-stranded DNAs shown as a sequence 73 and a sequence 74 in a sequence table; (26) the primer pair 26 for detecting rs9522149 consists of two single-stranded DNAs shown as a sequence 76 and a sequence 77 in a sequence table; (27) the primer pair 27 for detecting rs9809818 consists of two single-stranded DNAs shown as a sequence 79 and a sequence 80 in a sequence table; (28) the primer pair 28 for detecting rs9908046 consists of two single-stranded DNAs shown as a sequence 82 and a sequence 83 in a sequence table.
In the primer pair group, a molar ratio of the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, the primer pair 5, the primer pair 6, the primer pair 7, the primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair 13, the primer pair 14, the primer pair 15, the primer pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21, the primer pair 22, the primer pair 23, the primer pair 24, the primer pair 25, the primer pair 26, the primer pair 27, and the primer pair 28 is 0.8: 0.6: 1.5: 2.7: 3: 0.8: 2: 1.3: 1: 1: 1.5: 1: 2.7: 0.5: 0.8: 2.5: 0.4: 1.6: 2.5: 4: 3: 0.8: 0.8: 3: 3: 6: 0.6: 0.6. the molar ratio of the two primers in each primer pair is 1: 1.
the invention protects a single-stranded DNA group for detecting 28SNP sites in a human genome.
The single-stranded DNA group for detecting 28SNP sites in a human genome provided by the invention consists of the following components (1) to (28): (1) primer pair 1 and extension primer 1 for detecting rs 10483251; the primer pair 1 consists of two single-stranded DNAs shown as a sequence 1 and a sequence 2 in a sequence table; the extension primer 1 is single-stranded DNA shown as a sequence 3 in a sequence table; (2) primer pair 2 and extension primer 2 for detecting rs 12142199; the primer pair 2 consists of two single-stranded DNAs shown as a sequence 4 and a sequence 5 in a sequence table; the extension primer 2 is single-stranded DNA shown as a sequence 6 in a sequence table; (3) primer pair 3 and extension primer 3 for detecting rs 1229984; the primer pair 3 consists of two single-stranded DNAs shown as a sequence 7 and a sequence 8 in a sequence table; the extension primer 3 is single-stranded DNA shown as a sequence 9 in a sequence table; (4) primer pair 4 and extension primer 4 for detecting rs 12402499; the primer pair 4 consists of two single-stranded DNAs shown as a sequence 10 and a sequence 11 in a sequence table; the extension primer 4 is single-stranded DNA shown as a sequence 12 in a sequence table; (5) primer pair 5 and extension primer 5 for detecting rs 12498138; the primer pair 5 consists of two single-stranded DNAs shown as a sequence 13 and a sequence 14 in a sequence table; the extension primer 5 is single-stranded DNA shown as a sequence 15 in a sequence table; (6) a primer pair 6 and an extension primer 6 for detecting rs 12594144; the primer pair 6 consists of two single-stranded DNAs shown as a sequence 16 and a sequence 17 in a sequence table; the extension primer 6 is single-stranded DNA shown as a sequence 18 in a sequence table; (7) primer pair 7 and extension primer 7 for detecting rs 1426654; the primer pair 7 consists of two single-stranded DNAs shown as a sequence 19 and a sequence 20 in a sequence table; the extension primer 7 is single-stranded DNA shown as a sequence 21 in a sequence table; (8) a primer pair 8 and an extension primer 8 for detecting rs 1557553; the primer pair 8 consists of two single-stranded DNAs shown as a sequence 22 and a sequence 23 in a sequence table; the extension primer 8 is single-stranded DNA shown as a sequence 24 in a sequence table; (9) primer pair 9 and extension primer 9 for detecting rs 16891982; the primer pair 9 consists of two single-stranded DNAs shown as a sequence 25 and a sequence 26 in a sequence table; the extension primer 9 is single-stranded DNA shown as a sequence 27 in a sequence table; (10) a primer pair 10 and an extension primer 10 for detecting rs 17822931; the primer pair 10 consists of two single-stranded DNAs shown as a sequence 28 and a sequence 29 in a sequence table; the extension primer 10 is single-stranded DNA shown as a sequence 30 in a sequence table; (11) primer pair 11 and extension primer 11 for detecting rs 1871534; the primer pair 11 consists of two single-stranded DNAs shown as a sequence 31 and a sequence 32 in a sequence table; the extension primer 11 is single-stranded DNA shown as a sequence 33 in a sequence table; (12) a primer pair 12 and an extension primer 12 for detecting rs 2080161; the primer pair 12 consists of two single-stranded DNAs shown as a sequence 34 and a sequence 35 in a sequence table; the extension primer 12 is single-stranded DNA shown as a sequence 36 in a sequence table; (13) a primer pair 13 and an extension primer 13 for detecting rs 2139931; the primer pair 13 consists of two single-stranded DNAs shown as a sequence 37 and a sequence 38 in a sequence table; the extension primer 13 is a single-stranded DNA shown as a sequence 39 in a sequence table; (14) primer pair 14 and extension primer 14 for detecting rs 2789823; the primer pair 14 consists of two single-stranded DNAs shown as a sequence 40 and a sequence 41 in a sequence table; the extension primer 14 is single-stranded DNA shown as a sequence 42 in a sequence table; (15) primer pair 15 and extension primer 15 for detecting rs 2814778; the primer pair 15 consists of two single-stranded DNAs shown as a sequence 43 and a sequence 44 in a sequence table; the extension primer 15 is single-stranded DNA shown as a sequence 45 in a sequence table; (16) a primer pair 16 and an extension primer 16 for detecting rs 3751050; the primer pair 16 consists of two single-stranded DNAs shown as a sequence 46 and a sequence 47 in a sequence table; the extension primer 16 is single-stranded DNA shown as a sequence 48 in a sequence table; (17) primer pair 17 and extension primer 17 for detecting rs 3827760; the primer pair 17 consists of two single-stranded DNAs shown as a sequence 49 and a sequence 50 in a sequence table; the extension primer 17 is single-stranded DNA shown as a sequence 51 in a sequence table; (18) primer pair 18 and extension primer 18 for detecting rs 4657449; the primer pair 18 consists of two single-stranded DNAs shown as a sequence 52 and a sequence 53 in a sequence table; the extension primer 18 is single-stranded DNA shown as a sequence 54 in a sequence table; (19) primer pair 19 and extension primer 19 for detecting rs 4749305; the primer pair 19 consists of two single-stranded DNAs shown as a sequence 55 and a sequence 56 in a sequence table; the extension primer 19 is single-stranded DNA shown as a sequence 57 in a sequence table; (20) a primer pair 20 and an extension primer 20 for detecting rs 4792928; the primer pair 20 consists of two single-stranded DNAs shown as a sequence 58 and a sequence 59 in a sequence table; the extension primer 20 is single-stranded DNA shown as a sequence 60 in a sequence table; (21) a primer pair 21 and an extension primer 21 for detecting rs 6054465; the primer pair 21 consists of two single-stranded DNAs shown as a sequence 61 and a sequence 62 in a sequence table; the extension primer 21 is single-stranded DNA shown as a sequence 63 in a sequence table; (22) a primer pair 22 and an extension primer 22 for detecting rs 6437783; the primer pair 22 consists of two single-stranded DNAs shown as a sequence 64 and a sequence 65 in a sequence table; the extension primer 22 is single-stranded DNA shown as a sequence 66 in a sequence table; (23) a primer pair 23 and an extension primer 23 for detecting rs 715605; the primer pair 23 consists of two single-stranded DNAs shown as a sequence 67 and a sequence 68 in a sequence table; the extension primer 23 is single-stranded DNA shown as a sequence 69 in a sequence table; (24) a primer pair 24 and an extension primer 24 for detecting rs 8072587; the primer pair 24 consists of two single-stranded DNAs shown as a sequence 70 and a sequence 71 in a sequence table; the extension primer 24 is single-stranded DNA shown as a sequence 72 in a sequence table; (25) a primer pair 25 and an extension primer 25 for detecting rs 8137373; the primer pair 25 consists of two single-stranded DNAs shown as a sequence 73 and a sequence 74 in a sequence table; the extension primer 25 is single-stranded DNA shown as a sequence 75 in a sequence table; (26) a primer pair 26 and an extension primer 26 for detecting rs 9522149; the primer pair 26 consists of two single-stranded DNAs shown as a sequence 76 and a sequence 77 in a sequence table; the extension primer 26 is single-stranded DNA shown as a sequence 78 in a sequence table; (27) a primer pair 27 and an extension primer 27 for detecting rs 9809818; the primer pair 27 consists of two single-stranded DNAs shown as a sequence 79 and a sequence 80 in a sequence table; the extension primer 27 is single-stranded DNA shown as a sequence 81 in a sequence table; (28) a primer pair 28 and an extension primer 28 for detecting rs 9908046; the primer pair 28 consists of two single-stranded DNAs shown as a sequence 82 and a sequence 83 in a sequence table; the extension primer 28 is single-stranded DNA shown as a sequence 84 in a sequence table.
In the single-stranded DNA group, a molar ratio of the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, the primer pair 5, the primer pair 6, the primer pair 7, the primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair 13, the primer pair 14, the primer pair 15, the primer pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21, the primer pair 22, the primer pair 23, the primer pair 24, the primer pair 25, the primer pair 26, the primer pair 27, and the primer pair 28 is 0.8: 0.6: 1.5: 2.7: 3: 0.8: 2: 1.3: 1: 1: 1.5: 1: 2.7: 0.5: 0.8: 2.5: 0.4: 1.6: 2.5: 4: 3: 0.8: 0.8: 3: 3: 6: 0.6: 0.6; the molar ratio of the two primers in each primer pair is 1: 1. the extension primer 1, the extension primer 2, the extension primer 3, the extension primer 4, the extension primer 5, the extension primer 6, the extension primer 7, the extension primer 8, the extension primer 9, the extension primer 10, the extension primer 11, the extension primer 12, the extension primer 13, the extension primer 14, the extension primer 15, the extension primer 16, the extension primer 17, the extension primer 18, the extension primer 19, the extension primer 20, the extension primer 21, the extension primer 22, the extension primer 23, the extension primer 24, the extension primer 25, the extension primer 26, the extension primer 27, and the extension primer 28 are present in a molar ratio of 0.45: 0.35: 1.2: 1.7: 4: 1: 3: 0.8: 1: 0.8: 1.8: 1.1: 1.1: 1.1: 0.8: 1.4: 1: 0.5: 1.3: 1.8: 1.6: 0.9: 0.9: 2: 2.3: 3: 1: 0.6.
the invention also protects a kit for distinguishing five interstellar populations.
The kit for distinguishing the five interstellar populations provided by the invention contains the primer pair group or the single-stranded DNA group.
The kit may further contain at least one of the following substances as required: dNTP, DNA polymerase, alkaline phosphatase.
The invention also protects the application of the substance for detecting 28SNP sites in any one of the following methods:
(a) constructing a five continental population genotyping database;
(b) five interstellar populations were distinguished.
Wherein, the substance for detecting 28SNP sites can be the primer pair group or the single-stranded DNA group or the kit.
The invention also discloses a method for constructing the five interpolant population genotyping database.
The method for constructing the five interpolant population genotyping database provided by the invention specifically comprises the following steps:
(a1) selecting the 28SNP loci of five interstellar populations from a thousand-person genome project and a human genome diversity plan to be typed to form an original typing library;
(a2) and performing structure clustering analysis on all samples in the original typing database, and selecting a part with the ancestor principal component of more than 90% from the samples to form a five-continental population genotyping database.
Structure is a free, multi-platform (Windows, Mac, Linux), open source, classical bioinformatics software that uses multi-site genotype data composed of unlinked markers to implement a model-based clustering method to infer population composition Structure, and is widely used in the fields of human genetics, group genetics, forensic genetics, and the like. The parameters used were: 10000 burns, 10000 reptitions, mixed models; the results were run to obtain the ancestral component ratio for each sample.
The invention also protects a method for distinguishing five interpolal groups.
The method for distinguishing the five interpolal groups provided by the invention specifically comprises the following steps:
(b1) constructing a five interstellar population genotyping database according to the method;
(b2) extracting genome DNA of a person to be detected, and detecting 28SNP sites to obtain original genotype data of the person to be detected on the 28SNP sites;
(b3) and comparing the original genotype data of the person to be detected on the 28SNP sites with the five interstellar population genotyping database by an analysis method, thereby determining which of the five interstellar populations the person to be detected belongs to.
In the two methods, when the 28SNP sites are detected, the primer pair group or the single-stranded DNA group or the kit is adopted; 28-fold PCR amplification was performed with an annealing temperature of 55 ℃.
The five intercontinental areas described hereinbefore are east asia, europe, africa, oceania and america.
In order to realize the distinguishing of five interstellar populations (east Asia, Europe, Africa, oceania and America), the invention screens out 28SNP loci, constructs a composite detection system, uses the system to detect 712 samples from 16 populations, combines the detection result with 20 populations in thousand human genomes and 2 populations in a CEPH library to total 2804 individual typing data, and adopts a cluster analysis method and a principal component analysis method to evaluate the system efficiency. Selecting a sample with ancestor main components of more than 90 percent to construct a reference population genotyping database, carrying out population matching probability, individual main component analysis and the like on 140 individuals with known ancestor sources to carry out population source inference, and evaluating the population source distinguishing capability of the system in an actual sample. The result shows that the system can distinguish five interstellar populations, has certain distinguishing capability to mixed populations, and matches ancestral principal components and populations of known source samples with source information of the ancestral principal components and the populations. And the source composition of the ancestors of the individuals can be deduced, and the method can be popularized and applied in actual inspection plans.
Drawings
FIG. 1 is a drawing showing typing at a DNA concentration of 5 ng/. mu.L (28-plex SNP detection result). In the figure, 1-28 represent 28SNPs (corresponding to the numbers in Table 2), each number being followed by corresponding typing data (the typing data shown after the number are complementary bases in Table 2, since there are complementary strands detected).
FIG. 2 is a principal component analysis of the frequency of typing genes in 38 populations based on 28 SNPs. A: an african individual; b: an American individual; c: an east Asian individual; d: a European individual; e: mixed population; f: individual in oceania.
FIG. 3 shows the results of Structure analysis of 38 human groups with 28 SNPs.
FIG. 4 is a graph of population classification analysis of test samples. A: an african individual; b: an American individual; c: east asian individuals (guang xihan); d: east asian individuals (henham); e: a European individual; f: a family of dimensions; g: individual in oceania.
FIG. 5 is a graph of the results of the detection of "9947" and "HWQ" samples in accordance with the present invention and article system. A: 9947 sample article system test results; b: the detection result of the HWQ sample article system; c: 9947 sample the test results of the system of the present invention; d: the HWQ sample is the detection result of the system. In A-D, 1-28 represent 28SNPs (corresponding to the numbers in the column "numbering" in Table 6), each number being followed by typing data.
Detailed Description
The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
Example 1 construction and application of intercontinental population genotyping database of the invention
Materials and methods
1. Sample information
2000 samples of 20 people groups in a thousand human genome (1000 genes) of a public database and 92 samples of 2 people groups in a human genome diversity plan (HGDP-CEPH) are selected, a detection sample comprises 712 samples of 16 people groups, and 2804 individuals of 38 people groups are used as verification samples. The details are shown in Table 1.
TABLE 1 crowd sample information Table
Figure BDA0001359351980000061
Note: italics indicate the test population and parenthesis indicates the number of test samples selected.
2. Sources of SNPs sites
By analyzing The "Global AIMs Nano set", 3 triallelic SNP sites in 31 AIMs (de la pure M, Santos C, Fondervia M, et al. The Global AIMs Nano set: A31-plex Snapshot assay and information SNPs [ J ]. Formanic Science International: Genetics,2016,22:81-88.) were deleted, and The remaining 28 allele SNPs sites were used for construction of a complex system, The site information is shown in Table 2.
Table 228 SNP sites details
Figure BDA0001359351980000071
3. Multiplex PCR amplification
The PCR multiplex amplification reaction system is 5 mu L, and contains 10 × PCR buffer (containing Mg)2+15mmol/L)0.6μL,25mmol/LMgCl20.9 muL, 0.1 muL of 10mmol/L dNTP, 0.7 muL of composite amplification primer, 5U/muL
Figure BDA0001359351980000072
0.1. mu.L of plusDNA polymerase (QIAGEN, Germany), 1. mu.L of 5 ng/. mu.L of template DNA, and water to make up to 5. mu.L. And (3) PCR reaction conditions: after 10min at 95 ℃; circulating at 95 deg.C for 30s, 55 deg.C for 40s, and 72 deg.C for 1min for 40 times; finally, the extension is carried out for 20min at 72 ℃. The purification reaction system is 7.5 mu L, and contains 5 mu L of amplification product, H2O 1μL,ExoΙ(10U/μL)0.2μL,SAP(1U/μL)1μL,10.3 μ L of 0 × SAP buffer, mixing well, incubating at 37 deg.C for 45min, and inactivating enzyme activity at 85 deg.C for 15 min.
Use of
Figure BDA0001359351980000073
Single base extension reaction was carried out using the Multiplex Kit (ABI, USA) containing 2. mu.L of purified PCR product, 1. mu.L of the composite extension primer, and 2.5. mu.L of SNaPshot mix in a 5.5. mu.L system. And (3) PCR reaction conditions: after 10s at 96 ℃; circulating 33 times at 59 deg.C for 5s and 60 deg.C for 30 s. Then, 1. mu.L of SAP (1U/. mu.L) was added to the reaction product, incubated at 37 ℃ for 80min, and purified at 85 ℃ for 15min to remove excess primers and dNTPs.
The above extension-purified product was subjected to capillary electrophoresis detection on a 3130-XL genetic Analyzer (ABI, USA). The assay system was 10. mu.L, and included 1. mu.L of a mixture of the purified single base extension product, 9. mu.L of formamide, and an internal standard GeneScanLiz-120(38:1 by volume) (ABI, USA). Setting electrophoresis parameters: the sample injection time is 18s, the sample injection voltage is 3kV, the electrophoresis voltage is 13.4kV, and the electrophoresis time is 15 min. Genotyping analysis was performed according to Genemapper ID v3.2 software.
Wherein, the primer sequences for detecting each SNP site and the concentrations thereof are detailed in Table 3.
TABLE 3 primer sequences for detecting respective SNP sites and concentrations thereof
Site of the body Reference numerals Amplification primer (F) Amplification primer (R) Concentration of amplification primers (. mu.M) Extension primer Number of bases Extension primer concentration (μ M)
rs10483251 EX-28*22 Sequence 1 Sequence 2 0.8 Sequence 3 48 0.45
rs12142199 EX-28*23 Sequence 4 Sequence 5 0.6 Sequence 6 63 0.35
rs1229984 EX-28*2 Sequence 7 Sequence 8 1.5 Sequence 9 73 1.2
rs12402499 EX-28*24 Sequence 10 Sequence 11 2.7 Sequence 12 51 1.7
rs12498138 EX-28*25 Sequence 13 Sequence 14 3 Sequence 15 63 4
rs12594144 EX-28*26 Sequence 16 Sequence 17 0.8 Sequence 18 40 1
rs1426654 EX-28*3 Sequence 19 Sequence 20 2 Sequence 21 67 3
rs1557553 EX-28*4 Sequence 22 Sequence 23 1.3 Sequence 24 93 0.8
rs16891982 EX-28*27 Sequence 25 Sequence 26 1 Sequence 27 80 1
rs17822931 EX-28*28 Sequence 28 Sequence 29 1 Sequence 30 59 0.8
rs1871534 EX-28*5 Sequence 31 Sequence 32 1.5 Sequence 33 105 1.8
rs2080161 EX-28*6 Sequence 34 Sequence 35 1 Sequence 36 48 1.1
rs2139931 EX-28*7 Sequence 37 Sequence 38 2.7 Sequence 39 109 1.1
rs2789823 EX-28*8 Sequence 40 Sequence 41 0.5 Sequence 42 51 1.1
rs2814778 EX-28*9 Sequence 43 Sequence 44 0.8 Sequence 45 89 0.8
rs3751050 EX-28*10 Sequence 46 Sequence 47 2.5 Sequence 48 44 1.4
rs3827760 EX-28*11 Sequence 49 Sequence 50 0.4 Sequence 51 89 1
rs4657449 EX-28*12 Sequence 52 Sequence 53 1.6 Sequence 54 93 0.5
rs4749305 EX-28*13 Sequence 55 Sequence 56 2.5 Sequence 57 67 1.3
rs4792928 EX-28*14 Sequence 58 Sequence 59 4 Sequence 60 99 1.8
rs6054465 EX-28*15 Sequence 61 Sequence 62 3 Sequence 63 84 1.6
rs6437783 EX-28*16 Sequence 64 Sequence 65 0.8 Sequence 66 45 0.9
rs715605 EX-28*1 Sequence 67 Sequence 68 0.8 Sequence 69 113 0.9
rs8072587 EX-28*17 Sequence 70 Sequence 71 3 Sequence 72 96 2
rs8137373 EX-28*18 Sequence 73 Sequence 74 3 Sequence 75 100 2.3
rs9522149 EX-28*19 Sequence 76 Sequence 77 6 Sequence 78 72 3
rs9809818 EX-28*20 Sequence 79 Sequence 80 0.6 Sequence 81 55 1
rs9908046 EX-28*21 Sequence 82 Sequence 83 0.6 Sequence 84 109 0.6
Second, software and analysis method
1. Principal Component Analysis (PCA)
Principal component analysis was performed using rv3.2.3 software: a. dividing 38 crowds including public database samples and detection samples into African, American, east Asia, Europe, oceania and mixed crowds (Central Asia, Central Asia and south Asia) according to the intercontinental region and the region, and performing population principal component analysis based on gene frequency; b. one sample was randomly taken from each of the 7 test populations (table 1) for individual principal component analysis (principal component analysis was performed with R v3.3.2 and ethnicity classification plots were drawn with the R package ggplot 2).
2. Cluster analysis
Carrying out clustering analysis on 38 crowds in the table 1 by using structure. v2.3.4 software (taking 3-7 from K), analyzing the genetic structure of each crowd, and drawing a crowd clustering result graph by using Disstruct 1.1; and the statistics of the individual ancestral components are carried out on the 7 samples subjected to the individual principal component analysis.
Structure is a free, multi-platform (Windows, Mac, Linux), open source, classical bioinformatics software that uses multi-site genotype data composed of unlinked markers to implement a model-based clustering method to infer population composition Structure, and is widely used in the fields of human genetics, group genetics, forensic genetics, and the like. The parameters used were: 10000 burns, 10000 reptitions, mixed models; the results were run to obtain the ancestral component ratio for each sample.
3. Random crowd match probability
The calculation of the random population matching was performed using forensic intelligence software on 140 randomly selected test samples from 7 populations (labeled in table 1).
Third, experimental results
1. 28-plex SNP detection results
The results are shown in FIG. 1, from which it can be seen that: the 28 AIMs meet the following criteria: 1) the distinguishing capability of the five interpolal groups is kept well balanced; 2) the interval between loci is at least 1Mb, and the occurrence probability of linkage inheritance is reduced. The system adopts a popular SNaPshot detection typing technology, and 28 site alleles can be obviously judged to be typed.
2. Evaluation of efficiency of differentiation of System
(1) Principal component analysis
The results are shown in FIG. 2, from which it can be seen that: principal component 1(PC1) and principal component 2(PC2) explain 61.3% of the differences, 28 sites clearly distinguish 38 populations into six segments, and in combination with table 1 we find that the more concentrated population is the source of the same intercontinental ancestor, that 30 populations derived from five intercontinental regions (africa, europe, east asia, america, oceania) are clearly separated into five segments, and that in principal component 1 america, east asia, oceania, africa, europe are sequentially separated, wherein the population in east asia and europe are relatively concentrated, and that in mixed population (8) are relatively distributed, but are located between the population in east asia and europe, and that in mixed population in south asia and europe, the misconcentration of distribution further illustrates the complexity of their genetic structure and requires further research; in principal component 2, the oceania and american populations are distinguished.
(2) Cluster analysis
Using the 28 AIMs, a total of 2804 specimens from the 38 populations described above were analyzed for genetic structure. The results are shown in FIG. 3, from which it can be seen that: when K is 3, the population is clustered into three major segments: africa and europe are distinguished, the american, oceania and east asian populations exhibit a consistent ancestral composition, and the genetic composition of the mixed population of vickers and the like exhibits a continuous distribution of ancestral compositions in europe and east asia. As K-value increased, new ancestral principal components appeared successively in america, oceania, and when K-value was 6, all individuals corresponded to 6 person groups: the mixed population of Africa, America, oceania, east Asia, mixed population and Europe has independent ancestral components, which shows that the mixed species of Uygur and south Asia becomes a transition species with relatively stable genetic components after long-term fusion and evolution. When the K value is increased to 7, new components appear in the mixed population in south Asia, which indicates that the mixed population in south Asia has differences in ancestral origin and genetic structure. When the K value continues to increase, further delamination does not occur in the east Asian population, which indicates that the system cannot further distinguish the east Asian local sub-population.
(3) Crowd source inference test for individual individuals
In order to avoid the occurrence of mixed components and achieve the maximum distinguishing capacity, K-6 is selected as the more ideal distinguishing capacity of the system; meanwhile, in order to be further applied to forensic actual case examination and improve the accuracy of inference, individuals (2201 samples in total) with ancestor main components of more than 90% are selected as reference samples to construct a reference population genotyping database, and the following statistical method is adopted to evaluate the individual inference capability of the system.
A. Likelihood ratio
The random population match probabilities were calculated for 140 samples of known individual origin based on a reference database (not containing 140 test samples), and their possible intercontinental population origins were counted based on likelihood ratios. The group matching probability, i.e. the random matching probability, is simply the estimated probability that a specific typing of a certain locus combination may occur in a population, and can also be understood as the theoretical probability that a sample is randomly drawn from the population and a specific DNA typing will occur. The calculation of the LR values is specifically: and the group matching probability with the maximum unknown individual probability is the denominator, the matching probabilities of other groups are the numerators, and the likelihood ratios of different groups are obtained in sequence. The results are shown in Table 4. The ancestral source of 140 test samples concluded 137 consistent with the sample information; the other 3 sample ancestor inferred sources from the dimensional family are respectively mixed population, east Asia and Europe, but the likelihood ratios are all less than 100, and the inferred results do not exclude sample information. In summary, the system concluded an absolute accuracy of 97.86% for the ancestral source of the test samples, and 2.14% did not exclude the sample information source. The system is higher in accuracy when the source of the five interpolant population is deduced by combining the table 4, and the comprehensive ancestral components and MP values are required to be analyzed when the mixed population is deduced. In addition, race and ethnicity are not consistent concepts, the nationality is in the Asia-Europe border, the nationality belongs to the Europe and Asia mixed race, the fusion between different races has no obvious boundary, the ethnicity information in the household registration is not completely consistent with the race, the three ethnicity samples do not reach the discrimination standard, but the inference result accords with the geographical position distribution, and the comprehensive consideration of various aspects is carried out on people in the later population differentiation.
TABLE 4 test sample match probability results
Figure BDA0001359351980000101
B. Testing crowd sourcing classifications of individuals
From the 140 test samples, 1 sample is randomly selected from each test population, and the individual principal component analysis is carried out based on the constructed reference database to deduce the population source, as shown in fig. 4, the ancestral component calculation statistics of 7 samples are shown in table 5. From the individual principal component analysis results (fig. 4), it can be seen that 7 known individual samples can all fall into the corresponding ancestral population, and in the individual ancestral component calculation (table 5), we can see that the ancestral principal components of 7 test samples all reach more than 94%.
TABLE 5 results of ancestral source analysis of seven test samples
Figure BDA0001359351980000102
Combining the results of this example, it can be seen that: the SNP composite system provided by the invention can effectively carry out genetic structure analysis and individual ancestral source inference on five populations and mixed populations of east Asia, Europe, Africa, America and Atlanta. Considering that the population of the current society flows frequently and the flowing range is wide, the system adopts fewer sites to realize the differentiation of five interstellar populations, compared with the previously established 27-Plex SNP coverage range aiming at the interptellar populations of Europe, Asia and non-three continents, the two systems can mutually verify aiming at the ancestor source deduction of unknown samples in the DNA inspection of forensic medicine, and more accurate investigation clues are provided for cases.
Example 2 comparison of the present invention with the prior art
This example further demonstrates the superiority and inferiority of the 28-system of the invention with 31 AIMs (de la pure M, Santos C, Fondervia M, et al. the Global AIMs Nano set: A31-plex SNaPshot assay of information SNPs [ J ]. Formanic Science International: Genetics,2016,22:81-88.) using a standard European DNA sample (9947) and an east Asian DNA sample (HWQ, Asian in the laboratory of the inventors).
System group 28 of the present invention: the procedure was followed in example 1.
Article 31 arms groups: the procedure was performed as described in de la pure M, Santos C, Fondervia M, et al, the Global AIMs Nano set, A31-plex SNa Pshot assay of analytical SNPs [ J ]. Formanic Science International: Genetics,2016,22:81-88, i.e., the amplification and extension primers were matched according to the concentrations in the article (excluding three trialles).
The results are shown in table 6 and fig. 5. For DNA sample of '9947' standard European, the result of identification by the 28 system and the article system of the invention is consistent; for the DNA sample of east Asian, HWQ, two inconsistency (see bold underlined part in table 6) appear in the identification result of the 28 system and the article system of the invention, and further the detection result of the 28 system of the invention is verified to be correct by carrying out single amplification and sequencing on the inconsistent sites. It can be seen that, compared with 31 AIMs in the article, the system of the present invention further improves the accuracy of the detection result under the condition of reducing the number of SNP sites, which is particularly embodied in this embodiment as more accurate detection result for east Asia samples.
As can be seen from FIG. 5, the following problems mainly occur when the amplification primers and the extension primers are matched according to the concentrations in the article (three triallelic genes are removed): the 31 AIMs in the article have the influence on the type judgment due to incomplete peak emergence of sites, unbalanced peak emergence of sites (large difference of peak heights of heterozygotes) and peak emergence overlap among certain sites. The system 28 of the present invention effectively solves the above problems: 1. the solution to "incomplete site peak emergence" is: the proportion of the primers is adjusted. 2. The solution to "site peak imbalance (greater difference in peak heights for heterozygotes)": the annealing temperature is reduced under the condition of ensuring the peak specificity. 3. Aiming at the solution that the peak overlapping between certain sites has influence on the type judgment: adjusting the length of the extended primer. The system 28 of the invention finally achieves accurate typing through the adjustment.
TABLE 6 results of "9947" and "HWQ" in the two detection systems of the present invention and the present article
Figure BDA0001359351980000111
Figure BDA0001359351980000121
<110> material evidence identification center of public security department
<120> method and system for intercontinental group source inference for individuals of unknown origin
<130>GNCLN171048
<160>84
<170>PatentIn version 3.5
<210>1
<211>22
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>1
gcacgttctt aaccttggct at 22
<210>2
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>2
ttctgaatat cccacccaca a 21
<210>3
<211>48
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>3
gtgccacgtc gtgaaagtct gacaaggaaa aagttatgtg accagatt 48
<210>4
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>4
aggccttgat gtgcttgaac 20
<210>5
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>5
cgagaaggcc aaccactact 20
<210>6
<211>63
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>6
aacaactgac taaactaggt gccacgtcgt gaaagtctga caatcaaaca tgttcctctg 60
cac 63
<210>7
<211>24
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>7
attctgtaga tggtggctgt agga 24
<210>8
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>8
ctgcctcatg gcctaaaatc a 21
<210>9
<211>59
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>9
caactgacta aactaggtgc cacgtcgtga aagtctgaca aaccacgtgg tcatctgtg 59
<210>10
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>10
tgaagggtat tactagtggc 20
<210>11
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>11
ttgacagact tctgcttttg 20
<210>12
<211>51
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>12
aggtgccacg tcgtgaaagt ctgacaactg cttttgattt caagtatcag t 51
<210>13
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>13
tcttcttcag ggaatcctgt 20
<210>14
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>14
gagttacata ggatttgcga g 21
<210>15
<211>63
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>15
caactgacta aactaggtgc cacgtcgtga aagtctgaca agggaatcct gttattcaca 60
tta 63
<210>16
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>16
cctacaagac cacccaccag 20
<210>17
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>17
ggacccatgg tcattccata 20
<210>18
<211>40
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>18
cacgtcgtga aagtctgaca agctcccacc ctgaaaaaga 40
<210>19
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>19
aattcaggag ctgaactgcc 20
<210>20
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>20
tgttcagccc ttggattgtc 20
<210>21
<211>67
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>21
ctctctctct ctctctctct ctctctctct ctctctctct ctctctcttt cgctgccatg 60
aaagttg 67
<210>22
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>22
taatacaaga gccgcctgga 20
<210>23
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>23
cttgcaagga actgcagcta t 21
<210>24
<211>93
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>24
taaactaggt gccacgtcgt gaaagtctga caacaactga ctaaactagg tgccacgtcg 60
tgaaagtctg acaacccaaa gcccctggaa aaa 93
<210>25
<211>25
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>25
gaataaagtg aggaaaacac ggagt 25
<210>26
<211>25
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>26
gtttctcatc tacgaaagag gagtc 25
<210>27
<211>80
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>27
cacgtcgtga aagtctgaca acaactgact aaactaggtg ccacgtcgtg aaagtctgac 60
aaggttggat gttggggctt 80
<210>28
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>28
cctagagtcc cccaaacctc 20
<210>29
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>29
cacttctggg catctgcttc 20
<210>30
<211>59
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>30
aactgactaa actaggtgcc acgtcgtgaa agtctgacaa ctgcattgcc agtgtactc 59
<210>31
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>31
acatcctgca gaccttcctg 20
<210>32
<211>19
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>32
cagaccttgg gcgtcagat 19
<210>33
<211>105
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>33
ctgacaacaa ctgactaaac taggtgccac gtcgtgaaag tctgacaaca actgactaaa 60
ctaggtgcca cgtcgtgaaa gtctgacaac ctggcagtgg gtgca 105
<210>34
<211>27
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>34
gagtatgata taattttgtt cctgctg 27
<210>35
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>35
tggactttat gggttgttgt ttt 23
<210>36
<211>48
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>36
cacgtcgtga aagtctgaca attttttgtt ttttttttgc actcatca 48
<210>37
<211>22
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>37
agtcttggct agggcgttag ta 22
<210>38
<211>22
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>38
ctcctagtca tggttgatgt gg 22
<210>39
<211>109
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>39
acaacaactg actaaactag gtgccacgtc gtgaaagtct gacaacaact gactaaacta 60
ggtgccacgt cgtgaaagtc tgacaattcg tgttgatgag aaaatttca 109
<210>40
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>40
agagggcttc tgttcacacc 20
<210>41
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>41
atgcaccact actgtccaag 20
<210>42
<211>51
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>42
aaactaggtg ccacgtcgtg aaagtctgac aaggaggtga gcttcacggg g 51
<210>43
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>43
aacctgatgg ccctcattag t 21
<210>44
<211>19
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>44
atggcaccgt ttggttcag 19
<210>45
<211>89
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>45
agtctgacaa ctaggtgcca cgtcgtgaaa gtctgacaac taggtgccac gtcgtgaaag 60
tctgacatct cattagtcct tggctctta 89
<210>46
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>46
gaaggctccc aactcgttag 20
<210>47
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>47
gtcattaaag tcaacctagg c 21
<210>48
<211>44
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>48
ctctctctct ctctctctct cttgtttagg agagttgaga catc 44
<210>49
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>49
tgctcagctc cacgtacaac 20
<210>50
<211>19
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>50
ctcttcaggc cgaagctct 19
<210>51
<211>89
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>51
actaggtgcc acgtcgtgaa agtctgacaa caactgacta aactaggtgc cacgtcgtga 60
aagtctgaca atggcgccac gttttcaca 89
<210>52
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>52
cccctcggga gaaaacatag 20
<210>53
<211>24
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>53
ttctagagtt gaatgagggt caga 24
<210>54
<211>93
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>54
aaactaggtg ccacgtcgtg aaagtctgac aacaactgac taaactaggt gccacgtcgt 60
gaaagtctga caagagctaa ggaaagatac gtg 93
<210>55
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>55
cagcccaacc tactcctctg 20
<210>56
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>56
tccctacaaa gtggcaaacc 20
<210>57
<211>67
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>57
caacaactga ctaaactagg tgccacgtcg tgaaagtctg acaacagtaa atagtaactc 60
catcttc 67
<210>58
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>58
tctctcagga tatccctttg g 21
<210>59
<211>25
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>59
aaaatcttga ttctgtatcg cagtc 25
<210>60
<211>101
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>60
caactgacta aactaggtgc cacgtcgtga aagtctgaca acaactgact aaactaggtg 60
ccacgtcgtg aaagtctgac aacgcagtct actagttgtc c 101
<210>61
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>61
tatggcctca ggttctccac 20
<210>62
<211>21
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>62
cacatgatct caccgtttcc t 21
<210>63
<211>113
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>63
tgacaacaac tgactaaact aggtgccacg tcgtgaaagt ctgacaacaa ctgactaaac 60
taggtgccac gtcgtgaaag tctgacaaca catgcaaaat caggataata atg 113
<210>64
<211>22
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>64
gcaatgagat tagttgcact gg 22
<210>65
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>65
attatatgcc caccctgctc 20
<210>66
<211>44
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>66
tgccacgtcg tgaaagtctg acaactggtt gaggcacact atta 44
<210>67
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>67
cccagctagg gctagacacc 20
<210>68
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>68
tcaaagactg agccatgcac 20
<210>69
<211>113
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>69
gaaagtctga caacaactga ctaaactagg tgccacgtcg tgaaagtctg acaacaactg 60
actaaactag gtgccacgtc gtgaaagtct gacaaccacc ctaaggggac aga 113
<210>70
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>70
tggcaacctc acatggtaga 20
<210>71
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>71
ccaggggagg tagaaagagg 20
<210>72
<211>97
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>72
aactgactaa actaggtgcc acgtcgtgaa agtctgacaa caactgacta aactaggtgc 60
cacgtcgtga aagtctgaca acagtctcct gcccggc 97
<210>73
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>73
ccagagcttt gcagcacttt 20
<210>74
<211>19
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>74
caaggacgca gctctctca 19
<210>75
<211>101
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>75
acaactgact aaactaggtg ccacgtcgtg aaagtctgac aacaactgac taaactaggt 60
gccacgtcgt gaaagtctga caagagtgtt ttgtgggcct c 101
<210>76
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>76
agaaaggaga ggaaacaccg 20
<210>77
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>77
tcagcaactt ctagtcctcg 20
<210>78
<211>55
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>78
ctctctctct ctctctctct ctctctctct gacaatctga ggtccttgca gctcc 55
<210>79
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>79
tgtgtggttt tctcagcgac 20
<210>80
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>80
agcatggtat gagcactgag 20
<210>81
<211>40
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>81
tctctctctc tctctctctc tcctcctaat aagagctggc 40
<210>82
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>82
ccttggcatg ttcctctctc 20
<210>83
<211>24
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>83
tcagaggaat tagaaaggcc taaa 24
<210>84
<211>109
<212>DNA
<213> Artificial sequence
<220>
<223>
<400>84
agtctgacaa caactgacta aactaggtgc cacgtcgtga aagtctgaca acaactgact 60
aaactaggtg ccacgtcgtg aaagtctgac aaggaggtag gagcaccca 109

Claims (11)

1.28 SNP site combinations in any of the following applications:
(a) constructing a five continental population genotyping database;
(b) distinguishing five interstellar populations;
the 28SNP loci are respectively: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, rs17822931, rs1871534, rs2080161, rs2139931, rs2789823, rs2814778, rs3751050, rs3827760, rs4657449, rs4749305, rs4792928, rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818 and rs 9908046.
2. A primer pair group for detecting 28SNP sites in a human genome; the 28SNP loci are respectively: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, rs17822931, rs1871534, rs2080161, rs2139931, rs2789823, rs2814778, rs3751050, rs3827760, rs4657449, rs4749305, rs4792928, rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818 and rs 9908046; the primer pair group is composed of the following (1) to (28):
(1) the primer pair 1 for detecting rs10483251 consists of two single-stranded DNAs shown as a sequence 1 and a sequence 2 in a sequence table;
(2) a primer pair 2 for detecting rs12142199, which consists of two single-stranded DNAs shown as a sequence 4 and a sequence 5 in a sequence table;
(3) the primer pair 3 for detecting rs1229984 consists of two single-stranded DNAs shown as a sequence 7 and a sequence 8 in a sequence table;
(4) the primer pair 4 for detecting rs12402499 consists of two single-stranded DNAs shown as a sequence 10 and a sequence 11 in a sequence table;
(5) the primer pair 5 for detecting rs12498138 consists of two single-stranded DNAs shown as a sequence 13 and a sequence 14 in a sequence table;
(6) the primer pair 6 for detecting rs12594144 consists of two single-stranded DNAs shown as a sequence 16 and a sequence 17 in a sequence table;
(7) primer pair 7 for detecting rs1426654, which consists of two single-stranded DNAs shown as sequence 19 and sequence 20 in the sequence table;
(8) the primer pair 8 for detecting rs1557553 consists of two single-stranded DNAs shown as a sequence 22 and a sequence 23 in a sequence table;
(9) a primer pair 9 for detecting rs16891982, which consists of two single-stranded DNAs shown as a sequence 25 and a sequence 26 in a sequence table;
(10) the primer pair 10 for detecting rs17822931 consists of two single-stranded DNAs shown as a sequence 28 and a sequence 29 in a sequence table;
(11) a primer pair 11 for detecting rs1871534, which consists of two single-stranded DNAs shown as a sequence 31 and a sequence 32 in a sequence table;
(12) the primer pair 12 for detecting rs2080161 consists of two single-stranded DNAs shown as a sequence 34 and a sequence 35 in a sequence table;
(13) the primer pair 13 for detecting rs2139931 consists of two single-stranded DNAs shown as a sequence 37 and a sequence 38 in a sequence table;
(14) the primer pair 14 for detecting rs2789823 comprises two single-stranded DNAs shown as a sequence 40 and a sequence 41 in a sequence table;
(15) primer pair 15 for detecting rs2814778, which consists of two single-stranded DNAs shown as sequence 43 and sequence 44 in the sequence table;
(16) the primer pair 16 for detecting rs3751050 consists of two single-stranded DNAs shown as a sequence 46 and a sequence 47 in a sequence table;
(17) the primer pair 17 for detecting rs3827760 consists of two single-stranded DNAs shown as a sequence 49 and a sequence 50 in a sequence table;
(18) the primer pair 18 for detecting rs4657449 consists of two single-stranded DNAs shown as a sequence 52 and a sequence 53 in a sequence table;
(19) the primer pair 19 for detecting rs4749305 comprises two single-stranded DNAs shown as a sequence 55 and a sequence 56 in a sequence table;
(20) the primer pair 20 for detecting rs4792928 consists of two single-stranded DNAs shown as a sequence 58 and a sequence 59 in a sequence table;
(21) the primer pair 21 for detecting rs6054465 consists of two single-stranded DNAs shown as a sequence 61 and a sequence 62 in a sequence table;
(22) the primer pair 22 for detecting rs6437783 consists of two single-stranded DNAs shown as a sequence 64 and a sequence 65 in a sequence table;
(23) the primer pair 23 for detecting rs715605 consists of two single-stranded DNAs shown as a sequence 67 and a sequence 68 in a sequence table;
(24) the primer pair 24 for detecting rs8072587 consists of two single-stranded DNAs shown as a sequence 70 and a sequence 71 in a sequence table;
(25) the primer pair 25 for detecting rs8137373 is composed of two single-stranded DNAs shown as a sequence 73 and a sequence 74 in a sequence table;
(26) the primer pair 26 for detecting rs9522149 consists of two single-stranded DNAs shown as a sequence 76 and a sequence 77 in a sequence table;
(27) the primer pair 27 for detecting rs9809818 consists of two single-stranded DNAs shown as a sequence 79 and a sequence 80 in a sequence table;
(28) the primer pair 28 for detecting rs9908046 consists of two single-stranded DNAs shown as a sequence 82 and a sequence 83 in a sequence table.
3. The primer-pair set according to claim 2, wherein: in the primer pair group, a molar ratio of the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, the primer pair 5, the primer pair 6, the primer pair 7, the primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair 13, the primer pair 14, the primer pair 15, the primer pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21, the primer pair 22, the primer pair 23, the primer pair 24, the primer pair 25, the primer pair 26, the primer pair 27, and the primer pair 28 is 0.8: 0.6: 1.5: 2.7: 3: 0.8: 2: 1.3: 1: 1: 1.5: 1: 2.7: 0.5: 0.8: 2.5: 0.4: 1.6: 2.5: 4: 3: 0.8: 0.8: 3: 3: 6: 0.6: 0.6;
the molar ratio of the two primers in each primer pair is 1: 1.
4. a single-stranded DNA group for detecting 28SNP sites in a human genome; the 28SNP loci are respectively: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, rs17822931, rs1871534, rs2080161, rs2139931, rs2789823, rs2814778, rs3751050, rs3827760, rs4657449, rs4749305, rs4792928, rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818 and rs 9908046; the single-stranded DNA group consists of the following (1) to (28):
(1) primer pair 1 and extension primer 1 for detecting rs 10483251; the primer pair 1 consists of two single-stranded DNAs shown as a sequence 1 and a sequence 2 in a sequence table; the extension primer 1 is single-stranded DNA shown as a sequence 3 in a sequence table;
(2) primer pair 2 and extension primer 2 for detecting rs 12142199; the primer pair 2 consists of two single-stranded DNAs shown as a sequence 4 and a sequence 5 in a sequence table; the extension primer 2 is single-stranded DNA shown as a sequence 6 in a sequence table;
(3) primer pair 3 and extension primer 3 for detecting rs 1229984; the primer pair 3 consists of two single-stranded DNAs shown as a sequence 7 and a sequence 8 in a sequence table; the extension primer 3 is single-stranded DNA shown as a sequence 9 in a sequence table;
(4) primer pair 4 and extension primer 4 for detecting rs 12402499; the primer pair 4 consists of two single-stranded DNAs shown as a sequence 10 and a sequence 11 in a sequence table; the extension primer 4 is single-stranded DNA shown as a sequence 12 in a sequence table;
(5) primer pair 5 and extension primer 5 for detecting rs 12498138; the primer pair 5 consists of two single-stranded DNAs shown as a sequence 13 and a sequence 14 in a sequence table; the extension primer 5 is single-stranded DNA shown as a sequence 15 in a sequence table;
(6) a primer pair 6 and an extension primer 6 for detecting rs 12594144; the primer pair 6 consists of two single-stranded DNAs shown as a sequence 16 and a sequence 17 in a sequence table; the extension primer 6 is single-stranded DNA shown as a sequence 18 in a sequence table;
(7) primer pair 7 and extension primer 7 for detecting rs 1426654; the primer pair 7 consists of two single-stranded DNAs shown as a sequence 19 and a sequence 20 in a sequence table; the extension primer 7 is single-stranded DNA shown as a sequence 21 in a sequence table;
(8) a primer pair 8 and an extension primer 8 for detecting rs 1557553; the primer pair 8 consists of two single-stranded DNAs shown as a sequence 22 and a sequence 23 in a sequence table; the extension primer 8 is single-stranded DNA shown as a sequence 24 in a sequence table;
(9) primer pair 9 and extension primer 9 for detecting rs 16891982; the primer pair 9 consists of two single-stranded DNAs shown as a sequence 25 and a sequence 26 in a sequence table; the extension primer 9 is single-stranded DNA shown as a sequence 27 in a sequence table;
(10) a primer pair 10 and an extension primer 10 for detecting rs 17822931; the primer pair 10 consists of two single-stranded DNAs shown as a sequence 28 and a sequence 29 in a sequence table; the extension primer 10 is single-stranded DNA shown as a sequence 30 in a sequence table;
(11) primer pair 11 and extension primer 11 for detecting rs 1871534; the primer pair 11 consists of two single-stranded DNAs shown as a sequence 31 and a sequence 32 in a sequence table; the extension primer 11 is single-stranded DNA shown as a sequence 33 in a sequence table;
(12) a primer pair 12 and an extension primer 12 for detecting rs 2080161; the primer pair 12 consists of two single-stranded DNAs shown as a sequence 34 and a sequence 35 in a sequence table; the extension primer 12 is single-stranded DNA shown as a sequence 36 in a sequence table;
(13) a primer pair 13 and an extension primer 13 for detecting rs 2139931; the primer pair 13 consists of two single-stranded DNAs shown as a sequence 37 and a sequence 38 in a sequence table; the extension primer 13 is a single-stranded DNA shown as a sequence 39 in a sequence table;
(14) primer pair 14 and extension primer 14 for detecting rs 2789823; the primer pair 14 consists of two single-stranded DNAs shown as a sequence 40 and a sequence 41 in a sequence table; the extension primer 14 is single-stranded DNA shown as a sequence 42 in a sequence table;
(15) primer pair 15 and extension primer 15 for detecting rs 2814778; the primer pair 15 consists of two single-stranded DNAs shown as a sequence 43 and a sequence 44 in a sequence table; the extension primer 15 is single-stranded DNA shown as a sequence 45 in a sequence table;
(16) a primer pair 16 and an extension primer 16 for detecting rs 3751050; the primer pair 16 consists of two single-stranded DNAs shown as a sequence 46 and a sequence 47 in a sequence table; the extension primer 16 is single-stranded DNA shown as a sequence 48 in a sequence table;
(17) primer pair 17 and extension primer 17 for detecting rs 3827760; the primer pair 17 consists of two single-stranded DNAs shown as a sequence 49 and a sequence 50 in a sequence table; the extension primer 17 is single-stranded DNA shown as a sequence 51 in a sequence table;
(18) primer pair 18 and extension primer 18 for detecting rs 4657449; the primer pair 18 consists of two single-stranded DNAs shown as a sequence 52 and a sequence 53 in a sequence table; the extension primer 18 is single-stranded DNA shown as a sequence 54 in a sequence table;
(19) primer pair 19 and extension primer 19 for detecting rs 4749305; the primer pair 19 consists of two single-stranded DNAs shown as a sequence 55 and a sequence 56 in a sequence table; the extension primer 19 is single-stranded DNA shown as a sequence 57 in a sequence table;
(20) a primer pair 20 and an extension primer 20 for detecting rs 4792928; the primer pair 20 consists of two single-stranded DNAs shown as a sequence 58 and a sequence 59 in a sequence table; the extension primer 20 is single-stranded DNA shown as a sequence 60 in a sequence table;
(21) a primer pair 21 and an extension primer 21 for detecting rs 6054465; the primer pair 21 consists of two single-stranded DNAs shown as a sequence 61 and a sequence 62 in a sequence table; the extension primer 21 is single-stranded DNA shown as a sequence 63 in a sequence table;
(22) a primer pair 22 and an extension primer 22 for detecting rs 6437783; the primer pair 22 consists of two single-stranded DNAs shown as a sequence 64 and a sequence 65 in a sequence table; the extension primer 22 is single-stranded DNA shown as a sequence 66 in a sequence table;
(23) a primer pair 23 and an extension primer 23 for detecting rs 715605; the primer pair 23 consists of two single-stranded DNAs shown as a sequence 67 and a sequence 68 in a sequence table; the extension primer 23 is single-stranded DNA shown as a sequence 69 in a sequence table;
(24) a primer pair 24 and an extension primer 24 for detecting rs 8072587; the primer pair 24 consists of two single-stranded DNAs shown as a sequence 70 and a sequence 71 in a sequence table; the extension primer 24 is single-stranded DNA shown as a sequence 72 in a sequence table;
(25) a primer pair 25 and an extension primer 25 for detecting rs 8137373; the primer pair 25 consists of two single-stranded DNAs shown as a sequence 73 and a sequence 74 in a sequence table; the extension primer 25 is single-stranded DNA shown as a sequence 75 in a sequence table;
(26) a primer pair 26 and an extension primer 26 for detecting rs 9522149; the primer pair 26 consists of two single-stranded DNAs shown as a sequence 76 and a sequence 77 in a sequence table; the extension primer 26 is single-stranded DNA shown as a sequence 78 in a sequence table;
(27) a primer pair 27 and an extension primer 27 for detecting rs 9809818; the primer pair 27 consists of two single-stranded DNAs shown as a sequence 79 and a sequence 80 in a sequence table; the extension primer 27 is single-stranded DNA shown as a sequence 81 in a sequence table;
(28) a primer pair 28 and an extension primer 28 for detecting rs 9908046; the primer pair 28 consists of two single-stranded DNAs shown as a sequence 82 and a sequence 83 in a sequence table; the extension primer 28 is single-stranded DNA shown as a sequence 84 in a sequence table.
5. The set of single-stranded DNAs of claim 4, wherein: in the single-stranded DNA group, a molar ratio of the primer pair 1, the primer pair 2, the primer pair 3, the primer pair 4, the primer pair 5, the primer pair 6, the primer pair 7, the primer pair 8, the primer pair 9, the primer pair 10, the primer pair 11, the primer pair 12, the primer pair 13, the primer pair 14, the primer pair 15, the primer pair 16, the primer pair 17, the primer pair 18, the primer pair 19, the primer pair 20, the primer pair 21, the primer pair 22, the primer pair 23, the primer pair 24, the primer pair 25, the primer pair 26, the primer pair 27, and the primer pair 28 is 0.8: 0.6: 1.5: 2.7: 3: 0.8: 2: 1.3: 1: 1: 1.5: 1: 2.7: 0.5: 0.8: 2.5: 0.4: 1.6: 2.5: 4: 3: 0.8: 0.8: 3: 3: 6: 0.6: 0.6; the molar ratio of the two primers in each primer pair is 1: 1;
the extension primer 1, the extension primer 2, the extension primer 3, the extension primer 4, the extension primer 5, the extension primer 6, the extension primer 7, the extension primer 8, the extension primer 9, the extension primer 10, the extension primer 11, the extension primer 12, the extension primer 13, the extension primer 14, the extension primer 15, the extension primer 16, the extension primer 17, the extension primer 18, the extension primer 19, the extension primer 20, the extension primer 21, the extension primer 22, the extension primer 23, the extension primer 24, the extension primer 25, the extension primer 26, the extension primer 27, and the extension primer 28 are present in a molar ratio of 0.45: 0.35: 1.2: 1.7: 4: 1: 3: 0.8: 1: 0.8: 1.8: 1.1: 1.1: 1.1: 0.8: 1.4: 1: 0.5: 1.3: 1.8: 1.6: 0.9: 0.9: 2: 2.3: 3: 1: 0.6.
6. a kit for differentiating quinquehony populations comprising the primer set of claim 2 or 3 or the single-stranded DNA set of claim 4 or 5 and at least one of: dNTP, DNA polymerase, alkaline phosphatase.
7. The use of a substance for detecting a combination of 28SNP sites in any one of:
(a) constructing a five continental population genotyping database;
(b) distinguishing five interstellar populations;
the 28SNP loci are respectively: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, rs17822931, rs1871534, rs2080161, rs2139931, rs2789823, rs2814778, rs3751050, rs3827760, rs4657449, rs4749305, rs4792928, rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818 and rs 9908046.
8. Use according to claim 7, characterized in that: the substance for detecting a combination of 28SNP sites is the primer set according to claim 2 or 3 or the single-stranded DNA set according to claim 4 or 5 or the kit according to claim 6.
9. A method for constructing a five continental population genotyping database comprises the following steps:
(a1) selecting 28SNP loci of five interstellar populations from a thousand-person genome project and a human genome diversity plan to form an original typing library by typing;
(a2) carrying out structure cluster analysis on all samples in an original typing database, and selecting a part with the ancestor principal component of more than 90% to form a five-interstella crowd genotyping database;
the 28SNP loci are respectively: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, rs17822931, rs1871534, rs2080161, rs2139931, rs2789823, rs2814778, rs3751050, rs3827760, rs4657449, rs4749305, rs4792928, rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818 and rs 9908046.
10. A method of distinguishing five intercostal populations comprising the steps of:
(b1) constructing an interstellar population genotyping database according to the method of claim 9;
(b2) extracting genome DNA of a person to be detected, and detecting 28SNP sites to obtain original genotype data of the person to be detected on the 28SNP sites;
(b3) comparing the original genotype data of the person to be detected on the 28SNP loci with the five interstellar population genotyping database, thereby determining which of the five interstellar populations the person to be detected belongs to;
the 28SNP loci are respectively: rs10483251, rs12142199, rs1229984, rs12402499, rs12498138, rs12594144, rs1426654, rs1557553, rs16891982, rs17822931, rs1871534, rs2080161, rs2139931, rs2789823, rs2814778, rs3751050, rs3827760, rs4657449, rs4749305, rs4792928, rs6054465, rs6437783, rs715605, rs8072587, rs8137373, rs9522149, rs9809818 and rs 9908046.
11. The method of claim 10, wherein: when the 28SNP sites are detected, the primer pair group of claim 2 or 3, the single-stranded DNA group of claim 4 or 5 or the kit of claim 6 is adopted; 28-fold PCR amplification was performed with an annealing temperature of 55 ℃.
CN201710610369.8A 2017-07-25 2017-07-25 Method and system for inferring source of five continental ethnic groups of individuals of unknown origin Active CN107419017B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710610369.8A CN107419017B (en) 2017-07-25 2017-07-25 Method and system for inferring source of five continental ethnic groups of individuals of unknown origin

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710610369.8A CN107419017B (en) 2017-07-25 2017-07-25 Method and system for inferring source of five continental ethnic groups of individuals of unknown origin

Publications (2)

Publication Number Publication Date
CN107419017A CN107419017A (en) 2017-12-01
CN107419017B true CN107419017B (en) 2020-09-08

Family

ID=60430402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710610369.8A Active CN107419017B (en) 2017-07-25 2017-07-25 Method and system for inferring source of five continental ethnic groups of individuals of unknown origin

Country Status (1)

Country Link
CN (1) CN107419017B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108950005A (en) * 2018-05-06 2018-12-07 朱波峰 A kind of the Forensic detection system and its application of 30 SNP sites of autosome first ancestor
CN108531617A (en) * 2018-05-06 2018-09-14 朱波峰 The detecting system of one 42 SNP site for medical jurisprudence individual ancestors' information inference
CN108411008B (en) * 2018-06-01 2021-07-27 公安部物证鉴定中心 Application of 72 SNP sites and related primers in identification or assisted identification of human ethnic groups
CN110885888B (en) * 2018-09-07 2022-04-29 中国科学院北京基因组研究所 SNP marker combination for deducing different geographical region populations of Asia
CN109852701B (en) * 2018-12-28 2021-01-26 四川大学 Composite system for family source inference and inference method and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6544730B1 (en) * 1997-10-27 2003-04-08 Prescott Deininger High density polymorphic genetic locus
CN104212886A (en) * 2014-07-25 2014-12-17 公安部物证鉴定中心 Method and system for performing African, European and East Asian population genetic principal component analysis to unknown-source individual

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6544730B1 (en) * 1997-10-27 2003-04-08 Prescott Deininger High density polymorphic genetic locus
CN104212886A (en) * 2014-07-25 2014-12-17 公安部物证鉴定中心 Method and system for performing African, European and East Asian population genetic principal component analysis to unknown-source individual

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
27-plex SNP 种族推断方法的优化及验证;江丽等;《遗传Hereditas (Beijing)》;20170228;第39卷(第2期);第166-173页 *

Also Published As

Publication number Publication date
CN107419017A (en) 2017-12-01

Similar Documents

Publication Publication Date Title
CN107419017B (en) Method and system for inferring source of five continental ethnic groups of individuals of unknown origin
CN108220413B (en) Fluorescent multiplex amplification kit for combined detection of human Y chromosome STR and Indel loci and application thereof
CN108411008B (en) Application of 72 SNP sites and related primers in identification or assisted identification of human ethnic groups
CN102337345B (en) Medicolegal composite assay kit based on twenty triallelic SNP (single nucleotide polymorphism) genetic markers
CN105463116B (en) A kind of Forensic medicine composite detection kit and detection method based on 20 triallelic SNP genetic markers
US20230212671A1 (en) Primer composition, kit and method for detecting microhaplotype loci based on next generation sequencing technology, and applications thereof
CA2598455A1 (en) Multiplex assays for inferring ancestry
CN110305968A (en) A kind of composite amplification system in the micro- haplotype domain SNP-DIP based on NGS parting for medical jurisprudence individual identification
CN108823294B (en) Forensic medicine composite detection kit based on Y-SNP genetic markers of 20 haplotype groups D
CN106399479A (en) SNP typing kit used for detecting susceptibility genes of type-II diabetes
Yin et al. Developmental validation of Y-SNP pedigree tagging system: a panel via quick ARMS PCR
CN110564861A (en) Fluorescence labeling composite amplification kit for human Y chromosome STR locus and InDel locus and application thereof
CN112280849B (en) Composite amplification system and kit for anti-depression individualized medication genotyping detection
CN112011622B (en) Method and system for analyzing non-east Asia and European population sources of individuals with unknown sources
CN112342303A (en) NGS-based human Y chromosome STR and SNP genetic marker combined detection system and detection method
CN108517364B (en) Forensic medicine composite detection kit based on 56Y chromosome SNP genetic markers
CN110331213A (en) The composite amplification system and its kit of a kind of joint-detection mankind full genome seat group InDel genetic marker and application
CN116064842A (en) Composite amplification box for degradation material deducing biological geographical ancestor DIPs and sex identification
CN109762909A (en) A kind of 44 site InDels composite amplification detection kits for sample medical jurisprudence individual appreciation of degrading
CN108642190B (en) Forensic medicine composite detection kit based on 14 autosomal SNP genetic markers
CN111485024B (en) Primer combination for individual feature identification and application thereof
CN110551830B (en) Human Y-STR locus fluorescence labeling kit and detection method
CN108060233B (en) Fluorescence multiplex amplification system and kit for 30Y chromosome STR loci and application
CN109852702A (en) A kind of compound system that SNP-SNP is marked and its methods and applications for detecting uneven mixing sample
Cao et al. An efficient ancestry informative SNPs panel for further discriminating East Asian populations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant