CN112626223B - STR typing method based on multiplex PCR technology and DNB technology - Google Patents
STR typing method based on multiplex PCR technology and DNB technology Download PDFInfo
- Publication number
- CN112626223B CN112626223B CN202010814836.0A CN202010814836A CN112626223B CN 112626223 B CN112626223 B CN 112626223B CN 202010814836 A CN202010814836 A CN 202010814836A CN 112626223 B CN112626223 B CN 112626223B
- Authority
- CN
- China
- Prior art keywords
- str
- pcr
- dnb
- technology
- sequences
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005516 engineering process Methods 0.000 title claims abstract description 49
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000007403 mPCR Methods 0.000 title claims abstract description 30
- 238000012163 sequencing technique Methods 0.000 claims abstract description 52
- 238000006243 chemical reaction Methods 0.000 claims abstract description 29
- 238000012408 PCR amplification Methods 0.000 claims abstract description 22
- 238000011156 evaluation Methods 0.000 claims abstract description 14
- 238000007363 ring formation reaction Methods 0.000 claims abstract description 14
- 238000000746 purification Methods 0.000 claims abstract description 5
- 238000011084 recovery Methods 0.000 claims abstract description 3
- 108020004414 DNA Proteins 0.000 claims description 62
- 230000003321 amplification Effects 0.000 claims description 33
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 33
- 238000013461 design Methods 0.000 claims description 11
- 108091093088 Amplicon Proteins 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000011144 upstream manufacturing Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 2
- 241000282414 Homo sapiens Species 0.000 description 41
- 239000013615 primer Substances 0.000 description 27
- 238000003752 polymerase chain reaction Methods 0.000 description 18
- 239000000047 product Substances 0.000 description 9
- 238000001514 detection method Methods 0.000 description 7
- 239000012634 fragment Substances 0.000 description 7
- 230000008901 benefit Effects 0.000 description 4
- 238000005251 capillar electrophoresis Methods 0.000 description 4
- 238000012165 high-throughput sequencing Methods 0.000 description 4
- 108020004638 Circular DNA Proteins 0.000 description 3
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 230000004907 flux Effects 0.000 description 3
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 108091092878 Microsatellite Proteins 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 239000002077 nanosphere Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003252 repetitive effect Effects 0.000 description 2
- 238000005096 rolling process Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000004132 cross linking Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000013100 final test Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6834—Enzymatic or biochemical coupling of nucleic acids to a solid phase
- C12Q1/6837—Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/686—Polymerase chain reaction [PCR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Analytical Chemistry (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses an STR typing method based on a multiplex PCR technology and a DNB technology, which comprises the following steps of S1, extracting DNA of a sample to be detected; s2, designing multiple PCR primers aiming at the selected STR locus and establishing an evaluation system for cross interconnection of STR sequences in DNB cyclization; s3, taking the DNA of the sample to be detected in the step S1 as a template, and carrying out a first round of PCR amplification reaction by using the multiplex PCR primer designed in the step S2; s4, purifying the first PCR product, and performing a second PCR amplification to introduce a high-throughput Index; s5, carrying out PCR amplification, recovery and purification on the second round; and S6. Performing DNB sequencing on the PCR product. The invention provides a set of evaluation system which is connected with each other through a set of STR sequences suitable for DNB sequencing and cyclization, and successfully establishes an STR typing method based on a multiplex PCR technology and a DNB technology.
Description
Technical Field
The invention relates to the field of biotechnology, in particular to an STR typing method based on a multiplex PCR technology and a DNB technology.
Background
The short segment repeated sequence (short tandem repeat, STR) typing technology is a main means for individual identification and parent identification in forensics at home and abroad at present, and has the characteristics of simplicity, rapidness, reliable result, high sensitivity and good repeatability. The combination of multiple genetic markers such as autosomal STR, Y-STR and X-STR has unique application value for bigeminal paternity and more complex paternity.
The current legal medical expert STR typing detection method is mainly a PCR-capillary electrophoresis method. The main principle of the method is as follows: (1) extracting DNA, and extracting DNA samples with high purity, good integrity and proper concentration by using methods such as magnetic beads, phenol/chloroform, chelex and the like; (2) performing PCR amplification, wherein the PCR (polymerase chain reaction) technology takes two DNA chains to be amplified as templates, and rapidly and specifically amplifies target DNA fragments under the mediation of a pair of artificially synthesized oligonucleotide primers through the enzymatic action of high-temperature-resistant DNA polymerase; (3) capillary electrophoresis is based on Sanger enzyme method, and uses heat-resistant DNA polymerase to refer to the template to be detected to generate a series of chain termination fragments. In the process, 4 different fluorescent dyes are used for marking reaction products, the reaction products are separated through capillary electrophoresis, a fixed laser light source is used for exciting fluorescent signals, CCD (charge coupled device) is used for detecting and collecting data, and a computer is used for analyzing and processing to obtain sequence information. (4) STR typing of the samples was determined. And after electrophoresis, detecting the lengths of peaks of DNA fragments with different fluorescent colors, comparing the DNA fragments with an internal standard to determine the lengths of the DNA fragments, corresponding to the different colors, and finally comparing the DNA fragments with allele ladders determined in the same way to obtain STR typing of the sample. The prior STR technology based on the traditional PCR-capillary electrophoresis has the main defects that firstly, forensic sites detected simultaneously are few, flux is low, although the upgrade of a multicolor fluorescence system promotes the increase of the number of disposable detection sites, the number of the detection sites is still difficult to exceed 60 at present, the detection requirements of first-line public security departments cannot be met, and meanwhile, the number of autosomes STR, YSTR and XSTR detected by forensic requirements is increased year by year. Secondly, the integrity of the obtained sequence information is limited. The traditional STR typing method based on the first-generation CE technology can only obtain the relative length information of the amplicon, can not obtain the complete base sequences of the STR repeated region and the flanking region, can not fully consider the possible variation of the sequences and the influence on the sequencing result, and limits the application scene of forensic DNA detection.
High-throughput sequencing technology (High-throughput sequencing), also known as "Next-generation" sequencing technology ("Next-generation" sequencing technology), allows for the simultaneous sequencing of millions of DNA molecules, which enables careful and comprehensive analysis of the transcriptome and genome of a species, a milestone in the course of development of sequencing technology. Patent CN102943111a discloses the use and method of high throughput DNA sequencing for determining short tandem repeat loci in human genome, patent CN104673907a discloses a system for high throughput inspection STR typing and detection method thereof, but the accuracy of the above-mentioned STR typing method based on conventional high throughput sequencing technology still remains to be improved. The DNB sequencing technology is based on a new sequencing technology developed by the prior second generation sequencing, genomic DNA is firstly subjected to fragmentation treatment, then a linker sequence is added, and cyclized to form single-stranded circular DNA, then the single-stranded circular DNA can be amplified by 2-3 orders of magnitude by using a rolling circle amplification technology (Rollingcircle amplification, RCA), the generated amplified product is called DNA nanospheres (DNAnanoball, DNB), and finally the nanospheres are fixed on an arrayed silicon chip by a DNB loading technology. Compared to other second generation sequencing techniques, DNB sequencing techniques have several advantages: (1) DNB enhances the signal strength by increasing the copy number of the DNA to be tested, thereby improving the sequencing accuracy; (2) Unlike PCR exponential amplification, amplification errors of rolling circle amplification techniques do not accumulate; (3) DNB has the same size as an activation site on a chip, and each site is only fixed with one DNB, so that no mutual interference is generated between signal points; (4) The combination of the arrayed sequencing chip and the DNB sequencing technology enables the area of the imaging system pixels and the sequencing chip to be utilized to the maximum. Currently, in the prior art, STR typing methods based on multiplex PCR technology and DNB technology are not known. Since the conventional DNB sequencing method forms single-stranded circular DNA followed by sequencing of DNB production, this step has little effect on random fragments of the genome or on common sequences without fixed structures. However, in the case of STR sequences having a highly repetitive structure, various types of repetition and a general imbalance in base ratio, the generation and amplification of DNB after cyclization may generate serious bias, and even single-stranded DNA itself cannot be normally cyclized, thereby seriously affecting the balance of sequencing results. Therefore, how to perform STR typing by using the re-PCR technique and the DNB technique based on the characteristics of STR sequences is a problem that needs to be solved at present.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings in the prior art and provide an STR typing method based on a multiplex PCR technology and a DNB technology.
The above object of the present invention is achieved by the following technical solutions:
an STR typing method based on a multiplex PCR technology and a DNB technology comprises the following steps:
s1, preprocessing a sample to be detected, and extracting DNA of the sample to be detected;
s2, designing multiple PCR primers aiming at STR loci to be detected, establishing an evaluation system for cross interconnection of STR sequences in DNB cyclization, and adjusting the amplification direction of the multiple PCR primers;
s3, taking the DNA of the sample to be detected in the step S1 as a template, and carrying out a first round of PCR amplification reaction by using the multiplex PCR primer designed in the step S2;
s4, purifying the first PCR product, and performing a second PCR amplification to introduce a high-throughput Index;
s5, carrying out PCR amplification, recovery and purification on the second round;
s6, performing DNB sequencing on the PCR product;
step S2, the evaluation system is that any two amplification product sequences with the lengths of m and n are respectively shifted one by one for mutual comparison on each possible base pairing state, and interconnection scores, namely the difference value of the matching base numbers and the unmatched base numbers, are calculated; generating (m x n) times of base alignment and (m+n-1) times of interconnection scores, and taking the highest score as the final interconnection score of the m and n sequences; for a set of p different sequences, self and pairwise comparisons result in 1/2 (p 2 +p) group comparison results; selecting a set of leads according to the comparison resultThe repeated motif type combination with the lowest total score of the system interconnection reaction is obtained, and the amplification directions of all the primers are adjusted accordingly, so that the amplified STR sequences are ensured not to have serious cross interconnection.
The invention adopts a multiple PCR technology to detect a plurality of different STR loci simultaneously in a reaction system, aims at the problems that the STR loci have a high repeated structure, various repeated types and common unbalance of base proportions, and the generation and the amplification of DNB after cyclization are likely to generate serious deviation and even single-stranded DNA itself cannot be cyclized normally, designs a set of evaluation system suitable for cross interconnection of STR sequences of DNB sequencing and cyclization, and carries out strict sequence complementary self-connection evaluation on a group of STR target sequences (including all sequences per se, forward and reverse sequences and sequences), so as to adjust a set of optimal sequence direction combination and ensure that serious cross interconnection reaction among the STR sequences does not occur as much as possible.
Preferably, the STR locus to be tested comprises at least two of D1S1656, D2S1338, D2S441, TPOX, D3S1358, FGA, CSF1PO, D5S818, D7S820, D8S1179, D10S1248, TH01, D12S391, vWA, D13S317, D16S539, D18S51, D19S433, D21S11 or D22S 1045.
Preferably, when the primer design is carried out aiming at STR locus loci, besides following the traditional multiple PCR primer design principle, the invention also gives consideration to the balance among amplicons of STR loci with different lengths, types and copy numbers, so that the designed PCR amplification primer can accurately cover a core repeated region for STR length calculation, and the optimal sequencing quality can be achieved by adjusting the initial position of the STR repeated region according to the characteristic of higher initial quality of a second-generation sequencing sequence.
Further preferably, the STR locus multiplex PCR primer comprises an upstream primer and a downstream primer, respectively, and the sequences of the upstream primer and the downstream primer are shown in SEQ ID NO. 1-40 in sequence.
Preferably, the amplification system of the first round PCR amplification reaction is: JN-1500 MIX 6. Mu.L, genomic DNA 20-300ng,3 XEnzymoHT 10. Mu.L, H 2 O to 30. Mu.L.
Preferably, the amplification procedure of the first round PCR amplification reaction is 98 ℃ for 3-5 min; 20s at 98 ℃ and 6min at 60 ℃ for 20 cycles; and at 72℃for 6min.
Preferably, the amplification system of the first round PCR amplification reaction is: 2 XHIFI_enzyme 15. Mu.L, nuclear-Free H 2 O13. Mu.L, primer_MGI-F1. Mu.L, MGI_Bar_xxx1mu.L, and 30. Mu.L in total.
The high-throughput Index in the step S4 can also fulfill the aim of the invention by adopting the high-throughput Index commonly used in the prior art, but in order to fully exert the advantages of the second-generation sequencing on a high-throughput large sample, the invention also provides a set of ultra-high-throughput Index which requires strict conditions of base balance and laser balance principle, GC content of 30-70%, continuous repeated base less than or equal to 5, self reverse complementation inconsistency, hamming distance greater than 3 and no conflict with the current main flow platform Index.
Preferably, the amplification procedure of the second round of PCR amplification reaction is: 98 ℃ for 2 min;98℃15s,58℃15s,72℃30s,6 cycles; 72 ℃ for 2min.
The invention also claims to provide the use of any of the above methods in individual identification and/or paternity testing.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides an STR typing method based on a multiplex PCR technology and a DNB technology, which is characterized in that a complete set of evaluation system is cross-interconnected through a set of STR sequences suitable for DNB sequencing and cyclization, and a set of STR target sequences (comprising all sequences per se, forward and reverse sequences and sequences), which need to be cyclized, are subjected to strict sequence complementary self-connection evaluation, so that an optimal sequence direction combination is adjusted, serious cross-interconnection reaction between the STR sequences is avoided, the balance of sequencing results is ensured, and the STR typing method based on the multiplex PCR technology and the DNB technology is successfully established, thereby being used for individual identification and/or paternity identification.
Drawings
FIG. 1 is a flow chart showing the operation of the STR typing method based on the multiplex PCR technique and DNB technique of the present invention.
FIG. 2 is a diagram illustrating the processing of the present invention in the presence of severe cross-interconnections between certain two STR repeats.
FIG. 3 is an illustration of a set of 96 Indexs in an ultra high throughput Index system optimized in accordance with the present invention.
FIG. 4 is the final sequencing result of standard 2800M.
Detailed Description
The invention is further illustrated in the following drawings and specific examples, which are not intended to limit the invention in any way. Unless specifically stated otherwise, the reagents, methods and apparatus employed in the present invention are those conventional in the art.
Reagents and materials used in the following examples are commercially available unless otherwise specified.
Example 1
An STR typing method based on multiplex PCR technology and DNB technology has the operation flow shown in figure 1. The method mainly comprises the following steps:
1. multiplex PCR amplification technology based on second generation sequencing
(1) Primer design of multiplex PCR amplicon: a specific set of STR panel is detected by utilizing a multiplex PCR amplicon capturing technology and matching with second generation sequencing, and a set of specific amplification primers is designed according to the sequence region characteristics of the STR sites. In order to adapt to the characteristic of high throughput of second-generation sequencing, more than several hundred weight PCR primers are often designed at a time, besides following the design principle of the traditional multiplex PCR primers, the balance among amplicons of STR loci with different lengths, types and copy numbers is also considered, so that the designed PCR amplification primers can accurately cover a core repeated region for STR length calculation, and according to the characteristic of higher initial quality of the second-generation sequencing sequence, the optimal sequencing quality can be achieved by adjusting the initial position of the STR repeated region.
Examples of primer designs are shown in Table 1 below:
TABLE 1
Note that: chr: a chromosome; LEN: a length; TM:50% of the temperature at which the primer is in a dissociated state;
(2) Design of a set of STR sequence cross-connection evaluation system suitable for DNB cyclization
And (3) carrying out strict sequence complementary self-connection evaluation on a group of STR target sequences (all sequences per se, forward and reverse, and between sequences) which need cyclization, so as to adjust a set of optimal sequence direction combinations and ensure that no serious cross-connection reaction occurs between STR sequences as much as possible.
Examples: as shown in fig. 2, the presence of severe cross-interconnections between certain two STR repeats can be avoided by adjusting the sequencing direction of one of the STR sequences;
based on the principle, a whole set of STR sequence cross interconnection evaluation system suitable for DNB sequencing and cyclization is designed, the influence factors such as the repetition type, the repetition length, the base ratio and the like of a set of STR sequences are comprehensively considered, the sequencing direction of all STR sequences is integrally optimized, and then the subsequent normal cyclization and DNB amplification are ensured, so that the balance among all sites of a sequencing result is achieved.
Principle of: for any two sequences of length m, n, respectively, the alignment is performed by shifting one by one on each possible Watson-Crick base pairing state, and the interconnection score, i.e. the difference between the number of matching bases and the number of unmatched bases, is calculated. This procedure will produce (m x n) base pairs and (m+n-1) interconnect scores, with the highest score being taken as the final interconnect score for both m and n sequences. For a set of p different sequences, self and pairwise comparisons will yield 1/2 (p 2 And +p) comparing the results, selecting a set of repeated motif type combination which enables the total score of the system interconnection reaction to be the lowest according to the comparing results, and adjusting the amplification directions of all primers according to the repeated motif type combination so as to ensure that serious cross interconnection does not occur among the amplified STR sequences.
(3) Target region amplification: the first round of PCR amplification was performed according to the reaction system and amplification conditions listed below.
Amplification system: reactions were performed in a super clean bench using 0.2mL PCR tube/96 well PCR plate according to the system configuration described in Table 2 below:
TABLE 2
The PCR amplification procedure is shown in Table 3:
TABLE 3 Table 3
* If the DNA is less than 5ng, the amount is increased by 1 cycle.
(4) Design of ultra-high throughput Index system suitable for current mainstream second generation sequencing platform
As a preferred implementation mode, the invention is to exhaust the efficiency of a single-ended Index system, fully utilize the advantages of second-generation sequencing on a high-flux large sample, and design a set of ultra-high-flux Index system suitable for the current mainstream second-generation sequencing platform. According to the strict conditions of base balance and laser balance principle, GC content of 30-70%, continuous repeated base less than or equal to 5, self reverse complementation inconsistency, hamming distance greater than 3 and no conflict with the Index of the current main stream platform, we search all possible spaces of 10bp length Index, and finally optimize a set of ultra-high flux Index system comprising 2016 indices (which can be randomly combined in a set of 96 or 32) and corresponding high-efficiency resolution scripts. The data splitting default allows a fault-tolerant space of 1bp, the final data splitting rate of the Index system reaches 94% after a plurality of batches of experimental tests, the effective data amount for sample site analysis is higher than 80%, and the advantage of parallel sequencing of a large number of samples by the high-throughput sequencing platform is fully exerted.
The following Table 4 is a high throughput 2016 Index system base balance assessment:
TABLE 4 Table 4
Pos | A | C | G | T | Highest base duty cycle |
1 | 506 | 504 | 511 | 495 | 25.35% |
2 | 508 | 504 | 506 | 498 | 25.20% |
3 | 504 | 506 | 507 | 499 | 25.15% |
4 | 507 | 502 | 506 | 501 | 25.15% |
5 | 501 | 507 | 506 | 502 | 25.15% |
6 | 505 | 504 | 502 | 505 | 25.05% |
7 | 503 | 505 | 505 | 503 | 25.05% |
8 | 506 | 503 | 505 | 502 | 25.10% |
9 | 504 | 504 | 505 | 503 | 25.05% |
10 | 500 | 508 | 503 | 505 | 25.20% |
Some set of 96 Index examples are shown in FIG. 3.
(5) After the first round of PCR product purification, a second round of PCR amplification was performed and the corresponding Index system was introduced for second generation sequencing in step (4). The system of the second round of PCR amplification is shown in Table 5:
TABLE 5
* Different samples please use different Barcode/Index.
The PCR amplification procedure is shown in Table 6:
TABLE 6
* Typically 6 cycles, 8 cycles are used when the original DNA is less than 10 ng.
(6) And (5) recovering PCR products. The PCR product is purified using a purification kit or other equivalent magnetic beads.
2. Second generation sequencing (DNB sequencing technique)
DNB sequencing was performed on the PCR products using a Huada sequencer.
Example 2
An STR typing method based on multiplex PCR technology and DNB technology, the same as the method of example 1, wherein:
(1) STR locus information: taking 20 autosomes STRs of Expanded CODIS core loci as an example, specific site information is shown in table 7 below:
20 autosomes STR of table 7 Expanded CODIS core loci
(2) STR locus primer information
Multiplex PCR initial primer design as shown in table 8 below, fully contains the repeat region of STR:
TABLE 8
(3) STR sequence cross-interconnect assessment for DNB cyclization
According to the amplification direction of the multiplex PCR initial primers, we obtained the repeated sequences of the amplified products and classified according to the repetition types, the repetition lengths and the base ratios, as shown in Table 9, 5 kinds of repeated motif types were obtained in total:
TABLE 9
Computational assessment of cross-linking reactions that may exist between all repeat motif types was performed.
Principle of: for any two sequences of length m, n, respectively, the alignment is performed by shifting one by one on each possible Watson-Crick base pairing state, and the interconnection score, i.e. the difference between the number of matching bases and the number of unmatched bases, is calculated. This procedure will produce (m x n) base pairs and (m+n-1) interconnect scores, with the highest score being taken as the final interconnect score for both m and n sequences. For a set of p different sequences, self and pairwise comparisons will yield 1/2 (p 2 +p) group comparison results.
Specifically, in this example, each repetitive motif type is first uniformly repeated 6 times, for example, two motif types with a repetition length of 4 are repeated 6 times, and then two sequences with a length of 24bp are obtained, and base pairs are compared one by one, so that 576 base pairs and 47 interconnection scores are generated (in order to consider the calculated amount, the repeated sequences with a length of 20-30bp can be obtained appropriately according to the repetition type and 4-10 times). The set contains 10 motif types (including forward and reverse) and is self-contained and two-by-two compared to generate 55 sets of comparison results (highest scores), wherein 15 sets of scores are more than 7, and the table 10 is listed below:
table 10
From the results table, a set of repeat motif type combinations that minimize the total score of the phylogenetic interconnect reactions are selected and the amplification direction of all primers is adjusted accordingly to ensure that severe cross-interconnections do not occur between the amplified STR sequences.
Principle of: the calculation mode of each pair of reaction scores is that one of each type of forward direction or reverse direction is selected, and according to two types of interconnection reactions, the number of sites a and b belonging to the two types in all 20 STRs and the highest score S of the interconnection reactions are counted, so that the score a.b.S of the reaction in a system can be calculated. The score sum of all interconnected reactions in the system for a group of types is the total score of the system.
Specifically, in this example, if the combination of repeating motif types selected is 1_rc,2_rc,3_rc,4_rc,5_rc, the types in which the interconnection reaction will occur are 5_rc and 5_rc, the sites belonging to this type are only D22S1045, and thus the number of sites is 1 and 1, respectively, and the highest score of the interconnection reaction is 7, and thus the score of the reaction in the system is 1×1×7=7. Since this combination has no other interconnect reaction, the total system score is also 7, and by complete calculation we found that the total system score for this combination is the lowest among all combinations, and therefore this combination is the optimal repeat motif type combination. Accordingly, the primer amplification direction of all sites is adjusted, for example, the original primer amplification direction of D1S1656 is forward, the repeat motif type is 1, and after the reverse amplification is adjusted, the repeat motif type is changed to 1_rc.
After the cross interconnection evaluation and primer adjustment, the sequencing direction of 20 autosomes STRs of Expanded CODIS core loci is optimized on the whole, and then subsequent normal cyclization and DNB amplification are ensured, so that the balance among all sites of a sequencing result is achieved.
(4) Typing results of the final test
As for the final sequencing result of standard 2800M, as shown in FIG. 4, the invention successfully establishes an STR typing method based on a multiplex PCR technology and a DNB technology, and the invention performs STR typing based on a multiplex PCR amplification system by utilizing the characteristics of more sites, high flux and low cost of the DNB sequencing technology, thereby ensuring the accuracy and reliability of the sequencing result and being more suitable for the standardization of future STR detection.
Sequence listing
<110> Guangzhou deep dawn Gene technology Co., ltd
GUANGZHOU CRIMINAL SCIENCE AND TECHNOLOGY Research Institute
<120> an STR typing method based on multiplex PCR technology and DNB technology
<141> 2020-08-13
<160> 40
<170> SIPOSequenceListing 1.0
<210> 1
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 1
gcagcacaaa actcgtttag ca 22
<210> 2
<211> 26
<212> DNA
<213> person (Homo sapiens)
<400> 2
tataagttca agcctgtgtt gctcaa 26
<210> 3
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 3
atgcctacat ccctagtacc tagca 25
<210> 4
<211> 24
<212> DNA
<213> person (Homo sapiens)
<400> 4
ccagtggatt tggaaacaga aatg 24
<210> 5
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 5
ctgtaacaag ggctacagga at 22
<210> 6
<211> 29
<212> DNA
<213> person (Homo sapiens)
<400> 6
caccacaccc agccataaat aacatatta 29
<210> 7
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 7
caccttcctc tgcttcactt ttc 23
<210> 8
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 8
ccttctgtcc ttgtcagcgt tta 23
<210> 9
<211> 21
<212> DNA
<213> person (Homo sapiens)
<400> 9
ctgcagtcca atctgggtga c 21
<210> 10
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 10
ctcatgaaat caacagaggc ttg 23
<210> 11
<211> 29
<212> DNA
<213> person (Homo sapiens)
<400> 11
atcacggtct gaaatcgaaa atatggtta 29
<210> 12
<211> 26
<212> DNA
<213> person (Homo sapiens)
<400> 12
ctgcagggca taacattatc caaaag 26
<210> 13
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 13
acttggacag catttcctgt gt 22
<210> 14
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 14
cagattgtac agaggaggca ctt 23
<210> 15
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 15
ctctcccatc tggatagtgg ac 22
<210> 16
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 16
gtgacaaggg tgattttcct ctt 23
<210> 17
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 17
attgtgaggt cttaaaatct gaggtatcaa 30
<210> 18
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 18
aaagggtatg atagaacact tgtcatagtt 30
<210> 19
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 19
cacggcctgg caacttatat gt 22
<210> 20
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 20
gctgtcaaaa accgtatgta ttcttgtttc 30
<210> 21
<211> 27
<212> DNA
<213> person (Homo sapiens)
<400> 21
aagcttagta cttaactcac tgccttg 27
<210> 22
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 22
ttcccttgtc ttgttattaa aggaacaact 30
<210> 23
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 23
aaatgacact gctacaactc acacc 25
<210> 24
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 24
cattggcctg ttcctccctt at 22
<210> 25
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 25
gtgatagtag tttcttctgg tgaaggaa 28
<210> 26
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 26
cttgcagatg gactgtcatg agatt 25
<210> 27
<211> 33
<212> DNA
<213> person (Homo sapiens)
<400> 27
gagataggac agatgataaa tacataggat gga 33
<210> 28
<211> 29
<212> DNA
<213> person (Homo sapiens)
<400> 28
cactttgccc ttattatttt gtgaactcc 29
<210> 29
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 29
attctgccta cagccaatgt gaa 23
<210> 30
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 30
caaatctcct ccttcaactt gggtt 25
<210> 31
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 31
ggtctaagag cttgtaaaaa gtgtacaa 28
<210> 32
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 32
gcgtttgtgt gtgcatctgt aag 23
<210> 33
<211> 26
<212> DNA
<213> person (Homo sapiens)
<400> 33
cacttcactc tgagtgacaa attgag 26
<210> 34
<211> 27
<212> DNA
<213> person (Homo sapiens)
<400> 34
gcaacaacac aaataaacaa accgtca 27
<210> 35
<211> 24
<212> DNA
<213> person (Homo sapiens)
<400> 35
aaggaacagg tggtgttggt taca 24
<210> 36
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 36
gttgaggctg caaaaagcta taattgta 28
<210> 37
<211> 24
<212> DNA
<213> person (Homo sapiens)
<400> 37
atatgtgagt caattcccca agtg 24
<210> 38
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 38
tgtattagtc aatgttctcc agagacag 28
<210> 39
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 39
cctctccacc ctatagaccc tg 22
<210> 40
<211> 27
<212> DNA
<213> person (Homo sapiens)
<400> 40
cctcagctgt agaatggaaa tagtgac 27
Claims (6)
1. An STR typing method based on a multiplex PCR technology and a DNB technology is characterized by comprising the following steps:
s1, preprocessing a sample to be detected, and extracting DNA of the sample to be detected;
s2, designing multiple PCR primers aiming at STR loci to be detected, establishing an evaluation system for cross interconnection of STR sequences in DNB cyclization, and adjusting the amplification direction of the multiple PCR primers;
s3, taking the DNA of the sample to be detected in the step S1 as a template, and carrying out a first round of PCR amplification reaction by using the multiplex PCR primer designed in the step S2;
s4, purifying the first PCR product, and performing a second PCR amplification to introduce a high-throughput Index;
s5, carrying out PCR amplification, recovery and purification on the second round;
s6, performing DNB sequencing on the PCR product;
step S2, the evaluation system is that any two amplification product sequences with the lengths of m and n are respectively shifted one by one for mutual comparison on each possible base pairing state, and interconnection scores, namely the difference value of the matching base numbers and the unmatched base numbers, are calculated; generating (m x n) base pairs and (m+n-1) interconnection scores, and taking the highest score as the final interconnection score of the two sequences; for a set of p different sequences, self and pairwise comparisons result in 1/2 (p 2 +p) group comparison results; selecting a set of repeated motif type combinations with the lowest total score of the system interconnection reaction according to the comparison result, and adjusting the amplification directions of all primers according to the combination so as to ensure that serious cross interconnection does not occur among the amplified STR sequences;
the high-flux Index requires strict conditions of 30-70% GC content, 5 or less continuous repeated bases, self reverse complementation inconsistency, hamming distance greater than 3 and no conflict with the current mainstream platform Index according to base balance and laser balance principles.
2. The STR typing method of claim 1, wherein the STR locus to be tested comprises at least two of D1S1656, D2S1338, D2S441, TPOX, D3S1358, FGA, CSF1PO, D5S818, D7S820, D8S1179, D10S1248, TH01, D12S391, vWA, D13S317, D16S539, D18S51, D19S433, D21S11 or D22S 1045.
3. The STR typing method according to claim 1, wherein the multiplex PCR primer not only conforms to the design principle of the traditional multiplex PCR primer, but also balances the amplicons of STR sites with different lengths, types and copy numbers, so that the designed PCR amplification primer can accurately cover a core repeated region for STR length calculation, and according to the characteristic of higher initial quality of a second-generation sequencing sequence, the optimal sequencing quality is achieved by adjusting the initial position of the STR repeated region.
4. The STR typing method of claim 2 or 3 wherein the STR locus multiplex PCR primer comprises an upstream primer and a downstream primer, respectively, having the sequences shown in SEQ ID NOS 1 to 40.
5. The STR typing method according to claim 1, wherein the amplification procedure of the first round of PCR amplification reaction is 98 ℃ for 3 to 5min; 20s at 98 ℃ and 6min at 60 ℃ for 20 cycles; and at 72℃for 6min.
6. The STR typing method according to claim 1, wherein the amplification procedure of the second round of PCR amplification reaction is: 98 ℃ for 2 min;98℃15s,58℃15s,72℃30s,6 cycles; 72 ℃ for 2min.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010814836.0A CN112626223B (en) | 2020-08-13 | 2020-08-13 | STR typing method based on multiplex PCR technology and DNB technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010814836.0A CN112626223B (en) | 2020-08-13 | 2020-08-13 | STR typing method based on multiplex PCR technology and DNB technology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112626223A CN112626223A (en) | 2021-04-09 |
CN112626223B true CN112626223B (en) | 2024-03-22 |
Family
ID=75300084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010814836.0A Active CN112626223B (en) | 2020-08-13 | 2020-08-13 | STR typing method based on multiplex PCR technology and DNB technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112626223B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109022559A (en) * | 2018-08-21 | 2018-12-18 | 华中农业大学 | A kind of molecular mark detection method based on two generation sequencing technologies |
CN110878334A (en) * | 2019-11-12 | 2020-03-13 | 北京康普森生物技术有限公司 | Primer for sequencing amplicon and two-step PCR library building method |
-
2020
- 2020-08-13 CN CN202010814836.0A patent/CN112626223B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109022559A (en) * | 2018-08-21 | 2018-12-18 | 华中农业大学 | A kind of molecular mark detection method based on two generation sequencing technologies |
CN110878334A (en) * | 2019-11-12 | 2020-03-13 | 北京康普森生物技术有限公司 | Primer for sequencing amplicon and two-step PCR library building method |
Also Published As
Publication number | Publication date |
---|---|
CN112626223A (en) | 2021-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2016256786B2 (en) | Methods and compositions for rapid multiplex amplification of str loci | |
AU2017200433B2 (en) | Multivariate diagnostic assays and methods for using same | |
CN110317875B (en) | Methylation gene related to lung cancer and detection kit thereof | |
Liang et al. | Distribution and cloning of eukaryotic mRNAs by means of differential display: refinements and optimization | |
CN105899680A (en) | Nucleic acid probe and method of detecting genomic fragments | |
CN110358815B (en) | Method for simultaneously detecting multiple target nucleic acids and kit thereof | |
CA2905410A1 (en) | Systems and methods for detection of genomic copy number changes | |
CN111118151A (en) | Human SMN1 and SMN2 gene copy number detection kit based on digital PCR method | |
CN108103164B (en) | Method for detecting copy number variation by using multiple fluorescent competitive PCR | |
GB2497510A (en) | Methods for determining mononucleotide sequence repeats | |
US20170101677A1 (en) | Mouse cell line authentication | |
CN110305968A (en) | A kind of composite amplification system in the micro- haplotype domain SNP-DIP based on NGS parting for medical jurisprudence individual identification | |
CN112280848A (en) | Relative quantitative detection method and kit for human motor neuron gene copy number | |
Thies | Molecular approaches to studying the soil biota | |
US20210292829A1 (en) | High throughput assays for detecting infectious diseases using capillary electrophoresis | |
CN112626223B (en) | STR typing method based on multiplex PCR technology and DNB technology | |
CN109321662B (en) | Fluorescence labeling composite amplification kit for 15 Indel loci of human Y chromosome | |
CN113462783B (en) | Brain glioma chromosome lp/19q detection method based on MassArray nucleic acid mass spectrum and application thereof | |
CN115960997A (en) | Primer probe combination and kit for detecting exon14 skipping mutation of c-MET gene based on digital PCR platform | |
CN115851915A (en) | Primer group and method for detecting hereditary ataxia disease-causing gene | |
US20240209457A1 (en) | Novel Y-Chromosomal Short Tandem Repeat Markers for Typing Male Individuals | |
CN116004775A (en) | Primer probe composition, kit and method for quantifying copy number of human motor neurons | |
CN111100924A (en) | Quality control product for detecting CGG (CGG repeat number) of FMR1 gene, application thereof and kit containing quality control product | |
CN104152568B (en) | High-throughput STR sequence core repeat number detection method | |
US20100297622A1 (en) | Method for high-throughput gene expression profile analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |