CN112626223B - STR typing method based on multiplex PCR technology and DNB technology - Google Patents

STR typing method based on multiplex PCR technology and DNB technology Download PDF

Info

Publication number
CN112626223B
CN112626223B CN202010814836.0A CN202010814836A CN112626223B CN 112626223 B CN112626223 B CN 112626223B CN 202010814836 A CN202010814836 A CN 202010814836A CN 112626223 B CN112626223 B CN 112626223B
Authority
CN
China
Prior art keywords
str
pcr
dnb
technology
sequences
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010814836.0A
Other languages
Chinese (zh)
Other versions
CN112626223A (en
Inventor
文少卿
雷波
王凌翔
刘超
刘长晖
刘宏
韩晓龙
徐曲毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou criminal science and technology research institute
Guangzhou Shenxiao Gene Technology Co ltd
Original Assignee
Guangzhou criminal science and technology research institute
Guangzhou Shenxiao Gene Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou criminal science and technology research institute, Guangzhou Shenxiao Gene Technology Co ltd filed Critical Guangzhou criminal science and technology research institute
Priority to CN202010814836.0A priority Critical patent/CN112626223B/en
Publication of CN112626223A publication Critical patent/CN112626223A/en
Application granted granted Critical
Publication of CN112626223B publication Critical patent/CN112626223B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses an STR typing method based on a multiplex PCR technology and a DNB technology, which comprises the following steps of S1, extracting DNA of a sample to be detected; s2, designing multiple PCR primers aiming at the selected STR locus and establishing an evaluation system for cross interconnection of STR sequences in DNB cyclization; s3, taking the DNA of the sample to be detected in the step S1 as a template, and carrying out a first round of PCR amplification reaction by using the multiplex PCR primer designed in the step S2; s4, purifying the first PCR product, and performing a second PCR amplification to introduce a high-throughput Index; s5, carrying out PCR amplification, recovery and purification on the second round; and S6. Performing DNB sequencing on the PCR product. The invention provides a set of evaluation system which is connected with each other through a set of STR sequences suitable for DNB sequencing and cyclization, and successfully establishes an STR typing method based on a multiplex PCR technology and a DNB technology.

Description

STR typing method based on multiplex PCR technology and DNB technology
Technical Field
The invention relates to the field of biotechnology, in particular to an STR typing method based on a multiplex PCR technology and a DNB technology.
Background
The short segment repeated sequence (short tandem repeat, STR) typing technology is a main means for individual identification and parent identification in forensics at home and abroad at present, and has the characteristics of simplicity, rapidness, reliable result, high sensitivity and good repeatability. The combination of multiple genetic markers such as autosomal STR, Y-STR and X-STR has unique application value for bigeminal paternity and more complex paternity.
The current legal medical expert STR typing detection method is mainly a PCR-capillary electrophoresis method. The main principle of the method is as follows: (1) extracting DNA, and extracting DNA samples with high purity, good integrity and proper concentration by using methods such as magnetic beads, phenol/chloroform, chelex and the like; (2) performing PCR amplification, wherein the PCR (polymerase chain reaction) technology takes two DNA chains to be amplified as templates, and rapidly and specifically amplifies target DNA fragments under the mediation of a pair of artificially synthesized oligonucleotide primers through the enzymatic action of high-temperature-resistant DNA polymerase; (3) capillary electrophoresis is based on Sanger enzyme method, and uses heat-resistant DNA polymerase to refer to the template to be detected to generate a series of chain termination fragments. In the process, 4 different fluorescent dyes are used for marking reaction products, the reaction products are separated through capillary electrophoresis, a fixed laser light source is used for exciting fluorescent signals, CCD (charge coupled device) is used for detecting and collecting data, and a computer is used for analyzing and processing to obtain sequence information. (4) STR typing of the samples was determined. And after electrophoresis, detecting the lengths of peaks of DNA fragments with different fluorescent colors, comparing the DNA fragments with an internal standard to determine the lengths of the DNA fragments, corresponding to the different colors, and finally comparing the DNA fragments with allele ladders determined in the same way to obtain STR typing of the sample. The prior STR technology based on the traditional PCR-capillary electrophoresis has the main defects that firstly, forensic sites detected simultaneously are few, flux is low, although the upgrade of a multicolor fluorescence system promotes the increase of the number of disposable detection sites, the number of the detection sites is still difficult to exceed 60 at present, the detection requirements of first-line public security departments cannot be met, and meanwhile, the number of autosomes STR, YSTR and XSTR detected by forensic requirements is increased year by year. Secondly, the integrity of the obtained sequence information is limited. The traditional STR typing method based on the first-generation CE technology can only obtain the relative length information of the amplicon, can not obtain the complete base sequences of the STR repeated region and the flanking region, can not fully consider the possible variation of the sequences and the influence on the sequencing result, and limits the application scene of forensic DNA detection.
High-throughput sequencing technology (High-throughput sequencing), also known as "Next-generation" sequencing technology ("Next-generation" sequencing technology), allows for the simultaneous sequencing of millions of DNA molecules, which enables careful and comprehensive analysis of the transcriptome and genome of a species, a milestone in the course of development of sequencing technology. Patent CN102943111a discloses the use and method of high throughput DNA sequencing for determining short tandem repeat loci in human genome, patent CN104673907a discloses a system for high throughput inspection STR typing and detection method thereof, but the accuracy of the above-mentioned STR typing method based on conventional high throughput sequencing technology still remains to be improved. The DNB sequencing technology is based on a new sequencing technology developed by the prior second generation sequencing, genomic DNA is firstly subjected to fragmentation treatment, then a linker sequence is added, and cyclized to form single-stranded circular DNA, then the single-stranded circular DNA can be amplified by 2-3 orders of magnitude by using a rolling circle amplification technology (Rollingcircle amplification, RCA), the generated amplified product is called DNA nanospheres (DNAnanoball, DNB), and finally the nanospheres are fixed on an arrayed silicon chip by a DNB loading technology. Compared to other second generation sequencing techniques, DNB sequencing techniques have several advantages: (1) DNB enhances the signal strength by increasing the copy number of the DNA to be tested, thereby improving the sequencing accuracy; (2) Unlike PCR exponential amplification, amplification errors of rolling circle amplification techniques do not accumulate; (3) DNB has the same size as an activation site on a chip, and each site is only fixed with one DNB, so that no mutual interference is generated between signal points; (4) The combination of the arrayed sequencing chip and the DNB sequencing technology enables the area of the imaging system pixels and the sequencing chip to be utilized to the maximum. Currently, in the prior art, STR typing methods based on multiplex PCR technology and DNB technology are not known. Since the conventional DNB sequencing method forms single-stranded circular DNA followed by sequencing of DNB production, this step has little effect on random fragments of the genome or on common sequences without fixed structures. However, in the case of STR sequences having a highly repetitive structure, various types of repetition and a general imbalance in base ratio, the generation and amplification of DNB after cyclization may generate serious bias, and even single-stranded DNA itself cannot be normally cyclized, thereby seriously affecting the balance of sequencing results. Therefore, how to perform STR typing by using the re-PCR technique and the DNB technique based on the characteristics of STR sequences is a problem that needs to be solved at present.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings in the prior art and provide an STR typing method based on a multiplex PCR technology and a DNB technology.
The above object of the present invention is achieved by the following technical solutions:
an STR typing method based on a multiplex PCR technology and a DNB technology comprises the following steps:
s1, preprocessing a sample to be detected, and extracting DNA of the sample to be detected;
s2, designing multiple PCR primers aiming at STR loci to be detected, establishing an evaluation system for cross interconnection of STR sequences in DNB cyclization, and adjusting the amplification direction of the multiple PCR primers;
s3, taking the DNA of the sample to be detected in the step S1 as a template, and carrying out a first round of PCR amplification reaction by using the multiplex PCR primer designed in the step S2;
s4, purifying the first PCR product, and performing a second PCR amplification to introduce a high-throughput Index;
s5, carrying out PCR amplification, recovery and purification on the second round;
s6, performing DNB sequencing on the PCR product;
step S2, the evaluation system is that any two amplification product sequences with the lengths of m and n are respectively shifted one by one for mutual comparison on each possible base pairing state, and interconnection scores, namely the difference value of the matching base numbers and the unmatched base numbers, are calculated; generating (m x n) times of base alignment and (m+n-1) times of interconnection scores, and taking the highest score as the final interconnection score of the m and n sequences; for a set of p different sequences, self and pairwise comparisons result in 1/2 (p 2 +p) group comparison results; selecting a set of leads according to the comparison resultThe repeated motif type combination with the lowest total score of the system interconnection reaction is obtained, and the amplification directions of all the primers are adjusted accordingly, so that the amplified STR sequences are ensured not to have serious cross interconnection.
The invention adopts a multiple PCR technology to detect a plurality of different STR loci simultaneously in a reaction system, aims at the problems that the STR loci have a high repeated structure, various repeated types and common unbalance of base proportions, and the generation and the amplification of DNB after cyclization are likely to generate serious deviation and even single-stranded DNA itself cannot be cyclized normally, designs a set of evaluation system suitable for cross interconnection of STR sequences of DNB sequencing and cyclization, and carries out strict sequence complementary self-connection evaluation on a group of STR target sequences (including all sequences per se, forward and reverse sequences and sequences), so as to adjust a set of optimal sequence direction combination and ensure that serious cross interconnection reaction among the STR sequences does not occur as much as possible.
Preferably, the STR locus to be tested comprises at least two of D1S1656, D2S1338, D2S441, TPOX, D3S1358, FGA, CSF1PO, D5S818, D7S820, D8S1179, D10S1248, TH01, D12S391, vWA, D13S317, D16S539, D18S51, D19S433, D21S11 or D22S 1045.
Preferably, when the primer design is carried out aiming at STR locus loci, besides following the traditional multiple PCR primer design principle, the invention also gives consideration to the balance among amplicons of STR loci with different lengths, types and copy numbers, so that the designed PCR amplification primer can accurately cover a core repeated region for STR length calculation, and the optimal sequencing quality can be achieved by adjusting the initial position of the STR repeated region according to the characteristic of higher initial quality of a second-generation sequencing sequence.
Further preferably, the STR locus multiplex PCR primer comprises an upstream primer and a downstream primer, respectively, and the sequences of the upstream primer and the downstream primer are shown in SEQ ID NO. 1-40 in sequence.
Preferably, the amplification system of the first round PCR amplification reaction is: JN-1500 MIX 6. Mu.L, genomic DNA 20-300ng,3 XEnzymoHT 10. Mu.L, H 2 O to 30. Mu.L.
Preferably, the amplification procedure of the first round PCR amplification reaction is 98 ℃ for 3-5 min; 20s at 98 ℃ and 6min at 60 ℃ for 20 cycles; and at 72℃for 6min.
Preferably, the amplification system of the first round PCR amplification reaction is: 2 XHIFI_enzyme 15. Mu.L, nuclear-Free H 2 O13. Mu.L, primer_MGI-F1. Mu.L, MGI_Bar_xxx1mu.L, and 30. Mu.L in total.
The high-throughput Index in the step S4 can also fulfill the aim of the invention by adopting the high-throughput Index commonly used in the prior art, but in order to fully exert the advantages of the second-generation sequencing on a high-throughput large sample, the invention also provides a set of ultra-high-throughput Index which requires strict conditions of base balance and laser balance principle, GC content of 30-70%, continuous repeated base less than or equal to 5, self reverse complementation inconsistency, hamming distance greater than 3 and no conflict with the current main flow platform Index.
Preferably, the amplification procedure of the second round of PCR amplification reaction is: 98 ℃ for 2 min;98℃15s,58℃15s,72℃30s,6 cycles; 72 ℃ for 2min.
The invention also claims to provide the use of any of the above methods in individual identification and/or paternity testing.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides an STR typing method based on a multiplex PCR technology and a DNB technology, which is characterized in that a complete set of evaluation system is cross-interconnected through a set of STR sequences suitable for DNB sequencing and cyclization, and a set of STR target sequences (comprising all sequences per se, forward and reverse sequences and sequences), which need to be cyclized, are subjected to strict sequence complementary self-connection evaluation, so that an optimal sequence direction combination is adjusted, serious cross-interconnection reaction between the STR sequences is avoided, the balance of sequencing results is ensured, and the STR typing method based on the multiplex PCR technology and the DNB technology is successfully established, thereby being used for individual identification and/or paternity identification.
Drawings
FIG. 1 is a flow chart showing the operation of the STR typing method based on the multiplex PCR technique and DNB technique of the present invention.
FIG. 2 is a diagram illustrating the processing of the present invention in the presence of severe cross-interconnections between certain two STR repeats.
FIG. 3 is an illustration of a set of 96 Indexs in an ultra high throughput Index system optimized in accordance with the present invention.
FIG. 4 is the final sequencing result of standard 2800M.
Detailed Description
The invention is further illustrated in the following drawings and specific examples, which are not intended to limit the invention in any way. Unless specifically stated otherwise, the reagents, methods and apparatus employed in the present invention are those conventional in the art.
Reagents and materials used in the following examples are commercially available unless otherwise specified.
Example 1
An STR typing method based on multiplex PCR technology and DNB technology has the operation flow shown in figure 1. The method mainly comprises the following steps:
1. multiplex PCR amplification technology based on second generation sequencing
(1) Primer design of multiplex PCR amplicon: a specific set of STR panel is detected by utilizing a multiplex PCR amplicon capturing technology and matching with second generation sequencing, and a set of specific amplification primers is designed according to the sequence region characteristics of the STR sites. In order to adapt to the characteristic of high throughput of second-generation sequencing, more than several hundred weight PCR primers are often designed at a time, besides following the design principle of the traditional multiplex PCR primers, the balance among amplicons of STR loci with different lengths, types and copy numbers is also considered, so that the designed PCR amplification primers can accurately cover a core repeated region for STR length calculation, and according to the characteristic of higher initial quality of the second-generation sequencing sequence, the optimal sequencing quality can be achieved by adjusting the initial position of the STR repeated region.
Examples of primer designs are shown in Table 1 below:
TABLE 1
Note that: chr: a chromosome; LEN: a length; TM:50% of the temperature at which the primer is in a dissociated state;
(2) Design of a set of STR sequence cross-connection evaluation system suitable for DNB cyclization
And (3) carrying out strict sequence complementary self-connection evaluation on a group of STR target sequences (all sequences per se, forward and reverse, and between sequences) which need cyclization, so as to adjust a set of optimal sequence direction combinations and ensure that no serious cross-connection reaction occurs between STR sequences as much as possible.
Examples: as shown in fig. 2, the presence of severe cross-interconnections between certain two STR repeats can be avoided by adjusting the sequencing direction of one of the STR sequences;
based on the principle, a whole set of STR sequence cross interconnection evaluation system suitable for DNB sequencing and cyclization is designed, the influence factors such as the repetition type, the repetition length, the base ratio and the like of a set of STR sequences are comprehensively considered, the sequencing direction of all STR sequences is integrally optimized, and then the subsequent normal cyclization and DNB amplification are ensured, so that the balance among all sites of a sequencing result is achieved.
Principle of: for any two sequences of length m, n, respectively, the alignment is performed by shifting one by one on each possible Watson-Crick base pairing state, and the interconnection score, i.e. the difference between the number of matching bases and the number of unmatched bases, is calculated. This procedure will produce (m x n) base pairs and (m+n-1) interconnect scores, with the highest score being taken as the final interconnect score for both m and n sequences. For a set of p different sequences, self and pairwise comparisons will yield 1/2 (p 2 And +p) comparing the results, selecting a set of repeated motif type combination which enables the total score of the system interconnection reaction to be the lowest according to the comparing results, and adjusting the amplification directions of all primers according to the repeated motif type combination so as to ensure that serious cross interconnection does not occur among the amplified STR sequences.
(3) Target region amplification: the first round of PCR amplification was performed according to the reaction system and amplification conditions listed below.
Amplification system: reactions were performed in a super clean bench using 0.2mL PCR tube/96 well PCR plate according to the system configuration described in Table 2 below:
TABLE 2
The PCR amplification procedure is shown in Table 3:
TABLE 3 Table 3
* If the DNA is less than 5ng, the amount is increased by 1 cycle.
(4) Design of ultra-high throughput Index system suitable for current mainstream second generation sequencing platform
As a preferred implementation mode, the invention is to exhaust the efficiency of a single-ended Index system, fully utilize the advantages of second-generation sequencing on a high-flux large sample, and design a set of ultra-high-flux Index system suitable for the current mainstream second-generation sequencing platform. According to the strict conditions of base balance and laser balance principle, GC content of 30-70%, continuous repeated base less than or equal to 5, self reverse complementation inconsistency, hamming distance greater than 3 and no conflict with the Index of the current main stream platform, we search all possible spaces of 10bp length Index, and finally optimize a set of ultra-high flux Index system comprising 2016 indices (which can be randomly combined in a set of 96 or 32) and corresponding high-efficiency resolution scripts. The data splitting default allows a fault-tolerant space of 1bp, the final data splitting rate of the Index system reaches 94% after a plurality of batches of experimental tests, the effective data amount for sample site analysis is higher than 80%, and the advantage of parallel sequencing of a large number of samples by the high-throughput sequencing platform is fully exerted.
The following Table 4 is a high throughput 2016 Index system base balance assessment:
TABLE 4 Table 4
Pos A C G T Highest base duty cycle
1 506 504 511 495 25.35%
2 508 504 506 498 25.20%
3 504 506 507 499 25.15%
4 507 502 506 501 25.15%
5 501 507 506 502 25.15%
6 505 504 502 505 25.05%
7 503 505 505 503 25.05%
8 506 503 505 502 25.10%
9 504 504 505 503 25.05%
10 500 508 503 505 25.20%
Some set of 96 Index examples are shown in FIG. 3.
(5) After the first round of PCR product purification, a second round of PCR amplification was performed and the corresponding Index system was introduced for second generation sequencing in step (4). The system of the second round of PCR amplification is shown in Table 5:
TABLE 5
* Different samples please use different Barcode/Index.
The PCR amplification procedure is shown in Table 6:
TABLE 6
* Typically 6 cycles, 8 cycles are used when the original DNA is less than 10 ng.
(6) And (5) recovering PCR products. The PCR product is purified using a purification kit or other equivalent magnetic beads.
2. Second generation sequencing (DNB sequencing technique)
DNB sequencing was performed on the PCR products using a Huada sequencer.
Example 2
An STR typing method based on multiplex PCR technology and DNB technology, the same as the method of example 1, wherein:
(1) STR locus information: taking 20 autosomes STRs of Expanded CODIS core loci as an example, specific site information is shown in table 7 below:
20 autosomes STR of table 7 Expanded CODIS core loci
(2) STR locus primer information
Multiplex PCR initial primer design as shown in table 8 below, fully contains the repeat region of STR:
TABLE 8
(3) STR sequence cross-interconnect assessment for DNB cyclization
According to the amplification direction of the multiplex PCR initial primers, we obtained the repeated sequences of the amplified products and classified according to the repetition types, the repetition lengths and the base ratios, as shown in Table 9, 5 kinds of repeated motif types were obtained in total:
TABLE 9
Computational assessment of cross-linking reactions that may exist between all repeat motif types was performed.
Principle of: for any two sequences of length m, n, respectively, the alignment is performed by shifting one by one on each possible Watson-Crick base pairing state, and the interconnection score, i.e. the difference between the number of matching bases and the number of unmatched bases, is calculated. This procedure will produce (m x n) base pairs and (m+n-1) interconnect scores, with the highest score being taken as the final interconnect score for both m and n sequences. For a set of p different sequences, self and pairwise comparisons will yield 1/2 (p 2 +p) group comparison results.
Specifically, in this example, each repetitive motif type is first uniformly repeated 6 times, for example, two motif types with a repetition length of 4 are repeated 6 times, and then two sequences with a length of 24bp are obtained, and base pairs are compared one by one, so that 576 base pairs and 47 interconnection scores are generated (in order to consider the calculated amount, the repeated sequences with a length of 20-30bp can be obtained appropriately according to the repetition type and 4-10 times). The set contains 10 motif types (including forward and reverse) and is self-contained and two-by-two compared to generate 55 sets of comparison results (highest scores), wherein 15 sets of scores are more than 7, and the table 10 is listed below:
table 10
From the results table, a set of repeat motif type combinations that minimize the total score of the phylogenetic interconnect reactions are selected and the amplification direction of all primers is adjusted accordingly to ensure that severe cross-interconnections do not occur between the amplified STR sequences.
Principle of: the calculation mode of each pair of reaction scores is that one of each type of forward direction or reverse direction is selected, and according to two types of interconnection reactions, the number of sites a and b belonging to the two types in all 20 STRs and the highest score S of the interconnection reactions are counted, so that the score a.b.S of the reaction in a system can be calculated. The score sum of all interconnected reactions in the system for a group of types is the total score of the system.
Specifically, in this example, if the combination of repeating motif types selected is 1_rc,2_rc,3_rc,4_rc,5_rc, the types in which the interconnection reaction will occur are 5_rc and 5_rc, the sites belonging to this type are only D22S1045, and thus the number of sites is 1 and 1, respectively, and the highest score of the interconnection reaction is 7, and thus the score of the reaction in the system is 1×1×7=7. Since this combination has no other interconnect reaction, the total system score is also 7, and by complete calculation we found that the total system score for this combination is the lowest among all combinations, and therefore this combination is the optimal repeat motif type combination. Accordingly, the primer amplification direction of all sites is adjusted, for example, the original primer amplification direction of D1S1656 is forward, the repeat motif type is 1, and after the reverse amplification is adjusted, the repeat motif type is changed to 1_rc.
After the cross interconnection evaluation and primer adjustment, the sequencing direction of 20 autosomes STRs of Expanded CODIS core loci is optimized on the whole, and then subsequent normal cyclization and DNB amplification are ensured, so that the balance among all sites of a sequencing result is achieved.
(4) Typing results of the final test
As for the final sequencing result of standard 2800M, as shown in FIG. 4, the invention successfully establishes an STR typing method based on a multiplex PCR technology and a DNB technology, and the invention performs STR typing based on a multiplex PCR amplification system by utilizing the characteristics of more sites, high flux and low cost of the DNB sequencing technology, thereby ensuring the accuracy and reliability of the sequencing result and being more suitable for the standardization of future STR detection.
Sequence listing
<110> Guangzhou deep dawn Gene technology Co., ltd
GUANGZHOU CRIMINAL SCIENCE AND TECHNOLOGY Research Institute
<120> an STR typing method based on multiplex PCR technology and DNB technology
<141> 2020-08-13
<160> 40
<170> SIPOSequenceListing 1.0
<210> 1
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 1
gcagcacaaa actcgtttag ca 22
<210> 2
<211> 26
<212> DNA
<213> person (Homo sapiens)
<400> 2
tataagttca agcctgtgtt gctcaa 26
<210> 3
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 3
atgcctacat ccctagtacc tagca 25
<210> 4
<211> 24
<212> DNA
<213> person (Homo sapiens)
<400> 4
ccagtggatt tggaaacaga aatg 24
<210> 5
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 5
ctgtaacaag ggctacagga at 22
<210> 6
<211> 29
<212> DNA
<213> person (Homo sapiens)
<400> 6
caccacaccc agccataaat aacatatta 29
<210> 7
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 7
caccttcctc tgcttcactt ttc 23
<210> 8
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 8
ccttctgtcc ttgtcagcgt tta 23
<210> 9
<211> 21
<212> DNA
<213> person (Homo sapiens)
<400> 9
ctgcagtcca atctgggtga c 21
<210> 10
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 10
ctcatgaaat caacagaggc ttg 23
<210> 11
<211> 29
<212> DNA
<213> person (Homo sapiens)
<400> 11
atcacggtct gaaatcgaaa atatggtta 29
<210> 12
<211> 26
<212> DNA
<213> person (Homo sapiens)
<400> 12
ctgcagggca taacattatc caaaag 26
<210> 13
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 13
acttggacag catttcctgt gt 22
<210> 14
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 14
cagattgtac agaggaggca ctt 23
<210> 15
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 15
ctctcccatc tggatagtgg ac 22
<210> 16
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 16
gtgacaaggg tgattttcct ctt 23
<210> 17
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 17
attgtgaggt cttaaaatct gaggtatcaa 30
<210> 18
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 18
aaagggtatg atagaacact tgtcatagtt 30
<210> 19
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 19
cacggcctgg caacttatat gt 22
<210> 20
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 20
gctgtcaaaa accgtatgta ttcttgtttc 30
<210> 21
<211> 27
<212> DNA
<213> person (Homo sapiens)
<400> 21
aagcttagta cttaactcac tgccttg 27
<210> 22
<211> 30
<212> DNA
<213> person (Homo sapiens)
<400> 22
ttcccttgtc ttgttattaa aggaacaact 30
<210> 23
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 23
aaatgacact gctacaactc acacc 25
<210> 24
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 24
cattggcctg ttcctccctt at 22
<210> 25
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 25
gtgatagtag tttcttctgg tgaaggaa 28
<210> 26
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 26
cttgcagatg gactgtcatg agatt 25
<210> 27
<211> 33
<212> DNA
<213> person (Homo sapiens)
<400> 27
gagataggac agatgataaa tacataggat gga 33
<210> 28
<211> 29
<212> DNA
<213> person (Homo sapiens)
<400> 28
cactttgccc ttattatttt gtgaactcc 29
<210> 29
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 29
attctgccta cagccaatgt gaa 23
<210> 30
<211> 25
<212> DNA
<213> person (Homo sapiens)
<400> 30
caaatctcct ccttcaactt gggtt 25
<210> 31
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 31
ggtctaagag cttgtaaaaa gtgtacaa 28
<210> 32
<211> 23
<212> DNA
<213> person (Homo sapiens)
<400> 32
gcgtttgtgt gtgcatctgt aag 23
<210> 33
<211> 26
<212> DNA
<213> person (Homo sapiens)
<400> 33
cacttcactc tgagtgacaa attgag 26
<210> 34
<211> 27
<212> DNA
<213> person (Homo sapiens)
<400> 34
gcaacaacac aaataaacaa accgtca 27
<210> 35
<211> 24
<212> DNA
<213> person (Homo sapiens)
<400> 35
aaggaacagg tggtgttggt taca 24
<210> 36
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 36
gttgaggctg caaaaagcta taattgta 28
<210> 37
<211> 24
<212> DNA
<213> person (Homo sapiens)
<400> 37
atatgtgagt caattcccca agtg 24
<210> 38
<211> 28
<212> DNA
<213> person (Homo sapiens)
<400> 38
tgtattagtc aatgttctcc agagacag 28
<210> 39
<211> 22
<212> DNA
<213> person (Homo sapiens)
<400> 39
cctctccacc ctatagaccc tg 22
<210> 40
<211> 27
<212> DNA
<213> person (Homo sapiens)
<400> 40
cctcagctgt agaatggaaa tagtgac 27

Claims (6)

1. An STR typing method based on a multiplex PCR technology and a DNB technology is characterized by comprising the following steps:
s1, preprocessing a sample to be detected, and extracting DNA of the sample to be detected;
s2, designing multiple PCR primers aiming at STR loci to be detected, establishing an evaluation system for cross interconnection of STR sequences in DNB cyclization, and adjusting the amplification direction of the multiple PCR primers;
s3, taking the DNA of the sample to be detected in the step S1 as a template, and carrying out a first round of PCR amplification reaction by using the multiplex PCR primer designed in the step S2;
s4, purifying the first PCR product, and performing a second PCR amplification to introduce a high-throughput Index;
s5, carrying out PCR amplification, recovery and purification on the second round;
s6, performing DNB sequencing on the PCR product;
step S2, the evaluation system is that any two amplification product sequences with the lengths of m and n are respectively shifted one by one for mutual comparison on each possible base pairing state, and interconnection scores, namely the difference value of the matching base numbers and the unmatched base numbers, are calculated; generating (m x n) base pairs and (m+n-1) interconnection scores, and taking the highest score as the final interconnection score of the two sequences; for a set of p different sequences, self and pairwise comparisons result in 1/2 (p 2 +p) group comparison results; selecting a set of repeated motif type combinations with the lowest total score of the system interconnection reaction according to the comparison result, and adjusting the amplification directions of all primers according to the combination so as to ensure that serious cross interconnection does not occur among the amplified STR sequences;
the high-flux Index requires strict conditions of 30-70% GC content, 5 or less continuous repeated bases, self reverse complementation inconsistency, hamming distance greater than 3 and no conflict with the current mainstream platform Index according to base balance and laser balance principles.
2. The STR typing method of claim 1, wherein the STR locus to be tested comprises at least two of D1S1656, D2S1338, D2S441, TPOX, D3S1358, FGA, CSF1PO, D5S818, D7S820, D8S1179, D10S1248, TH01, D12S391, vWA, D13S317, D16S539, D18S51, D19S433, D21S11 or D22S 1045.
3. The STR typing method according to claim 1, wherein the multiplex PCR primer not only conforms to the design principle of the traditional multiplex PCR primer, but also balances the amplicons of STR sites with different lengths, types and copy numbers, so that the designed PCR amplification primer can accurately cover a core repeated region for STR length calculation, and according to the characteristic of higher initial quality of a second-generation sequencing sequence, the optimal sequencing quality is achieved by adjusting the initial position of the STR repeated region.
4. The STR typing method of claim 2 or 3 wherein the STR locus multiplex PCR primer comprises an upstream primer and a downstream primer, respectively, having the sequences shown in SEQ ID NOS 1 to 40.
5. The STR typing method according to claim 1, wherein the amplification procedure of the first round of PCR amplification reaction is 98 ℃ for 3 to 5min; 20s at 98 ℃ and 6min at 60 ℃ for 20 cycles; and at 72℃for 6min.
6. The STR typing method according to claim 1, wherein the amplification procedure of the second round of PCR amplification reaction is: 98 ℃ for 2 min;98℃15s,58℃15s,72℃30s,6 cycles; 72 ℃ for 2min.
CN202010814836.0A 2020-08-13 2020-08-13 STR typing method based on multiplex PCR technology and DNB technology Active CN112626223B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010814836.0A CN112626223B (en) 2020-08-13 2020-08-13 STR typing method based on multiplex PCR technology and DNB technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010814836.0A CN112626223B (en) 2020-08-13 2020-08-13 STR typing method based on multiplex PCR technology and DNB technology

Publications (2)

Publication Number Publication Date
CN112626223A CN112626223A (en) 2021-04-09
CN112626223B true CN112626223B (en) 2024-03-22

Family

ID=75300084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010814836.0A Active CN112626223B (en) 2020-08-13 2020-08-13 STR typing method based on multiplex PCR technology and DNB technology

Country Status (1)

Country Link
CN (1) CN112626223B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109022559A (en) * 2018-08-21 2018-12-18 华中农业大学 A kind of molecular mark detection method based on two generation sequencing technologies
CN110878334A (en) * 2019-11-12 2020-03-13 北京康普森生物技术有限公司 Primer for sequencing amplicon and two-step PCR library building method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109022559A (en) * 2018-08-21 2018-12-18 华中农业大学 A kind of molecular mark detection method based on two generation sequencing technologies
CN110878334A (en) * 2019-11-12 2020-03-13 北京康普森生物技术有限公司 Primer for sequencing amplicon and two-step PCR library building method

Also Published As

Publication number Publication date
CN112626223A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
AU2016256786B2 (en) Methods and compositions for rapid multiplex amplification of str loci
AU2017200433B2 (en) Multivariate diagnostic assays and methods for using same
CN110317875B (en) Methylation gene related to lung cancer and detection kit thereof
Liang et al. Distribution and cloning of eukaryotic mRNAs by means of differential display: refinements and optimization
CN105899680A (en) Nucleic acid probe and method of detecting genomic fragments
CN110358815B (en) Method for simultaneously detecting multiple target nucleic acids and kit thereof
CA2905410A1 (en) Systems and methods for detection of genomic copy number changes
CN111118151A (en) Human SMN1 and SMN2 gene copy number detection kit based on digital PCR method
CN108103164B (en) Method for detecting copy number variation by using multiple fluorescent competitive PCR
GB2497510A (en) Methods for determining mononucleotide sequence repeats
US20170101677A1 (en) Mouse cell line authentication
CN110305968A (en) A kind of composite amplification system in the micro- haplotype domain SNP-DIP based on NGS parting for medical jurisprudence individual identification
CN112280848A (en) Relative quantitative detection method and kit for human motor neuron gene copy number
Thies Molecular approaches to studying the soil biota
US20210292829A1 (en) High throughput assays for detecting infectious diseases using capillary electrophoresis
CN112626223B (en) STR typing method based on multiplex PCR technology and DNB technology
CN109321662B (en) Fluorescence labeling composite amplification kit for 15 Indel loci of human Y chromosome
CN113462783B (en) Brain glioma chromosome lp/19q detection method based on MassArray nucleic acid mass spectrum and application thereof
CN115960997A (en) Primer probe combination and kit for detecting exon14 skipping mutation of c-MET gene based on digital PCR platform
CN115851915A (en) Primer group and method for detecting hereditary ataxia disease-causing gene
US20240209457A1 (en) Novel Y-Chromosomal Short Tandem Repeat Markers for Typing Male Individuals
CN116004775A (en) Primer probe composition, kit and method for quantifying copy number of human motor neurons
CN111100924A (en) Quality control product for detecting CGG (CGG repeat number) of FMR1 gene, application thereof and kit containing quality control product
CN104152568B (en) High-throughput STR sequence core repeat number detection method
US20100297622A1 (en) Method for high-throughput gene expression profile analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant