WO2012000152A1 - Pcr-sequencing method based on technology of dna molecular index and strategy of dna-breaking incompletely - Google Patents

Pcr-sequencing method based on technology of dna molecular index and strategy of dna-breaking incompletely Download PDF

Info

Publication number
WO2012000152A1
WO2012000152A1 PCT/CN2010/001834 CN2010001834W WO2012000152A1 WO 2012000152 A1 WO2012000152 A1 WO 2012000152A1 CN 2010001834 W CN2010001834 W CN 2010001834W WO 2012000152 A1 WO2012000152 A1 WO 2012000152A1
Authority
WO
WIPO (PCT)
Prior art keywords
pairs
primer
pcr
hla
dna
Prior art date
Application number
PCT/CN2010/001834
Other languages
French (fr)
Chinese (zh)
Inventor
李剑
刘涛
赵美茹
张现东
Original Assignee
深圳华大基因科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因科技有限公司 filed Critical 深圳华大基因科技有限公司
Publication of WO2012000152A1 publication Critical patent/WO2012000152A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the invention relates to the field of nucleic acid sequencing technology, in particular to the field of PCR sequencing technology. In another aspect, the invention also relates to the field of PCR-index/barcode technology.
  • the method of the present invention is particularly applicable to second generation sequencing techniques, particularly Pair-end sequencing technology in second generation sequencing technology, and can also be used for HLA genotyping. Background technique
  • the PCR sequencing method is a technique for obtaining a DNA fragment of a target gene by PCR, and then performing DNA sequence detection on the obtained DNA fragment of the target gene, thereby obtaining a DNA sequence information of the target gene, which is intuitive and accurate. Its characteristics have long been widely used in the field of gene mutation detection and genotyping.
  • PCR-index/barcode technology by adding a primer index sequence at the 5' end of the PCR primer, a unique primer label can be introduced into each sample during PCR, allowing the sample to utilize second-generation DNA sequencing technology (In the detection process of large-scale, high-throughput, single-molecule sequencing sequencer represented by Illumina GA, Roche 454, ABI Solid and other sequencers, except for the PCR section, which must be processed one by one, other experimental links can be more The samples are mixed and processed simultaneously, and finally the results of each sample can be returned by its unique primer tag sequence; this method has the characteristics of low cost, large flux and multiple different gene loci that can simultaneously detect a large number of samples.
  • the second generation of DNA sequencing technology has a shorter length of sequential sequencing than the first generation (Sanger method) DNA sequencing technology.
  • Illumina GA Illumina's The Genome Analyzer sequencer (Illumina GA for short) is an example.
  • the current maximum sequencing length of Illumina GA is 200 bp.
  • the length of the PCR product is greater than 200 bp, the entire PCR product cannot be tested by Illumina GA sequencing and directly by PCR sequencing.
  • the entire DNA sequence information of the PCR product to be detected is not obtained.
  • Short sequencing reads limit the application of second-generation sequencing technology, in addition to progressively improving sequencing technology to achieve longer actual sequencing reads, development can overcome the second-generation DNA sequencer's existing sequencing read length in PCR sequencing applications New technologies with insufficient fields have also become a top priority. Summary of the invention
  • Illumina GA has ultra-high sequencing throughput, but its measurement length is only 200bp; although the length of Roche 454 GS-FLX can reach 500bp, its sequencing cost is higher and the throughput is smaller; the first generation of sequencer Although the length measurement can reach more than lOObp, its throughput and cost cannot be compared with the second generation sequencer.
  • the second-generation sequencing technology includes the Pair-end sequencing technology in the second-generation sequencing technology, and the PCR sequencing technology for the PCR template with the DNA Reference Sequence.
  • the present invention provides a method for PCR sequencing, which reduces the limitation caused by the short read length of the self-sequencing and expands the application of the second generation DNA sequencing technology in the field of PCR sequencing applications.
  • the design of the primer label varies according to the experimental platform used. Considering the characteristics of the Illumina GA sequencing platform itself, the present invention mainly considers the following points when designing the primer label: 1: Avoid more than 3 in the primer label sequence (including 3 Single base repeat, 2: the total content of base A and base C in the same site of all primer tags is between 30% and 70% of the total base content, 3: GC of the primer tag sequence itself The content is between 40-60%, 4: the sequence difference between the primer tags is greater than 4 bases, 5: the sequence with high similarity to the Illumina GA sequencing primer is avoided in the primer tag sequence, 6: the primer tag sequence addition is reduced.
  • 1 Avoid more than 3 in the primer label sequence (including 3 Single base repeat, 2: the total content of base A and base C in the same site of all primer tags is between 30% and 70% of the total base content, 3: GC of the primer tag sequence itself The content is between 40-60%, 4: the sequence difference between the primer tags is greater than 4 bases, 5: the sequence with high similarity to the Illumina GA sequencing
  • the present invention introduces a primer tag at each end of the PCR product by PCR reaction (the primer tag sequence may be the same or different), so that the primer tag at either end of the PCR product can specifically label the sample information of the PCR product.
  • the resulting PCR product was subjected to incomplete typing.
  • the so-called incomplete interruption means that the final product contains a complete unbroken PCR product and a partially interrupted PCR product.
  • the breaking method includes, but is not limited to, a chemical breaking method (for example, enzymatic cutting) and a physical breaking method, and the physical breaking method includes an ultrasonic breaking method or a mechanical breaking method.
  • Interrupted DNA is electrophoresed on 2% agarose, and the gel is purified to recover all DNA bands from the maximum read length of the sequencer to the longest DNA length range that the sequencer can use (the longest DNA available for the Illumina GA sequencer) Is 700bp, this length is the original DNA length, no There are lengths of the library linker included). Purification and recovery methods include, but are not limited to, electrophoretic tapping recovery, magnetic bead recovery, and the like. The recovered DNA fragment is then subjected to a second-generation sequencer sequencing library construction process to construct a sequencing library, followed by sequencing, preferably using a PCR-FREE sequencing library construction process to construct a sequencing library, which is preferably sequenced using the Pair-End method.
  • Adapter or “library adapter” tag technology refers to the addition of different library linkers to multiple sequencing libraries (the sequence of the different library linkers is different, and the different parts of the sequence are called adapter indices). ), constructing a tag sequencing library, thereby enabling a plurality of different tag sequencing libraries to be mixed and sequenced, and finally a library tagging technique in which the sequencing results of each tag sequencing library can be distinguished from each other.
  • the PCR-FREE library construction method combining the library linker technology means that the library linker is directly ligated to both ends of the DNA fragment in the sequencing library, and the introduction process of the library linker is called PCR-Free library construction because there is no PCR involved. .
  • the access method can be ligated using DNA ligase. There is no PCR participation in the whole library construction process, which avoids the error caused by PCR and leads to the final conclusion during the construction of PCR product pooling library with high sequence similarity. Inaccuracy.
  • the second generation sequencing technology when the second generation sequencing technology is applied to the field of PCR sequencing, the actual measurable PCR product length exceeds the maximum sequencing of the sequencer. length.
  • a set of primer indices comprising at least 10 pairs, or at least 20 pairs, or at least 30 pairs, or at least 40 pairs of 95 pairs of primer labels shown in Table 1. Or at least 50 pairs, at least 60 pairs, or at least 70 pairs, or at least 80 pairs, or at least 90 pairs, or 95 pairs (or the set of primer labels are 10 - 95 pairs of 95 pairs of primer labels shown in Table 1) (eg 10 - 95 pairs, 20 - 95 pairs, 30 - 95 pairs, 40 - 95 pairs, 50 - 95 pairs, 60 - 95 pairs, 70 - 95 pairs, 80 - 95 pairs, 90 - 95 pairs, or 95 pairs ))), and
  • the set of primer tags preferably comprises at least PI-1 to PI-10, or PI-11 to ⁇ -20, or PI-21 to ⁇ -30, or PI-31 in the 95 pairs of primer labels shown in Table 1.
  • each pair of primer tags is combined with a PCR primer pair for amplifying a sequence of interest to be tested into a pair of tag primers
  • the 5th ends of the forward and reverse PCR primers have (or are optionally joined by a linker sequence) a forward primer tag and a reverse primer tag, respectively.
  • the PCR primer is a PCR primer for amplifying a specific gene of HLA, preferably for amplifying HLA-A/B 2, 3, 4 exon and HLA-DRB1 PCR primers for exon 2, preferably the PCR primers are shown in Table 2.
  • a set of tag primers comprising a set of primer tags described above and a PCR primer pair for amplifying a sequence of interest, wherein each pair of primer tags is combined with a PCR primer pair
  • the 5th ends of the forward and reverse PCR primers have (or are optionally joined by a linker sequence) a forward primer label and a reverse primer label, respectively.
  • the PCR primer in the above-described tag primer is a PCR primer for amplifying a specific gene of HLA, preferably for amplifying HLA-A/B 2, 3, 4 PCR primers for the exon and HLA-DRB1 exon 2, preferably the PCR primers are shown in Table 2.
  • the tag primer is used in a PCR sequencing method.
  • a method of determining a nucleotide sequence of a nucleic acid of interest in a sample comprising:
  • n samples n being an integer greater than or equal to 1, the sample preferably being from a mammal, more preferably a human, in particular a human blood sample; alternatively, dividing the n samples to be analyzed into m groups m is an integer and n > m >l;
  • each pair of label primers are used, and in the presence of a template from the sample, PCR amplification is carried out under conditions suitable for amplifying the nucleic acid of interest, wherein each pair of label primers comprises primers
  • the label's forward label primer and reverse label primer (both may be degenerate primers), wherein the forward label il and the reverse label primer may contain the same or different primer labels; Primer labels are different from each other;
  • the recovered DNA mixture is sequenced using a second generation sequencing technique, preferably a Pair-End technique (e.g., Illumina GA, Illumina Hiseq 2000), to obtain a sequence of the broken DNA;
  • a Pair-End technique e.g., Illumina GA, Illumina Hiseq 2000
  • the sequencing results obtained will be Corresponding to the sample, the alignment sequence (such as Blast, BWA program) is used to locate each sequencing sequence to the corresponding DNA reference sequence of the PCR product, and the complete target nucleic acid is spliced from the sequence of the broken DNA by sequence overlap and linkage. .
  • the alignment sequence such as Blast, BWA program
  • a method of determining a nucleotide sequence of a nucleic acid of interest in a sample comprising:
  • n samples n being an integer greater than or equal to 1, the sample preferably being from a human, in particular a human blood sample; optionally, dividing n samples to be analyzed into m groups, m being an integer and n > m > 1 ;
  • each pair of label primers are used, and in the presence of a template from the sample, PCR is performed under conditions suitable for amplifying the nucleic acid of interest, wherein each pair of label primers comprises a primer label
  • the forward label primer and the reverse label primer both may be degenerate primers, wherein the forward label primer and the reverse label primer may contain the same or different primer labels; the primer label in the label primer pair used for different samples Different from each other;
  • Sequencing The recovered DNA mixture is sequenced using a second-generation sequencing technique, preferably a Pair-End technique (eg, Illumina GA, Illumina Hiseq 2000), to obtain a sequence of the interrupted DNA;
  • a second-generation sequencing technique preferably a Pair-End technique (eg, Illumina GA, Illumina Hiseq 2000)
  • each pair of primer tags and PCR primer pairs are combined into a pair of tag primers, and the 5th ends of the forward and reverse PCR primers respectively have (or optionally pass
  • the ligation sequence is ligated with a forward primer tag and a reverse primer tag.
  • the PCR primer is a PCR primer for amplifying a specific gene of HLA, preferably 2 for amplifying HLA-A/B.
  • PCR primers for exon 3, exon 4 and exon 2 of HLA-DRB1, preferably the PCR primers are shown in Table 2.
  • the primer tag is designed for PCR primers, preferably for PCR primers for amplifying a specific gene of 3 ⁇ 4A, more preferably for use.
  • the set of primer tags preferably comprises at least PI-1 to PI-10, or PI-11 to PI-20, or PI-21 to PI-30, or PI-31 in 95 pairs of primer labels shown in Table 1.
  • PI-40 or PI-41 to PI-50, or PI-51 to PI-60, or PI-61 to PI-70, or PI-71 to PI-80, or PI-81 to PI-90, Or PI-91 to PI-95, or it Any combination of two or more of them.
  • the DNA disruption includes a chemical disruption method and a physical disruption method, wherein the chemical method includes an enzyme digestion method, and the physical The breaking method includes an ultrasonic breaking method or a mechanical breaking method. After the DNA was disrupted, fragments of 450-750 bp in length were isolated.
  • an HLA typing method comprising: sequencing a sample (especially a blood sample) from a patient using the sequencing method described above, and sequencing the result with an HLA database (eg, IMGT HLA)
  • an HLA database eg, IMGT HLA
  • the standard sequence data alignment in the professional database the 100% match of the sequence alignment results is the HLA genotype of the corresponding sample.
  • Figure 1 Sequence primers for primer labeling, DNA disruption, and DNA sequencing.
  • the positive and negative primer tag sequences Index-NF/R (1) were introduced into the PCR product of the No. N sample.
  • the products of the PCR product interrupted by physical methods include: a product with a primer tag sequence at one end, both ends a product without a primer tag sequence, a product that is completely unbroken, and a tapping purification to recover all DNA bands located between the maximum read length of the sequencer and the longest DNA length range applicable to the sequencer for sequencing (2), Using the Index-NF/R to retrieve the sequencing result of the PCR product belonging to the No. N sample in the sequencing result, and using the known reference sequence information of the PCR product to locate the relative reference position of each sequencing sequence, and according to the overlap between the sequencing sequences And the linkage relationship assembled into a complete PCR product (3, 4).
  • FIG. 2 Electrophoresis results of the corresponding exon PCR products of sample No. 1 HLA-A/B/DRB1. From the electropherogram, the PCR product is a series of single bands with a fragment size of 300bp-500bp, wherein lane M is the molecular weight marker. (DL 2000, Takara), Lanes 1-7 are the HLA-A/B/DRB1 exons of sample No. 1 (A2, A3, A4, B2, B3, B4, DRBl-2) PCR amplification products, negative control (N) without amplification bands. The results for the other samples are similar.
  • Figure 3 DNA electrophoresis after HLA-Mix interruption (before and after tapping), the tapping area is 450-750 bp.
  • Lane M is a molecular weight marker (NEB-50bp DNA Ladder)
  • lane 1 is the electrophoresis of HLA-Mix before tapping
  • lane 2 is the gel of HLA-Mix after tapping.
  • Figure 4 Screenshot of the consensus sequence constructor for sample No. 1, which illustrates the complete sequence of the spliced PCR product based on the overlap between the primer tag and the DN A fragment. detailed description
  • the primers to be analyzed are introduced into the primers of the HLA-A/B 2, 3, 4 exons and the HLA-DRB1 exon 2 PCR product by PCR, so that the specific PCR products are labeled.
  • Sample information The PCR amplification products of the three sites of HLA-A/B/DRB1 in each sample were mixed together to obtain a PCR product library; the PCR product library was not completely interrupted by ultrasound, and a PCR-Free sequencing library was constructed.
  • the sequencing library was electrophoresed by 2% low melting agarose, pure gel All DNA bands between 450 bp and 750 bp in length were recovered (the library linker was added to both ends of the DNA fragment during the construction of the PCR-Free sequencing library, so that the length of the DNA fragment on the electropherogram is longer than the actual The length is about 250 bp, so the 450 bp to 700 bp fragment is recovered here, which is equivalent to recovering a DNA fragment of 200 bp to 500 bp in length.
  • the recovered DNA was sequenced by Illumina GA PE-100.
  • the sequence information of all the tested samples can be found by the primer tag sequence, and the sequence of the entire PCR product is assembled by the overlapping and linkage relationship between the reference sequence information of the known DNA fragment and the sequence of the DNA fragment, and then passed through HLA-A/
  • the alignment of the standard database of the corresponding exons of B/DRB1 can assemble the entire sequence of the original PCR product to achieve genotyping of HLA-A/B/DRB1.
  • DNA was extracted from 95 blood samples of known HLA-SBT typing (Chinese Hematopoietic Stem Cell Donor Database (hereinafter referred to as "Zhonghua Marrow Bank") using the KingFisher Automatic Extractor (American Thermo).
  • the main steps are as follows: Take out the deep hole plate and one shallow hole plate of 6 Kingfisher automatic extractor. Add a certain amount of matching reagents according to the instructions and mark them. Place all the well plates with reagents as required. Corresponding position, select the program "Bioeasy a 200ul Blood DNA-KF.msz” program, press “star” to execute the program for nucleic acid extraction. At the end of the program, approximately 100 ⁇ L of the eluted product in the plate Elution was collected as the extracted DNA.
  • Example 2 Example 2
  • the PCR primers are PCR primers for exons 2, 3, 4 of HLA-A/B and exon 2 of HLA-DRB1.
  • a primer tag is then introduced at both ends of the PCR product by a PCR reaction to specifically label PCR products from different samples.
  • each set of PCR-label primers consisting of a pair of bidirectional primer tags (Table 1) and exons 2, 3, 4 for amplifying HLA-A/B and A PCR primer (Table 2) consisting of exon 2 of HLA-DRB1, wherein each forward PCR primer has a forward primer label attached to a pair of primer tags at the 5' end, and the 5' end of the reverse PCR primer is ligated.
  • a reverse primer label for a pair of primer tags The primer tag is added directly to the 5, end of the PCR primer when the primer is synthesized.
  • the 95 DNAs obtained in the sample extraction step of Example 1 were sequentially numbered 1-95, and the PCR reaction was carried out in a 96-well plate, a total of 7 plates, numbered HLA-P-A2, HLA-P-A3, HLA, respectively.
  • a negative control without template was set in the plate, and the primer used in the negative control was the same as the corresponding primer of template 1.
  • D2-F1, D2-F2, D2-F3, D2-F4, D2-F5, D2-F6, D2-F7 are forward primers for amplifying HLA-DRB1 exon 2
  • D2-R is for amplifying HLA- Reverse primer for exon 2 of DRB1.
  • the PCR reaction system of HLA-DRB1 is as follows:
  • PI Ar A/B/D2-F 1/2/3/4/5/6/7 represents primer 5, and HLA-A/B/DRB1 with the nth forward primer tag sequence (Table 1) at the end F primer, PI nr A/B/D2-R 2/3/4 denotes primer 5, and the R primer of HLA-A/B/DRB1 with the nth reverse primer tag sequence at the end (here n ⁇ 95 ), others and so on.
  • each sample corresponds to a specific set of PCR primers ( nr deletion - ⁇ listen l4 / sl6n , PI nr A / B / D2 - R 2 / 3 / 4 ).
  • FIG. 1 shows the electrophoresis results of the corresponding exon PCR products of sample No. 1 HLA-A/B/DRB1.
  • the DNA molecular marker is DL 2000 ( Takara).
  • the gel map has a series of single bands with a fragment size of 300bp-500bp, indicating The HLA-A/B/DRB1 exons (A2, A3, A4, B2, B3, B4, DRB1-2) of sample No. 1 were successfully amplified by PCR, and the negative control (N) had no amplified bands.
  • the results of other samples are similar to this.
  • Example 3 shows the electrophoresis results of the corresponding exon PCR products of sample No. 1 HLA-A/B/DRB1.
  • the DNA molecular marker is DL 2000 ( Takara).
  • the gel map has a series of single bands with a fragment size of 300bp-500bp, indicating The HLA-A/B/DRB1 exons (A2, A3, A4, B2, B
  • HLA-A3-Mix labeled HLA-A3-Mix, HLA-A4-Mix, HLA-B2-Mix, HLA-B3-Mix v HLA-B4-Mix and HLA-D2-Mix, oscillating and mixing, from HLA-A2-Mix 200 ul of HLA-A3-Mix, HLA-A4-Mix, HLA-B2-Mix, HLA-B3-Mix, HLA-B4-Mix and HLA-D2-Mix were mixed in a 3 ml EP tube, labeled For HLA-Mix, 500ul DNA mixture from HLA-Mix was purified by Qiagen DNA Purification kit (QIAGEN). For details, see the instructions), 200 ul of DNA obtained was purified, and the concentration of HLA-Mix DNA was determined to be 48 ng/ul by Nanodrop 8000 (Thermo Fisher Scientific).
  • QIAGEN Qiagen DNA Purification kit
  • T4 DNA Polymerase (T4 DNA Polymerase) 5 ⁇
  • the reaction conditions are: Thermomixer (Thermomixer, Eppendorf)
  • reaction product was recovered by QIAquick PCR Purification Kit and dissolved in 34 ⁇ M EB (QIAGEN Elution Buffer).
  • the DNA was recovered in the previous step, and the end was added with A reaction.
  • the system was as follows (reagents were purchased from Enzymatics):
  • the reaction conditions were: Constant Temperature Mixer (Thermomixer, Eppendorf) 37 Warm bath for 30 min.
  • reaction product was recovered and purified by MiniElute PCR Purification Kit (QIAGEN) and dissolved in 13 ⁇ M EB solution (QIAGEN Elution Buffer).
  • PCR-Free library adaptor refers to a designed set of bases whose primary function is to assist in the immobilization of DNA molecules on a sequencing chip and to provide a binding site for universal sequencing primers.
  • PCR-Free library linkers can pass DNA. The ligase directly ligates it to both ends of the DNA fragment in the sequencing library, and the introduction process of the library linker is called PCR-Free library linker because there is no PCR involved.
  • PCR-free oligonucleotide linker mix (30mM) (PCR-free Adapter ⁇
  • thermomixer Thermomixer, Eppendorf 20X bath for 15 min.
  • the reaction product was purified by Ampure Beads (Beckman Coulter Genomics) and dissolved in 50 ul of deionized water.
  • the DNA concentration was determined by real-time PCR (QPCR) as follows:
  • the sequencing result of Illumina GA is a series of DNA sequences. By searching the sequence of the positive and negative primers and the primer sequences in the sequencing results, the sequencing results of the PCR products of the HLA-A/B/DRB1 exons corresponding to each primer label are established. database. The sequencing results of each exon are mapped to the reference sequence of the corresponding exon by BWA (Burrows-Wheeler Aligner) (reference sequence source: http://www.ebi.ac.uk/imgt/hla/) Build consistency across databases
  • the (pair-End linkage) relationship can be assembled into the corresponding sequence of each exon of HLA-A/B/DRB1.
  • the obtained DNA sequence was aligned with the sequence database of the corresponding exons of HLA-A/B/DRB1 in the IMGT HLA professional database, and the 100% match of the sequence alignment results was the HLA-A/B/DRB1 genotype of the corresponding sample. do not.
  • a screenshot of the exon 2 consensus sequence constructor for the HLA-A site of sample No. 1 illustrated in Figure 4 can be seen. For all 95 samples, the results obtained were completely identical to the results of the original known classification.
  • the specific results of the samples No. 1-32 are as follows:
  • Tiercy J M Molecular basis of HLA polymorphism: implications in clinical transplantation. [J]. Transpl Immunol, 2002, 9: 173-180.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides a sequencing method for nucleic acid, which method comprises: amplifying the target nucleic acid using indexed primers, breaking the amplicons incompletely and purifying and recovering mixed DNAs, sequencing the mixed DNAs by a next-generation sequencing method to obtain the sequences of the broken DNAs, and assembling the sequences of the broken DNAs into a complete sequence for the target nucleic acid. The present invention further provides primer indexes for the sequencing method.

Description

一种基于 DNA分子标签技术和 DNA不完全打断策略的  A DNA molecular tagging technique and an incomplete DNA disruption strategy
PCR测序方法 技术领域  PCR sequencing method
本发明涉及核酸测序技术领域, 特别是 PCR测序技术领域。 另一方面, 本发明还涉及 PCR-index/barcode技术领域。 本发明 的方法特别适用于第二代测序技术, 尤其是第二代测序技术中的 Pair-end测序技术, 还可以用于 HLA基因分型。 背景技术  The invention relates to the field of nucleic acid sequencing technology, in particular to the field of PCR sequencing technology. In another aspect, the invention also relates to the field of PCR-index/barcode technology. The method of the present invention is particularly applicable to second generation sequencing techniques, particularly Pair-end sequencing technology in second generation sequencing technology, and can also be used for HLA genotyping. Background technique
PCR测序 (PCR sequencing ) 方法是通过 PCR的方法得到 目的基因 DNA片段, 再通过对所得到的目的基因 DNA片段进行 DNA序列检测, 从而得到目的基因 DNA序列信息的一种技术, 其具有直观、 准确的特点, 长久以来广泛应用于基因突变检测以 及基因分型等领域。  The PCR sequencing method is a technique for obtaining a DNA fragment of a target gene by PCR, and then performing DNA sequence detection on the obtained DNA fragment of the target gene, thereby obtaining a DNA sequence information of the target gene, which is intuitive and accurate. Its characteristics have long been widely used in the field of gene mutation detection and genotyping.
PCR-index/barcode技术,通过在 PCR引物的 5, 末端添加引 物标签( primer index )序列, 可在 PCR过程中对每个样本引入 独特的引物标签,使样本在利用第二代 DNA测序技术(以 Illumina GA, Roche 454, ABI Solid等测序仪为代表的可大规模, 高通量, 单分子测序的测序仪)检测过程中, 除 PCR ^节必须逐个样本处 理外, 其它实验环节可把多个样本混在一起同时处理, 最终每个 样本的检测结果可以通过其独特的引物标签序列我回; 该方法具 有成本低, 通量大和可同时检测大量样本的多个不同基因位点的 特点。  PCR-index/barcode technology, by adding a primer index sequence at the 5' end of the PCR primer, a unique primer label can be introduced into each sample during PCR, allowing the sample to utilize second-generation DNA sequencing technology ( In the detection process of large-scale, high-throughput, single-molecule sequencing sequencer represented by Illumina GA, Roche 454, ABI Solid and other sequencers, except for the PCR section, which must be processed one by one, other experimental links can be more The samples are mixed and processed simultaneously, and finally the results of each sample can be returned by its unique primer tag sequence; this method has the characteristics of low cost, large flux and multiple different gene loci that can simultaneously detect a large number of samples.
第二代 DNA测序技术与第一代( Sanger法) DNA测序技术相 比, 其可连续测序的长度较短。 以 Illumina G A ( Illumina公司的 Genome Analyzer测序仪,简称 Illumina GA )为例, 当前 Illumina GA的最大测序长度为 200bp, 当 PCR产物的长度大于 200bp, 利 用 Illumina GA测序,直接釆取 PCR测序的方法,不能把整个 PCR 产物测通,得不到被检测 PCR产物的全部 DNA序列信息的。短的 测序读长限制了第二代测序技术的应用, 除了通过逐步改进测序技 术来获得更长的实际测序读长外, 开发可克服第二代 DNA测序仪 现有测序读长在 PCR测序应用领域不足的新技术也成为当务之急。 发明内容 The second generation of DNA sequencing technology has a shorter length of sequential sequencing than the first generation (Sanger method) DNA sequencing technology. By Illumina GA (Illumina's The Genome Analyzer sequencer (Illumina GA for short) is an example. The current maximum sequencing length of Illumina GA is 200 bp. When the length of the PCR product is greater than 200 bp, the entire PCR product cannot be tested by Illumina GA sequencing and directly by PCR sequencing. The entire DNA sequence information of the PCR product to be detected is not obtained. Short sequencing reads limit the application of second-generation sequencing technology, in addition to progressively improving sequencing technology to achieve longer actual sequencing reads, development can overcome the second-generation DNA sequencer's existing sequencing read length in PCR sequencing applications New technologies with insufficient fields have also become a top priority. Summary of the invention
当前利用第二代测序技术, 对大量样本中的特定基因相关序 列同时进行测序分析时,一般会采用 PCR测序的策略,直接利用 引物标签 + 二代测序技术的组合。 当所用测序仪的测长可以覆 盖整个 PCR产物的长度时,上述技术就足够满足要求了。但当所 用测序仪的测长不够覆盖整个 PCR产物的长度时,要么更换具有 更长测长的第二代测序仪(如用 Roche 454 GS-FLX)替代 Illumina GA, 如果测长还不满足要求的话, 只能牺牲成本和通量, 改用第 一代测序仪。  Currently, using the second-generation sequencing technology, when sequencing a specific gene-related sequence in a large number of samples simultaneously, a PCR sequencing strategy is generally adopted, and a combination of primer label + second-generation sequencing technology is directly used. This technique is sufficient when the length of the sequencer used can cover the length of the entire PCR product. However, when the length of the sequencer used is not enough to cover the length of the entire PCR product, either replace the Illumina GA with a second-generation sequencer with a longer length (such as Roche 454 GS-FLX) if the length is not sufficient. If you can only sacrifice cost and throughput, switch to the first generation of sequencers.
现实情况是, Illumina GA具有超高测序通量, 但其测长才 200bp; Roche 454 GS-FLX的测长虽然可以达到 500bp, 但其测 序成本较高, 通量较小; 第一代测序仪的测长虽然可以达到 lOOObp以上, 但其通量和成本没法和第二代测序仪相比。  The reality is that Illumina GA has ultra-high sequencing throughput, but its measurement length is only 200bp; although the length of Roche 454 GS-FLX can reach 500bp, its sequencing cost is higher and the throughput is smaller; the first generation of sequencer Although the length measurement can reach more than lOObp, its throughput and cost cannot be compared with the second generation sequencer.
有没有能够兼顾成本和通量, 能提高测序仪可以测通的 PCR 产物长度的技术呢? 本专利中的引物标签 + DNA不完全打断策略 + 二代测序技术的组合能在充分利用第二代测序仪高通量、 低成 本特点的同时,使测序仪能够测通的 PCR产物长度达到测序仪本 身测长以上, 大大扩大了其适用范围。 其中, 本发明的所采用的 二代测序技术包括第二代测序技术中的 Pair-end测序技术, 以及 针对 PCR模板有 DNA参考序列 (DNA Reference Sequence ) 的 PCR测序技术。 Is there a technology that combines cost and throughput to increase the length of PCR products that a sequencer can measure? The combination of primer label + DNA incomplete disruption strategy and second-generation sequencing technology in this patent can make the length of the PCR product that the sequencer can measure while making full use of the high-throughput and low-cost characteristics of the second-generation sequencer. The sequencer itself measures more than the length, greatly expanding its scope of application. Wherein the use of the present invention The second-generation sequencing technology includes the Pair-end sequencing technology in the second-generation sequencing technology, and the PCR sequencing technology for the PCR template with the DNA Reference Sequence.
本发明提供了用于 PCR测序的方法,通过所述方法减少了自 身测序读长较短造成的限制,扩大了第二代 DNA测序技术在 PCR 测序应用领域的应用。  The present invention provides a method for PCR sequencing, which reduces the limitation caused by the short read length of the self-sequencing and expands the application of the second generation DNA sequencing technology in the field of PCR sequencing applications.
引物标签的设计根据所应用的实验平台不同而不同, 考虑 Illumina GA测序平台本身的特点, 本发明在设计引物标签时主 要考虑了以下几点: 1: 引物标签序列中避免 3 个以上 (包括 3 个)单碱基重复序列, 2: 所有引物标签的同一位点中碱基 A和 碱基 C的总含量占所有碱基含量的 30%-70%之间, 3: 引物标签 序列本身的 GC含量在 40-60%之间, 4: 引物标签之间序列差异 度大于 4个碱基, 5: 引物标签序列中避免出现与 Illumina GA测 序引物相似度高的序列, 6: 减少引物标签序列添加到 PCR引物 上后, 对 PCR引物造成的严重发卡( hairpin ), 二聚体( dimer ) 情况的出现。  The design of the primer label varies according to the experimental platform used. Considering the characteristics of the Illumina GA sequencing platform itself, the present invention mainly considers the following points when designing the primer label: 1: Avoid more than 3 in the primer label sequence (including 3 Single base repeat, 2: the total content of base A and base C in the same site of all primer tags is between 30% and 70% of the total base content, 3: GC of the primer tag sequence itself The content is between 40-60%, 4: the sequence difference between the primer tags is greater than 4 bases, 5: the sequence with high similarity to the Illumina GA sequencing primer is avoided in the primer tag sequence, 6: the primer tag sequence addition is reduced. Upon the introduction of the PCR primer, the hairpin and the dimer were caused by the PCR primer.
本发明通过 PCR反应在 PCR产物两端各引入一个引物标签 (引物标签序列可以相同也可以不同),使 PCR产物任何一端的 引物标签都可以特异的标记 PCR产物的样本信息。 所得 PCR产 物经过经不完全打 。所谓不完全打断,指打断终产物中包含完整 的未被打断的 PCR产物和局部打断的 PCR产物。所述打断方法包 括但不限于化学打断方法 (例如酶切) 和物理打断方法, 所述物 理打断方法包括超声波打断方法或机械打断方法等。打断的 DNA 经 2%琼脂糖电泳, 割胶纯化回收从测序仪最大读取长度到测序仪 可适用的最长 DNA长度范围之间的所有 DNA条带 ( Illumina GA 测序仪可适用的最长 DNA为 700bp, 此长度为原始 DNA长度, 没 有包括文库接头序列长度) 。 纯化回收方法包括但不限于电泳割 胶回收, 也可以是磁珠回收等。 回收的 DNA片段再按照第二代测 序仪测序文库构建的流程构建测序文库, 然后进行测序, 优选为 采用 PCR-FREE测序文库构建的流程构建测序文库,测序方法优 选釆用 Pair-End方法测序。 PCR-Free测序文库构建的流程按照 本领域技术人员已知方法构建。 在得到测序总数据中, 通过引物 标签序列可以找到所有所测样本的序列信息, 利用 BWA把各个 测序序列定位到 PCR 产物的相应 DNA 参考序列 (Reference Sequence )上,再通过测序序列之间的重叠和连锁关系拼接出 PCR 产物的完整序列(图 1)。此处连锁是指由 Pair-End测序特点决定的 pair-end连锁关系。 The present invention introduces a primer tag at each end of the PCR product by PCR reaction (the primer tag sequence may be the same or different), so that the primer tag at either end of the PCR product can specifically label the sample information of the PCR product. The resulting PCR product was subjected to incomplete typing. The so-called incomplete interruption means that the final product contains a complete unbroken PCR product and a partially interrupted PCR product. The breaking method includes, but is not limited to, a chemical breaking method (for example, enzymatic cutting) and a physical breaking method, and the physical breaking method includes an ultrasonic breaking method or a mechanical breaking method. Interrupted DNA is electrophoresed on 2% agarose, and the gel is purified to recover all DNA bands from the maximum read length of the sequencer to the longest DNA length range that the sequencer can use (the longest DNA available for the Illumina GA sequencer) Is 700bp, this length is the original DNA length, no There are lengths of the library linker included). Purification and recovery methods include, but are not limited to, electrophoretic tapping recovery, magnetic bead recovery, and the like. The recovered DNA fragment is then subjected to a second-generation sequencer sequencing library construction process to construct a sequencing library, followed by sequencing, preferably using a PCR-FREE sequencing library construction process to construct a sequencing library, which is preferably sequenced using the Pair-End method. The procedure for PCR-Free sequencing library construction was constructed according to methods known to those skilled in the art. In the total sequencing data, the sequence information of all the tested samples can be found by the primer tag sequence, and each sequencing sequence is mapped to the corresponding DNA reference sequence (Reference Sequence) of the PCR product by BWA, and then the overlap between the sequencing sequences is performed. And the linkage relationship spliced out the complete sequence of the PCR product (Figure 1). Linkage here refers to the pair-end linkage relationship determined by the characteristics of Pair-End sequencing.
"接头 ( adapter ) " 或 "文库接头 ( library adapter ) " 标 签技术是指通过对多个测序文库添加不同文库接头 (不同文库接 头的组成序列不同, 序列不同的部分称为接头标签 (adapter index ) ) , 构建标签测序文库, 从而可实现多个不同标签测序文 库混合测序, 且最终各个标签测序文库的测序结果可相互区分的 一种文库标签技术。  "Adapter" or "library adapter" tag technology refers to the addition of different library linkers to multiple sequencing libraries (the sequence of the different library linkers is different, and the different parts of the sequence are called adapter indices). ), constructing a tag sequencing library, thereby enabling a plurality of different tag sequencing libraries to be mixed and sequenced, and finally a library tagging technique in which the sequencing results of each tag sequencing library can be distinguished from each other.
基于 DNA分子标签技术和 DNA不完全打断策略的 PCR测 序方法的使用可在不增加引物标签数目的情况下, 大大提高可唯 一标记的样本数目。  The use of PCR sequencing methods based on DNA molecular tagging techniques and DNA incomplete disruption strategies greatly increases the number of uniquely labeled samples without increasing the number of primer tags.
结合文库接头标签技术的 PCR-FREE的文库构建方法,是指 将文库接头直接连接至测序文库中的 DNA片段两端,文库接头的 导入过程因为没有 PCR的参与, 因此称作 PCR-Free文库构建。 其中接入方法可以采用 DNA连接酶进行连接。 其整个文库构建 过程中无 PCR的参与, 避免了在高序列相似度的 PCR产物混合 ( pooling ) 文库的构建过程中, 由 PCR 引入错误而导致最后结 果的不准确性。 The PCR-FREE library construction method combining the library linker technology means that the library linker is directly ligated to both ends of the DNA fragment in the sequencing library, and the introduction process of the library linker is called PCR-Free library construction because there is no PCR involved. . The access method can be ligated using DNA ligase. There is no PCR participation in the whole library construction process, which avoids the error caused by PCR and leads to the final conclusion during the construction of PCR product pooling library with high sequence similarity. Inaccuracy.
在本发明中,通过对正反 PCR引物添加引物标记,结合 DNA 不完全打断策略,使第二代测序技术应用于 PCR测序领域时, 实 际可测得的 PCR产物长度超过测序仪的最大测序长度。  In the present invention, by adding a primer label to a positive and negative PCR primer and a DNA incomplete disruption strategy, when the second generation sequencing technology is applied to the field of PCR sequencing, the actual measurable PCR product length exceeds the maximum sequencing of the sequencer. length.
在本发明的一个方面中, 提供了一组引物标签 ( primer index ) , 其包括表 1所示 95对引物标签中的至少 10对, 或至少 20对, 或至少 30对, 或至少 40对, 或至少 50对, 至少 60对, 或至少 70对, 或至少 80对, 或至少 90对, 或 95对 (或者所述 一组引物标签由表 1所示 95对引物标签中的 10 - 95对(例如 10 - 95对, 20 - 95对, 30 - 95对, 40 - 95对, 50 - 95对, 60 - 95对, 70 - 95对, 80 - 95对, 90 - 95对, 或 95对)组成) , 并 且  In one aspect of the invention, a set of primer indices is provided comprising at least 10 pairs, or at least 20 pairs, or at least 30 pairs, or at least 40 pairs of 95 pairs of primer labels shown in Table 1. Or at least 50 pairs, at least 60 pairs, or at least 70 pairs, or at least 80 pairs, or at least 90 pairs, or 95 pairs (or the set of primer labels are 10 - 95 pairs of 95 pairs of primer labels shown in Table 1) (eg 10 - 95 pairs, 20 - 95 pairs, 30 - 95 pairs, 40 - 95 pairs, 50 - 95 pairs, 60 - 95 pairs, 70 - 95 pairs, 80 - 95 pairs, 90 - 95 pairs, or 95 pairs ))), and
所述一组引物标签优选地至少包括表 1所示 95对引物标签中 的 PI-1至 PI-10, 或 PI-11至 ΡΙ-20, 或 PI-21至 ΡΙ-30, 或 PI-31 至 ΡΙ-40,或 PI-41至 ΡΙ-50,或 PI-51至 ΡΙ-60,或 PI-61至 ΡΙ-70, 或 PI-71至 ΡΙ-80, 或 PI-81至 ΡΙ-90, 或 PI-91至 ΡΙ-95, 或者它 们任何两个或者多个的组合。  The set of primer tags preferably comprises at least PI-1 to PI-10, or PI-11 to ΡΙ-20, or PI-21 to ΡΙ-30, or PI-31 in the 95 pairs of primer labels shown in Table 1. To ΡΙ-40, or PI-41 to ΡΙ-50, or PI-51 to ΡΙ-60, or PI-61 to ΡΙ-70, or PI-71 to ΡΙ-80, or PI-81 to ΡΙ-90, Or PI-91 to ΡΙ-95, or a combination of any two or more of them.
根据本发明另一方面, 还提供了所述的引物标签用于 PCR测 序方法的用途, 其中特别是, 每一对引物标签与用于扩增待测目的 序列的 PCR引物对组合成一对标签引物,正反 PCR引物的 5,端分 别具有(或者任选通过连接序列连接)正向引物标签和反向引物标 签。  According to another aspect of the present invention, there is further provided the use of the primer tag for a PCR sequencing method, wherein, in particular, each pair of primer tags is combined with a PCR primer pair for amplifying a sequence of interest to be tested into a pair of tag primers The 5th ends of the forward and reverse PCR primers have (or are optionally joined by a linker sequence) a forward primer tag and a reverse primer tag, respectively.
在本发明的一个具体实施方式中,所述 PCR引物是用于扩增 HLA的特定基因的 PCR引物, 优选是用于扩增 HLA-A/B 2, 3, 4号外显子和 HLA-DRB1 2号外显子的 PCR引物, 优选的所述 PCR引物如表 2所示。 本发明另一方面中, 提供了上文所述一组引物标签与用于扩 增待测目的序列的 PCR引物对组合成的一组标签引物,其中每一 对引物标签与 PCR引物对组合成一对标签引物, 正反 PCR引物 的 5,端分别具有 (或者任选通过连接序列连接)正向引物标签和 反向引物标签。 In a specific embodiment of the invention, the PCR primer is a PCR primer for amplifying a specific gene of HLA, preferably for amplifying HLA-A/B 2, 3, 4 exon and HLA-DRB1 PCR primers for exon 2, preferably the PCR primers are shown in Table 2. In another aspect of the invention, there is provided a set of tag primers comprising a set of primer tags described above and a PCR primer pair for amplifying a sequence of interest, wherein each pair of primer tags is combined with a PCR primer pair For the label primer, the 5th ends of the forward and reverse PCR primers have (or are optionally joined by a linker sequence) a forward primer label and a reverse primer label, respectively.
在本发明的一个具体实施方式中, 上文所述标签引物中的 PCR引物是用于扩增 HLA的特定基因的 PCR引物,优选是用于 扩增 HLA-A/B 2, 3, 4号外显子和 HLA-DRB1 2号外显子的 PCR 引物,优选的所述 PCR引物如表 2所示。在本发明的另一个具体 实施方式中, 所述的标签引物用于 PCR测序方法。  In a specific embodiment of the present invention, the PCR primer in the above-described tag primer is a PCR primer for amplifying a specific gene of HLA, preferably for amplifying HLA-A/B 2, 3, 4 PCR primers for the exon and HLA-DRB1 exon 2, preferably the PCR primers are shown in Table 2. In another embodiment of the invention, the tag primer is used in a PCR sequencing method.
本发明另一方面中, 提供了一种测定样品中目的核酸的核苷 酸序列的方法, 其包括:  In another aspect of the invention, a method of determining a nucleotide sequence of a nucleic acid of interest in a sample, comprising:
1 )提供 n个样品, n为大于等于 1的整数, 所述样品优选地来 自哺乳动物, 更优选是人, 特别是人的血样; 可选地, 将待分析 的 n个样品分成 m个小组, m为整数且 n > m > l ; 1) providing n samples, n being an integer greater than or equal to 1, the sample preferably being from a mammal, more preferably a human, in particular a human blood sample; alternatively, dividing the n samples to be analyzed into m groups m is an integer and n > m >l;
2 )扩增: 对于每一个样品, 使用一对标签引物, 在存在来自 该样品的模板时, 在适于扩增目的核酸的条件下进行 PCR扩增, 其中, 每一对标签引物由包含引物标签的正向标签引物和反向标 签引物 (均可以是简并引物)构成, 其中正向标签 il物和反向标 签引物所包含的引物标签可以相同或者不同; 不同样品所用标签 引物对中的引物标签彼此不同;  2) Amplification: For each sample, a pair of label primers are used, and in the presence of a template from the sample, PCR amplification is carried out under conditions suitable for amplifying the nucleic acid of interest, wherein each pair of label primers comprises primers The label's forward label primer and reverse label primer (both may be degenerate primers), wherein the forward label il and the reverse label primer may contain the same or different primer labels; Primer labels are different from each other;
3 )打断: 将所得的 PCR产物文库进行不完全打断;  3) Interruption: The resulting PCR product library is incompletely interrupted;
4 ) 测序: 将回收的 DNA混合物利用二代测序技术, 优选的 是 Pair-End技术(例如 Illumina GA、 Illumina Hiseq 2000 )进行 测序, 获得打断后的 DNA的序列;  4) Sequencing: The recovered DNA mixture is sequenced using a second generation sequencing technique, preferably a Pair-End technique (e.g., Illumina GA, Illumina Hiseq 2000), to obtain a sequence of the broken DNA;
5 )拼接: 基于每个样品独特的引物标签将获得的测序结果与 样品 对应, 利用比对程序 (例如 Blast, BWA程序)把各个 测序序列定位到 PCR产物的相应 DNA参考序列上, 通过序列重 叠和连锁关系, 从打断后的 DNA的序列拼接出完整的目的核酸。 5) Stitching: Based on the unique primer label for each sample, the sequencing results obtained will be Corresponding to the sample, the alignment sequence (such as Blast, BWA program) is used to locate each sequencing sequence to the corresponding DNA reference sequence of the PCR product, and the complete target nucleic acid is spliced from the sequence of the broken DNA by sequence overlap and linkage. .
本发明另一方面, 提供了一种测定样品中目的核酸的核苷酸 序列的方法, 其包括:  In another aspect of the invention, a method of determining a nucleotide sequence of a nucleic acid of interest in a sample, comprising:
1 )提供 n个样品, n为大于等于 1的整数, 所述样品优选地来 自人, 特别是人的血样; 可选地, 将待分析的 n个样品分成 m个小 组, m为整数且 n > m > 1 ;  1) providing n samples, n being an integer greater than or equal to 1, the sample preferably being from a human, in particular a human blood sample; optionally, dividing n samples to be analyzed into m groups, m being an integer and n > m > 1 ;
2 )扩增: 对于每一个样品, 使用一对标签引物, 在存在来自 该样品的模板时, 在适于扩增目的核酸的条件下进行 PCR 增, 其中, 每一对标签引物由包含引物标签的正向标签引物和反向标 签引物 (均可以是简并引物)构成, 其中正向标签引物和反向标 签引物所包含的引物标签可以相同或者不同; 不同样品所用标签 引物对中的引物标签彼此不同;  2) Amplification: For each sample, a pair of label primers are used, and in the presence of a template from the sample, PCR is performed under conditions suitable for amplifying the nucleic acid of interest, wherein each pair of label primers comprises a primer label The forward label primer and the reverse label primer (both may be degenerate primers), wherein the forward label primer and the reverse label primer may contain the same or different primer labels; the primer label in the label primer pair used for different samples Different from each other;
3 ) 混合: 将各样品的 PCR扩增产物混合在一起, 获得 PCR 产物文库;  3) mixing: mixing PCR amplification products of each sample to obtain a PCR product library;
4 )打断: 将所得的 PCR产物文库进行不完全打断;  4) interruption: the resulting PCR product library is incompletely interrupted;
5 )建库: 将打断后的 PCR产物文库构建 PCR-Free测序文库, 回收位于所用测序仪最大读长长度到所用测序仪适用的最长 DNA长度范围之间的所有 DNA条带, 可以对文库添加不同的文库 接头 (adapter ) 以区分不同的 PCR-Free测序文库;  5) Building a library: Construct a PCR-Free sequencing library from the interrupted PCR product library, and recover all the DNA bands located between the maximum read length of the sequencer used and the longest DNA length range used by the sequencer used. The library adds different library adapters to distinguish different PCR-Free sequencing libraries;
6 ) 测序: 将回收的 DNA混合物利用二代测序技术, 优选的 是 Pair-End技术(例如 Illumina GA、 Illumina Hiseq 2000 )进行 测序, 获得打断后的 DNA的序列;  6) Sequencing: The recovered DNA mixture is sequenced using a second-generation sequencing technique, preferably a Pair-End technique (eg, Illumina GA, Illumina Hiseq 2000), to obtain a sequence of the interrupted DNA;
7 )拼接: 基于各个文库不同的文库接头序列和每个样品独特 的引物标签将获得的测序结果与样品一一对应,利用比对程序(例 如 Blast,BWA程序 fe各个测序序列定位到 PCR产物的相应 DNA 参考序列上,通过序列重叠和连锁关系,从打断后的 DNA的序列 拼接出完整的目的核酸。 7) Stitching: Based on the different library linker sequences of each library and the unique primer tags of each sample, the sequencing results obtained are one-to-one corresponding to the samples, and the alignment program is used. For example, Blast, the BWA program, each sequencing sequence is mapped to the corresponding DNA reference sequence of the PCR product, and the complete target nucleic acid is spliced from the sequence of the broken DNA by sequence overlap and linkage.
在本发明的一个具体实施方式中,在上文所述的测序方法中, 每一对引物标签与 PCR引物对组合成一对标签引物, 正反 PCR 引物的 5,端分别具有 (或者任选通过连接序列连接)正向引物标 签和反向引物标签。  In a specific embodiment of the present invention, in the sequencing method described above, each pair of primer tags and PCR primer pairs are combined into a pair of tag primers, and the 5th ends of the forward and reverse PCR primers respectively have (or optionally pass The ligation sequence is ligated with a forward primer tag and a reverse primer tag.
在本发明的一个具体实施方式中,在上文所述的测序方法中, 所述 PCR引物是用于扩增 HLA的特定基因的 PCR引物, 优选 是用于扩增 HLA-A/B的 2, 3, 4号外显子以及 HLA-DRB1 2号 外显子的 PCR引物, 优选的所述 PCR引物如表 2所示。  In a specific embodiment of the present invention, in the sequencing method described above, the PCR primer is a PCR primer for amplifying a specific gene of HLA, preferably 2 for amplifying HLA-A/B. PCR primers for exon 3, exon 4 and exon 2 of HLA-DRB1, preferably the PCR primers are shown in Table 2.
在本发明的一个具体实施方式中,在上文所述的测序方法中, 所述引物标签针对 PCR引物进行设计, 优选针对用于扩增 ¾A 的特定基因的 PCR引物进行设计,更优选针对用于扩增 HLA-A/B 的 2, 3, 4号外显子以及 HLA-DRB1 2号外显子的 PCR引物, 特别是如表 2所示的 PCR引物进行设计,所述引物标签特别是包 括表 1所示 95对引物标签中的至少 10对, 或至少 20对, 或至少 30对, 或至少 40对, 或至少 50对, 至少 60对, 或至少 70对, 或至少 80对, 或至少 90对, 或 95对(或者所述一組引物标签由 表 1所示 95对引物标签中的 10 - 95对(例如 10 - 95对, 20 - 95 对, 30 - 95对, 40 - 95对, 50 - 95对, 60 - 95对, 70 - 95对, 80 - 95对, 90 - 95对, 或 95对)组成) , 并且  In a specific embodiment of the present invention, in the sequencing method described above, the primer tag is designed for PCR primers, preferably for PCR primers for amplifying a specific gene of 3⁄4A, more preferably for use. PCR primers for amplifying exon 2, 3, and 4 of HLA-A/B and exon 2 of HLA-DRB1, particularly PCR primers as shown in Table 2, especially including a table 1 at least 10 pairs of 95 pairs of primer labels, or at least 20 pairs, or at least 30 pairs, or at least 40 pairs, or at least 50 pairs, at least 60 pairs, or at least 70 pairs, or at least 80 pairs, or at least 90 Pairs, or 95 pairs (or the set of primer labels are 10 - 95 pairs of 95 pairs of primer labels shown in Table 1 (eg 10 - 95 pairs, 20 - 95 pairs, 30 - 95 pairs, 40 - 95 pairs, 50-95 pairs, 60-95 pairs, 70-95 pairs, 80-95 pairs, 90-95 pairs, or 95 pairs), and
所述一组引物标签优选地至少包括表 1所示 95对引物标签中 的 PI-1至 PI-10, 或 PI-11至 PI-20, 或 PI-21至 PI-30, 或 PI-31 至 PI-40,或 PI-41至 PI-50,或 PI-51至 PI-60,或 PI-61至 PI-70, 或 PI-71至 PI-80, 或 PI-81至 PI-90, 或 PI-91至 PI-95, 或者它 们任何两个或者多个的组合。 The set of primer tags preferably comprises at least PI-1 to PI-10, or PI-11 to PI-20, or PI-21 to PI-30, or PI-31 in 95 pairs of primer labels shown in Table 1. To PI-40, or PI-41 to PI-50, or PI-51 to PI-60, or PI-61 to PI-70, or PI-71 to PI-80, or PI-81 to PI-90, Or PI-91 to PI-95, or it Any combination of two or more of them.
在本发明的一个具体实施方式中,在上文所述的测序方法中, 所述 DNA打断包括化学打断方法和物理打断方法, 其中所述化 学方法包括酶切方法, 所述物理打断方法包括超声波打断方法或 机械打断方法。 所述 DNA打断后, 分离 450-750bp长度的片段。  In a specific embodiment of the present invention, in the sequencing method described above, the DNA disruption includes a chemical disruption method and a physical disruption method, wherein the chemical method includes an enzyme digestion method, and the physical The breaking method includes an ultrasonic breaking method or a mechanical breaking method. After the DNA was disrupted, fragments of 450-750 bp in length were isolated.
本发明另一方面中, 提供了一种 HLA分型方法, 包括: 使 用上文所述的测序方法对来自患者的样品 (特别是血样)进行测 序,以及将测序结果与 HLA数据库(如 IMGT HLA专业数据库) 中的标准序列数据比对, 序列比对结果 100 %匹配的即为对应样 本的 HLA基因型别。 附图说明  In another aspect of the invention, an HLA typing method is provided, comprising: sequencing a sample (especially a blood sample) from a patient using the sequencing method described above, and sequencing the result with an HLA database (eg, IMGT HLA) The standard sequence data alignment in the professional database), the 100% match of the sequence alignment results is the HLA genotype of the corresponding sample. DRAWINGS
图 1: 为引物标签标记, DNA打断和 DNA测序后, 序列拼 接图示。 第 N号样本的 PCR产物两端引入了正反引物标签序列 Index-N-F/R (1), PCR产物经物理方法打断后的产物中包括: 一 端带有引物标签序列的产物, 两端都不带引物标签序列的产物, 完全未被打断的产物, 割胶纯化回收位于测序仪最大读长长度到 测序仪适用的最长 DNA长度范围之间的所有 DNA条带用于测序 ( 2 ) , 利用 Index-N-F/R在测序结果中找回属于第 N号样本的 PCR产物的测序结果, 利用 PCR产物已知的参考序列信息定位 各个测序序列相对参考的位置, 并根据测序序列之间的重叠和连 锁关系组装成完整的 PCR产物的测序结果 (3, 4 ) 。  Figure 1: Sequence primers for primer labeling, DNA disruption, and DNA sequencing. The positive and negative primer tag sequences Index-NF/R (1) were introduced into the PCR product of the No. N sample. The products of the PCR product interrupted by physical methods include: a product with a primer tag sequence at one end, both ends a product without a primer tag sequence, a product that is completely unbroken, and a tapping purification to recover all DNA bands located between the maximum read length of the sequencer and the longest DNA length range applicable to the sequencer for sequencing (2), Using the Index-NF/R to retrieve the sequencing result of the PCR product belonging to the No. N sample in the sequencing result, and using the known reference sequence information of the PCR product to locate the relative reference position of each sequencing sequence, and according to the overlap between the sequencing sequences And the linkage relationship assembled into a complete PCR product (3, 4).
图 2: 为 1号样本 HLA-A/B/DRB1相应外显子 PCR产物电泳 结果, 从电泳图上看, PCR产物为一系列片段大小 300bp-500bp 的单一条带,其中泳道 M是分子量标记物( DL 2000,Takara公司), 泳道 1-7为 1号样本的 HLA-A/B/DRB1各外显子 ( A2、 A3、 A4、 B2、 B3、 B4、 DRBl-2 ) PCR扩增产物, 阴性对照 (N )无扩增条 带。 其它样品的结果与此类似。 Figure 2: Electrophoresis results of the corresponding exon PCR products of sample No. 1 HLA-A/B/DRB1. From the electropherogram, the PCR product is a series of single bands with a fragment size of 300bp-500bp, wherein lane M is the molecular weight marker. (DL 2000, Takara), Lanes 1-7 are the HLA-A/B/DRB1 exons of sample No. 1 (A2, A3, A4, B2, B3, B4, DRBl-2) PCR amplification products, negative control (N) without amplification bands. The results for the other samples are similar.
图 3: 为 HLA-Mix打断后 DNA电泳情况 (割胶前后), 割胶 区域为 450-750bp区域。 其中泳道 M是分子量标记物(NEB-50bp DNA Ladder ) , 泳道 1是割胶前 HLA-Mix的电泳情况, 泳道 2 是割胶后 HLA-Mix的胶图。  Figure 3: DNA electrophoresis after HLA-Mix interruption (before and after tapping), the tapping area is 450-750 bp. Lane M is a molecular weight marker (NEB-50bp DNA Ladder), lane 1 is the electrophoresis of HLA-Mix before tapping, and lane 2 is the gel of HLA-Mix after tapping.
图 4: 1号样本的一致性(consensus )序列构建程序截图, 示 例说明了根据引物标签和 DN A片段之间的重叠关系拼接出 PCR产 物的完整序列。 具体实施方式  Figure 4: Screenshot of the consensus sequence constructor for sample No. 1, which illustrates the complete sequence of the spliced PCR product based on the overlap between the primer tag and the DN A fragment. detailed description
下面将结合实施例对本发明的实施方案进行详细描述, 但是 本领域技术人员将会理解, 下列实施例仅用于说明本发明, 而不 应视为限定本发明的范围。  The embodiments of the present invention are described in detail below with reference to the accompanying drawings.
在本发明的实施例中, 通过采取引物标签 + DNA不完全打断 策略 + Illumia GA测序仪 Pair-End 100测序技术的组合对 95个 样本的 HLA-A/B 2, 3, 4号外显子以及 HLA-DRB1 2号外显子的 基因分型 (PCR产物长度大小处于 290bp-500bp之间) , 证明该 策略可以在充分发挥第二代测序仪高通量、低成本^点的同时, 可 以实现对超过测序仪本身测长以上的基因片段的分型。  In an embodiment of the invention, 95 samples of HLA-A/B 2, 3, and 4 exons were obtained by adopting a combination of primer tag + DNA incomplete disruption strategy + Illumia GA sequencer Pair-End 100 sequencing technology. And the genotyping of exon 2 of HLA-DRB1 (the length of PCR product is between 290bp and 500bp), which proves that this strategy can realize the high-throughput and low-cost of the second-generation sequencer. The typing of gene fragments that exceed the length of the sequencer itself.
原理: 将待分析的样本, 通过 PCR反应在 HLA-A/B 2, 3, 4 号外显子以及 HLA-DRB1 2号外显子的 PCR产物两端引入引物标 签, 使其特异的标记 PCR 产物的样本信息。 将各组内样品的 HLA-A/B/DRB1三个位点的 PCR扩增产物混合在一起,获得 PCR 产物文库; 所得 PCR 产物文库经过超声不完全打断后, 构建 PCR-Free测序文库, 测序文库经 2%低熔点琼脂糖电泳, 割胶纯 化回收位于 450bp-750bp 长度范围之间的所有 DNA 条带 (PCR-Free测序文库的构建过程中在 DNA 片段的两端都添加上 了文库接头, 使 DNA 片段在电泳图上体现的长度比实际长度大 了 250bp左右, 因此,此处回收 450bp-700bp的片段, 实际上相当 于回收原长度为 200bp-500bp 的 DNA 片段)。 回收的 DNA 经 Illumina GA PE-100测序。 通过引物标签序列可以找到所有所测 样本的序列信息, 再通过已知 DNA片段的参考序列信息和 DNA 片段序列之间的重叠和连锁关系组装出整个 P C R产物的序列,再 通过与 HLA-A/B/DRB1 相应外显子的标准数据库的比对结果可 组装出原 PCR产物的全序列,实现 HLA-A/B/DRB1的基因分型。 实施例 1 Principle: The primers to be analyzed are introduced into the primers of the HLA-A/B 2, 3, 4 exons and the HLA-DRB1 exon 2 PCR product by PCR, so that the specific PCR products are labeled. Sample information. The PCR amplification products of the three sites of HLA-A/B/DRB1 in each sample were mixed together to obtain a PCR product library; the PCR product library was not completely interrupted by ultrasound, and a PCR-Free sequencing library was constructed. The sequencing library was electrophoresed by 2% low melting agarose, pure gel All DNA bands between 450 bp and 750 bp in length were recovered (the library linker was added to both ends of the DNA fragment during the construction of the PCR-Free sequencing library, so that the length of the DNA fragment on the electropherogram is longer than the actual The length is about 250 bp, so the 450 bp to 700 bp fragment is recovered here, which is equivalent to recovering a DNA fragment of 200 bp to 500 bp in length. The recovered DNA was sequenced by Illumina GA PE-100. The sequence information of all the tested samples can be found by the primer tag sequence, and the sequence of the entire PCR product is assembled by the overlapping and linkage relationship between the reference sequence information of the known DNA fragment and the sequence of the DNA fragment, and then passed through HLA-A/ The alignment of the standard database of the corresponding exons of B/DRB1 can assemble the entire sequence of the original PCR product to achieve genotyping of HLA-A/B/DRB1. Example 1
样本提取  Sample extraction
使用 KingFisher 自动提取仪(美国 Thermo公司)从 95份 已知 HLA-SBT分型结果的血样 (中国造血干细胞捐献者资料库 (以下称 "中华骨髓库" )) 中提取 DNA。 主要步骤如下: 取出 6 个 Kingfisher自动提取仪配套的深孔板及 1个浅孔板, 根据说明 书分别加入一定量配套的试剂并做好标记, 将所有已加好试剂的 孔板按要求置于相应的位置, 选定程序 " Bioeasy一 200ul Blood DNA一 KF.msz" 程序, 按下 "star" 执行该程序进行核酸提取。 程序结束后收集 plate Elution中的 lOOul左右的洗脱产物即为提 取的 DNA。 实施例 2  DNA was extracted from 95 blood samples of known HLA-SBT typing (Chinese Hematopoietic Stem Cell Donor Database (hereinafter referred to as "Zhonghua Marrow Bank") using the KingFisher Automatic Extractor (American Thermo). The main steps are as follows: Take out the deep hole plate and one shallow hole plate of 6 Kingfisher automatic extractor. Add a certain amount of matching reagents according to the instructions and mark them. Place all the well plates with reagents as required. Corresponding position, select the program "Bioeasy a 200ul Blood DNA-KF.msz" program, press "star" to execute the program for nucleic acid extraction. At the end of the program, approximately 100 μL of the eluted product in the plate Elution was collected as the extracted DNA. Example 2
PCR扩增  PCR amplification
通过合成在 5, 末端具有不同引物标签的 PCR引物制作不同 的 PCR标签引物, 这样不同的 PCR标签引物可以用于不同的样 本, 所述 PCR 引物是针对 HLA-A/B的 2, 3, 4号外显子以及 HLA-DRB1 2号外显子的 PCR引物。其后通过 PCR反应在 PCR 产物两端引入引物标签,从而特异地标记了来自不同样本的 PCR 产物。 Different by making PCR primers with different primer tags at the 5, end PCR tag primers, such that different PCR tag primers can be used for different samples, the PCR primers are PCR primers for exons 2, 3, 4 of HLA-A/B and exon 2 of HLA-DRB1. A primer tag is then introduced at both ends of the PCR product by a PCR reaction to specifically label PCR products from different samples.
以 95套 PCR标签引物来分别扩增 95份 DNA样本,每套 PCR 标签引物由一对双向引物标签(表 1 )和用于扩增 HLA-A/B的 2, 3, 4号外显子以及 HLA-DRB1 2号外显子的 PCR引物 (表 2 ) 组成, 其中每个正向 PCR引物的 5, 末端上连接一对引物标签的 正向引物标签, 而反向 PCR引物的 5' 末端上连接一对引物标签 的反向引物标签。引物标签在引物合成时直接添加在 PCR引物的 5,末端。  95 sets of PCR-labeled primers were used to amplify 95 DNA samples, each set of PCR-label primers consisting of a pair of bidirectional primer tags (Table 1) and exons 2, 3, 4 for amplifying HLA-A/B and A PCR primer (Table 2) consisting of exon 2 of HLA-DRB1, wherein each forward PCR primer has a forward primer label attached to a pair of primer tags at the 5' end, and the 5' end of the reverse PCR primer is ligated. A reverse primer label for a pair of primer tags. The primer tag is added directly to the 5, end of the PCR primer when the primer is synthesized.
把实施例 1的样本提取步骤中所得的 95份 DNA, 依次编号 1-95, PCR反应在 96孔板中进行,共 7板,编号分别为 HLA-P-A2、 HLA-P-A3, HLA-P-A4, HLA-P-B2、 HLA-P-B3, HLA-P-B4以 及 HLA-P-DRB1-2 ( A2/3/4, B2/B3/B4, DRB1-2 表示扩增的位 点) , 板内设置一个不添加模板的阴性对照, 阴性对照所用引物 与模板 1的对应引物相同。 实验的同时, 记录下每对引物标签对 应的样本编号信息。  The 95 DNAs obtained in the sample extraction step of Example 1 were sequentially numbered 1-95, and the PCR reaction was carried out in a 96-well plate, a total of 7 plates, numbered HLA-P-A2, HLA-P-A3, HLA, respectively. -P-A4, HLA-P-B2, HLA-P-B3, HLA-P-B4 and HLA-P-DRB1-2 (A2/3/4, B2/B3/B4, DRB1-2 for amplification) Site), a negative control without template was set in the plate, and the primer used in the negative control was the same as the corresponding primer of template 1. At the same time as the experiment, record the sample number information corresponding to each pair of primer labels.
表 1, 引物标签的相关信息  Table 1, Primer Label Information
Figure imgf000013_0001
sz PD 0DV0 DV10V13 8r-u
Figure imgf000013_0001
Sz PD 0DV0 DV10V13 8r-u
LZ £3 3V3VOVIVOXDI LZ-ΙάLZ £3 3V3VOVIVOXDI LZ-Ιά
91 ZD XV1019VI30I3 9Ζ-\ά sz ID IV0VXV3V30V3 XVI3XOX3VDV1 sr-u91 ZD XV1019VI30I3 9Ζ-\ά sz ID IV0VXV3V30V3 XVI3XOX3VDV1 sr-u
PZ z 3V0丄 i)丄 03Vf)V丄 PZ-ldPZ z 3V0丄 i)丄 03Vf)V丄 PZ-ld
£Z na :) VOlVJ Di:)丄 V £Z-ld£Z na :) VOlVJ Di:)丄 V £Z-ld
ZZ oia 1V0X3V30IV13 VOV13I3V30DV ΖΖ-ΙάZZ oia 1V0X3V30IV13 VOV13I3V30DV ΖΖ-Ιά
IZ 68 V IV IODVIO丄 f)V丄 vio i iv) IZ 68 V IV IODVIO丄 f)V丄 vio i iv)
oz 80 νονοί):)ί)ν丄 f) V 30V03I3I01DV oz-u Oz 80 νονοί):) ί)ν丄 f) V 30V03I3I01DV oz-u
61 Z.S 丄 VO VOVIVIDI 6I-IJ61 Z.S 丄 VO VOVIVIDI 6I-IJ
81 99 9V3031VX0V1D 3VXVI31V303V 8T-U81 99 9V3031VX0V1D 3VXVI31V303V 8T-U
LI DYDDIDYIDDDD DDD1DD1DDYDD "-IdLI DYDDIDYIDDDD DDD1DD1DDYDD "-Id
91 lyOLDDlYDDDV DXV3V1D3IDV3 9T-IJ91 lyOLDDlYDDDV DXV3V1D3IDV3 9T-IJ
SI £9 YDDYDDLYDIDI : LOVD i );)丄 fXL Si-Id SI £9 YDDYDDLYDIDI : LOVD i );)丄 fXL Si-Id
za :)丄 OJ^VD!DVJLV:) DVS i:)丄 01331 n-iA Za :)丄 OJ^VD!DVJLV:) DVS i:)丄 01331 n-iA
£T ia :) V 30i)Vi 丄 3IV3VOXVODV3 ει-υ£T ia :) V 30i)Vi 丄 3IV3VOXVODV3 ει-υ
Zl nv 0109工 OVIOV工 V Ζ\-ΙάZl nv 0109 work OVIOV work V Ζ\-Ιά
II XIV ODV!DI OVJ^V丄 XV10303I3V0V II-IdII XIV ODV!DI OVJ^V丄 XV10303I3V0V II-Id
01 OIV X0303VD10V1V 3I3103V3VI3V 01 -u01 OIV X0303VD10V1V 3I3103V3VI3V 01 -u
6 6V 303V3VI0VI3V 6-Id6 6V 303V3VI0VI3V 6-Id
8 8V 8-Id8 8V 8-Id
L LY 30VIV103IV9V L-ldL LY 30VIV103IV9V L-ld
9 9V VO丄 V丄:)丄 30V0I30XV3V1 9-Id9 9V VO丄 V丄:)丄 30V0I30XV3V1 9-Id
S SV 10iDVf :)JLV丄 VIVO丄:) 3工:) s-wS SV 10iDVf :) JLV丄 VIVO丄:) 3 workers :) s-w
P V :) Vi)工 OVlOI i)丄 P-Id P V :) Vi) OVlOI i) 丄 P-Id
l7C8lOOOlOiN3/X3d S3 0V01V103V3VX £S-Idl7C8lOOOlOiN3/X3d S3 0V01V103V3VX £S-Id
ZS I VX03V0X30VXD DOVXVIOVOIVD zs-uZS I VX03V0X30VXD DOVXVIOVOIVD zs-u
IS £3 03XV9V303V3V V i)丄 丄 is-wIS £3 03XV9V303V3V V i)丄 丄 is-w
OS za 131V1V0V30IV IDI0V1V130VD OS-IdOS za 131V1V0V30IV IDI0V1V130VD OS-Id
6P la VOV3V1VOIOD1 3IV1DV1VDOV3 6f"Id6P la VOV3V1VOIOD1 3IV1DV1VDOV3 6f"Id
SP rxa VOVIO丄 OVJ^Ol 303I3V3VDV0D 8f-IJSP rxa VOVIO丄 OVJ^Ol 303I3V3VDV0D 8f-IJ
LP na LP-ldLP na LP-ld
9P oia DlVOVD i:)丄 3丄 V3V13V31303I 9P-ld9P oia DlVOVD i:)丄 3丄 V3V13V31303I 9P-ld
SP 6a 3V0X0V03031V 3V13VOV1V313 Sf-ldSP 6a 3V0X0V03031V 3V13VOV1V313 Sf-ld
PP 8a V3IV10VXV0I3 DX0V013X3V0V Pf-ldPP 8a V3IV10VXV0I3 DX0V013X3V0V Pf-ld
£P Ld 1:)丄 31Vi)li)\ £P-Id£P Ld 1:)丄 31Vi)li)\ £P-Id
ZP 9a DLVDVDyDl Dl DLDD1D 1Y1DL ΖΡ-ldZP 9a DLVDVDyDl Dl DLDD1D 1Y1DL ΖΡ-ld
IP sa DDDDYDYIVDDI 1VOVX3X3VIVI IP-IdIP sa DDDDYDYIVDDI 1VOVX3X3VIVI IP-Id
OP t VODYDYDDLDYD 3V1V3XVX30V3 ΟΡ-ldOP t VODYDYDDLDYD 3V1V3XVX30V3 ΟΡ-ld
6£ εα DlDDl YDyDOy 6£-ld6£ εα DlDDl YDyDOy 6£-ld
S£ za XV13IV0VI03V V3I9V1DI3V3V 8£"IdS£ za XV13IV0VI03V V3I9V1DI3V3V 8£"Id
LI τα 9V01V0303V3X L£-ldLI τα 9V01V0303V3X L£-ld
9£ zio V3VIDOIV1V9X 3030VX010V3V 9£-Id se UD V03X0VXVI3VX 9£ zio V3VIDOIV1V9X 3030VX010V3V 9£-Id se UD V03X0VXVI3VX
013 31Vi:)i)JLi):)丄 31 f£-IJ εε 60 VXD3V313V1V3 丄 DOOV ££-ld zt 83 3V3VOXV3IVI3 XVOVXV3X3V3V re-id l£ LD VDVt)丄 ViDOiDVl ιε-id 013 31Vi:)i)JLi):)丄 31 f£-IJ εε 60 VXD3V313V1V3 丄 DOOV ££-ld zt 83 3V3VOXV3IVI3 XVOVXV3X3V3V re-id l£ LD VDVt)丄 ViDOiDVl ιε-id
0£ 93 XVID3X3XV3VI DYDIDDDDYDYD οε-Η0£ 93 XVID3X3XV3VI DYDIDDDDYDYD οε-Η
6Z S3 3VOVOI3V1V3V VX30V0I3V0V1 6Z S3 3VOVOI3V1V3V VX30V0I3V0V1
l7C8lOO/OlOZN3/X3d ZSTOOO/ZIOZ OAV 8 9D 丄:) DVOViDVliDVJL SL-ldl7C8lOO/OlOZN3/X3d ZSTOOO/ZIOZ OAV 8 9D 丄:) DVOViDVliDVJL SL-ld
LL SO V3I303V013VD LL-ΙάLL SO V3I303V013VD LL-Ιά
9L PD V3IVDOV13VIV 9L-\d9L PD V3IVDOV13VIV 9L-\d
SL εο DVE VIV )丄 VOO 丄 OOJLVIOVO IV Si-Id L ZD 丄 Vi)丄 V丄 VOl Vf)丄 iDVOV 丄 3JL PL-IdSL εο DVE VIV )丄 VOO 丄 OOJLVIOVO IV Si-Id L ZD 丄 Vi)丄 V丄 VOl Vf)丄 iDVOV 丄 3JL PL-Id
£L ID 丄 :>3:>vi)v 1V3V1VI3IOVX tL ld£L ID 丄 :>3:>vi)v 1V3V1VI3IOVX tL ld
ZL ΖΙΛ VlOVOlVi)丄 30丄 IVOOVlf):)丄 V1V ZL-IdZL ΖΙΛ VlOVOlVi) 丄 30丄 IVOOVlf):)丄 V1V ZL-Id
IL UA VXDV103V3V0D XVOVDVXVI3V3 -idIL UA VXDV103V3V0D XVOVDVXVI3V3 -id
OL Old ) V丄 OVOliDliD l 丄 OZ.-UOL Old ) V丄 OVOliDliD l 丄 OZ.-U
69 6Λ 031V0IV30X3V VX3X3V3VXDV3 69-Id69 6Λ 031V0IV30X3V VX3X3V3VXDV3 69-Id
89 8i IV3XDOXD1DV3 89-Id89 8i IV3XDOXD1DV3 89-Id
L9 Ld DIDDIVDYIDDI V03V010XV3V1 L9-ldL9 Ld DIDDIVDYIDDI V03V010XV3V1 L9-ld
99 9Λ l lV lOOOVi)丄 130X03I3VDVX 99-Id99 9Λ l lV lOOOVi)丄 130X03I3VDVX 99-Id
S9 丄 ) 130丄 I3IDVIVOV3VX S9"Id 9 Λ P9-ldS9 丄 ) 130丄 I3IDVIVOV3VX S9"Id 9 Λ P9-ld
£9 £Λ £9-ld£9 £Λ £9-ld
19 ΖΛ 3VOXODI3V13X Z9-U19 ΖΛ 3VOXODI3V13X Z9-U
19 ΙΛ I3IVIOX3V1V3 19-Id19 ΙΛ I3IVIOX3V1V3 19-Id
09 ΠΉ V013VD1V130X O VDIDIO:)丄 ) V 09-Id09 ΠΉ V013VD1V130X O VDIDIO:)丄 ) V 09-Id
6S na 0V0I03V0313V 6S"Id6S na 0V0I03V0313V 6S"Id
8S oia VDOVO丄 3 丄 V0V3V03I0X93 8S"Id8S oia VDOVO丄 3 丄 V0V3V03I0X93 8S"Id
LS DDDyiDDDYDDl LS ldLS DDDyiDDDYDDl LS ld
9S 83 3V9IV030VXVX fJV JD工 V工 VO丄 3 9S-Id9S 83 3V9IV030VXVX fJV JD Work V VO丄 3 9S-Id
,a D03V3VXVD0VX SS-Id , a D03V3VXVD0VX SS-Id
PS 93 丄:) 3VI303I3V03V PS 93 丄:) 3VI303I3V03V
l7C8lOOOlOiN3/X3d PI-79 TGTATCACGAGC ATGATCGTATAC G7 79l7C8lOOOlOiN3/X3d PI-79 TGTATCACGAGC ATGATCGTATAC G7 79
PI-80 TACTGCTATCTC CGCTGCATAGCG G8 80PI-80 TACTGCTATCTC CGCTGCATAGCG G8 80
PI-81 CGCGAGCTCGTC ACTCGATGAGCT G9 81PI-81 CGCGAGCTCGTC ACTCGATGAGCT G9 81
PI-82 TAGAGTCTGTAT TGTCTATCACAT G10 82PI-82 TAGAGTCTGTAT TGTCTATCACAT G10 82
PI-83 TACTATCGCTCT TATGTGACATAC Gil 83PI-83 TACTATCGCTCT TATGTGACATAC Gil 83
PI-84 TAGATGACGCTC TACTCGTAGCGC G12 84PI-84 TAGATGACGCTC TACTCGTAGCGC G12 84
PI-85 TCGCGTGACATC ATCTACTGACGT HI 85PI-85 TCGCGTGACATC ATCTACTGACGT HI 85
PI-86 ACACGCTCTACT ACAGTAGCGCAC H2 86PI-86 ACACGCTCTACT ACAGTAGCGCAC H2 86
PI-87 TACATAGTCTCG CTAGTATCATGA H3 87PI-87 TACATAGTCTCG CTAGTATCATGA H3 87
PI-88 TGAGTAGCACGC TCGATCATGCAG H4 88PI-88 TGAGTAGCACGC TCGATCATGCAG H4 88
PI-89 TAGATGCTATAC TACATGCACTCA H5 89PI-89 TAGATGCTATAC TACATGCACTCA H5 89
PI-90 ATCGATGTCACG CAGCTCGACTAC H6 90PI-90 ATCGATGTCACG CAGCTCGACTAC H6 90
PI-91 ATCATATGTAGC CTCTACAGTCAC H7 91PI-91 ATCATATGTAGC CTCTACAGTCAC H7 91
PI-92 TAGCATCGATAT AGATAGCACATC H8 92PI-92 TAGCATCGATAT AGATAGCACATC H8 92
PI-93 TGATCGACGCTC CTAGATATCGTC H9 93PI-93 TGATCGACGCTC CTAGATATCGTC H9 93
PI-94 TGCAGCTCATAG TACAGACTGCAC H10 94PI-94 TGCAGCTCATAG TACAGACTGCAC H10 94
PI-95 CGACGTAGAGTC CAGTAGCACTAC Hll 95 表 2, 未添加引物标签前用于扩增 HLA-A/B/DRB1相应外显 子的 PCR引物 PI-95 CGACGTAGAGTC CAGTAGCACTAC Hll 95 Table 2, PCR primers for amplifying the corresponding exons of HLA-A/B/DRB1 before the primer label was added
Figure imgf000017_0001
A-F4 GTGTCCCATGACAGATGCAAAA 扩 HLA-A基因 4
Figure imgf000017_0001
A-F4 GTGTCCCATGACAGATGCAAAA Expand HLA-A gene 4
430bp 430bp
A-R4 GGCCCTGACCCTGCTAAAGG 号外显子 A-R4 GGCCCTGACCCTGCTAAAGG exon
B-F2 AGGAGCGAGGGGACCGCA 扩 HLA-B基因 2  B-F2 AGGAGCGAGGGGACCGCA Expand HLA-B gene 2
400bp 400bp
B-R2 CGGGCCGGGGTCACTCAC 号外显子 B-R2 CGGGCCGGGGTCACTCAC exon
B-F3 CGGGGCCAGGGTCTCACA 扩 HLA-B基因 3  B-F3 CGGGGCCAGGGTCTCACA expansion HLA-B gene 3
370bp 370bp
B-R3 GAGGCCATCCCCGGCGAC 号外显子 B-R3 GAGGCCATCCCCGGCGAC exon
B-F4 GCTGGTCACATGGGTGGTCCTA 扩 HLA-A基因 4  B-F4 GCTGGTCACATGGGTGGTCCTA expansion HLA-A gene 4
380bp 380bp
B-R4 CTCCTTACCCCATCTCAGGGTG 号外显子 B-R4 CTCCTTACCCCATCTCAGGGTG exon
D2-F1 CACGTTTCTTGGAGTACTCTA  D2-F1 CACGTTTCTTGGAGTACTCTA
D2-F2 GTTTCTTGTGGCAgCTTAAgTT  D2-F2 GTTTCTTGTGGCAgCTTAAgTT
D2-F3 CCTGTGGCAGGGTAAGTATA  D2-F3 CCTGTGGCAGGGTAAGTATA
D2-F4 GTTTCTTGAAGCAGGATAAGTT 扩 HLA-DRB1基  D2-F4 GTTTCTTGAAGCAGGATAAGTT Expansion HLA-DRB1 base
300bp 300bp
D2-F5 GCACGTTTCTTGGAGGAGG 因 2号外显子 D2-F5 GCACGTTTCTTGGAGGAGG due to exon 2
D2-F6 TTTCCTGTGGCAGCCTAAGA  D2-F6 TTTCCTGTGGCAGCCTAAGA
D2-F7 GTTTCTTGGAGCAGGTTAAAC  D2-F7 GTTTCTTGGAGCAGGTTAAAC
D2-R CCTCACCTCGCCGCTGCAC  D2-R CCTCACCTCGCCGCTGCAC
D2-F1, D2-F2, D2-F3, D2-F4, D2-F5, D2-F6, D2-F7为扩增 HLA-DRBl 2号外显子的正向引物, D2-R为扩增 HLA-DRB1 2号 外显子的反向引物。  D2-F1, D2-F2, D2-F3, D2-F4, D2-F5, D2-F6, D2-F7 are forward primers for amplifying HLA-DRB1 exon 2, and D2-R is for amplifying HLA- Reverse primer for exon 2 of DRB1.
HLA-A/B/DRB1 的 PCR程序如下: The PCR procedure for HLA-A/B/DRB1 is as follows:
96 2min  96 2min
95 °C 30s 60 30s 72 "C 20s (32cycles)  95 °C 30s 60 30s 72 "C 20s (32cycles)
∞ HLA-A/B的 PCR反应体系如下所有试剂均购自普洛麦格(北 生物技术有限公司 (Promega )∞ HLA-A/B PCR reaction system All reagents were purchased from Promega (Promega)
Figure imgf000019_0001
Figure imgf000019_0001
HLA-DRBl的 PCR反应体系如下:
Figure imgf000019_0002
Figure imgf000020_0001
The PCR reaction system of HLA-DRB1 is as follows:
Figure imgf000019_0002
Figure imgf000020_0001
其中 PIArA/B/D2-F1/2/3/4/5/6/7表示引物 5,末端带有第 n号正向引 物标签序列(表 1 )的 HLA-A/B/DRB1的 F引物, PInrA/B/D2-R2/3/4 表示引物 5,末端带有第 n号反向引物标签序列的 HLA-A/B/DRB1 的 R引物(此处 n < 95 ) , 其它依次类推。 且每个样本对应特定的 一套 PCR引物 ( nr删 -Έ聽 l4/sl6n, PInrA/B/D2-R2/3/4 ) 。 Wherein PI Ar A/B/D2-F 1/2/3/4/5/6/7 represents primer 5, and HLA-A/B/DRB1 with the nth forward primer tag sequence (Table 1) at the end F primer, PI nr A/B/D2-R 2/3/4 denotes primer 5, and the R primer of HLA-A/B/DRB1 with the nth reverse primer tag sequence at the end (here n < 95 ), others and so on. And each sample corresponds to a specific set of PCR primers ( nr deletion - Έ listen l4 / sl6n , PI nr A / B / D2 - R 2 / 3 / 4 ).
PCR反应在 Bio-Rad公司的 PTC-200 PCR仪上运行。 PCR完 成后, 取 2ul PCR产物经 1%的琼脂糖凝胶电泳检测。 图 2显示 了 1号样本 HLA-A/B/DRB1相应外显子 PCR产物电泳结果, DNA 分子标记为 DL 2000 ( Takara公司), 胶图上有一系列片段大小为 300bp-500bp单一条带, 表明 1号样本的 HLA-A/B/DRB1各外显 子 (A2、 A3、 A4、 B2、 B3、 B4、 DRB1-2 ) PCR扩增成功, 阴性 对照 (N )无扩增条带。 其它样品的结果与此类似 实施例 3  The PCR reaction was run on a Bio-Rad PTC-200 PCR machine. After the PCR was completed, 2 ul of the PCR product was detected by 1% agarose gel electrophoresis. Figure 2 shows the electrophoresis results of the corresponding exon PCR products of sample No. 1 HLA-A/B/DRB1. The DNA molecular marker is DL 2000 ( Takara). The gel map has a series of single bands with a fragment size of 300bp-500bp, indicating The HLA-A/B/DRB1 exons (A2, A3, A4, B2, B3, B4, DRB1-2) of sample No. 1 were successfully amplified by PCR, and the negative control (N) had no amplified bands. The results of other samples are similar to this. Example 3
PCR产物混合和纯化  PCR product mixing and purification
从 96孔板 HLA-P-A2剩余的 PCR产物中 (阴性对照除外) 各取 20ul混合在一个 3ml的 EP管中, 标记为 HLA-A2-Mix, 对 其它 6个 96孔板进行同样的操作, 分别标记为 HLA-A3-Mix、 HLA-A4-Mix、 HLA-B2-Mix、 HLA-B3-Mix v HLA-B4-Mix 和 HLA-D2-Mix, 震荡混匀, 从 HLA-A2-Mix、 HLA-A3-Mix、 HLA-A4-Mix、 HLA-B2-Mix、 HLA-B3-Mix、 HLA-B4-Mix 和 HLA-D2-Mix中各取 200ul混合在一个 3ml的 EP管中, 标记为 HLA-Mix, 从 HLA-Mix中取 500ul DNA混合物经 Qiagen DNA Purification kit试剂盒(QIAGEN公司) 过柱纯化 (具体纯化步 骤详见说明书 ) , 纯化所得的 200ul DNA , 经 Nanodrop 8000(Thermo Fisher Scientific公司)测定 HLA-Mix DNA浓度为 48ng/ul. 实施例 4 From the remaining PCR product of 96-well plate HLA-P-A2 (except the negative control), 20 ul each was mixed in a 3 ml EP tube, labeled as HLA-A2-Mix, and the same operation was performed on the other 6 96-well plates. , labeled HLA-A3-Mix, HLA-A4-Mix, HLA-B2-Mix, HLA-B3-Mix v HLA-B4-Mix and HLA-D2-Mix, oscillating and mixing, from HLA-A2-Mix 200 ul of HLA-A3-Mix, HLA-A4-Mix, HLA-B2-Mix, HLA-B3-Mix, HLA-B4-Mix and HLA-D2-Mix were mixed in a 3 ml EP tube, labeled For HLA-Mix, 500ul DNA mixture from HLA-Mix was purified by Qiagen DNA Purification kit (QIAGEN). For details, see the instructions), 200 ul of DNA obtained was purified, and the concentration of HLA-Mix DNA was determined to be 48 ng/ul by Nanodrop 8000 (Thermo Fisher Scientific). Example 4
PCR产物的打断, 以及 Illumina GA PCR-Free测序文库的 构建  Interruption of PCR products and construction of Illumina GA PCR-Free sequencing library
1. DNA打断  DNA interruption
从纯化后的 HLA-Mix中取总量 5ug的 DNA 用带 AFA纤维 扣盖的 Covaris 微管在 Covaris S2DNA打断仪 (Covaris公司)上 打断。 打断条件如下:  A total of 5 ug of DNA from the purified HLA-Mix was disrupted on a Covaris S2 DNA interrupter (Covaris) using Covaris microtubes with AFA fiber caps. The breaking conditions are as follows:
频率扫描 ( fre uency sweeping )  Frequency sweeping (fre uency sweeping)
Figure imgf000021_0002
Figure imgf000021_0002
2. 打断后纯化  2. Purification after interruption
将 HLA-Mix的所有打断产物用 QIAquick PCR Purification Kit 回收纯化, 分别溶于 37.5ul 的 EB ( QIAGEN Elution Buffer ) 中;  All the interrupted products of HLA-Mix were recovered and purified by QIAquick PCR Purification Kit and dissolved in 37.5 ul of EB (QIAGEN Elution Buffer);
3. 末端修复反应  3. End repair response
对打断后纯化的 HLA-Mix进行 DNA末端修复反应, 体系如  DNA end-repairing reaction of HLA-Mix purified after interruption, such as
Figure imgf000021_0001
多核苷酸激酶緩冲液( 10x Polynucleotide Kinase Buffer( B904 ) ) 10
Figure imgf000021_0001
Polynucleotide Kinase Buffer ( 10x Polynucleotide Kinase Buffer ( B904 ) ) 10
dNTP混合物 (每种 lOmM ) ( Solution Set ( lOmM each ) ) 4 μL dNTP mixture (each lOmM) (Solution Set ( lOmM each ) ) 4 μL
T4 DNA聚合酶( T4 DNA Polymerase ) 5μ  T4 DNA Polymerase (T4 DNA Polymerase) 5μ
Kleno 片段 ( Klenow Fragment )  Kleno Fragment (Klenow Fragment)
T4多聚核苷酸激酶( T4 Polynucleotide Kinase )  T4 Polynucleotide Kinase
总体积 ( Total volume ) 100 Total volume 100
反应条件为: 恒温混匀器 (Thermomixer, Eppendorf 公司) The reaction conditions are: Thermomixer (Thermomixer, Eppendorf)
20 温浴 30 min。 20 warm bath for 30 min.
反应产物经 QIAquick PCR Purification Kit回收纯化, 溶于 34 μΐ的 EB ( QIAGEN Elution Buffer ) 中。  The reaction product was recovered by QIAquick PCR Purification Kit and dissolved in 34 μM EB (QIAGEN Elution Buffer).
4. 3, 末端加 A反应  4. 3, end plus A reaction
上一步回收 DNA的 3, 末端加 A反应, 体系如下 (试剂均 购自 Enzymatics公司 ) :  The DNA was recovered in the previous step, and the end was added with A reaction. The system was as follows (reagents were purchased from Enzymatics):
上一步所得 DNA 32 DNA obtained in the previous step 32
10x 蓝色緩沖液( 10x blue buffer ) 5 μL·  10x blue buffer ( 10x blue buffer ) 5 μL·
dATP(lmM, GE公司) 10 dATP (lmM, GE) 10
Klenow (3'-5' exo-) 3  Klenow (3'-5' exo-) 3
总体积 ( Total volume ) 50 Total volume 50
反应条件为: 恒温混匀器(Thermomixer, Eppendorf 公 司) 37 温浴 30 min。  The reaction conditions were: Constant Temperature Mixer (Thermomixer, Eppendorf) 37 Warm bath for 30 min.
反应产物经 MiniElute PCR Purification Kit( QIAGEN公司) 回收纯化, 溶于 13 μΐ的 EB溶液( QIAGEN Elution Buffer ) 中。  The reaction product was recovered and purified by MiniElute PCR Purification Kit (QIAGEN) and dissolved in 13 μM EB solution (QIAGEN Elution Buffer).
5. 连接 Illumina GA PCR-Free文库接头 (adapter ) 术语 "PCR-Free文库接头 (adapter ) " 是指经设计的一段 碱基, 其主要作用是辅助固定 DNA分子在测序芯片上以及提供 通用测序引物的结合位点, PCR-Free文库接头可以通过 DNA连 接酶将其直接连接至测序文库中的 DNA 片段两端, 文库接头的 导入过程因为没有 PCR的参与, 因此称作 PCR-Free文库接头。 5. Connect the Illumina GA PCR-Free library connector (adapter) The term "PCR-Free library adaptor" refers to a designed set of bases whose primary function is to assist in the immobilization of DNA molecules on a sequencing chip and to provide a binding site for universal sequencing primers. PCR-Free library linkers can pass DNA. The ligase directly ligates it to both ends of the DNA fragment in the sequencing library, and the introduction process of the library linker is called PCR-Free library linker because there is no PCR involved.
加 A后的产物连接 Illumina GA PCR-Free文库接头, 体系 如下 (试剂均购自 Illumina公司) :  The product after addition of A was ligated to the Illumina GA PCR-Free library linker, and the system was as follows (reagents were purchased from Illumina):
上一步所得 DNA Ιΐμί DNA obtained in the previous step Ιΐμί
2x 快速连接緩沖液 ( 2x Rapid ligation buffer ) 2x Rapid ligation buffer
PCR-free寡核苷酸接头混合物 (30mM) ( PCR-free Adapter Ιμί PCR-free oligonucleotide linker mix (30mM) (PCR-free Adapter Ιμί
oligo mix ) Oligo mix )
T4 DNA连接酶 ( T4 DNA Ligase ) (Rapid, L603-HC-L) 3μΙ 总体积 ( Total volume ) 30 T4 DNA Ligase (Rapid, L603-HC-L) 3μΙ Total Volume 30
反应条件为: 恒温混匀器(Thermomixer, Eppendorf 公 司) 20X温浴 15 min。  The reaction conditions were: a thermomixer (Thermomixer, Eppendorf) 20X bath for 15 min.
反应产物经 Ampure Beads(Beckman Coulter Genomics)纯化 后溶于 50ul去离子水, 经荧光定量 PCR ( QPCR )检测到 DNA 浓度结果如下:
Figure imgf000023_0001
The reaction product was purified by Ampure Beads (Beckman Coulter Genomics) and dissolved in 50 ul of deionized water. The DNA concentration was determined by real-time PCR (QPCR) as follows:
Figure imgf000023_0001
6. 割胶回收  6. Tapping recycling
取 3(^L HLA-Mix用 2%低熔点琼脂糖胶进行回收。电泳条件 为 100V, 100min。DNA marker为 NEB公司的 50bp DNA marker。 割胶回收 450-750bp长度范围的 DNA片段(附图 3 ) 。 胶回收产 物经 QIAquick PCR Purification Kit ( QIAGE 公司)回收纯化, 纯化后体积为 32ul, 经荧光定量 PCR ( QPCR )检测到 DNA浓 度结果为 10.16nM。 实施例 5 Take 3 (^L HLA-Mix with 2% low melting point agarose gel for recovery. The electrophoresis conditions are 100V, 100min. The DNA marker is NEB's 50bp DNA marker. The tapping gel recovers the DNA fragment of 450-750bp length range (Figure 3) ). Rubber recycling The material was recovered and purified by QIAquick PCR Purification Kit (QIAGE), and the volume after purification was 32 ul. The DNA concentration was 10.16 nM by real-time PCR (QPCR). Example 5
Illumina GA测序  Illumina GA sequencing
根据 QPCR 检测结果, 取 lOpmol DNA 用 Illumina GA PE-100程序测序, 具体操作流程详见 Illumina GA操作说明书 ( Illumina GA Π x ) 。 实施例 6  According to the QPCR test results, lOpmol DNA was sequenced using the Illumina GA PE-100 program. The specific procedure is detailed in the Illumina GA operating instructions (Illumina GA Π x ). Example 6
结果分析  Result analysis
Illumina GA产出的测序结果是一系列 DNA序列,通过查找 测序结果中的正反引物标签序列和引物序列, 建立各个引物标签 对应样本 HLA-A/B/DRB1 各外显子 PCR产物测序结果的数据 库。 通过 BWA(Burrows-Wheeler Aligner)把各外显子的测序结果 定位在相应外显子的参考序列上 ( 参考序列来源: http://www.ebi.ac.uk/imgt/hla/ ) 同时, 构建各个数据库的一致性 The sequencing result of Illumina GA is a series of DNA sequences. By searching the sequence of the positive and negative primers and the primer sequences in the sequencing results, the sequencing results of the PCR products of the HLA-A/B/DRB1 exons corresponding to each primer label are established. database. The sequencing results of each exon are mapped to the reference sequence of the corresponding exon by BWA (Burrows-Wheeler Aligner) (reference sequence source: http://www.ebi.ac.uk/imgt/hla/) Build consistency across databases
( consensus )序列, 再对数据库中 DNA序列进行筛选和测序错 误校正。 校正后的 DNA 序列通过序列重叠 (overlap ) 和连锁(consistency) sequence, and then the DNA sequence in the database is screened and sequenced for error correction. Corrected DNA sequence through sequence overlap and linkage
( Pair-End连锁) 关系可组装成 HLA-A/B/DRB1 各外显子相应 的序列。 所得 DNA 序列利用与 IMGT HLA 专业数据库中 HLA-A/B/DRB1 相应各外显子的序列数据库比对,序列比对结果 100%匹配的即为对应样本的 HLA-A/B/DRB1基因型别。 可参考 图 4示例说明的 1号样品的 HLA-A位点的 2号外显子一致性序 列构建程序的截图。 所有 95 个样本, 得到的分型结果与原已知分型结果完全相 , 其中 1-32号样本的具体结果如下: The (pair-End linkage) relationship can be assembled into the corresponding sequence of each exon of HLA-A/B/DRB1. The obtained DNA sequence was aligned with the sequence database of the corresponding exons of HLA-A/B/DRB1 in the IMGT HLA professional database, and the 100% match of the sequence alignment results was the HLA-A/B/DRB1 genotype of the corresponding sample. do not. A screenshot of the exon 2 consensus sequence constructor for the HLA-A site of sample No. 1 illustrated in Figure 4 can be seen. For all 95 samples, the results obtained were completely identical to the results of the original known classification. The specific results of the samples No. 1-32 are as follows:
样本编号 原 HLA-A/B/DRB1型别 Sample No. Original HLA-A/B/DRB1 Type
Α*02:03 A*ll:01 38:02 B' *48:01 DRB1' "=14:54 DRB1*15:01 Α*02:03 A*ll:01 38:02 B' *48:01 DRB1' "=14:54 DRB1*15:01
2 Α*01:01 Α*30:01 BJ ¾08:01 B*13:02 DRB1' ¾03:01 DRB1*07:012 Α*01:01 Α*30:01 B J 3⁄4 08:01 B*13:02 DRB1' 3⁄4 03:01 DRB1*07:01
3. Α*01:01 Α*02:01 BJ 45:11 B*47:01 DRB1*13:02 DRB1*15:013. Α*01:01 Α*02:01 B J 45:11 B*47:01 DRB1*13:02 DRB1*15:01
4 Α*24:08 Α*26:01 k40:01 B' k51:01 DRB1' "04:04 DRB1*09:014 Α*24:08 Α*26:01 k 40:01 B' k 51:01 DRB1'"04:04 DRB1*09:01
5 Α*01:01 Α*24:02 BJ fe54:01 B' "55:02 DRB1' "04:05 DRB1*09:015 Α*01:01 Α*24:02 B J fe 54:01 B'"55:02DRB1'"04:05 DRB1*09:01
6 Α*01:01 Α*03:02 fe15:ll B' ^37:01 DRB1' "=10:01 DRB1*14:546 Α*01:01 Α*03:02 fe 15:ll B' ^37:01 DRB1'"=10:01 DRB1*14:54
7 Α*11:01 Α*30:01 43:02 B' "15:18 DRB1' "=04:04 DRB1*07:017 Α*11:01 Α*30:01 43:02 B' "15:18 DRB1' "=04:04 DRB1*07:01
8 Α*01:01 Α*02:01 "35:03 B' "81:01 DRB1' "11:01 DRB1*15:018 Α*01:01 Α*02:01 "35:03 B' "81:01 DRB1' "11:01 DRB1*15:01
9 Α*02:06 Α*31:01 k27:07 B' "40:02 DRB1' "03:01 DRB1*13:029 Α*02:06 Α*31:01 k 27:07 B'"40:02DRB1'"03:01 DRB1*13:02
10 Α*01:01 Α*66:01 k37:01 B' "49:01 DRB1' fe10:01 DRB1*13:0210 Α*01:01 Α*66:01 k 37:01 B'"49:01DRB1' fe 10:01 DRB1*13:02
11 Α*01:01 Α*03:01 35:01 B! k52:0l DRB1' ¾01:01 DRB1*15:0211 Α * 01: 01 Α * 03: 01 35:01 B k 52:! 0l DRB1 '¾ 01:01 DRB1 * 15: 02
12 Α*11:01 Α*11:01 45:01 B*15:05 DRB1' "04:06 DRB1*15:0112 Α*11:01 Α*11:01 45:01 B*15:05 DRB1' "04:06 DRB1*15:01
13 Α*01:01 Α*11:02 Β*07:02 B*15:02 DRB1' "09:01 DRB1*15:0113 Α*01:01 Α*11:02 Β*07:02 B*15:02 DRB1' "09:01 DRB1*15:01
14 Α*01:01 Α*02:01 k52:01 B' "67:01 DRB1' "15:02 DRB1*16:0214 Α*01:01 Α*02:01 k 52:01 B'"67:01DRB1'"15:02 DRB1*16:02
15 Α*01:01 Α*02:05 Β*15:17 B' fc50:01 DRB1' "=07:01 DRB1*15:0115 Α*01:01 Α*02:05 Β*15:17 B' fc 50:01 DRB1'"=07:01 DRB1*15:01
16 Α*01:01 Α*11:01 Β*37:01 B' 0:02 DRB1' k10:01 DRB1*12:0216 Α*01:01 Α*11:01 Β*37:01 B' 0:02 DRB1' k 10:01 DRB1*12:02
17 Α*24:07 Α*32:01 Β*35:05 B*40:01 DRB1' "03:01 DRB1*04:0517 Α*24:07 Α*32:01 Β*35:05 B*40:01 DRB1' "03:01 DRB1*04:05
18 Α*11:01 Α*24:02 k13:01 B' "35:01 DRB1*16:02 DRB1*16:0218 Α*11:01 Α*24:02 k 13:01 B'"35:01 DRB1*16:02 DRB1*16:02
19 Α*11:01 Α*11:01 "40:02 B*55:12 DRB1' k04:05 DRB1*15:0119 Α*11:01 Α*11:01 "40:02 B*55:12 DRB1' k 04:05 DRB1*15:01
20 Α*02:11 Α*24:02 k40:01 B' 0:06 DRB1*11:01 DRB1*15:0120 Α*02:11 Α*24:02 k 40:01 B' 0:06 DRB1*11:01 DRB1*15:01
21 Α*01:01 Α*02:06 k51:01 B' 7:01 DRB1*07:01 DRB1*12:0121 Α*01:01 Α*02:06 k 51:01 B' 7:01 DRB1*07:01 DRB1*12:01
22 Α*01:01 Α*29:01 k07:05 B' "15:01 DRB1' k04:05 DRB1*07:0122 Α*01:01 Α*29:01 k 07:05 B'"15:01DRB1' k 04:05 DRB1*07:01
23 Α*01:01 Α*02:07 ^37:01 B' fe46:01 DRB1' k04:03 DRB1*10:0123 Α*01:01 Α*02:07 ^37:01 B' fe 46:01 DRB1' k 04:03 DRB1*10:01
24 Α*24:85 Α*30:01 Βν 43:02 B' "55:02 DRB1' fe07:01 DRB1*15:0124 Α*24:85 Α*30:01 Β ν 43:02 B'"55:02DRB1' fe 07:01 DRB1*15:01
25 Α*11:01 Α*31:01 ^07:06 B' "51:01 DRB1' "12:02 DRB1*14:0525 Α*11:01 Α*31:01 ^07:06 B' "51:01 DRB1' "12:02 DRB1*14:05
26 Α*01:01 Α*11:01 k46:01 B' "57:01 DRB1' ¾07:01 DRB1*08:0326 Α * 01: 01 Α * 11: 01 k 46:01 B '"57:01DRB1' ¾ 07:01 DRB1 * 08: 03
27 Α*01:01 Α*02:01 45:18 B*37:01 DRB1' ¾04:01 DRB1*15:0127 Α*01:01 Α*02:01 45:18 B*37:01 DRB1' 3⁄4 04:01 DRB1*15:01
28 Α*01:01 Α*24:02 Β*37:01 B' "46:01 DRB1' k09:01 DRB1*10:0128 Α*01:01 Α*24:02 Β*37:01 B'"46:01DRB1' k 09:01 DRB1*10:01
29 Α*26:01 Α*66:01 Βν k40:40 B' ¾41:02 DRB1' ni:0l DRB1*1S:0129 Α * 26: 01 Α * 66: 01 Β ν k 40:40 B '¾ 41:02 DRB1' ni: 0l DRB1 * 1S: 01
30 Α*02:01 Α*29:02 43:02 B' "45:01 DRB1' "03:01 DRB1*12:0230 Α*02:01 Α*29:02 43:02 B' "45:01 DRB1' "03:01 DRB1*12:02
31 Α*01:01 Α*11:03 Βν 5:01 B' "57:01 DRB1' fe07:01 DRB1*15:0131 Α*01:01 Α*11:03 Β ν 5:01 B'"57:01DRB1' fe 07:01 DRB1*15:01
32 Α*11:01 Α*26:01 Β*35:03 B' "38:01 DRB1' "11:03 DRB1*14:04 样本编号 测得的 HLA-A/B/DRB1型别 32 Α*11:01 Α*26:01 Β*35:03 B' "38:01 DRB1' "11:03 DRB1*14:04 Sample No. Measured HLA-A/B/DRB1 type
1 A*02:03 A*ll:01 B*38:02 B*48:01 DRB1*14:54 DRB1*15:01 2 A*01 01 A*30:01 B*08:01 B*13:02 DRB1*03:01 DRB1*07:011 A*02:03 A*ll:01 B*38:02 B*48:01 DRB1*14:54 DRB1*15:01 2 A*01 01 A*30:01 B*08:01 B*13:02 DRB1*03:01 DRB1*07:01
3 A*01 01 A*02:01 B*15:ll B*47:01 DRB1*13:02 DRB1*15:013 A*01 01 A*02:01 B*15:ll B*47:01 DRB1*13:02 DRB1*15:01
4 A*24 08 A*26:01 B*40:01 B*51:01 DRB1*04:04 DRB1*09:014 A*24 08 A*26:01 B*40:01 B*51:01 DRB1*04:04 DRB1*09:01
5 A*01 01 A*24:02 B*54:01 B*55:02 DRB1*04:05 DRB1*09:015 A*01 01 A*24:02 B*54:01 B*55:02 DRB1*04:05 DRB1*09:01
6 A*01 01 A*03:02 B*15:ll B*37:01 DRB1*10:01 DRB1*14:546 A*01 01 A*03:02 B*15:ll B*37:01 DRB1*10:01 DRB1*14:54
7 A*ll 01 A*30:01 B*13:02 B*15:18 DRB1*04:04 DRB1*07:017 A*ll 01 A*30:01 B*13:02 B*15:18 DRB1*04:04 DRB1*07:01
8 A*01 01 A*02:01 B*35:03 B*81:01 DRB1*11:01 DRB1*15:018 A*01 01 A*02:01 B*35:03 B*81:01 DRB1*11:01 DRB1*15:01
9 A*02 06 A*31:01 B*27:07 B*40:02 DRB1*03:01 DRB1*13:029 A*02 06 A*31:01 B*27:07 B*40:02 DRB1*03:01 DRB1*13:02
10 A*01 01 A*66:01 B*37:01 B*49:01 DRB1*10:01 DRB1*13:0210 A*01 01 A*66:01 B*37:01 B*49:01 DRB1*10:01 DRB1*13:02
11 A*01 01 A*03:01 B*35:01 B*52:01 DRB1*01:01 DRB1*15:0211 A*01 01 A*03:01 B*35:01 B*52:01 DRB1*01:01 DRB1*15:02
12 A*ll 01 A*ll:01 B*15:01 B*15:05 DRB1*04:06 DRB1*15:0112 A*ll 01 A*ll:01 B*15:01 B*15:05 DRB1*04:06 DRB1*15:01
13 A*01 01 A*ll:02 B*07:02 B*15:02 DRB1*09:01 DRB1*15:0113 A*01 01 A*ll:02 B*07:02 B*15:02 DRB1*09:01 DRB1*15:01
14 A*01 01 A*02:01 B*52:01 B*67:01 DRB1*15:02 DRB1*16:0214 A*01 01 A*02:01 B*52:01 B*67:01 DRB1*15:02 DRB1*16:02
15 A*01 01 A*02:05 B*15:17 B*50:01 DRB1*07:01 DRB1*15:0115 A*01 01 A*02:05 B*15:17 B*50:01 DRB1*07:01 DRB1*15:01
16 A*01 01 A*ll:01 B*37:01 B*40:02 DRB1*10:01 DRB1*12:0216 A*01 01 A*ll:01 B*37:01 B*40:02 DRB1*10:01 DRB1*12:02
17 A*24 07 A*32:01 B*35:05 B*40:01 DRB1*03:01 DRB1*04:0517 A*24 07 A*32:01 B*35:05 B*40:01 DRB1*03:01 DRB1*04:05
18 A*ll 01 A*24:02 B*13:01 B*35:01 DRB1*16:02 DRB1*16:0218 A*ll 01 A*24:02 B*13:01 B*35:01 DRB1*16:02 DRB1*16:02
19 A*ll 01 A* 11 :01 B*40:02 B*55:12 DRB1*04:05 DRB1*15:0119 A*ll 01 A* 11 :01 B*40:02 B*55:12 DRB1*04:05 DRB1*15:01
20 A*02 11 A*24:02 B*40:01 B*40:06 DRB1*11:01 DRB1*15:0120 A*02 11 A*24:02 B*40:01 B*40:06 DRB1*11:01 DRB1*15:01
21 A*01 01 A*02:06 B*51:01 B*57:01 DRB1*07:01 DRB1*12:0121 A*01 01 A*02:06 B*51:01 B*57:01 DRB1*07:01 DRB1*12:01
22 A*01 01 A*29:01 B*07:05 B*15:01 DRB1*04:05 DRB1*07:0122 A*01 01 A*29:01 B*07:05 B*15:01 DRB1*04:05 DRB1*07:01
23 A*01 01 A*02:07 B*37:01 B 6.01 DRB1*04:03 DRB1*10:0123 A*01 01 A*02:07 B*37:01 B 6.01 DRB1*04:03 DRB1*10:01
24 A*24 85 A*30:01 B* 13:02 B*55:02 DRB1*07:01 DRB1*15:0124 A*24 85 A*30:01 B* 13:02 B*55:02 DRB1*07:01 DRB1*15:01
25 A*ll 01 A*31:01 B*07:06 B*51:01 DRB1*12:02 DRB1*14:0525 A*ll 01 A*31:01 B*07:06 B*51:01 DRB1*12:02 DRB1*14:05
26 A*01 01 A*ll:01 B*46:01 B*57:01 DRB1*07:01 DRB1*08:0326 A*01 01 A*ll:01 B*46:01 B*57:01 DRB1*07:01 DRB1*08:03
27 A*01 01 A*02:01 B*15:18 B*37:01 DRB1*04:01 DRB1*15:0127 A*01 01 A*02:01 B*15:18 B*37:01 DRB1*04:01 DRB1*15:01
28 A*01 01 A*24:02 B*37:01 B*46:01 DRB1*09:01 DRB1*10:0128 A*01 01 A*24:02 B*37:01 B*46:01 DRB1*09:01 DRB1*10:01
29 A*26 01 A*66:01 B*40:40 B*41:02 DRB1*12:01 DRB1*15:0129 A*26 01 A*66:01 B*40:40 B*41:02 DRB1*12:01 DRB1*15:01
30 A*02 01 A*29:02 B*13:02 B*45:01 DRB1*03:01 DRB1*12:0230 A*02 01 A*29:02 B*13:02 B*45:01 DRB1*03:01 DRB1*12:02
31 A*01 01 A*ll:03 B*15:01 B*57:01 DRB1*07:01 DRB1*15:0131 A*01 01 A*ll:03 B*15:01 B*57:01 DRB1*07:01 DRB1*15:01
32 A*ll 01 A*26:01 B*35:03 B*38:01 DRB1*11:03 DRB1*14:04 注 : HLA-DRBl 型 别 中 的 DRB1*1201 不 排 除 DRB1*1206/1210/1217的可能性, DRB1*1454不排除 DRB1*1401 的可能性, 因为上述等位基因在 HLA-DRBl 2号外显子的序列完 全相同。 尽管本发明的具体实施方式已经得到详细的描述, 本领域技 术人员将会理解。 根据已经公开的所有教导, 可以对那些细节进 行各种修改和替换, 这些改变均在本发明的保护范围之内。 本发 明的全部范围由所附权利要求及其任何等同物给出。 参考文献 32 A*ll 01 A*26:01 B*35:03 B*38:01 DRB1*11:03 DRB1*14:04 Note: DRB1*1201 in HLA-DRBl type does not exclude DRB1*1206/1210/ The possibility of 1217, DRB1*1454 does not exclude the possibility of DRB1*1401, since the above alleles have identical sequences in exon 2 of HLA-DRB1. Although specific embodiments of the invention have been described in detail, those skilled in the art will understand. Various modifications and substitutions may be made to those details in light of the teachings of the invention, which are within the scope of the invention. The full scope of the invention is given by the appended claims and any equivalents thereof. references
[1]. http://www.ebi.ac.uk/imgt/hla/stats.html  [1]. http://www.ebi.ac.uk/imgt/hla/stats.html
[2]. Tiercy J M. Molecular basis of HLA polymorphism: implications in clinical transplantation. [J]. Transpl Immunol, 2002, 9: 173-180.  [2]. Tiercy J M. Molecular basis of HLA polymorphism: implications in clinical transplantation. [J]. Transpl Immunol, 2002, 9: 173-180.
[3]. C.Antoine, S.Muller, A.Cant, et al. Long-term survival and transplantation of haemopoietic stem cells for immunodeficiencies: report of the European experience. 1968-99.  [3]. C. Antoine, S. Muller, A. Cant, et al. Long-term survival and transplantation of haemopoietic stem cells for immunodeficiencies: report of the European experience. 1968-99.
[J]. The Lancet, 2003,9357:553-560. [J]. The Lancet, 2003, 9357: 553-560.
[4】. H. A. Erlich, G. Opelz, J. Hansen, et al. HLA DNA Typing and Transplantation. [J].Immunity, 2001,14:347-356.  [4]. H. A. Erlich, G. Opelz, J. Hansen, et al. HLA DNA Typing and Transplantation. [J]. Immunity, 2001, 14: 347-356.
[5]. Lillo R, Balas A, Vicario JL, et al. Two new HLA class allele, DPBl*02014,by sequence-based typing. [J]. Tissue Antigens, 2002, 59: 47-48,  [5]. Lillo R, Balas A, Vicario JL, et al. Two new HLA class allele, DPBl*02014, by sequence-based typing. [J]. Tissue Antigens, 2002, 59: 47-48,
[6]. A. Dormoy, N. Froelich . Leisenbach, et al. Mono-allelic amplification of exons 2-4 using allele group-specific primers for sequence-based typing (SBT) of the HLA-A, -B and -C genes: Preparation and validation of ready-to-use pre-SBT mini-kits. [J]. Tissue Antigens, 2003, 62: 201-216.  [6]. A. Dormoy, N. Froelich . Leisenbach, et al. Mono-allelic amplification of exons 2-4 using allele group-specific primers for sequence-based typing (SBT) of the HLA-A, -B and - C genes: Preparation and validation of ready-to-use pre-SBT mini-kits. [J]. Tissue Antigens, 2003, 62: 201-216.
[7]. Elaine R. Mardis. The impact of next-generation sequencing technology on genetics. [J]. Trends in Genetics.2008,24:133-141. [7]. Elaine R. Mardis. The impact of next-generation sequencing technology on genetics. [J]. Trends in Genetics. 2008, 24: 133-141.
[8]. Christian Hoffmannl, Nana Minkahl, Jeremy Leipzig. DNA barcoding and pyrosequencing to identify rare HIV drug resistance mutations.【J】. Nucleic Acids Research,2007,l-8.  [8]. Christian Hoffmannl, Nana Minkahl, Jeremy Leipzig. DNA barcoding and pyrosequencing to identify rare HIV drug resistance mutations. [J]. Nucleic Acids Research, 2007, l-8.
[9]. Shannon J.Odelberg, Robert B.Weiss , Akira Hata. Template-switching during DNA synthesis by Therm us aquaticus DNA polymerase I. [J]. Nucleic Acids Research.1995, 23:2049-2057.  [9]. Shannon J. Odelberg, Robert B. Weiss, Akira Hata. Template-switching during DNA synthesis by Therm us aquaticus DNA polymerase I. [J]. Nucleic Acids Research.1995, 23:2049-2057.
[10]. Sayer D, Whidborne R, Brestovac B. HLA - DRB1 DNA sequencing based typing: an approach suitable for high throughput typing including unrelated bone marrow registry donors. [J]. Tissue Antigens. 2001, 57(l):46-54. [10]. Sayer D, Whidborne R, Brestovac B. HLA - DRB1 DNA sequencing based typing typing: an approach suitable for high throughput typing including unrelated bone marrow registry donors. [J]. Tissue Antigens. 2001, 57(l):46 -54.
[11]. Iwanka Kozarewa, Zemin Ning, Michael A Quail. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. [J]. Nature Methods. 2009, 6:291 - 295. [11]. Iwanka Kozarewa, Zemin Ning, Michael A Quail. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. [J]. Nature Methods. 2009, 6:291 - 295.

Claims

权 利 要 求 Rights request
1. 一种测定样品中目的核酸的核苷酸序列的方法, 其包括: A method of determining a nucleotide sequence of a nucleic acid of interest in a sample, comprising:
1 )提供 n个样品, n为大于等于 1的整数, 所述样品优选地来 自哺乳动物, 更优选是人, 特别是人的血样; 可选地, 将待分析 的 n个样品分成 m个小组, m为整数且 n > m > l;  1) providing n samples, n being an integer greater than or equal to 1, the sample preferably being from a mammal, more preferably a human, in particular a human blood sample; alternatively, dividing the n samples to be analyzed into m groups , m is an integer and n > m > l;
2 )扩增: 对于每一个样品, 使用一对或多对标签引物, 在存 在来自该样品的模板时, 在适于扩增目的核酸的条件下进行 PCR 扩增, 其中, 每一对标签引物由包含引物标签的正向标签引物和 反向标签引物 (均可以是简并引物)构成, 其中正向标签引物和 反向标签引物所包含的引物标签可以相同或者不同; 不同样品所 用标签引物对中的引物标签彼此不同;  2) Amplification: For each sample, one or more pairs of label primers are used, and in the presence of a template from the sample, PCR amplification is performed under conditions suitable for amplifying the nucleic acid of interest, wherein each pair of label primers The forward label primer and the reverse label primer (both may be degenerate primers) including the primer label, wherein the forward label primer and the reverse label primer may contain the same or different primer labels; the label primer pair used for different samples The primer labels in each other are different from each other;
3 ) 混合: 当 η>1时, 将各样品的 PCR扩增产物混合在一起; 3) mixing: when η>1, the PCR amplification products of each sample are mixed together;
4 )打断: 将所得的扩增产物进行不完全打断, 并进行纯化回 收; 4) Interruption: the obtained amplification product is incompletely interrupted, and purified and recovered;
5 )测序: 将回收的 DNA混合物利用二代测序技术, 优选的是 Pair-End技术(例如 IUumina GA、 IUumina Hiseq 2000 )进行测序, 获得打断后的 DNA的序列; 和  5) Sequencing: The recovered DNA mixture is sequenced using a second-generation sequencing technique, preferably a Pair-End technique (eg, IUumina GA, IUumina Hiseq 2000), to obtain a sequence of interrupted DNA;
6 )拼接:基于每个样品独特的引物标签将获得的测序结果与 样品 对应, 利用比对程序 (例如 Blast, BWA程序)把各个测 序序列定位到 PCR产物的相应 DNA参考序列上, 通过序列重叠和 连锁关系, 从打断后的 DNA的序列拼接出完整的目的核酸。  6) Stitching: The sequencing results obtained are based on the unique primer labels of each sample, and the sequencing sequences (for example, Blast, BWA program) are used to locate each sequencing sequence on the corresponding DNA reference sequence of the PCR product, by overlapping the sequences. And the linkage relationship, the complete target nucleic acid is spliced from the sequence of the interrupted DNA.
2. 权利要求 1所述的方法,其中每一对引物标签与 PCR引物对 组合成一对标签引物, 正反 PCR引物的 5,端分别具有(或者任选 通过连接序列连接)正向引物标签和反向引物标签。  2. The method of claim 1 wherein each pair of primer tags and PCR primer pairs are combined into a pair of tag primers, the 5th ends of the forward and reverse PCR primers having (or optionally joined by a linker sequence) a forward primer tag and Reverse primer label.
3. 权利要求 1所述的方法, 其中所述 PCR引物是用于扩增 HLA-A/B的 2, 3, 4号外显子以及 HLA-DRBl 2号外显子的 PCR 引物, 优选的所述 PCR引物如表 2所示。 3. The method of claim 1 wherein the PCR primer is for amplification PCR primers for exon 2, 3, and 4 of HLA-A/B and exon 2 of HLA-DRB1, and the PCR primers are preferably shown in Table 2.
4. 权利要求 1所述的方法,其中所述引物标签针对 PCR引物进 行设计, 优选针对用于扩增 HLA的特定基因的 PCR引物进行设 计, 更优选是用于扩增 HLA-A/B的 2 , 3 , 4号外显子以及 HLA-DRB1 2号外显子的 PCR引物,特别是如表 2所示的 PCR引物 进行设计,所述引物标签特别是包括表 1所示 95对引物标签中的至 少 10对, 或至少 20对, 或至少 30对, 或至少 40对, 或至少 50对, 至少 60对, 或至少 70对, 或至少 80对, 或至少 90对, 或 95对 (或 者所述一组引物标签由表 1所示 95对引物标签中的 10 - 95对(例如 10 - 95对, 20 - 95对, 30 - 95对, 40 - 95对, 50 - 95对, 60 - 95 对, 70 - 95对, 80 - 95对, 90 - 95对, 或 95对)组成) , 并且 所述一组引物标签优选地至少包括表 1所示 95对引物标签中 的 PI-1至 PI-10 , 或 PI-11至 PI-20 , 或 PI-21至 PI-30 , 或 PI-31至 PI-40,或 PI-41至 PI-50,或 PI-51至 PI-60,或 PI-61至 PI-70,或 PI-71 至 PI-80, 或 PI-81至 PI-90, 或 PI-91至 PI-95, 或者它们任何两个 或者多个的组合。  4. The method of claim 1, wherein the primer tag is designed for PCR primers, preferably for PCR primers for amplifying specific genes of HLA, more preferably for amplifying HLA-A/B PCR primers for exons 2, 3, and 4 and exon 2 of HLA-DRB1, in particular, PCR primers as shown in Table 2, which specifically include 95 pairs of primer tags shown in Table 1. At least 10 pairs, or at least 20 pairs, or at least 30 pairs, or at least 40 pairs, or at least 50 pairs, at least 60 pairs, or at least 70 pairs, or at least 80 pairs, or at least 90 pairs, or 95 pairs (or A set of primer labels consists of 10 - 95 pairs of 95 pairs of primer labels shown in Table 1 (eg 10 - 95 pairs, 20 - 95 pairs, 30 - 95 pairs, 40 - 95 pairs, 50 - 95 pairs, 60 - 95 pairs) 70-95 pairs, 80-95 pairs, 90-95 pairs, or 95 pairs), and the set of primer labels preferably includes at least PI-1 to PI- in 95 pairs of primer labels shown in Table 1. 10, or PI-11 to PI-20, or PI-21 to PI-30, or PI-31 to PI-40, or PI-41 to PI-50, or PI -51 to PI-60, or PI-61 to PI-70, or PI-71 to PI-80, or PI-81 to PI-90, or PI-91 to PI-95, or any two or more of them Combination of one.
5. 权利要求 1所述的方法,其中所述 DNA打断包括化学打断方 法和物理打断方法, 其中所述化学方法包括酶切方法, 所述物理 打断方法包括超声波打断方法或机械打断方法。  5. The method of claim 1, wherein the DNA disruption comprises a chemical disruption method and a physical disruption method, wherein the chemical method comprises an enzymatic cleavage method, the physical disruption method comprising an ultrasonic disruption method or a mechanical method Interrupt the method.
6. 权利要求 1所述的方法, 其中所述 DNA打断后, 纯化回收 从测序仪最大读取长度到测序仪可适用的最长 DNA长度范围之间 的所有 DNA条带,其中所述纯化回收方法包括但不限于电泳割胶回 收, 也可以是磁珠回收。  6. The method of claim 1, wherein after the DNA is interrupted, the purification recovers all DNA bands from a maximum read length of the sequencer to a range of the longest DNA length to which the sequencer is applicable, wherein the purification Recovery methods include, but are not limited to, electrophoretic tapping recovery, or magnetic bead recovery.
7. 权利要求 1所述的方法, 其中所述方法还可以包括权利要求 1所述的步骤 1 ) - 4 ) , 以及如下步骤: 5 )建库: 将打断后的 PCR产物文库构建 PCR-Free测序文库, 可以对文库添加不同的文库接头 ( adapter ) 以区分不同的 PCR-Free测序文库, 纯化回收位于所用测序仪最大读长长度到所 用测序仪适用的最长 DNA长度范围之间的所有 DNA条带,优选地 是 450-750bp长度范围的 DNA片段; 7. The method of claim 1 wherein the method further comprises the steps 1) - 4) of claim 1 and the following steps: 5) Building a library: Construct a PCR-Free sequencing library from the interrupted PCR product library, and add different library adapters to the library to distinguish different PCR-Free sequencing libraries. The purification and recovery are located at the maximum read length of the sequencer used. All DNA bands between the length of the longest DNA length to which the sequencer used is suitable, preferably a DNA fragment of 450-750 bp length;
6 ) 测序: 将回收的 DNA混合物利用二代测序技术, 优选的 是 Pair-End技术(例如 Illumina GA、 Illumina Hiseq 2000 )进行 测序, 获得打断后的 DNA的序列;  6) Sequencing: The recovered DNA mixture is sequenced using a second-generation sequencing technique, preferably a Pair-End technique (eg, Illumina GA, Illumina Hiseq 2000), to obtain a sequence of the interrupted DNA;
7 )拼接: 基于各个文库不同的文库接头序列和每个样品独特 的引物标签将获得的测序结果与样品一一对应,利用比对程序(例 如 Blast,BWA程序) ^各个测序序列定位到 PCR产物的相应 DNA 参考序列上,通过序列重叠和连锁关系,从打断后的 DNA的序列 拼接出完整的目的核酸。  7) Stitching: Based on the different library linker sequences of each library and the unique primer tags of each sample, the obtained sequencing results are in one-to-one correspondence with the samples, and the alignment products (for example, Blast, BWA program) are used to locate the PCR products. On the corresponding DNA reference sequence, the complete target nucleic acid is spliced from the sequence of the broken DNA by sequence overlap and linkage.
8. 权利要求 1 - 7中任一项所述的方法用于 HLA分型的用途, 其特征在于包括: 使用权利要求 1 - 7中任一项的方法对来自患者 的样品 (特别是血样)进行测序, 以及将测序结果与 HLA数据库 8. Use of the method according to any one of claims 1 to 7 for HLA typing, characterized in that it comprises: a sample (in particular a blood sample) from a patient using the method according to any one of claims 1-7 Sequencing, and sequencing results with the HLA database
(如 IMGT HLA专业数据库) 中 HLA-DRB1 2号外显子的序列数 据比对, 序列比对结果 100 %匹配的即为对应样本的 HLA-DRB1 基因型别。 (For example, the IMGT HLA professional database) The sequence data of HLA-DRB1 exon 2 is aligned, and the 100% match of the sequence alignment results is the HLA-DRB1 genotype of the corresponding sample.
9. 一组引物标签, 其包括表 1所示 95对引物标签中的至少 10 对, 或至少 20对, 或至少 30对, 或至少 40对, 或至少 50对, 至少 60对, 或至少 70对, 或至少 80对, 或至少 90对, 或 95对 (或者所 述一组引物标签由表 1所示 95对引物标签中的 10 - 95对 (例如 10 - 95对, 20 - 95对, 30 - 95对, 40 - 95对, 50 - 95对, 60 - 95 对, 70 - 95对, 80 - 95对, 90 - 95对, 或 95对)组成) , 并且 所述一组引物标签优选地至少包括表 1所示 95对引物标签中 的 PI-1至 PI-10, 或 PI-11至 PI-20 , 或 PI-21至 PI-30 , 或 PI-31至 PI-40,或 PI-41至 PI-50,或 PI-51至 PI-60,或 PI-61至 PI-70,或 PI-71 至 PI-80, 或 PI-81至 PI-90, 或 PI-91至 PI-95, 或者它们任何两个 或者多个的组合。 9. A set of primer labels comprising at least 10 pairs of 95 pairs of primer labels shown in Table 1, or at least 20 pairs, or at least 30 pairs, or at least 40 pairs, or at least 50 pairs, at least 60 pairs, or at least 70 Yes, or at least 80 pairs, or at least 90 pairs, or 95 pairs (or the set of primer labels are 10 - 95 pairs of 95 pairs of primer labels shown in Table 1 (eg, 10 - 95 pairs, 20 - 95 pairs, 30-95 pairs, 40-95 pairs, 50-95 pairs, 60-95 pairs, 70-95 pairs, 80-95 pairs, 90-95 pairs, or 95 pairs), and the set of primer labels are preferred At least 95 pairs of primer labels shown in Table 1 PI-1 to PI-10, or PI-11 to PI-20, or PI-21 to PI-30, or PI-31 to PI-40, or PI-41 to PI-50, or PI-51 to PI-60, or PI-61 to PI-70, or PI-71 to PI-80, or PI-81 to PI-90, or PI-91 to PI-95, or a combination of any two or more thereof .
10. 权利要求 9所述的一组引物标签用于 PCR测序方法的用途, 其中特别是, 每一对引物标签与用于扩增待测目的序列的 PCR引 物对组合成一对标签引物, 正反 PCR引物的 5,端分别具有 (或者 任选通过连接序列连接)正向引物标签和反向引物标签。  10. Use of a set of primer tags according to claim 9 for a PCR sequencing method, wherein, in particular, each pair of primer tags is combined with a PCR primer pair for amplifying a sequence of interest to be tested into a pair of tag primers, positive and negative The 5th ends of the PCR primers have (or are optionally joined by a linker sequence) a forward primer tag and a reverse primer tag, respectively.
11. 权利要求 10所述的用途, 其中 PCR引物是用于扩增 HLA的 特定基因的 PCR引物, 优选是用于扩增 HLA-A/B的 2, 3 , 4号外 显子以及 HLA-DRB1 2号外显子的 PCR引物, 优选的所述 PCR引 物如表 2所示。  11. The use according to claim 10, wherein the PCR primer is a PCR primer for amplifying a specific gene of HLA, preferably an exon 2, 3, 4 for amplifying HLA-A/B and HLA-DRB1 PCR primers for exon 2, preferably the PCR primers are shown in Table 2.
12. 权利要求 9的一组引物标签与用于扩增待测目的序列的 PCR引物对组合成的一组标签引物, 其中每一对引物标签与 PCR 引物对组合成一对标签引物, 正反 PCR引物的 5,端各具有 (或者 任选通过连接序列连接) 一个引物标签。  12. A set of label primers comprising a set of primer tags of claim 9 in combination with PCR primer pairs for amplifying a sequence of interest, wherein each pair of primer tags and PCR primer pairs are combined into a pair of tag primers, positive and negative PCR The 5' ends of the primers each have (or are optionally joined by a ligation sequence) a primer tag.
13. 权利要求 12所述的标签引物, 其中所述 PCR引物是用于扩 增 HLA的特定基因的 PCR引物,优选是用于扩增 HLA-A/B的 2, 3, 4号外显子以及 HLA-DRB1 2号外显子的 PCR引物, 优选的所述 PCR引物如表 2所示。  The tag primer according to claim 12, wherein the PCR primer is a PCR primer for amplifying a specific gene of HLA, preferably exon 2, 3, 4 for amplifying HLA-A/B and PCR primers for exon 2 of HLA-DRB1, preferably the PCR primers are shown in Table 2.
14. 权利要求 12所述的标签引物用于 PCR测序方法的用途。  14. Use of the tag primer of claim 12 for a PCR sequencing method.
PCT/CN2010/001834 2010-06-30 2010-11-15 Pcr-sequencing method based on technology of dna molecular index and strategy of dna-breaking incompletely WO2012000152A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010213717.6A CN101921840B (en) 2010-06-30 2010-06-30 DNA molecular label technology and DNA incomplete interrupt policy-based PCR sequencing method
CN201010213717.6 2010-06-30

Publications (1)

Publication Number Publication Date
WO2012000152A1 true WO2012000152A1 (en) 2012-01-05

Family

ID=43337055

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/001834 WO2012000152A1 (en) 2010-06-30 2010-11-15 Pcr-sequencing method based on technology of dna molecular index and strategy of dna-breaking incompletely

Country Status (3)

Country Link
CN (1) CN101921840B (en)
SA (1) SA111320572B1 (en)
WO (1) WO2012000152A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10889860B2 (en) * 2013-09-24 2021-01-12 Georgetown University Compositions and methods for single G-level HLA typing

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008027558A2 (en) 2006-08-31 2008-03-06 Codon Devices, Inc. Iterative nucleic acid assembly using activation of vector-encoded traits
WO2012083505A1 (en) * 2010-12-24 2012-06-28 深圳华大基因科技有限公司 Method for hla-c genotyping and related primers thereof
CN101921841B (en) * 2010-06-30 2014-03-12 深圳华大基因科技有限公司 HLA (Human Leukocyte Antigen) gene high-resolution genotyping method based on Illumina GA sequencing technology
BR112012032586B1 (en) 2010-06-30 2021-08-17 Bgi Genomics Co., Ltd METHODS FOR DETERMINING THE NUCLEOTIDE SEQUENCE OF A NUCLEIC ACID OF INTEREST AND FOR DETERMINING THE HLA GENOTYPE IN A SAMPLE
WO2012064975A1 (en) 2010-11-12 2012-05-18 Gen9, Inc. Protein arrays and methods of using and making the same
ES2548400T3 (en) 2010-11-12 2015-10-16 Gen9, Inc. Methods and devices for nucleic acid synthesis
CN103270170B (en) * 2010-12-24 2015-07-29 深圳华大基因医学有限公司 The method of HLA-DQB1 gene type and relevant primer thereof
CN102653784B (en) * 2011-03-03 2015-01-21 深圳华大基因科技服务有限公司 Tag used for multiple nucleic acid sequencing and application method thereof
CN102690809B (en) * 2011-03-24 2013-12-04 深圳华大基因科技服务有限公司 DNA index and application thereof in construction and sequencing of mate-paired indexed library
EP2944693B1 (en) 2011-08-26 2019-04-24 Gen9, Inc. Compositions and methods for high fidelity assembly of nucleic acids
US9150853B2 (en) 2012-03-21 2015-10-06 Gen9, Inc. Methods for screening proteins using DNA encoded chemical libraries as templates for enzyme catalysis
EP4001427A1 (en) 2012-04-24 2022-05-25 Gen9, Inc. Methods for sorting nucleic acids and multiplexed preparative in vitro cloning
JP6509727B2 (en) 2012-06-25 2019-05-15 ギンゴー バイオワークス, インコーポレイテッド Methods for nucleic acid assembly and high-throughput sequencing
CN102758026B (en) * 2012-06-29 2014-05-07 深圳华大基因科技有限公司 HiSeq sequencing technology-based method for detecting hepatitis B virus type and drug resistance gene
CN103045591B (en) * 2013-01-05 2014-03-12 上海荻硕贝肯生物科技有限公司 HLA gene specific PCR amplification primer, HLA typing method and kit
CN103617375B (en) * 2013-12-02 2017-08-25 深圳华大基因健康科技有限公司 The method and system of PCR product sequencing and typing
CN111748606A (en) * 2014-06-24 2020-10-09 北京贝瑞和康医学检验实验室有限公司 Method and kit for quickly constructing plasma DNA sequencing library
CN104232631B (en) * 2014-08-26 2017-12-15 深圳华大基因股份有限公司 Label, Tag primer, kit and application thereof
CN104293941B (en) * 2014-09-30 2017-01-11 天津华大基因科技有限公司 Method for constructing sequencing library and application of sequencing library
CN104561294B (en) * 2014-12-26 2018-03-30 北京诺禾致源科技股份有限公司 The construction method and sequence measurement of Genotyping sequencing library
CN107858408A (en) * 2016-09-19 2018-03-30 深圳华大基因科技服务有限公司 A kind of generation sequence assemble method of genome two and system
CN106754904B (en) * 2016-12-21 2019-03-15 南京诺唯赞生物科技有限公司 The specific molecular label of cDNA a kind of and its application
CN108728903A (en) * 2017-04-21 2018-11-02 深圳市乐土精准医疗科技有限公司 The banking process of thalassemia large sample screening is used for based on high-flux sequence
CN109801679B (en) * 2019-01-15 2021-02-02 广州柿宝生物科技有限公司 Mathematical sequence reconstruction method for long-chain molecules
CN113564228A (en) * 2021-09-26 2021-10-29 天津诺禾致源生物信息科技有限公司 Automatic sample processing method and device and automatic sample processing system
CN117437978A (en) * 2023-12-12 2024-01-23 北京旌准医疗科技有限公司 Low-frequency gene mutation analysis method and device for second-generation sequencing data and application of low-frequency gene mutation analysis method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008045575A2 (en) * 2006-10-13 2008-04-17 J. Craig Venter Institute, Inc. Sequencing method
CN101654691A (en) * 2009-09-23 2010-02-24 深圳华大基因科技有限公司 Method for amplifying and typing HLA gene and relevant primer thereof

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0400584D0 (en) * 2004-01-12 2004-02-11 Solexa Ltd Nucleic acid chacterisation
EP1929039B2 (en) * 2005-09-29 2013-11-20 Keygene N.V. High throughput screening of mutagenized populations
EP2121983A2 (en) * 2007-02-02 2009-11-25 Illumina Cambridge Limited Methods for indexing samples and sequencing multiple nucleotide templates
WO2009049889A1 (en) * 2007-10-16 2009-04-23 Roche Diagnostics Gmbh High resolution, high throughput hla genotyping by clonal sequencing
CN101434988B (en) * 2007-11-16 2013-05-01 深圳华因康基因科技有限公司 High throughput oligonucleotide sequencing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008045575A2 (en) * 2006-10-13 2008-04-17 J. Craig Venter Institute, Inc. Sequencing method
CN101654691A (en) * 2009-09-23 2010-02-24 深圳华大基因科技有限公司 Method for amplifying and typing HLA gene and relevant primer thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RICHARD CRONN ET AL.: "Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology", NUCLEIC ACIDS RESEARCH, vol. 36, no. 19, 2008, pages E122 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10889860B2 (en) * 2013-09-24 2021-01-12 Georgetown University Compositions and methods for single G-level HLA typing

Also Published As

Publication number Publication date
SA111320572B1 (en) 2015-04-29
CN101921840A (en) 2010-12-22
CN101921840B (en) 2014-06-25

Similar Documents

Publication Publication Date Title
WO2012000152A1 (en) Pcr-sequencing method based on technology of dna molecular index and strategy of dna-breaking incompletely
JP5968879B2 (en) PCR sequencing method based on DNA molecular tag technology and DNA incomplete fragmentation technology and HLA genotyping method using the same
WO2012000150A1 (en) Pcr primers for determining hla-a,b genotypes and methods for using the same
WO2012000153A1 (en) High resolution typing method of hla gene based on illumina ga sequencing technology
WO2007106509A2 (en) Methods and means for nucleic acid sequencing
AU2013246050A1 (en) Detection and quantitation of sample contamination in immune repertoire analysis
JP2018524014A5 (en)
WO2012037880A1 (en) Dna tag and application thereof
WO2010039991A2 (en) Method of generating informative dna templates for high-throughput sequencing applications
WO2018147438A1 (en) Pcr primer set for hla gene, and sequencing method using same
WO2012037883A1 (en) Nucleic acid tags and use thereof
WO2012037875A1 (en) Dna tags and use thereof
CN109295500B (en) Single cell methylation sequencing technology and application thereof
CN108642201B (en) SNP (Single nucleotide polymorphism) marker related to millet plant height character as well as detection primer and application thereof
CN109777867B (en) Method for detecting deafness susceptibility gene mutation by combining overlap extension PCR with Sanger sequencing
WO2012083506A1 (en) Method for hla-dqb1 genotyping and related primers thereof
WO2020232635A1 (en) Method and system for constructing sequencing library on the basis of methylated dna target region, and use thereof
TWI542696B (en) HLA - C genotyping and its related primers
CN105603052B (en) Probe and use thereof
Wu et al. Strategies for unambiguous detection of allelic heterozygosity via direct DNA sequencing of PCR products: application to the HLA DRB1 locus
CN112126987A (en) Library construction method for sequencing methylation amplicon
JP2005323565A (en) Method for detecting presence of monobasic mutational polymorphism in target dna sequence, and kit
CN108728567B (en) SNP (Single nucleotide polymorphism) marker related to width character of millet flag leaf as well as detection primer and application thereof
NEDUVAT et al. Cost-Effective TA Cloning Applied to Sanger Sequencing and HLA Allele Typing
CN116179671A (en) Amplification primer group, kit and method for HLA genotyping

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10853858

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10853858

Country of ref document: EP

Kind code of ref document: A1