WO2012037882A1 - Dna标签及其应用 - Google Patents

Dna标签及其应用 Download PDF

Info

Publication number
WO2012037882A1
WO2012037882A1 PCT/CN2011/079904 CN2011079904W WO2012037882A1 WO 2012037882 A1 WO2012037882 A1 WO 2012037882A1 CN 2011079904 W CN2011079904 W CN 2011079904W WO 2012037882 A1 WO2012037882 A1 WO 2012037882A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna
tag
pcr
index
primer
Prior art date
Application number
PCT/CN2011/079904
Other languages
English (en)
French (fr)
Inventor
章文蔚
张艳艳
于竞
田方
陈海燕
龚梅花
周妍
王俊
Original Assignee
深圳华大基因科技有限公司
深圳华大基因研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因科技有限公司, 深圳华大基因研究院 filed Critical 深圳华大基因科技有限公司
Publication of WO2012037882A1 publication Critical patent/WO2012037882A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the invention relates to the field of nucleic acid sequencing technology, in particular to the field of DNA sequencing technology.
  • the invention relates to DNA tags for DNA sequencing and their use. More specifically, the present invention provides a DNA tag, a DNA tag linker, a PCR tag primer, a DNA tag library, a preparation method thereof, a method for determining DNA sequence information, and a plurality of DNA sample sequence information for constructing a DNA tag library. Methods and kits for constructing DNA tag libraries.
  • DNA sequencing technology is one of the important molecular biological analysis methods. It not only provides important data for basic biological research such as gene expression and gene regulation, but also plays an important role in applied research such as disease diagnosis and gene therapy. .
  • Solexa DNA Sequencing Platform Illumina
  • SBS Sequencing By Synthesis
  • Illumina has introduced a DNA tag (also known as index) database building method based on the Solexa DNA sequencing platform. As shown in Fig. 1, in the DNA tag construction process, three PCR primers were used, and a DNA tag library was constructed by PCR. (Preparing samples for multiplexed paired-End sequencing; Illumina part#1005361 Rev.B, by reference Incorporate it in its entirety).
  • the inventors of the present application found that the above-described method for preparing a tag library has some drawbacks: First, Illumina currently only provides 12 tag sequences of 6 bp in length, and the number of tags is small, and as the Solexa sequencing throughput increases, It is impossible to mix and sequence a large number of samples, which will waste the sequencing resources and affect the sequencing flux. Second, the above label construction method is to introduce the tag sequence into the library of the target fragment by PCR reaction, and the PCR amplification of the target fragment The amplification process requires the use of three PCR primers (two common PCR primers and one PCR tag primer, as shown in Figure 1), time-consuming consumables, high cost, and low PCR amplification efficiency.
  • the above label construction In the library method, only the PCR tag primer is used, and a tag sequence is introduced into each DNA library by a PCR reaction method, and the DNA tag library is distinguished based on the sequence information of the unique tag in each DNA tag library after sequencing. Sequence information, so using the 12 tag sequences it provides, can only be performed on up to 12 DNA samples simultaneously Mixed sequencing, unable to achieve mixed sequencing of a large number of samples.
  • a DNA tag (herein, simply referred to as a "tag") that can be used to construct a library of DNA tags is presented.
  • the invention proposes a set of isolated DNA tags.
  • the sample source of the DNA can be accurately characterized by linking the DNA tag to the sample DNA or its equivalent.
  • a DNA tag library of a plurality of samples (herein, sometimes referred to as a "tag library”) can be simultaneously constructed, and thus can be sequenced by mixing DNA tag libraries derived from different samples, and can be based on
  • the DNA tag classifies the DNA sequence of the DNA tag library to obtain DNA sequence information for a variety of samples, thereby making full use of high-throughput sequencing techniques, such as sequencing multiple DNA tag libraries using Solexa sequencing technology. , thereby improving the sequencing efficiency and throughput of the DNA tag library.
  • the inventors have surprisingly found that the construction of a DNA tag library using a DNA tag according to an embodiment of the present invention enables precise discrimination of a plurality of DNA tag libraries, and the resulting sequencing data results are very stable and reproducible.
  • the present invention also provides a set of isolated oligonucleotides for introducing the above DNA tag into sample DNA or its equivalent.
  • a set of isolated oligonucleotides according to an embodiment of the invention having a first strand and a second strand, each of said strands being composed of a nucleotide represented by SEQ ID NO: (3N-1), respectively
  • these oligonucleotides respectively have the DNA tags of the present invention as described above. And it has a sticky end T, and thus, the corresponding DNA tag can be introduced into the DNA or its equivalent by a ligation reaction.
  • a corresponding Y-shaped DNA tag linker can be formed by subjecting the sense sequence DNA Index-NF_adapter and its corresponding antisense sequence DNA Index-NR_adapter to an equimolar annealing treatment.
  • the DNA tag is introduced into the DNA of the sample or its equivalent, thereby enabling the construction of a DNA tag library having a DNA tag.
  • the inventors have surprisingly found that when constructing a DNA tag library containing various DNA tags with oligonucleotides having different tags for the same sample, the stability and reproducibility of the resulting sequencing data results are very it is good.
  • the human whole blood sample DNA tag library constructed using Indexl-59 exhibits a correlation of at least 0.99 when data analysis is performed using the pearson coefficient. Details of the specific algorithm for the pearson coefficient can be found in the relevant literature, for example: t Hoen, PA, Y. Ariyurek, et al. (2008).
  • the invention also provides a set of isolated PCR tag primers for introducing the above DNA tag into sample DNA or equivalents thereof.
  • a set of isolated PCR tag primers according to an embodiment of the invention each consisting of the nucleotides set forth in SEQ ID NOs: 178-236.
  • the set of PCR tag primers is sometimes referred to as "first PCR tag primer” or "PCR 1.0 tag primer”, which respectively have a DNA tag according to an embodiment of the present invention as described above.
  • PCR reaction of the PCR1.0 tag primer, the PCR1.0 tag primer can be introduced into the DNA of the sample or its equivalent, thereby introducing the corresponding DNA tag into the DNA or its equivalent.
  • CACGACGCTCTTCCGATCT ( 183 )
  • CACGACGCTCTTCCGATCT ( 185 )
  • CACGACGCTCTTCCGATCT ( 187 )
  • CACGACGCTCTTCCGATCT ( 188 )
  • CACGACGCTCTTCCGATCT ( 189 )
  • CACGACGCTCTTCCGATCT ( 190)
  • CACGACGCTCTTCCGATCT ( 191 )
  • CACGACGCTCTTCCGATCT ( 193 )
  • CACGACGCTCTTCCGATCT ( 195 )
  • CACGACGCTCTTCCGATCT ( 197 )
  • CACGACGCTCTTCCGATCT ( 199 )
  • CACGACGCTCTTCCGATCT (201 )
  • CACGACGCTCTTCCGATCT (203 )
  • CACGACGCTCTTCCGATCT (205 ) AATGATACGGCGACCACCGAGATCTAATAGAGACACTCTTTCCCT
  • CACGACGCTCTTCCGATCT (207 )
  • CACGACGCTCTTCCGATCT (208 )
  • CACGACGCTCTTCCGATCT (215 )
  • CACGACGCTCTTCCGATCT (217 )
  • CACGACGCTCTTCCGATCT (218 )
  • CACGACGCTCTTCCGATCT (219 )
  • CACGACGCTCTTCCGATCT (225 )
  • CACGACGCTCTTCCGATCT (231 )
  • CACGACGCTCTTCCGATCT (233 )
  • CACGACGCTCTTCCGATCT (235 )
  • the invention provides a further set of isolated PCR tag primers for introducing said DNA tag into sample DNA or its equivalent.
  • the set of isolated PCR tag primers according to an embodiment of the invention, each consisting of the nucleotides set forth in SEQ ID NOs: 237-295.
  • the set of PCR tag primers is sometimes referred to as "second PCR tag primer” or "PCR 2.0 tag primer”, which respectively have a DNA tag according to an embodiment of the present invention as described above.
  • PCR reaction of the PCR2.0 tag primer, the PCR2.0 tag primer can be introduced into the DNA of the sample or its equivalent, thereby introducing the corresponding DNA tag into the DNA or its equivalent.
  • PCR2.0 tag primer ( PCR2.0_index_N Primer ) sequence
  • a tag combination sequence that is, a DNA tag linker, a PCR1.0 tag primer, and a PCR2.0 tag can be introduced into a DNA sample by a linker ligation and a PCR reaction using the above DNA tag linker and two sets of PCR tag primers.
  • the tags in the primers are arranged in an array, and then the tag combination is introduced for the same DNA sample using a linker ligation and a PCR reaction.
  • 59 tag sequences can generate 205379 (59 ⁇ 59 ⁇ 59) different tag combinations, thereby utilizing the interaction of DNA tag linkers, PCR1.0 tag primers, and PCR 2.0 tag primers.
  • Different combinations of tags are introduced into the DNA sample to construct a DNA tag library of various DNA samples, and the DNA tag library can be distinguished according to the combination of these tags in different DNA tag libraries.
  • hybrid sequencing of a very large number of samples can be achieved to meet the needs of high-throughput sequencing, thereby reducing sequencing costs.
  • the present invention provides a method of preparing a DNA tag library.
  • the method comprises the steps of: fragmenting a DNA sample to obtain a DNA fragment; performing end repair of the DNA fragment to obtain a DNA fragment that has been repaired at the end; and the DNA fragment that has been repaired at the end 3, a base A is added at the end to obtain a DNA fragment having a sticky terminal A; the DNA fragment having the sticky terminal A is ligated to a DNA tag adaptor to obtain a ligation product to which a DNA tag linker is ligated, wherein the DNA
  • the tag linker is one of the isolated oligonucleotides according to an embodiment of the present invention; the ligation product is subjected to a PCR reaction to obtain a PCR amplification product, wherein the PCR reaction uses a first PCR tag primer and a second a PCR tag primer, wherein the first PCR tag primer is a set of isolated PCR tag primers consisting of the nu
  • the PCR amplification product comprises a fragment of interest, a DNA linker, and a DNA tag, wherein the sequence of the target fragment corresponds to the sequence of the DNA fragment; and the PCR amplification product is isolated and recovered, the PCR amplification
  • the product constitutes the DNA tag library.
  • the present invention also provides a DNA tag library obtained by a method of preparing a DNA tag library according to an embodiment of the present invention.
  • the present invention also provides a method of determining DNA sample sequence information.
  • the method comprises the steps of: establishing a DNA tag library of the DNA sample according to a method of preparing a DNA tag library according to an embodiment of the present invention; and sequencing the DNA tag library to determine the DNA sample Sequence information. Based on this method, the sequence information of the DNA sample in the DNA tag library and the sequence information of the DNA tag combination can be efficiently obtained, thereby enabling differentiation of the source of the DNA sample.
  • the inventors have surprisingly found that the use of the method according to an embodiment of the present invention to determine DNA sample sequence information can effectively reduce the problem of data output bias and can accurately distinguish a plurality of DNA tag libraries.
  • the present invention also provides a method of determining a plurality of DNA sample sequence information.
  • the method comprises the steps of: establishing, for each of the plurality of samples, a DNA tag library of the DNA sample independently of the method of constructing a DNA tag library according to an embodiment of the present invention, wherein a different DNA sample using a combination of DNA tags of different and known sequences; combining the DNA tag libraries of the plurality of samples to obtain a DNA tag library mixture; using the Solexa sequencing technology, the DNA tag library Sequencing is performed to obtain sequence information of the DNA sample and sequence information of the tag combination; and sequence information of the DNA sample is classified based on sequence information of the tag combination to determine the plurality of samples DNA sequence information.
  • the method according to an embodiment of the present invention can make full use of high-throughput sequencing technology, for example, using Solexa sequencing technology to simultaneously sequence DNA tag libraries of various samples, thereby improving the efficiency and sequencing of DNA tag library sequencing.
  • the amount, at the same time, can improve the efficiency of determining the sequence information of a variety of DNA samples.
  • kits for constructing a DNA tag library comprising: 59 isolated oligonucleotides, said isolated oligonucleotide, according to an embodiment of the present invention
  • the glucosinolate has a first strand consisting of a nucleotide represented by SEQ ID NO: (3N-1), and a second strand respectively represented by SEQ ID NO: (3N)
  • the nucleotide composition shown, wherein, for the same oligonucleotide, the N values of the first strand and the second strand are the same, and N l-59 of any integer
  • 59 isolated first PCR tag primers which consists of the nucleotides set forth in SEQ ID NOs: 178-236, respectively
  • 59 isolated second PCR-tagged primers each consisting of the nucleotides set forth in SEQ ID NOs: 237-295
  • Figure 1 shows a schematic flow diagram of a method for constructing a DNA tag library provided by Illumina
  • FIG. 2 is a schematic flow chart showing a method for constructing a DNA tag library according to an embodiment of the present invention
  • FIG. 3 is a view showing a DNA tag library constructed by a method for constructing a DNA tag library according to an embodiment of the present invention and a different tag combination thereof;
  • FIG. 4 is a schematic diagram showing a DNA tag library constructed by a method of constructing a DNA tag library and a different tag combination thereof according to an embodiment of the present invention
  • Figure 5 is a schematic diagram showing a DNA tag library constructed by a method of constructing a DNA tag library and a different tag combination thereof according to an embodiment of the present invention
  • Fig. 6 shows the results of electrophoresis detection of a DNA tag library constructed by the method of constructing a DNA tag library according to an embodiment of the present invention.
  • the present invention proposes a number of isolated DNA tags.
  • SEQ ID NO: (3N-2) any integer of 1-59.
  • DNA as used in the present invention may be any polymer comprising deoxyribonucleotides, including but not limited to modified or unmodified DNA.
  • a DNA tag library having a tag is obtained by linking the DNA tag to the DNA of the sample or its equivalent, and the sequence of the sample DNA and the tag can be obtained by sequencing the DNA tag library. The sequence, and thus the sequence based on the tag, can accurately characterize the source of the sample of DNA.
  • a DNA tag library of a plurality of samples can be simultaneously constructed, and the DNA sequence of the sample can be classified based on the DNA tag by mixing and simultaneously sequencing the DNA tag library derived from different samples.
  • DNA tag attached to the DNA of the sample or its equivalent shall be understood broadly, and it may include a DNA tag directly linked to the DNA of the sample to construct a DNA tag library, and may also have DNA with the sample.
  • a nucleic acid of the same sequence (for example, may be the corresponding RNA sequence or cDNA sequence, which has the same sequence as the DNA).
  • the inventors of the present application found that: In the present invention, in order to design an effective DNA tag, it is first necessary to consider the problem of recognizability and recognition rate between tag sequences. Second, in the case of a label mix of less than 12 samples, the GT content of each base site on the mixed label must be considered. Because the excitation fluorescence of the bases G and T is the same in the Solexa sequencing process, the excitation lights of the bases A and C are the same, so the "balance" of the base “GT” content and the base “AC” content must be considered. The base base "GT” content is 50%, which guarantees the highest label recognition rate and the lowest error rate. Finally, consider the repeatability and accuracy of the data output.
  • a set of DNA tags In order to achieve efficient construction of the DNA tag library and sequencing, a set of DNA tags must be constructed to ensure reliable results and high reproducibility. The same DNA sample ensures that a library of DNA tags constructed using different tags in the set of DN A tags will result in consistent sequencing results, thus ensuring reliable and reproducible results. In addition, it is also necessary to avoid the appearance of 3 or more consecutive bases in the tag sequence, because 3 or more consecutive bases increase the error rate of the sequence during synthesis or sequencing, and also Try to avoid the DNA tag connector itself forming a hairpin structure.
  • the inventors of the present application performed a large number of screening work, and selected a set of isolated DNA tags according to an embodiment of the present invention, which are respectively represented by the nucleotides represented by SEQ ID NO: (3N-2)
  • the sequence is as shown in Table 1 above and will not be described again.
  • These tags can be applied to the construction of any DNA tag library. There are currently no rumors for library construction of these tags for DNA sample sequencing and sequencing by Solexa.
  • the DNA tag used is a nucleic acid sequence of 7 bp in length, and the difference between the tags is more than 3 bases, the set of DNA tags consisting of the following: At least 5, or at least 10, or at least 15, or at least 20, at least 25, or at least 30, or at least 35, or at least 10, or at least 15, of the DNA tags differing from the DNA tag by one base At least 40, or 45, or at least 50, or at least 55 , or all 59.
  • the set of DNA tags preferably includes at least 59 DNA tags of DNA Index 1 to DNA Index 5, or DNA Index 6 to DNA Index 10 or DNA Index 11 shown in Table 1.
  • the 1 base difference comprises a substitution, addition or deletion of 1 base in the sequence of 59 tags shown in Table 1.
  • the present invention also provides the use of a tag according to an embodiment of the present invention for constructing and sequencing a DNA tag library, wherein the DNA tag linker of the DNA tag library comprises a DNA tag of the embodiment of the present invention, Thereby constituting respective corresponding DNA tag connectors.
  • the DNA tag is inserted into the 3' end of the DNA tag linker, or is ligated to the 3' end of the DNA linker with or without a linker, preferably inserted into the 3' end of the DNA tag linker, depending on the particular An example, more preferably, is inserted into the DNA tag linker 1 base from the 3' end of the DNA tag linker; wherein the PCR tag primer of the DNA tag library contains the DNA tag according to an embodiment of the present invention, thereby constituting the corresponding PCR Label primers.
  • the DNA tag is inserted into the PCR tag primer.
  • Oligonucleotides Oligonucleotides, PCR tag primers, and construction of DNA tag libraries
  • the present invention provides a set of isolated oligonucleotides which can be used to introduce a DNA tag as described above into DNA of a sample, thereby constructing a DNA tag library.
  • the invention provides a set of isolated oligonucleotides, each of the set of isolated oligonucleotides having a sticky end T, and the isolated oligonucleotides having a first The chain and the second strand, the sticky end T, are formed on the first strand of each of the oligonucleotides.
  • the first strand is composed of the nucleotides represented by SEQ ID NO: (3N-1), and the second strand is composed of the nucleotides represented by SEQ ID NO: (3N), respectively.
  • the N values of the first strand and the second strand are the same, that is, when the corresponding nucleotides in the sequence listing are used as the first strand and the second strand, respectively, the core of the first strand is formed.
  • the corresponding oligonucleotides can be formed by annealing the first strand and the second strand constituting the corresponding oligonucleotide, respectively.
  • the above oligonucleotides respectively have the DNA tags according to the embodiments of the present invention as described above, and the oligonucleotides have sticky ends, and thus, the corresponding DNA tags can be linked by a ligation reaction. Introduced into the DNA of the sample or its equivalent. Specifically, the sequences of these oligonucleotides are as shown in Table 1 above, and will not be described herein.
  • the oligonucleotide sequence (DNA tag linker) provided according to an embodiment of the present invention has high stability. This finding was primarily based on the analysis of the structural stability of these oligonucleotide sequences by Lasergene software (http://www.dnastar.com/) in accordance with some embodiments of the present invention.
  • the affinity parameter between the duplexes can be determined by analyzing the energy values formed between the two sequences, thereby predicting the most stable dimer overrall and energy formed by the DNA tag linker. The value, where the absolute value of the energy value (kcal/mol) is larger, indicates that the result of the duplex is more stable.
  • DNA index 1 adapter The second structure of the DNA tag N adapter and the most stable dimer overall - " ⁇ -type" structure and its energy are provided below according to an embodiment of the present invention.
  • the roost stable dimer overall: 20 bp, -38.3 kcal/mol 5 ' TACACTCT TCCCTACACGACGCTCTTCCGATCTAAGAGGCT 3 '
  • the most stable dimer overall 20 bp, —37.4 kcal/mol 5 1 TACAC TC TTCC C ACAC CACGCT CT TCCGATC TAAGC TTGT 3 '
  • the raost stable dimer overall: 20 bp, —38.9 kcal/mol 5 ' TACACT CT TTCC C ACACGACGCT CT TCCGATC CCT GAGGT 3 '
  • the mos stable dimer overall 20 bp, -35.2 kcal/m.ol 5 r T ACACTC TTTCC C AC AC GACGC TC TTCCGATCTCTC T ATTT 3 '
  • the mos st ble dimer overall: 20 bp, -39.3 kcal/mol 5, TACACTCTTTCC CTACAC GACGCTCTTCCGATCTGCGATGTT 3'
  • DNA index43 adapter The most stable dimer overall: 20 b f -38.6 kcal mol 5 1 TACACTCTTTCCCTACACGACGCTCT CCGATCTGGTGGACT 3 '
  • the present invention provides a set of DNA tag adaptors comprising the DNA tag described above, wherein the DNA tag linker of the DNA tag library comprises the tag at 3, the end, and preferably As a linker, these DNA tag linkers include or consist of the following: at least 5, or at least 10, of the 59 DNA tag linkers shown in Table 2 or a DNA tag linker which differs by one base from the DNA tag sequence contained therein , Or at least 15, or at least 20, at least 25, or at least 30, or at least 35, or at least 40, or 45, or at least 50, or at least 55, or all 59.
  • these DNA tag linkers preferably include at least the DNA Index 1F/R_adapter ⁇ DNA Index 5F/R_adapter in the 59 DNA tag linkers shown in Table 2, or the DNA Index 6F/R_adapter - DNA Index 10F/ R_adapter, or DNA Index 11F/R_adapter - DNA Index 15F/R_adapter, or DNA Index 16F/R_adapter - DNA Index20F/R_adapter, or DNA Index21F/R_adapter - DNA Index25F/R_adapter, or DNA Index26F/R_adapter - DNA Index30F/R_adapter, or DNA Index31F/R_adapter - DNA Index35F/R_adapter, or DNA Index36F/R_adapter - DNA Index40F/R_adapter, or DNA Index41 F/R_adapter ⁇ DNA Index45F/R_adapter, or DNA Index46F/R_adapter ⁇ DNA Index50F/R_adapter, or DNA Index41 F/R_adapter ⁇
  • the present invention provides two sets of isolated PCR tag primers which can be used to introduce the DNA tag described above into the DNA of the sample, thereby constructing a DNA tag library.
  • the two sets of isolated PCR tag primers one set consisting of the nucleotides set forth in SEQ ID NOs: 178-236, and the other set by SEQ ID NO: 237-295, respectively.
  • the nucleotide composition in the embodiment of the present invention, the two sets of PCR tag primers respectively have the DNA tag according to the embodiment of the present invention, and the PCR tag primer can be introduced into the DNA of the sample or by PCR reaction using the PCR tag primer.
  • the corresponding DNA tag is introduced into the DNA or its equivalent.
  • sequences of these PCR tag primers are as shown in Table 2 and Table 3 above, and are not described herein again.
  • a set of PCR tag primers consisting of the nucleotides set forth in SEQ ID NOS: 178-236, respectively is used as the first PCR tag primer (sometimes also referred to as "PCR 1.0 tag” Primer " )
  • a set of PCR tag primers consisting of the nucleotides set forth in SEQ ID NO: 237-295 respectively
  • a second PCR tag primer sometimes referred to as "PCR 2.0 tag primer” ⁇ PCR reaction
  • PCR1 can be used
  • the .0 tag primer and the PCR2.0 tag primer are simultaneously introduced into the DNA of one sample or its equivalent, thereby introducing the corresponding DNA tag into the DNA or its equivalent.
  • the corresponding DNA tag refers to the same PCR reaction.
  • the two primers are respectively included in
  • the invention provides two sets of PCR tag primers comprising a DNA tag according to an embodiment of the invention described above at the 3' end.
  • a set of PCR tag primers consisting of the nucleotides represented by SEQ ID NO: 178-236, respectively includes As follows or consisting of: 59 PCR1.0 tag primers shown in Table 2 or at least 5, or at least 10, or at least 15 PCR 1-tag primers differing from the DNA tag sequence contained therein by 1 base , or at least 20, at least 25, or at least 30, or at least 35, or at least 40, or 45, or at least 50, or at least 55, or all 59.
  • these PCR1.0 tag primers preferably include at least PCR1.0_Index_l Primer ⁇ PCR1.0_Index_5 Primerr, or PCR1.0_Index_6 Primer ⁇ PCR in 59 PCR1.0 tag primers shown in Table 2.
  • PCR1.0_Index_ 10 Primerr or PCR1.0_Index_l l Primer ⁇ PCR 1.0_Index_ 15 Primerr , or PCR1.0_Index_16 Primer - PCR1.0_Index_20 Primerr, or PCR1.0_Index_21 Primer - PCR1.0_Index_25 Primerr, or PCR1.0_Index_26 Primer - PCR1.0_Index_30 Primerr , or PCR1.0_Index_31 Primer - PCR1.0_Index_35 Primerr, or PCR1.0_Index_36 Primer - PCR1.0_Index_40 Primerr, or PCR1.0_Index_41 Primer - PCR1.0_Index_45 Primerr, or PCR1.0_Index_46 Primer - PCR1.0_Index_50 Primerr, or PCR1.0_Index_51 Primer ⁇ PCR 1.0_Index_55 Primerr, or PCR1.0_Index_55 Primerr
  • a set of PCR tag primers consisting of the nucleotides represented by SEQ ID NOs: 237-295, respectively includes As follows or consist of the following: 59 PCR2.0 tag primers shown in Table 3 or 1 base difference from the DNA tag sequence contained therein At least 5, or at least 10, or at least 15, or at least 20, at least 25, or at least 30, or at least 35, or at least 40, or 45, of the PCR 2.0 tag primers, or At least 50, or at least 55, or all 59.
  • these PCR2.0 tag primers preferably include at least PCR2.0_Index_l Primer ⁇ PCR2.0_Index_5 Primerr, or PCR2.0_Index_6 Primer-PCR2 in 59 PCR2.0 tag primers shown in Table 3.
  • PCR2.0_Index_10 Primerr or PCR2.0_Index_l l Primer - PCR2.0_Index_15 Primerr, or PCR2.0_Index_16 Primer - PCR2.0_Index_20 Primerr, or PCR2.0_Index_21 Primer ⁇ PCR2.0_Index_25 Primerr, or PCR2.0_Index_26 Primer ⁇ PCR2.0_Index_30 Primerr, or PCR2 .0_Index_31 Primer - PCR2.0_Index_35 Primerr, or PCR2.0_Index_36 Primer ⁇ PCR2.0_Index_40 Primerr, or PCR2.0_Index_41 Primer ⁇ PCR2.0_Index_45 Primerr, or PCR2.0_Index_46 Primer - PCR2.0_Index_50 Primerr, or PCR2.0_Index_51 Primer - PCR2.
  • PCR2.0_Index_55 Primer - PCR2.0_Index_59 Primerr or a combination of any two or more of them.
  • one base is substituted for one base substitution, addition or deletion in the tag sequence.
  • the use of PCR tag primers for DNA tag library construction and sequencing is also provided.
  • a DNA tag library constructed using the above DNA tag linker and PCR tag primer is also provided.
  • the present invention also provides a method of constructing a DNA tag library using the above DNA tag linker and PCR tag primer.
  • the method includes: First, a DNA sample is fragmented to obtain a DNA fragment.
  • the DNA sample is fragmented by ultrasonication.
  • the source of the DNA sample is not particularly limited.
  • the DNA sample is a human DNA sample. More specifically, it can be a human genomic DNA sample.
  • the inventors have found that a DNA tag library of a plurality of common model organisms can be efficiently constructed using the method according to an embodiment of the present invention.
  • the obtained DNA fragment has a length of about 180 bp, whereby the efficiency of constructing a DNA tag library and subsequent sequencing can be further improved.
  • the DNA fragment is end-repaired to obtain a DNA fragment that has been repaired at the end.
  • the end-repaired DNA fragment has two oligonucleotide strands, wherein base A is added at the 3' end of the two oligonucleotide strands, and two oligonucleotides Additions to the glycosidic acid chain.
  • a DNA fragment having a sticky end A is ligated to a DNA tag linker to obtain a ligation product to which a DNA tag linker is attached.
  • both ends of the DNA fragment are ligated to a DNA tag linker.
  • the DNA fragment having the sticky end A is linked to the DNA tag linker by linking the DN A tag link at the 3' end of both oligonucleotide strands of the DNA fragment having the sticky end A of.
  • the DNA tag linker is one of a group of isolated oligonucleotides according to an embodiment of the present invention
  • the DNA tag linker comprises one of the above-described set of isolated DNA tags according to an embodiment of the present invention.
  • the resulting ligation product is subjected to a PCR reaction to obtain a PCR amplification product.
  • the PCR reaction uses a first PCR tag primer and a second PCR tag primer
  • the first PCR tag primer is a nucleotide represented by SEQ ID NO: 178-236, respectively, according to an embodiment of the present invention.
  • One of a set of isolated PCR tag primers constructed, the second PCR tag primer being one of a set of isolated PCR tag primers consisting of the nucleotides set forth in SEQ ID NOs: 237-295, respectively.
  • the first PCR tag primer and the second PCR tag primer comprise different DNA tags.
  • the PCR amplification product comprises a fragment of interest, a DNA linker, and a DNA tag, wherein the sequence of the target fragment corresponds to the sequence of the DNA fragment.
  • the sequence of the target fragment corresponds to the sequence of the DNA fragment, which means that the sequence of the random fragment can be directly derived from the sequence of the target fragment, for example, the sequence of the target fragment can be identical to the sequence of the DNA fragment, or Fully complementary, even increasing or decreasing a known number of known bases, as long as the sequence of DNA can be obtained by limited calculations.
  • the obtained PCR amplification product is separated and recovered, and the PCR amplification product constitutes the DNA tag library.
  • the method for separating and recovering the amplified product is also not particularly limited, and those skilled in the art can select an appropriate method and apparatus for separation according to the characteristics of the amplified product, for example, by electrophoresis and recovering a PCR of a specific length.
  • the method of amplifying the product is recovered.
  • PCR amplification products having a length of about 380-400b are preferably recovered.
  • the present invention provides a method of constructing a DNA tag library, comprising:
  • n is an integer and an integer of 1 ⁇ n ⁇ 59, preferably n is an integer and 2 ⁇ n ⁇ 59, the DNA sample is from all eukaryotic and prokaryotic DNA samples, including but not limited to human DNA sample;
  • the breaking method includes, but is not limited to, an ultrasonic interrupting method, and preferably the disrupted DNA band is concentrated at about 250 bp;
  • each tag linker is attached to both ends of the DNA fragment
  • the linked product obtained in the step 5) is subjected to gel recovery and purification, preferably by electrophoresis and recovery by 2% agarose gel, and the recovered products of the respective DNA samples are mixed together;
  • PCR reaction using a mixture of the recovered products of the step 6) as a template, performing PCR amplification under conditions suitable for amplifying the nucleic acid of interest, and purifying and purifying the PCR product, preferably recovering the 380-400 bp target fragment.
  • a DNA tag library constructed by the above method for constructing a DNA tag library according to an embodiment of the present invention has a DNA tag linker comprising or consisting of: 59 DNA tag tags shown in Table 1 or At least 5, or at least 10, or at least 15, or at least 20, at least 25, or at least 30, or at least 35, of the DNA tag sequences comprising the DNA tag sequences differing by one base. Or at least 40, or 45, or at least 50, or at least 55, or all 59.
  • the DNA tag linker preferably comprises at least the DNA Index 1F/ of the 59 DNA tag linkers shown in Table 1.
  • 1 base difference comprises a substitution, addition or deletion of 1 base in the tag.
  • the above steps of the method for constructing a DNA tag library according to an embodiment of the present invention are as follows: the primer used in the PCR reaction is composed of nucleotides represented by SEQ ID NO: 178-236, respectively.
  • One of a set of isolated PCR tag primers was used as PCR Primer 1.0, one of a set of isolated PCR tag primers consisting of the nucleotides set forth in SEQ ID NOs: 237-295, respectively, as PCR Primer 2.0.
  • a DNA tag according to an embodiment of the present invention can be efficiently introduced into a DNA tag library constructed for a DNA sample.
  • the DNA tag library can be sequenced to obtain sequence information of the DNA sample and sequence information of the DNA tag, thereby enabling differentiation of the source of the DNA sample.
  • the DNA tag linker, the PCR1.0 tag primer, and the PCR2.0 tag primer used in the method of constructing a DNA tag library according to an embodiment of the present invention each contain a tag, whereby the DNA tag library constructed according to the method is With 3 tags, these 3 tags form a "tag combination". According to the 59 tag sequences of the embodiments of the present invention, 205,379 different tag combinations with 3 tags can be generated.
  • the above DNA tag linker and two sets of PCR tags are used.
  • the tag binding sequence can be introduced into the DNA sample by a linker ligation and a PCR reaction.
  • the DNA tag linker, the PCR1.0 tag primer and the PCR2.0 tag primer are simultaneously introduced into the DNA library, and the tag can be introduced.
  • hybrid sequencing of an extremely large number of samples can be finally achieved by constructing a huge cluster of tags.
  • the method for constructing a DNA tag library provided by the present invention has been significantly improved, thereby fully utilizing a high-throughput sequencing platform.
  • the need for high-throughput sequencing saves sequencing resources and reduces sequencing costs.
  • the present invention optimizes the database construction method provided by Illumina by introducing three PCR primers (two common primers and one PCR tag primer) into a tag by only two PCR primers (PCR1.0).
  • the label primer and the PCR2.0 label primer can be introduced into the label, which reduces the difficulty of the PCR reaction, improves the specificity of the PCR amplification, and improves the efficiency of the PCR amplification reaction, and the present invention also improves the label.
  • the recognition efficiency of the sequence increases the efficiency of construction of the DNA tag library and reduces the cost of library construction.
  • FIG. 1 and FIG. 2 wherein a flow chart of a method for constructing a DNA tag library of Illumina company shown in FIG. 1, and a flow chart of a method for constructing a DNA tag library of an embodiment of the present invention shown in FIG. Figure. So far, the DNA library construction method and the tag sequence of the tag combination introduced by these DNA tag linkers, PCR1.0 tag primers and PCR2.0 tag primers have not been reported.
  • a DNA tag containing a combination of various DNA tags is constructed using a DNA tag linker having different tag combinations, a PCR1.0 tag primer, and a PCR2.0 tag primer.
  • the resulting sequencing data results are very stable and reproducible.
  • the present invention also provides a kit for constructing a DNA tag library.
  • the PCR tag primers are
  • a DNA tag according to an embodiment of the present invention can be conveniently introduced into a constructed DNA tag library.
  • a DNA tag according to an embodiment of the present invention can be conveniently introduced into a constructed DNA tag library.
  • other components for constructing a DNA tag library can also be included in the kit, and details are not described herein.
  • the present invention also provides a DNA tag library constructed according to the method of constructing a DNA tag library of the present invention.
  • the tagged DNA tag library can be effectively applied to high-throughput sequencing technologies such as Solexa technology, so that the obtained nucleic acid sequence information such as DNA sequence information can be accurately classified by sample source by obtaining a tag sequence.
  • the present invention also provides a method of determining DNA sample sequence information.
  • it comprises: constructing a DNA tag library according to a method of constructing a DNA tag library according to an embodiment of the present invention; and then, the constructed DNA tag library is sequenced to determine sequence information of the DNA sample. Based on this method, the sequence information of the DNA sample in the DNA tag library and the sequence information of the DNA tag can be efficiently obtained, thereby enabling differentiation of the source of the DNA sample.
  • the inventors have surprisingly found that the use of the method according to an embodiment of the present invention to determine DNA sample sequence information can effectively reduce the problem of data output bias and can accurately distinguish a plurality of DNA tag libraries.
  • the constructed DNA tag library can be sequenced by any known method, and the type thereof is not particularly limited. According to some examples of the invention, DNA tag libraries can be sequenced using Solexa sequencing technology. According to an embodiment of the present invention, suitable sequencing primers can be selected for sequencing according to specific conditions.
  • the present invention provides a method of determining sequence information for a plurality of DNA samples.
  • the method comprises the steps of: constructing a DNA tag library of the DNA sample according to a method for constructing a DNA tag library according to an embodiment of the present invention, respectively, for each of a plurality of samples, wherein Different DNA samples are combined with DNA tags of mutually different and known sequences.
  • the term "various" is used in at least two.
  • the expression "combination of DNA tags which are different from each other and known sequences” means that the DNA tag library constructed by constructing a DNA sample differs from the tag combination of the DNA tag library of other samples, and Since the three tag sequences constituting the tag combination are known, the sequence of each tag combination is known.
  • the label combination means that the DNA tag linker, the PCR1.0 tag primer and the PCR2.0 tag primer used in the method for constructing the DNA tag library according to the embodiment of the present invention each contain a tag, thereby constructing according to the method
  • the three DNA tags in each tag combination may be all the same, or may be completely different, or may be any two identical.
  • the label combination "different from each other" means that there is at least one difference in the DNA label between the label combinations, that is, at least one label is different between the six labels of any two label combinations, that is, There must be one label different, or two labels may be different, or three labels may be different, or four labels may be different, or five labels may be different, or even six labels may be different.
  • 205,379 different tag combinations of 3 tags can be produced.
  • a tag combination can be introduced into a DNA library, and by importing different tag combinations, it is possible to construct a plurality of tags.
  • a DNA tag library of DNA samples allows the DNA tag library to be distinguished based on the difference in tag combinations in different DNA tag libraries after sequencing the DNA tag library.
  • hybrid sequencing of an extremely large number of samples can be finally realized by constructing a huge cluster of tags.
  • the obtained DNA tag libraries of various samples are combined to obtain a DNA tag library mixture, and the obtained DNA tag library mixture is sequenced by Solexa sequencing technology, thereby obtaining sequence information of the DNA sample and sequence information of the tag. . Finally, based on the sequence information of the tag combination, the sequence information of the DNA sample is classified to determine the sequence information of the plurality of DN A samples.
  • the method according to an embodiment of the present invention can make full use of high-throughput sequencing technology, for example, using Solexa sequencing technology to simultaneously sequence DNA libraries of various samples, thereby improving the efficiency and throughput of DNA library sequencing. At the same time, the efficiency of determining sequence information of a plurality of DNA samples can be improved.
  • the sequencing method and the sequencing primer used in the prior art have been described in detail above and will not be described again here.
  • the pMD18-T plasmid vector (Japanese takara) was used as a template, and primers were designed using Primer Premier 5.0 software.
  • pMD18-T Primer 1 CGGGGAG AGGCGGTTTGCGTATTGG;
  • pMD18-T Primer 2 TTTTGTG ATGCTCGTCAGGGGGGCG, PCR amplification of a fragment of 250 bp in length, using a NanoDrop 1000 instrument (NanoDrop, USA) to detect the concentration of the amplified product, then 1 ⁇ g according to the concentration
  • This PCR product was used as a DNA fragment constructed from a library, and hydrated to a volume of 35 ⁇ l.
  • the PCR product is then purified using the QIAquick PCR Purification Kit.
  • the total volume is 10CM Adjust the comfort thermostat mixer to 20 °C for 30 min, then purify it with QIAquick PCR Purification Kit, and finally dissolve the sample in 32 ⁇ l EB solution 0
  • the total volume of 50 ⁇ l was adjusted to 37 ° C with a comfortable thermomixer for 30 min, then purified using the MiniElute PCR Purification Kit, and finally the sample was dissolved in 1 (supplied EB solution 0)
  • the DNA tag linker can be any of the 59 DNA tag linkers in Table 1 (which consists of two complementary sequences, DNA Index-NF-adapter and DNA Index-NR-adapter). ).
  • the ligation product was electrophoretically separated in 2% agarose gel; the target fragment strip was then transferred to an Eppendorf tube.
  • the gel was purified by QIAquick Glue Purification Kit and the recovered product was dissolved in 20 ⁇ l of EB solution.
  • the reaction mixture was prepared according to the following reaction system, and the reagent was placed on water.
  • PCR1.0_indexN Primer can be any of the 59 PCR1.0_indexN Primer primers in Table 2; PCR2.0_indexN Primer can be any of the 59 PCR2.0_indexN Primers in Table 3. Label primers.
  • the PCR product was electrophoresed in 2% agarose gel, and the target fragment was cut and recovered, and purified by QIAquick gel purification kit, and the recovered product was dissolved in 3 (supplemented EB solution).
  • FIGS 3, 4, and 5 show schematic diagrams of DNA tag libraries constructed by methods of constructing DNA tag libraries and their different tag combinations in accordance with an embodiment of the present invention.
  • the DNA tag library can be distinguished by sequence information of different tag combinations. Specifically, the DNA tag library can be distinguished by a combination of a DNA tag linker, a PCR 1.0 index Primer, and a tag in the PCR 2.0 index Primer, wherein The combination of labels can reach 205,379 (59 x 59 x 59).
  • a DN A-tag library of a plurality of (205379) DNA samples can be constructed by introducing different tag combinations into a DNA library through a DNA tag linker, a PCR1.0 tag primer, and a PCR2.0 tag primer.
  • the DN A tag library can be distinguished based on the tag combination after sequencing the DN A tag library.
  • hybrid sequencing of an extremely large number of samples can be finally realized by constructing a huge cluster of tags.
  • the information sequence of the DNA fragment is: GTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAA
  • sequence information of the library constructed as shown in Figure 3 is as follows: >Index tagA-1: indexl+indexl+indexl
  • the fragment information sequence of the DNA fragment is:
  • the library sequence information constructed as shown in Fig. 4 is as follows>Index tagB-1: indexl+indexl+indexl
  • FIG. 6 shows the results of electrophoretic detection of a DNA tag library constructed by the method of constructing a DNA tag library according to an embodiment of the present invention.
  • the target fragment library is 390 bp as indicated by the arrow;
  • the D2000 marker strip size is: 2000 bp, 1000 bp, 750 bp, 500 bp, 250 bp, lOObp; wherein, 1, D2000 marker; 2, Index tag A-1; , Index tagA-2; 4, Index tagA-3; 5, Index tagA-58; 6, Index tagA-59; 7, Index tagB-1; 8, Index tagB-2; 9, Index tagB-3; Index tagB-58; 11, Index tagB-59; 12, Index tagC-1; 13, Index tagC-2; 14, Index tagC-3; 15, Index tagC -58, 16, Index tagC-59; tagD-1; 18, Index tagD-2; 19, Index tagD-58; 20, Index tagD-59; 21, Index tagE-1; 22, Index tag
  • the DNA tag library kit can be applied to DNA sequencing and can effectively improve the sequencing throughput of sequencing platforms such as the Solexa sequencing platform.

Description

DNA标签及其应用 优先权信息
本申请请求 2010 年 9 月 21 日向中国国家知识产权局提交的、 专利申请号为
201010299271.3的专利申请的优先权和权益, 并且通过参照将其全文并入此处。
技术领域
本发明涉及核酸测序技术领域, 特别是 DNA测序技术领域。 具体的, 本发明涉及 用于 DNA测序的 DNA标签及其应用。 更具体的, 本发明提供了用于构建 DNA标签文 库的 DNA标签、 DNA标签接头、 PCR标签引物、 DNA标签文库及其制备方法、 确定 DNA样品序列信息的方法、 确定多种 DNA样品序列信息的方法以及用于构建 DNA标 签文库的试剂盒。
背景技术
DNA测序技术, 是重要的分子生物学分析方法之一, 它不仅为基因表达、 基因调 控等生物学基础研究提供重要数据, 而且也在疾病诊断学、基因治疗等应用研究中起着 重要的作用。基于 Solexa DNA测序平台( Illumina ) , 釆用边合成边测序 ( Sequencing By Synthesis, SBS ) , 具有所需样品量少, 高通量, 高精确性, 拥有简单易操作的自动化 平台和功能强大等特点 (例 口参见 Paired- End sequencing User Guide ;Illumina part#1003880 ; Preparing samples for ChIP sequencing for DNA;Illumina part#l 1257047 Rev. A ; mRNA sequencing sample preparation Guide;Illumina part#l 004898 Rev.D ; Preparing 2-5kb samples for mate pair library sequencing; Illumina part#1005363 Rev.B , 通过参照将其全文并入本文) 。
然而, 目前对样品 DNA进行测序的方法, 仍有待改进。
发明内容
本发明是基于发明人的下列发现而完成的:
目前 Illumina公司基于 Solexa DNA测序平台推出了 DNA标签(也称为 index )建 库方法。 如图 1所示, 在 DNA标签建库流程中, 使用了 3条 PCR引物, 通过 PCR导 入标签来构建 DNA标签文库 ( Preparing samples for multiplexed Paired-End sequencing; Illumina part#1005361 Rev.B , 通过参照将其全文并入本文) 。 本申请的发明人发现, 上述标签文库制备方法存在着一些缺陷: 第一、 目前 Illumina公司只提供了 12种长度 为 6bp的标签序列, 标签的数量较少, 随着 Solexa测序通量的增加, 不能对大量样本 进行混合测序, 从而将浪费测序资源和影响到测序通量; 第二、 上述标签建库方法是通 过 PCR反应将标签序列导入到目的片段文库中的, 其对目的片段的 PCR扩增过程需要 釆用 3条 PCR引物 (两条公用 PCR引物和一条 PCR标签引物, 如图 1所示) , 耗时 耗材, 费用较高, 且 PCR扩增效率不高; 第三、 上述标签建库方法中仅釆用 PCR标签 引物, 通过 PCR反应这一种方法, 在每个 DNA文库中导入一个标签序列, 测序后基于 每个 DNA标签文库中的唯一标签的序列信息来区分 DNA标签文库的序列信息, 因此 利用其提供的 12种标签序列, 最多只能同时针对 12种 DNA样品进行混合测序, 无法 实现大量样本的混合测序。
本发明旨在解决现有技术问题的至少之一。 为此, 本发明的一个方面, 提出了一种 能够用于构建 DNA标签文库的 DNA标签(在本文中, 有时也简单地称为 "标签" ) 。 根据本发明的一个方面, 本发明提出了一组分离的 DNA标签。 根据本发明的一些实施 例,这些分离的 DNA标签分别由 SEQ ID NO: ( 3N-2 )所示的核苷酸构成,其中 N=l-59 的任意整数。 在本说明书中, 这些 DNA标签分别被命名为 DNA Index-N, 其中 N=l-59 的任意整数, 其序列如下表 1所示。 利用上述根据本发明实施例的 DNA标签, 通过将 DNA标签与样品 DNA或其等同物相连, 可以精确地表征 DNA的样品来源。 由此, 利 用上述 DNA标签, 可以同时构建多种样品的 DNA标签文库 (在本文中, 有时也称为 "标签文库" ), 从而可以通过将来源于不同样品的 DNA标签文库混合之后进行测序, 并且能够基于 DNA标签对 DNA标签文库的 DNA序列进行分类,从而可以获得多种样 品的 DNA序列信息,由此可以充分利用高通量的测序技术,例如利用 Solexa测序技术, 同时对多种 DNA标签文库进行测序, 从而提高 DNA标签文库的测序效率和通量。 发 明人惊奇地发现, 利用根据本发明实施例的 DNA标签构建 DNA标签文库, 能够精确 地对多种 DNA标签文库进行区分, 并且所得到的测序数据结果的稳定性和可重复性非 常好。
根据本发明的另一方面, 本发明还提供了用于将上述 DNA标签引入样品 DNA或 其等同物中的一组分离的寡核苷酸。根据本发明的实施例的一组分离的寡核苷酸, 具有 第一链和第二链, 所述第一链分别由 SEQ ID NO: ( 3N-1 ) 所示的核苷酸构成, 所述 第二链分别由 SEQ ID NO: ( 3N ) 所示的核苷酸构成, 其中, 对于相同的寡核苷酸, 其第一链和第二链的 N取值相同, 并且 N=l-59的任意整数。 根据本发明的实施例, 这 些寡核苷酸(在本说明书中, 有时也称为 "DNA标签接头" 、 "标签接头" ) 分别具 有如前所述的 #居本发明实施例的 DNA标签, 并且具有粘性末端 T, 因而, 可以通过 连接反应, 将相应的 DNA标签引入到 DNA或其等同物中。 与 DNA标签的命名方法类 似, 在本说明书中, 与 DNA标签 DNA Index-N相对应的寡核苷酸( DNA标签接头) 被命名为 DNA Index-N adapter, 其中 N=l-59的任意整数, 进一步, DNA标签接头的 第一链(在本文中, 有时也称为 "正义序列" )和第二链(在本文中, 有时也称为 "反 义序列 ")分别被命名为 DNA Index-NF_adapter和 DNA Index-NR_adapter,其中 N= 1-59 的任意整数, 其序列如下表 1所示(表 所示序列方向均是 5, _ 3,方向) 。 根据本发明 的实施例, 可以通过将正义序列 DNA Index-NF_adapter 和其相应的反义序列 DNA Index-NR_adapter进行等摩尔退火处理而形成相应的具 Y型结构的 DNA标签接头。
DNA标签 ( DNA index-N ) 和 DNA标签接头 ( DNA Index-N_adapter )序列
Figure imgf000003_0001
Figure imgf000004_0001
Figure imgf000005_0001
Figure imgf000006_0001
Figure imgf000007_0001
Figure imgf000008_0001
Figure imgf000009_0001
T(170)
5-Phos/CTTGTAAAGATCGGAAGAGCACACGTCTGAACTCC
DNA Index-57R_adapter AGTCAC(m)
DNA Index-58 TTGACCG(m)
TACACTCTTTCCCTACACGACGCTCTTCCGATCTTTGACCG
DNA Index-58F_adapter T(173)
5-Phos/CGGTCAAAGATCGGAAGAGCACACGTCTGAACTCC
DNA Index-58R_adapter AGTCAC(174)
DNA Index-59 TTGGTGC(175)
TACACTCTTTCCCTACACGACGCTCTTCCGATCTTTGGTGC
DNA Index-59F_adapter T(176)
5-Phos/GCACCAAAGATCGGAAGAGCACACGTCTGAACTCC
DNA Index-59R_adapter AGTCAC(177)
利用上述根据本发明实施例的寡核苷酸(也可以称为 DNA标签接头) ,
地将 DNA标签引入到样品的 DNA或其等同物中,由此能够构建具有 DNA标签的 DNA 标签文库。 另外, 发明人惊奇地发现, 当针对相同的样品, 釆用具有不同标签的寡核苷 酸构建含有各种 DNA标签的 DNA标签文库时, 所得到的测序数据结果的稳定性和可 重复性非常好。 根据本发明的实施例, 当釆用 pearson 系数进行数据分析时, 利用 Indexl-59 所构建的人全血样本 DNA 标签文库均表现出了至少 0.99 的相关性。 关于 pearson系数具体算法的细节可以参见相关文献, 例如: t Hoen, P. A., Y. Ariyurek, et al. (2008). "Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms." Nucleic Acids Res 36(21): el41 , 通过参照将其全文并入本文。 重复性越高, 则其 pearson系数越接近 1。
根据本发明的另一方面, 本发明还提供了用于将上述 DNA标签引入样品 DNA或 其等同物中的一组分离的 PCR标签引物。根据本发明的实施例的一组分离的 PCR标签 引物, 其分别由 SEQ ID NO: 178-236所示的核苷酸构成。 在本说明书中, 这组 PCR 标签引物有时被称为 "第一 PCR标签引物" 或 "PCR1.0标签引物" , 其分别具有如前 所述的根据本发明实施例的 DNA标签, 通过釆用 PCR1.0标签引物的 PCR反应, 可以 将 PCR1.0标签引物引入样品的 DNA或其等同物中,从而就将相应的 DNA标签引入到 DNA或其等同物中。 与 DNA标签的命名方法类似, 在本说明书中, 与 DNA标签 DNA Index-N相对应的 PCR1.0标签引物被命名为 PCR1.0_index_N Primer,其中 N=l-59的任 意整数, 其序列如下表 2所示 (表中所示序列方向均是 5, _ 3,方向) 。
表 2 PCR1.0标签引物 ( PCR1.0_index_N Primer ) 序列
Figure imgf000010_0001
CACGACGCTCTTCCGATCT ( 183 )
AATGATACGGCGACCACCGAGATCTAGAAGGTACACTCTTTCCCT
PCR1.0_index_7 Primer
ACACGACGCTCTTCCGATCT ( 184)
AATGATACGGCGACCACCGAGATCTGTTGAGTACACTCTTTCCCTA
PCR1.0_index_8 Primer
CACGACGCTCTTCCGATCT ( 185 )
AATGATACGGCGACCACCGAGATCTGGATTCTACACTCTTTCCCTA
PCR1.0_index_9 Primer
CACGACGCTCTTCCGATCT ( 186)
AATGATACGGCGACCACCGAGATCTCTGACCTACACTCTTTCCCTA
PCR1.0_index_10 Primer
CACGACGCTCTTCCGATCT ( 187 )
AATGATACGGCGACCACCGAGATCTTCAGACTACACTCTTTCCCTA
PCR1.0_index_l l Primer
CACGACGCTCTTCCGATCT ( 188 )
AATGATACGGCGACCACCGAGATCTAACAACTACACTCTTTCCCTA
PCR1.0_index_12 Primer
CACGACGCTCTTCCGATCT ( 189 )
AATGATACGGCGACCACCGAGATCTTTATGATACACTCTTTCCCTA
PCR1.0_index_13 Primer
CACGACGCTCTTCCGATCT ( 190)
AATGATACGGCGACCACCGAGATCTACGCGATACACTCTTTCCCTA
PCR1.0_index_14 Primer
CACGACGCTCTTCCGATCT ( 191 )
AATGATACGGCGACCACCGAGATCTAGTGCATACACTCTTTCCCTA
PCR1.0_index_15 Primer
CACGACGCTCTTCCGATCT ( 192)
AATGATACGGCGACCACCGAGATCTTATCAATACACTCTTTCCCTA
PCR1.0_index_16 Primer
CACGACGCTCTTCCGATCT ( 193 )
AATGATACGGCGACCACCGAGATCTATCCTTGACACTCTTTCCCTA
PCR1.0_index_17 Primer
CACGACGCTCTTCCGATCT ( 194)
AATGATACGGCGACCACCGAGATCTGGTCGTGACACTCTTTCCCTA
PCR1.0_index_18 Primer
CACGACGCTCTTCCGATCT ( 195 )
AATGATACGGCGACCACCGAGATCTGCGGATGACACTCTTTCCCT
PCR1.0_index_19 Primer
ACACGACGCTCTTCCGATCT ( 196)
AATGATACGGCGACCACCGAGATCTTTAGCGGACACTCTTTCCCTA
PCR1.0_index_20 Primer
CACGACGCTCTTCCGATCT ( 197 )
AATGATACGGCGACCACCGAGATCTAGGTAGGACACTCTTTCCCT
PCR1.0_index_21 Primer
ACACGACGCTCTTCCGATCT ( 198 )
AATGATACGGCGACCACCGAGATCTCCTCAGGACACTCTTTCCCTA
PCR1.0_index_22 Primer
CACGACGCTCTTCCGATCT ( 199 )
AATGATACGGCGACCACCGAGATCTTCTATCGACACTCTTTCCCTA
PCR1.0_index_23 Primer
CACGACGCTCTTCCGATCT (200)
AATGATACGGCGACCACCGAGATCTCCGTGCGACACTCTTTCCCTA
PCR1.0_index_24 Primer
CACGACGCTCTTCCGATCT (201 )
AATGATACGGCGACCACCGAGATCTTAACGCGACACTCTTTCCCTA
PCR1.0_index_25 Primer
CACGACGCTCTTCCGATCT (202)
AATGATACGGCGACCACCGAGATCTGATTACGACACTCTTTCCCTA
PCR1.0_index_26 Primer
CACGACGCTCTTCCGATCT (203 )
AATGATACGGCGACCACCGAGATCTTAGTTAGACACTCTTTCCCTA
PCR1.0_index_27 Primer
CACGACGCTCTTCCGATCT (204)
AATGATACGGCGACCACCGAGATCTCGACTAGACACTCTTTCCCTA
PCR1.0_index_28 Primer
CACGACGCTCTTCCGATCT (205 ) AATGATACGGCGACCACCGAGATCTAATAGAGACACTCTTTCCCT
PCR1.0_index_29 Primer
ACACGACGCTCTTCCGATCT (206)
AATGATACGGCGACCACCGAGATCTGTCACAGACACTCTTTCCCTA
PCR1.0_index_30 Primer
CACGACGCTCTTCCGATCT (207 )
AATGATACGGCGACCACCGAGATCTGATGTTCACACTCTTTCCCTA
PCR1.0_index_31 Primer
CACGACGCTCTTCCGATCT (208 )
AATGATACGGCGACCACCGAGATCTTAGAGTCACACTCTTTCCCTA
PCR1.0_index_32 Primer
CACGACGCTCTTCCGATCT (209 )
AATGATACGGCGACCACCGAGATCTAGCACTCACACTCTTTCCCTA
PCR1.0_index_33 Primer
CACGACGCTCTTCCGATCT (210)
AATGATACGGCGACCACCGAGATCTCTTAATCACACTCTTTCCCTA
PCR1.0_index_34 Primer
CACGACGCTCTTCCGATCT (211 )
AATGATACGGCGACCACCGAGATCTTACCTGCACACTCTTTCCCTA
PCR1.0_index_35 Primer
CACGACGCTCTTCCGATCT (212)
AATGATACGGCGACCACCGAGATCTGTGATGCACACTCTTTCCCTA
PCR1.0_index_36 Primer
CACGACGCTCTTCCGATCT (213 )
AATGATACGGCGACCACCGAGATCTATTCGGCACACTCTTTCCCTA
PCR1.0_index_37 Primer
CACGACGCTCTTCCGATCT (214)
AATGATACGGCGACCACCGAGATCTACATCGCACACTCTTTCCCTA
PCR1.0_index_38 Primer
CACGACGCTCTTCCGATCT (215 )
AATGATACGGCGACCACCGAGATCTCGCGAGCACACTCTTTCCCT
PCR1.0_index_39 Primer
ACACGACGCTCTTCCGATCT (216)
AATGATACGGCGACCACCGAGATCTAGGCTCCACACTCTTTCCCTA
PCR1.0_index_40 Primer
CACGACGCTCTTCCGATCT (217 )
AATGATACGGCGACCACCGAGATCTTGTTGCCACACTCTTTCCCTA
PCR1.0_index_41 Primer
CACGACGCTCTTCCGATCT (218 )
AATGATACGGCGACCACCGAGATCTCTAGGCCACACTCTTTCCCTA
PCR1.0_index_42 Primer
CACGACGCTCTTCCGATCT (219 )
AATGATACGGCGACCACCGAGATCTGTCCACCACACTCTTTCCCTA
PCR1.0_index_43 Primer
CACGACGCTCTTCCGATCT (220)
AATGATACGGCGACCACCGAGATCTACCGTACACACTCTTTCCCTA
PCR1.0_index_44 Primer
CACGACGCTCTTCCGATCT (221 )
AATGATACGGCGACCACCGAGATCTTTGCCACACACTCTTTCCCTA
PCR1.0_index_45 Primer
CACGACGCTCTTCCGATCT (222)
AATGATACGGCGACCACCGAGATCTCAATAACACACTCTTTCCCTA
PCR1.0_index_46 Primer
CACGACGCTCTTCCGATCT (223 )
AATGATACGGCGACCACCGAGATCTCCGATTAACACTCTTTCCCTA
PCR1.0_index_47 Primer
CACGACGCTCTTCCGATCT (224)
AATGATACGGCGACCACCGAGATCTTCTTATAACACTCTTTCCCTA
PCR1.0_index_48 Primer
CACGACGCTCTTCCGATCT (225 )
AATGATACGGCGACCACCGAGATCTAGAGATAACACTCTTTCCCT
PCR1.0_index_49 Primer
ACACGACGCTCTTCCGATCT (226)
AATGATACGGCGACCACCGAGATCTCTCTGGAACACTCTTTCCCTA
PCR1.0_index_50 Primer
CACGACGCTCTTCCGATCT (227 )
PCR1.0_index_51 Primer AATGATACGGCGACCACCGAGATCTTGTCCGAACACTCTTTCCCTA CACGACGCTCTTCCGATCT (228 )
AATGATACGGCGACCACCGAGATCTAAGACGAACACTCTTTCCCT
PCR1.0_index_52 Primer
ACACGACGCTCTTCCGATCT ( 229 )
AATGATACGGCGACCACCGAGATCTATAATCAACACTCTTTCCCTA
PCR1.0_index_53 Primer
CACGACGCTCTTCCGATCT (230)
AATGATACGGCGACCACCGAGATCTACTGGCAACACTCTTTCCCTA
PCR1.0_index_54 Primer
CACGACGCTCTTCCGATCT (231 )
AATGATACGGCGACCACCGAGATCTGGCAGCAACACTCTTTCCCT
PCR1.0_index_55 Primer
ACACGACGCTCTTCCGATCT (232)
AATGATACGGCGACCACCGAGATCTTACGCCAACACTCTTTCCCTA
PCR1.0_index_56 Primer
CACGACGCTCTTCCGATCT (233 )
AATGATACGGCGACCACCGAGATCTCTTGTAAACACTCTTTCCCTA
PCR1.0_index_57 Primer
CACGACGCTCTTCCGATCT (234)
AATGATACGGCGACCACCGAGATCTCGGTCAAACACTCTTTCCCTA
PCR1.0_index_58 Primer
CACGACGCTCTTCCGATCT (235 )
AATGATACGGCGACCACCGAGATCTGCACCAAACACTCTTTCCCT
PCR1.0_index_59 Primer
ACACGACGCTCTTCCGATCT (236)
根据本发明的又一方面, 本发明提供了用于将上述 DNA标签引入样品 DNA或其 等同物中的又一组分离的 PCR标签引物。根据本发明的实施例的该组分离的 PCR标签 引物, 其分别由 SEQ ID NO: 237-295所示的核苷酸构成。 在本说明书中, 这组 PCR 标签引物有时被称为 "第二 PCR标签引物" 或 "PCR2.0标签引物" , 其分别具有如前 所述的根据本发明实施例的 DNA标签, 通过釆用 PCR2.0标签引物的 PCR反应, 可以 将 PCR2.0标签引物引入样品的 DNA或其等同物中,从而就将相应的 DNA标签引入到 DNA或其等同物中。 与 DNA标签的命名方法类似, 在本说明书中, 与 DNA标签 DNA Index-N相对应的 PCR2.0标签引物被命名为 PCR2.0_index_N Primer,其中 N=l-59的任 意整数, 其序列如下表 3所示 (表中所示序列方向均是 5, _ 3,方向) 。
PCR2.0标签引物 ( PCR2.0_index_N Primer ) 序列
名称 序歹 lj (SEQ ID NO: )
CAAGCAGAAGACGGCATACGAGATTGCGGTTGTGACT
PCR2.0— —index— — 1 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(237)
CAAGCAGAAGACGGCATACGAGATGCCTCTTGTGACTG
PCR2.0— —index— —2 Primer
GAGTTCAGACGTGTGCTCTTCCGATCT(238)
CAAGCAGAAGACGGCATACGAGATCAAGCTTGTGACT
PCR2.0— —index— _3 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(239)
CAAGCAGAAGACGGCATACGAGATCGGCATTGTGACT
PCR2.0— —index— —4 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(240)
CAAGCAGAAGACGGCATACGAGATCATATGTGTGACT
PCR2.0— —index— —5 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(241)
CAAGCAGAAGACGGCATACGAGATGAGTGGTGTGACT
PCR2.0— —index— —6 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(242)
CAAGCAGAAGACGGCATACGAGATAGAAGGTGTGACT
PCR2.0— —index— _7 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(243)
CAAGCAGAAGACGGCATACGAGATGTTGAGTGTGACT
PCR2.0— —index— —8 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(244)
Figure imgf000014_0001
Figure imgf000015_0001
CAAGCAGAAGACGGCATACGAGATACTGGCAGTGACT
PCR2.0_index_54 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(290)
CAAGCAGAAGACGGCATACGAGATGGCAGCAGTGACT
PCR2.0_index_55 Primer
GGAGTTCAG ACGTGTGCTCTTCCGATCT(291 )
CAAGCAGAAGACGGCATACGAGATTACGCCAGTGACT
PCR2.0_index_56 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(292)
CAAGCAGAAGACGGCATACGAGATCTTGTAAGTGACT
PCR2.0_index_57 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(293)
CAAGCAGAAGACGGCATACGAGATCGGTCAAGTGACT
PCR2.0_index_58 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(294)
CAAGCAGAAGACGGCATACGAGATGCACCAAGTGACT
PCR2.0_index_59 Primer
GGAGTTCAGACGTGTGCTCTTCCGATCT(295)
利用上述根据本发明实施例的两组分离的 PCR标签引物,均能够有效地将 DNA标 签引入到样品的 DNA或其等同物中, 由此能够构建具有 DNA标签的 DNA标签文库。 另外, 发明人惊奇地发现, 当针对相同的样品, 釆用具有不同标签的 PCR标签引物分 别构建含有各种 DNA标签的 DNA标签文库时, 所得到的测序数据结果的稳定性和可 重复性非常好。
根据本发明的实施例, 利用上述 DNA标签接头和两组 PCR标签引物, 通过接头连 接和 PCR反应可以向 DNA样品中导入标签组合序列, 即将 DNA标签接头、 PCR1.0 标签引物和 PCR2.0标签引物中的标签进行排列组合, 然后针对同一 DNA样品釆用接 头连接和 PCR反应导入该标签组合。 由此, 根据本发明的实施例, 59种标签序列可以 产生 205379 ( 59χ 59χ 59 )种不同的标签组合, 从而可以利用 DNA标签接头、 PCR1.0 标签引物和 PCR2.0标签引物的共同作用向 DNA样品中导入不同的标签组合, 从而构 建多种 DNA样品的 DNA标签文库,根据不同 DNA标签文库中这些标签组合的不同就 可以对 DNA标签文库进行区分。 最终可以通过构建庞大的标签集群, 实现对超大量样 本的混合测序, 以满足高通量测序的需求, 从而降低测序成本。
根据本发明的又一方面, 本发明提供了一种制备 DNA标签文库的方法。 根据本发 明的实施例, 其包括以下步骤: 将 DNA样品片段化, 以便获得 DNA片段; 将所述 DNA 片段进行末端修复, 以便获得经过末端修复的 DNA片段; 在所述经过末端修复的 DNA 片段的 3,末端添加碱基 A, 以便获得具有粘性末端 A的 DNA片段; 将所述具有粘性末端 A的 DNA片段与 DNA标签接头相连, 以便获得连接有 DNA标签接头的连接产物, 其中 所述 DNA标签接头为根据本发明实施例的分离的寡核苷酸的一种; 将所述连接产物进 行 PCR反应, 以便获得 PCR扩增产物, 其中所述 PCR反应釆用第一 PCR标签引物和第二 PCR标签引物, 其中所述第一 PCR标签引物为根据本发明实施例的分别由 SEQ ID NO: 178-236所示的核苷酸构成的一组分离的 PCR标签引物一种, 所述第二 PCR标签引物为 根据本发明实施例的分别由 SEQ ID NO: 237-295所示的核苷酸构成的一组分离的 PCR 标签引物一种, 所述 PCR扩增产物包含目的片段、 DNA接头以及 DNA标签, 其中所述 目的片段的序列与所述 DNA片段的序列相对应; 以及分离回收所述 PCR扩增产物, 所述 PCR扩增产物构成所述 DNA标签文库。 利用根据本发明实施例的构建 DNA标签文库的 方法, 能够有效地将根据本发明实施例的 DNA标签组合引入到针对样品 DNA所构建的 DNA标签文库中。 从而可以通过对 DNA标签文库进行测序, 获得样品 DNA的序列信息 以及 DNA标签组合的序列信息, 从而能够对样品 DNA的来源进行区分。 另外, 发明人 惊奇地发现, 当针对相同的样品, 基于上述方法, 釆用不同的标签组合构建含有各种 DNA标签组合的 DNA标签文库时, 所得到的测序数据结果的稳定性和可重复性非常好。
进一步, 本发明还提供了一种 DNA 标签文库, 其是由根据本发明实施例的制备 DNA标签文库的方法所获得的。 根据本发明的又一方面, 本发明还提供了一种确定 DNA样品序列信息的方法。 根 据本发明的实施例, 其包括下列步骤: 根据本发明实施例的制备 DNA标签文库的方法 建立所述 DNA样品的 DNA标签文库; 以及对所述 DNA标签文库进行测序, 以确定所述 DNA样品的序列信息。 基于该方法, 能够有效地获得 DNA标签文库中 DNA样品的序列 信息以及 DNA标签组合的序列信息, 从而能够对 DNA样品的来源进行区分。 另外, 发 明人惊奇地发现, 利用根据本发明实施例的方法确定 DNA样品序列信息, 能够有效地 减少数据产出偏向性的问题, 并且能够精确地对多种 DNA标签文库进行区分。
根据本发明的再一方面,本发明还提供了一种确定多种 DNA样品序列信息的方法。 根据本发明的实施例, 其包括以下步骤: 针对所述多种样品的每一种, 分别独立地根据 本发明实施例的构建 DNA标签文库的方法, 建立所述 DNA样品的 DNA标签文库, 其 中, 不同的 DNA样品釆用相互不同并且已知序列的 DNA标签的组合; 将所述多种样 品的 DNA标签文库进行组合, 以便获得 DNA标签文库混合物; 利用 Solexa测序技术, 对所述 DNA标签文库混合进行测序, 以获得所述 DNA样品的序列信息以及所述标签 组合的序列信息; 以及基于所述标签组合的序列信息对所述 DNA样品的序列信息进行 分类, 以便确定所述多种样品的 DNA序列信息。 由此, 根据本发明实施例的该方法, 可以充分利用高通量的测序技术, 例如利用 Solexa测序技术, 同时对多种样品的 DNA 标签文库进行测序, 从而提高 DNA标签文库测序的效率和通量, 同时可以提高确定多 种 DNA样品序列信息的效率。
根据本发明的再一方面, 还提供了一种用于构建 DNA标签文库的试剂盒, 根据本 发明的实施例, 该试剂盒包括: 59 种分离的寡核苷酸, 所述分离的寡核苷酸具有第一 链和第二链, 所述第一链分别由 SEQ ID NO: ( 3N-1 ) 所示的核苷酸构成, 所述第二 链分别由 SEQ ID NO: ( 3N ) 所示的核苷酸构成, 其中, 对于相同的寡核苷酸, 其第 一链和第二链的 N取值相同, 并且 N=l-59的任意整数; 59种分离的第一 PCR标签引 物, 其分别由 SEQ ID NO: 178-236所示的核苷酸构成; 以及 59种分离的第二 PCR标 签引物, 其分别由 SEQ ID NO: 237-295所示的核苷酸构成; 其中, 所述 59种分离的 DNA标签接头、 59种分离的第一 PCR标签引物、 以及 59种分离的第二 PCR标签引物 分别设置在不同的容器中。 由此, 利用该试剂盒, 能够方便地将根据本发明实施例的 DNA标签引入到构建的 DNA标签文库中。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得 明显, 或通过本发明的实践了解到。
附图说明
本发明的上述和 /或附加的方面和优点从结合下面附图对实施例的描述中将变得明 显和容易理解, 其中:
图 1显示了 Illumina公司提供的 DNA标签文库构建方法的流程示意图;
图 2显示了根据本发明实施例的 DNA标签文库构建方法的流程示意图; 图 3显示了根据本发明实施例的构建 DNA标签文库的方法构建的 DNA标签文库 及其不同的标签组合的示意图;
图 4显示了根据本发明实施例的构建 DNA标签文库的方法构建的 DNA标签文库 及其不同的标签组合的示意图;
图 5显示了根据本发明实施例的构建 DNA标签文库的方法构建的 DNA标签文库 及其不同的标签组合的示意图;
图 6显示了根据本发明实施例的构建 DNA标签文库的方法构建的 DNA标签文库 的电泳检测结果。
发明详细描述
下面详细描述本发明的实施例, 所述实施例的示例在附图中示出, 其中自始至终相 同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附 图描述的实施例是示例性的, 仅用于解释本发明, 而不能理解为对本发明的限制。 需要说明的是, 术语 "第一" 、 "第二" 仅用于描述目的, 而不能理解为指示或暗 示相对重要性或者隐含指明所指示的技术特征的数量。 由此, 限定有 "第一"、 "第二" 的特征可以明示或者隐含地包括一个或者更多个该特征。进一步地,在本发明的描述中, 除非另有说明, "多个" 的含义是两个或两个以上。
DNA标签
根据本申请的一个方面, 本发明提出了一些分离的 DNA标签。 根据本发明的实施 例, 这些分离的 DNA标签分别由 SEQ ID NO: ( 3N-2 )所示的核苷酸序列构成, 其中 N=l-59的任意整数。 在本说明书中, 这些 DNA标签分别被命名为 DNA Index-N, 其中 N=l-59的任意整数, 其序列如前面表 1所示, 在此不再赘述。
在本发明中所使用术语 "DNA" 可以是任何包含脱氧核糖核苷酸的聚合物, 包括 但不限于经过修饰的或者未经修饰的 DNA。利用根据本发明实施例的 DN A标签, 通过 将 DNA标签与样品的 DNA或其等同物相连, 得到具有标签的 DNA标签文库, 通过对 DNA标签文库进行测序, 可以获得样品 DNA的序列以及标签的序列, 进而基于标签的 序列可以精确地表征 DNA的样品来源。 由此, 利用上述 DNA标签, 可以同时构建多 种样品的 DNA标签文库, 从而可以通过将来源于不同样品的 DNA标签文库进行混合, 同时进行测序, 基于 DNA标签对样品的 DNA序列进行分类, 获得多种样品的 DNA的 序列信息。 从而可以充分利用高通量的测序技术, 例如利用 Solexa测序技术, 同时对 多种样品的 DNA进行测序, 从而提高了通过高通量测序技术的效率和通量, 降低了确 定 DNA样品序列信息的成本。 这里所使用的表述方式 "DNA标签与样品的 DNA或其 等同物相连" 应^ 广义理解, 其包括 DNA标签可以与样品的 DNA直接相连, 以构建 DNA标签文库,也可以与和样品的 DNA具有相同序列的核酸(例如可以是相应的 RNA 序列或 cDNA序列, 其与 DNA具有相同的序列) 相连。
本申请的发明人发现: 在本发明中, 为了设计有效的 DNA标签, 首先需要考虑标 签序列之间的可识别性和识别率的问题。 其次,在标签混合量少于 12个样品的情况下, 必须考虑到混合后的标签上的每个碱基位点的 GT含量。 因为 Solexa测序过程中,碱基 G和 T的激发荧光一样, 碱基 A和 C的激发光是一样的, 因此必须考虑碱基 "GT" 含 量与碱基 "AC" 含量的 "平衡" , 最适碱基 "GT" 含量为 50% , 能保证标签识别率最 高和错误率最低。 最后, 还要考虑数据产出的可重复性和准确性, 即为了实现能够有效 构建 DNA标签文库并进行测序, 所构建的一组 DNA标签需要能够保证结果可靠, 可 重复性高, 也就是针对同样的 DNA样品, 可以保证利用该组 DN A标签中的不同标签 构建的 DNA标签文库, 能够获得一致的测序结果, 因而可以确保实验结果可靠且重复 性高。 另外, 还需要同时避免标签序列出现 3或 3 个以上连续的碱基的出现, 因为 3 个或 3个以上连续的碱基会增加序列在合成过程中或测序过程中的错误率,同时也要尽 量避免 DNA标签接头自身形成发夹结构。
为此, 本申请的发明人进行了大量的筛选工作, 并且选定了根据本发明实施例的一 组分离的 DNA标签, 其分别由 SEQ ID NO: ( 3N-2 ) 所示的核苷酸序列构成, 其中 N=l-59的任意整数。 其序列如前面表 1所示, 不再赘述。 另外, 发明人发现这些标签 之间的差异至少有 3个碱基, 即至少 3个碱基序列不同, 并且当标签的 7个碱基中的任 意 1个碱基出现测序错误或合成错误, 都不影响到标签的最终识别。这些标签可以应用 于任何 DNA标签文库的构建。 目前尚未有关于这些标签应用于 DNA样品测序的文库 构建并通过 Solexa测序的 艮道。
根据本发明的一些实施例, 所釆用的 DNA标签为长度是 7bp的核酸序列, 并且所述 标签之间的差异在 3个碱基以上, 所述一组 DNA标签由如下组成: 表 1所示 DNA标签或 与之相差 1个碱基的 DNA标签中的至少 5个, 或至少 10个, 或至少 15个, 或至少 20个, 至少 25个, 或至少 30个, 或至少 35个, 或至少 40个, 或 45个, 或至少 50个, 或至少 55 个, 或全部 59个。 具体地, 才艮据本发明的实施例, 所述一组 DNA标签优选地至少包括 表 1所示的 59个 DNA标签的 DNA Index 1 ~ DNA Index5 , 或 DNA Index6 ~ DNA IndexlO , 或 DNA Index 11 ~ DNA Indexl5 ,或 DNA Index 16 ~ DNA Index20,或 DNA Index21 ~ DNA Index25 , 或 DNA Index26 - DNA Index30 , 或 DNA Index31 - DNA Index35 , 或 DNA Index36 - DNA Index40 , 或 DNA Index41 - DNA Index45 , 或 DNA Index46 - DNA Index50 , 或 DNA Index51 ~ DNA Index55 , 或 DNA Index55 ~ DNA Index59, 或者他们任 何两个或多个的组合。在本发明的一些具体示例中, 所述相差 1个碱基包括对表 1所示 59 个标签的序列中 1个碱基的取代、 添加或缺失。
根据本发明的实施例, 本发明还提供了将根据本发明实施例的标签用于 DNA标签 文库构建并测序的用途, 其中 DNA标签文库的 DNA标签接头包含 #居本发明实施例的 DNA标签, 从而构成各自相对应的 DNA标签接头。 根据该用途的实施例, DNA标签插 入 DNA标签接头的 3'末端中, 或通过或不通过连接子连接在 DNA接头的 3'末端, 优选地 插入 DNA标签接头的 3'末端中, 根据具体的示例, 更优选地, 距离 DNA标签接头的 3' 末端 1个碱基处插入 DNA标签接头中; 其中 DNA标签文库的 PCR标签引物包含根据本发 明实施例的 DNA标签, 从而构成各自相对应的 PCR标签引物。 根据该用途的实施例, DNA标签插入 PCR标签引物中。
寡核苷酸、 PCR标签引物以及构建 DNA标签文库
根据本发明的又一方面, 本发明提供了一组分离的寡核苷酸, 其可以用于将前面所 描述的 DNA标签引入到样品的 DNA中, 进而构建 DNA标签文库。根据本发明的实施例, 本发明提供了一组分离的寡核苷酸, 该组分离的寡核苷酸中的每一种均具有粘性末端 T, 并且这些分离的寡核苷酸具有第一链和第二链, 粘性末端 T形成在每一种寡核苷酸 的第一链上。 其中, 根据本发明的实施例, 第一链分别由 SEQ ID NO: ( 3N-1 )所示的 核苷酸构成, 第二链分别由 SEQ ID NO: ( 3N ) 所示的核苷酸构成, 其中, 对于相同 的寡核苷酸, 其第一链和第二链的 N取值相同, 并且 N=l-59的任意整数。 对于相同的寡 核苷酸, 其第一链和第二链的 N取值相同, 即釆用序列表中的相应核苷酸分别作为第一 链和第二链时,构成第一链的核苷酸与构成第二链的核苷酸能够形成稳定的具有粘性末 端的二聚体, 具体地, 例如当 N=10时, 釆用 SEQ ID NO: 29作为第一链, SEQ ID NO: 30作为第二链。本领域技术人员能够理解, 可以通过分别将构成相应寡核苷酸的第一链 与第二链进行退火处理, 而形成相应的寡核苷酸。 在本说明书中, 与 DNA标签相对应 的寡核苷酸(DNA标签接头) 分别被命名为 DNA Index-N adapter, 其中 N=l-59的任意 整数 , 其 第 一链和 第 二链分别 被命名 为 DNA Index-NF_adapter和 DNA Index-NR_adapter, 其中 N=l-59的任意整数, 其序列如前面表 1所示, 在此不再赞述。 根据本发明的实施例, 上述寡核苷酸分别具有如前所述的根据本发明实施例的 DNA标 签, 并且这些寡核苷酸具有粘性末端, 因而, 可以通过连接反应, 将相应的 DNA标签 引入到样品的 DNA或其等同物中。 具体地, 这些寡核苷酸的序列如前面表 1所示, 在此 不再赘述。
发明人发现, 根据本发明的实施例所提供的寡核苷酸序列 (DNA标签接头) 具有 较高的稳定性。 该发现主要是根据本发明的一些实施例, 通过 Lasergene软件 ( http://www.dnastar.com/ ) 分析测试这些寡核苷酸序列的结构稳定性得来的。 使用 Lasergene的 PrimerSelect软件, 通过分析两条序列之间形成的能量值可以判断双链体之 间的亲和力参数, 从而预测 DNA标签接头形成的最稳定二聚体结构 (the most stable dimer overrall ) 及能量值, 其中, 能量值 ( kcal/mol ) 的绝对值越大, 表示双链体的结 果越稳定。 以下是分别对前面表 1所示的 59个 DNA标签接头进行上述的结构稳定性和亲 和力分析的结果, 结果表明, 这些 DNA标签接头形成的 "Y型" 结构非常稳定。
下面提供了根据本发明实施例的 DNA标签接头 ( DNA indexN adapter ) 的二级结 构以及最稳定的二聚体结构( The most stable dimer overall )—— "γ型" 结构及其能量 DNA index 1 adapter
The most stable dimer overall: 20 bp, —40.3 kcal mol 5 ' TACAC C TTCC C TACAC GACGCT CT TCCGATC AAC CGCAT 3 '
31 CACTGACCTCAAGTCTGCACACGAGAAGGCTAGATTC4GCGT 5 '
DNA index2 adapter
The roost stable dimer overall: 20 bp, -38.3 kcal/mol 5 ' TACACTCT TCCCTACACGACGCTCTTCCGATCTAAGAGGCT 3 '
DNA index3 adapter
The most stable dimer overall: 20 bp, —37.4 kcal/mol 51 TACAC TC TTCC C ACAC CACGCT CT TCCGATC TAAGC TTGT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGATTCC4AAC 5 '
DNA index4 adapter
The most stable dimer overall: 20 bp, - 0.5 kcal mol 5 ' TACACT CT TTCC C TAG ACGACGCT CT TCCGATC AAT GCCGT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGATTACGGC 5 '
DNA index5 adapter
The most stable dimer overall: 20 bp, -34.5 kcal/mol 5 ' TACAC CT TC C C ACAC GACGCTC CCGATC ACA ATGT 3 '
3 ' CACTGACC C\^¾ TCTGCACACGAGAAGGCTAGATGTA AC 5 '
DNA index6 adapter
The most stable dimer overall: 20 bp, —36, 2 kcal/mol 5 ' TACAC TC TTTCC CTACAC GACGCT CT TCCGATC TACCACTCT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGATGGTGAG 5 ' DNA index7 adapter
The most stable dimer overall: 20 bp, -36.5 kcal/mol
5 ' TACACTCT TCCCTACACGACGCTCT CCGATCTACCTTCTT 3'
3 ' CACTGACC CAAGTCTGCACACGAGAAGGCTAGATGGAAGA 5 ' DNA index 8 adapter
The most stable dimer overall: 20 bp, -35.0 kcal/mol
5 ' ACAC CTTTCCCTACACGACGC CTTCCGATCTACTCAACT 3 '
3 ' CAC GACCTCAAGTCTGCACACGAGAAGGCTAGATGAGTTG 5 ' DNA index9 adapter The roost stable dimer overall: 20 bp, -36.7 kcal /mol 5 ' TACACTCTTTCC C ACAC GACGCTCTTCCGATC AGAATCCT 3 '
3 » CACTGACCTCAAGTCTGCACACGACAAGGC AGATCTTAGG 5 '
DNA index 10 adapter
The most stable dim r overall: 20 bp, -36.5 kcal /mol 5 T TACACTCTTTCCCTACACGACGC C TCCGATCTAGGTCAGT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGATCCAGTC 5 '
DNA index 11 adapter
The most stable dimer overall: 20 bp, -35.0 kcal /mol
5 ' TACACTC CC C ACAC C4ACGCT C CCGATC AG C GA 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGATCAGACT 5 '
DNA index 12 adapter
The most stable dimer overall: 20 bp, -35.3 kcal /mol 5 ' TACACT CTTTCC C ACAC GACGCTCTTCCGATCTAGTTGTTT 3,
3 * CACTGACCTCAACTC GCACACGAGAAGGCTA.GA CAACAA 5 ' DNA index 13 adapter
The most s able dimer overall: 20 bp, -3 .8 kcal mol 51 TACACTCTTTCCCTACACGACGCTCTTCCGATCTATCATAAT 3 '
: : : : M I I 1 1 M Π ί i 1 M I I 1 M
3 ' GACTGACC CAAG CTGCACAC GAGAAGGC AGA AG ATT 5 ' DNA index 14 adapter
The most stable dimer overall: 20 bp, -40.1 kcal /mol 5 T TACACTCTT CCCTACACGACGCTCTTCCGATCTATCGCGTT 3,
3 ' CACTGACC CAAGTCTGCACACGAGAAGGCTAGATAGCGCA 5 '
DNA index 15 adapter
he most stable dimer overall: ·20 bp, -36.7 kcal /mol
31 CACTGACX'TCAAGTCTGCACACGAGAJiGGCTAGA ACGTGA 5 '
DNA index 16 adapter
The most, stable dimer overall: 20 bp, -34.8 kcal /mol
3, CACTGACCTCAAGTCTGCA ACGAGAAGGC AGATAACTAT 5 ' DNA index 17 adapter
T e mo s t stable 'dimer o ve rail: 20 b , - 37.6 kc l / mo 1 : : : : M i i i 1 M I Π M M ί ! i M
3 ' CACTGACCTCAAG CTGCACACGAGAAGGC AGAGTTC!CTA 5 ' DNA index 18 adapter
The most stable dimer over ll: 20 bp -38.8 kcal mol 5 ' TACACT CT TTCC C T ACAC GACGCT CT TCCGATC CAC GACCT 3. '
3 ' CACTGACCTCAAGTCTGCACACGAGAACGCTAGAGTGC GG 5 '
DNA index 19 adapter
The most stable dimer overall: 20 bp, -40.8 kcal/mol
5 * TACACTCT TCCCTACACGACGCTCTTGCGATC CATCCGCT 3 '
3, CACTGACC CAAGTCTGCACACGAGAAGGCTAGAGTAGGCG 5 ' DNA index20 adapter
The most stable dimer overall: 20 bp, -40.3 kcal/mol
5 T TACAC CTTTCCCTACACGACGCTC TCCGATC CCGCTAAT 3 '
3 f CACTGACCTGAAGTCTGCACACGAGAAGGCTAGAGGCGATT 5 '
DNA index21 adapter
The most stable dimer overall: 20 bp, -37.7 kcal/mol
5 ' TACACT CT TTCC C ACAC GACGCT CT TCCGATCTCC AC CTT 3 '
3 ' CAC GACCTCAAGTCTGCACACGAGAAGGCTAGAGGATGGA 5 '
DNA index22 adapter
The raost stable dimer overall: 20 bp, —38.9 kcal/mol 5 ' TACACT CT TTCC C ACACGACGCT CT TCCGATC CCT GAGGT 3 '
3 ' CAC GACCTCAAGTCTGGACACGAGAAGGCTAGAGGAC CC 5 '
DNA index23 adapter
The most, sta le dimer overall: 20 bp, -36.9 kcal mol
::: : m :ι 1 m !1 H ! m m
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGAGCTATCT 5 '
DNA index24 adapter
The most st ble dimer overall: 20 bp, -42.6 kcal mol
3 f CACTGACCTCAAGTCTGCACACGAGAAGGCTAGAGCG GCC 5 '
DNA index25 adapter
The most stable dimer overall: 20 bp, —40, 5 kcal/mol
5 ' TACACT CT TTCC CTACACGACGCT CT TCCGATC TCGC G AT 3 '
31 CACTGACCTCAAGTCTGCACACGAGAAGGCTAGAGCGCAAT 5 '
DNA index26 adapter e mos stable dimer overall: 20 p.r -36.9 kcal mol 5 i TACACTCTTTCCCTACACGACGCTCTTCCGATC CGTAATCT 3 '
3 ' CACTGACC TCAA.CTCTGC ACAC GAGAAGGC AGAGCAT AG 5 '
DNA index27 adapter
The most st afo 1 e dime r ove rail: 20 fo , -34.4 kcal /mol 5 ' TACACTCT CCC ACACGACGCTCT CCGATCTC AAC aT 3 '
3 ' CAC GACC CAAG CTGCACACGAGAAGGCTAGAGATTGAT 5 '
DNA index28 adapter
The Most stable dimer overall: 20 bp, -36.7 kcal /mol 5 ' TACACTCTTTCCCTACACGACGCTCTTCCGATCTCTACTCGT 31
3 ' CACTGACCTCAAGTCTGCACACGACAAGGC AGAGATCAGC 5 '
DNA index29 adapter
The mos stable dimer overall: 20 bp, -35.2 kcal/m.ol 5 r T ACACTC TTTCC C AC AC GACGC T C TTCCGATCTCTC T ATTT 3 '
3 f CACTGACC CAAG CTGCACACGAGAAGGCTAGAGAGA AA 5 '
DNA index30 adapter
The most stable dimer overall: 20 bp, -35.6 kcal /mol 5 ' TACACTCTTTCCCTACACGACGC CTTCCGATCTGTG GACT 3'
3 * CACTGACC TCAAGTCTGCACACGAGAAGGC AG AGAC AC TG 5 '
DNA index31 adapter
The most stable dimer overall: 20 bp, -36.1 kcal /mol
5 ' TACACTCTTTCCC ACACGACGCTC TCCGATCTGAA.CA CT 3 '
3 ' CACTGACCTCAAGTC GCACACGACAAGGCTAGACTTCTAG 5 '
DNA index32 adapter
The most, stable dimer overall: 20 b , -35.0 kcal /mol
5 ' TACACTCT TCCCTACACGACGCTCT CCGATCTGACTCTAT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGACTGAGAT 5 '
DNA index33 adapter
The most stable dimer overall: 20 bp, -37.4 kcal /mol 5 T TACACTCTTTCCCTACACGACGCTCTTCCGATCTGAGTGCTT 3 '
3 ' GACTGACCTCAAGTCTGCACACGAGAAGGCTAGACTCACGA 5 '
DNA index34 adapter
The most st ble dimer overall: 20 bp, - 35.8 kcal /mol 5 ' TACAC C TTCC C TACAC GACGC C TCCGATC GA AAGT 3'
3 * CACTGACC TCAAGTCTGCACACGAGAAGGC T AGAGTAA TC 5 ' DNA index35 adapter
The most stable dimer overall: 20 bp, -38.3 kcal/mol 5 ' TACACT CT TTCC C TAG ACGACGCT CT TCCGATC GCAGGTAT 3 '
3 ' CAC GACC CAAGTCTGC AC ACGAGAAGGC AGACGT CCAT 5 '
DNA index36 adapter
The most stable dimer overall: 20 bp, —37.6 kcal mol 5 ' TACACT CT TTCC C TAG ACGACGCT CT TCCGATC GCATC ACT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGACGTAGTG 5 '
DNA index37 adapter
The most stable dimer overall: 20 bp, -41.1 kcal/niol
DNA index38 adapter
The mos st ble dimer overall: 20 bp, -39.3 kcal/mol 5, TACACTCTTTCC CTACAC GACGCTCTTCCGATCTGCGATGTT 3'
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGACGCTACA 5 '
DNA index39 adapter
The most stable dimer overall: 20 bp, -42.9 kcal/mol
3 ' CACTGACC CAAGTCTGCACACGAGAAGGC AGACGAGCGC 5 '
DNA index40 adapter
The most stable dimer overall: 20 bp, - 0.4 kcal/mol
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGCTAGACCTCGGA 5 '
DNA index41 adapter
he most, stable dimer overall: 20 hpr —3 &.5 kcal/mol 5 ' TACAC C TCC C TACAC C4ACGCT CT TCCGATC TGGC AACA 3 '
3 ' CACTGACC CAAGTCTGCACACGAGAAGGCTAGACCGTTGT 5 '
DNA index42 adapter
The most stable dimer overall: 20 bp —3:9.8 kcal mol 3 ' CACTGACCTCAAGTC GCACACGAGAAGGCTAGACCGGATC 5 '
DNA index43 adapter The most stable dimer overall: 20 b f -38.6 kcal mol 51 TACACTCTTTCCCTACACGACGCTCT CCGATCTGGTGGACT 3 '
3 ' CACTGACCTCAA.GTCTGCACACGAGAAGGCTAGACCACC G- 5 ' DNA index44 adapter
The most stable dimer overall: 20 bp, -37, 9 kcal /mol
3 » CACTGACCTCAAGTCTGCACACGAGAAGGCTAGACA GCCA 5 ' DNA index45 adapter
The most stable dimer overall: 20 bp,- —39.5 kcal /mol
DNA index46 adapter
The most stable dimer overall: 20 bp, -35.8 kcal /mol
5 ' T ACAC C T CC C TACAC GACGCTC TTCCGATCTGT ATTGT :3. '
31 CACTGACC CAAGTCTGCACACGAGAAGGCTAGACAATAAC 5 ' DNA index47 adapter
The most stable dimer overall: 20 bp, -39.0 kcal /mol
5 * TftCACTCT TCCCTACACGACGCTC TCCGA CTTAA CGGT 3,
31 CACTGACCTCAACTC GCACACGAGAAGGCTAGAA. TAGCC 5 ' DNA index48 adapter
The most stable dimer overall: 20 bp, -3 .9 kcal /mol
5 ' ACAC C TTCCCTACAC GACGC C TCCGATC TAT AAGAT 3 f
3 ' CAC GACCTCAAGTCTGCACACGAGAAGGCTAGAATATTCT 5 ' DNA index49 adapter
The most stable dimer overall: 20 bp, -35.2 kcal /mol
5 ' TACACT CT TTCC C TACAC GACGCTC T CCGATCTTAT CTCTT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGC AGAATAGAGA 5 ' DNA index50 adapter
The most st ble dimer overall: 20 bp, -37.7 kcal /mol 5 ' TACACTC TTCCCTACAC GACGCTC TTCCGATCTTCCAGAGT 3 '
DNA index51 adapter
The most, stable dimer overall: 20 bp, -39.4 kcal mol 51 TACACT CT TTCC C ACAC GACGCTC T CCGATCTTGGGACAT 3 '
3 ' CACTGACC CAAGTCTGC AC ACGAGAAGGC T AGAAGC CTG 5 ' DNA index52 adapter
The most stable dimer overall: 20 b , —37, 9 kcal /mol
5 ' TACAC C TTC C C ACAC GACGC CT GCGATCTTCGTC TT 3 '
3 ' CACTGACCTCAAGTCTGCACACGAGAAGGC AGAAGCAGAA 5 '
DNA index53 adapter
The most stable dimer overall: 20 bp, —35,7 kcal /mol
3 ' CAC GACCTCAAGTCTGCACACGAGAAGGC AGAACTAA A 5 ' DNA index54 adapter
The most stable dimer overall: 20 bp, -39.2 kcal /mol
5 ' TACACTCTTTCCCTACACGACGC CTTCCGATCTTGCCAGTT 3 '
3 ' CACTGACC CAAGTCTGCACACGAGAAGGCTAG.AACGGTCA 5 '
DNA index55 adapter
The most stable dimer overall: 20 bp, - 1.0 kcal /mol 5, TACACTCTTTCCCTACACGACGCTCTTCCGATCTTGCTGCCT 3,
DNA index56 adapter
The most s able dimer overall: 20 bp, -40.3 kcal /mol 5 T TACACTC TTCCCTACACGACGC CTTCCGATCTTGGCGTAT 3 '
::: : H i i i i M ! Π H M ! ! 1 M
3 ' CAC GACCTCAAGTCTGCACACGAGAAGGC AGAACCGCAT 5 '
DNA index57 adapter
he most s able dimer overall: 20 b , -35.9 kcal /mol 5 ' TACACTCTTTCCCTACACGACGCTCTTCCGATC TTACAAGT 3 '
3 ' C AC GACC C AAGTCTGC ACACGAGAAGGC AGAAA C4 C 5 '
DNA index58 adapter
The most s ble dimer overall: 20 bp, —39.7 kcal/mo上 51 T ACACT C TTCC CTACAC GACGCT CT TC CGATC TTTGACCGT 3'
3 ' CACTGACC TC ^GTCTGCAC ACGAGAAGGC AGAAAC GGC 5 ' DNA index59 adapter
The most stable dimer overall: 20 bp, -39..5 kcal /mol
3 * CACTGACC C AAGTCTGC ACACGAGAAGGC AG AAACCACG 5 '
根据本发明的一些实施例, 本发明提供了一些含有上文所述的 DNA标签的一组 DNA标签接头, 其中 DNA标签文库的 DNA标签接头在 3,末端包含所述的标签, 并且优 选地用作接头, 这些 DNA标签接头包括如下或由如下组成: 表 2所示 59个 DNA标签接头 或与其所包含的 DNA标签序列相差 1个碱基的 DNA标签接头中的至少 5个, 或至少 10个, 或至少 15个, 或至少 20个, 至少 25个, 或至少 30个, 或至少 35个, 或至少 40个, 或 45 个, 或至少 50个, 或至少 55个, 或全部 59个。 才艮据本发明的具体示例, 这些 DNA标签 接头优选地至少包括表 2所示的 59个 DNA标签接头中的 DNA Index 1F/R_adapter ~ DNA Index5F/R_adapter , 或 DNA Index6F/R_adapter - DNA Index 10F/R_adapter , 或 DNA Index 11F/R_adapter - DNA Index 15F/R_adapter , 或 DNA Index 16F/R_adapter - DNA Index20F/R_adapter, 或 DNA Index21F/R_adapter - DNA Index25F/R_adapter , 或 DNA Index26F/R_adapter - DNA Index30F/R_adapter , 或 DNA Index31F/R_adapter - DNA Index35F/R_adapter, 或 DNA Index36F/R_adapter - DNA Index40F/R_adapter , 或 DNA Index41 F/R_adapter ~ DNA Index45F/R_adapter , 或 DNA Index46F/R_adapter ~ DNA Index50F/R_adapter, 或 DNA Index51F/R_adapter ~ DNA Index55F/R_adapter , 或 DNA Index55F/R_adapter ~ DNA Index59F/R_adapter, 或者他们任何两个或多个的组合。 根据 具体的示例, 相差 1个碱基包括对标签序列中 1个碱基的取代、 添加或删除。 根据本发明 的实施例, 还提供了 DNA标签接头用于 DNA标签文库构建并测序的用途。
根据本发明的另一方面, 本发明提供了两组分离的 PCR标签引物, 其可以用于将前 面所描述的 DNA标签引入到样品的 DNA中, 进而构建 DNA标签文库。 根据本发明的实 施例, 这两组分离的 PCR标签引物, 其中一组分别由 SEQ ID NO: 178-236所示的核苷 酸构成, 另一组分别由 SEQ ID NO: 237-295所示的核苷酸构成。 #居本发明的实施例, 上述两组 PCR标签引物分别具有如前所述的根据本发明实施例的 DNA标签, 通过釆用 PCR标签引物的 PCR反应, 可以将 PCR标签引物引入样品的 DNA或其等同物中, 从而将 相应的 DNA标签引入到 DNA或其等同物中。 具体地, 这些 PCR标签引物的序列如前面 表 2和表 3所示,在此不再赘述。另外,根据本发明的一些实施例,釆用分别由 SEQ ID NO: 178-236所示的核苷酸构成的一组 PCR标签引物作为第一 PCR标签引物 (有时也称为 "PCR1.0标签引物" ) , 分别由 SEQ ID NO: 237-295所示的核苷酸构成的一组 PCR标 签引物第二 PCR标签引物(有时也称为 "PCR2.0标签引物" 々PCR反应,可以将 PCR1.0 标签引物和 PCR2.0标签引物同时引入一个样品的 DNA或其等同物中, 从而将相应的 DNA标签引入到 DNA或其等同物中。 这里, 相应的 DNA标签是指同一个 PCR反应中的 两条引物 PCR1.0标签引物和 PCR2.0标签引物中各自分别包含的 DNA标签。
根据本发明的一些实施例, 本发明提供了两组 PCR标签引物, 其在 3,末端包含前面 所述根据本发明实施例的的 DNA标签。根据本发明的实施例, 其中作为 PCR反应的第一 PCR标签引物 (PCR1.0标签引物) , 分别由 SEQ ID NO: 178-236所示的核苷酸构成的 一组 PCR标签引物, 其包括如下或由如下组成: 表 2所示 59个 PCR1.0标签引物或与其所 包含的 DNA标签序列相差 1个碱基的 PCR1.0标签引物中的至少 5个, 或至少 10个, 或至 少 15个, 或至少 20个, 至少 25个, 或至少 30个, 或至少 35个, 或至少 40个, 或 45个, 或 至少 50个, 或至少 55个, 或全部 59个。 #居本发明的一些具体示例, 这些 PCR1.0标签 引物优选地至少包括表 2所示的 59个 PCR1.0标签引物中的 PCR1.0_Index_l Primer ~ PCR1.0_Index_5 Primerr , 或 PCR1.0_Index_6 Primer ~ PCR 1.0_Index_ 10 Primerr , 或 PCR1.0_Index_l l Primer ~ PCR 1.0_Index_ 15 Primerr , 或 PCR1.0_Index_16 Primer - PCR1.0_Index_20 Primerr, 或 PCR1.0_Index_21 Primer - PCR1.0_Index_25 Primerr, 或 PCR1.0_Index_26 Primer - PCR1.0_Index_30 Primerr , 或 PCR1.0_Index_31 Primer - PCR1.0_Index_35 Primerr, 或 PCR1.0_Index_36 Primer - PCR1.0_Index_40 Primerr, 或 PCR1.0_Index_41 Primer - PCR1.0_Index_45 Primerr , 或 PCR1.0_Index_46 Primer - PCR1.0_Index_50 Primerr, 或 PCR1.0_Index_51 Primer ~ PCR 1.0_Index_55 Primerr, 或 PCR1.0_Index_55 Primer ~ PCR1.0_Index_59 Primerr, 或者他们任何两个或多个的组合。 根据本发明的实施例, 其中作为 PCR反应的第二 PCR标签引物(PCR2.0标签引物) , 分 别由 SEQ ID NO: 237-295所示的核苷酸构成的一组 PCR标签引物, 其包括如下或由如 下组成: 表 3所示 59个 PCR2.0标签引物或与其所包含的 DNA标签序列相差 1个碱基的 PCR2.0标签引物中的至少 5个, 或至少 10个, 或至少 15个, 或至少 20个, 至少 25个, 或 至少 30个, 或至少 35个, 或至少 40个, 或 45个, 或至少 50个, 或至少 55个, 或全部 59 个。 根据本发明的一些具体示例, 这些 PCR2.0标签引物优选地至少包括表 3所示的 59个 PCR2.0标签 引 物 中 的 PCR2.0_Index_l Primer ~ PCR2.0_Index_5 Primerr , 或 PCR2.0_Index_6 Primer - PCR2.0_Index_10 Primerr , 或 PCR2.0_Index_l l Primer - PCR2.0_Index_15 Primerr, 或 PCR2.0_Index_16 Primer - PCR2.0_Index_20 Primerr, 或 PCR2.0_Index_21 Primer ~ PCR2.0_Index_25 Primerr , 或 PCR2.0_Index_26 Primer ~ PCR2.0_Index_30 Primerr, 或 PCR2.0_Index_31 Primer - PCR2.0_Index_35 Primerr, 或 PCR2.0_Index_36 Primer ~ PCR2.0_Index_40 Primerr , 或 PCR2.0_Index_41 Primer ~ PCR2.0_Index_45 Primerr, 或 PCR2.0_Index_46 Primer - PCR2.0_Index_50 Primerr, 或 PCR2.0_Index_51 Primer - PCR2.0_Index_55 Primerr , 或 PCR2.0_Index_55 Primer - PCR2.0_Index_59 Primerr, 或者他们任何两个或多个的组合。 根据具体的示例, 相差 1 个碱基 括对标签序列中 1个碱基的取代、 添加或删除。 根据本发明的实施例, 还提供 了 PCR标签引物用于 DNA标签文库构建并测序的用途。
由此,根据本发明的实施例,还提供了使用上述 DNA标签接头和 PCR标签引物构建 的 DNA标签文库。
根据本发明的另一方面,本发明还提供了一种利用上述 DNA标签接头和 PCR标签引 物构建 DNA标签文库的方法。 具体地, 根据本发明的实施例, 参考图 2, 该方法包括: 首先, 将 DNA样品片段化, 以便获得 DNA片段。 根据本发明的实施例, 通过超 声打断法将所述 DNA样品片段化。根据本发明的实施例, DNA样品的来源并不受特别 限制。 根据本发明的具体示例, DNA样品为人 DNA样品。 更具体的, 可以为人基因组 DNA样品。 发明人发现, 利用根据本发明实施例的方法, 能够有效地构建多种常见模 式生物的 DNA标签文库。根据本发明的实施例,所得的 DNA片段的长度为大约 180bp , 由此能够进一步提高构建 DNA标签文库以及后续测序的效率。
其次, 将 DNA片段进行末端修复, 以便获得经过末端修复的 DNA片段。
接着, 在经过末端修复的 DNA片段的 3,末端添加碱基 A, 以便获得具有粘性末 端 A的 DNA片段。 根据本发明的实施例, 经过末端修复的 DNA片段具有两条寡核苷 酸链, 其中, 碱基 A即是添加在所述两条寡核苷酸链的 3,末端, 且两条寡核苷酸链上 都要添加。
接下来,将具有粘性末端 A的 DNA片段与 DNA标签接头相连, 以便获得连接有 DNA 标签接头的连接产物。 根据本发明的实施例, DNA片段的两端均连接 DNA标签接头。 根据本发明的一些具体示例, 具有粘性末端 A的 DNA片段与 DNA标签接头相连, 是通过 在具有粘性末端 A的 DNA片段的两条寡核苷酸链的 3,末端均连接 DN A标签接头实现的。 根据本发明的实施例, 其中 DNA标签接头为根据本发明实施例的一组分离的寡核苷酸 的一种, 并且 DNA标签接头包含上述根据本发明实施例的一组分离的 DNA标签的一种。
然后, 将得到的连接产物进行 PCR反应, 以便获得 PCR扩增产物。 其中 PCR反应釆 用第一 PCR标签引物和第二 PCR标签引物, 在本说明书中, 第一 PCR标签引物为根据本 发明实施例的的分别由 SEQ ID NO: 178-236所示的核苷酸构成的一组分离的 PCR标签 引物的一种, 第二 PCR标签引物为分别由 SEQ ID NO: 237-295所示的核苷酸构成的一 组分离的 PCR标签引物的一种。 根据本发明的实施例, 第一 PCR标签引物和第二 PCR标 签引物包含不同的 DNA标签。 其 PCR扩增产物包含目的片段、 DNA接头以及 DNA标签, 其中目的片段的序列与 DNA片段的序列相对应。 该目的片段的序列与 DNA片段的序列 相对应, 其含义是指, 可以通过目的片段的序列直接推导出随机片段的序列, 例如, 目 的片段的序列可以与 DNA片段的序列完全相同, 也可以是完全互补, 甚至是增加或者 减少了已知数目的已知碱基, 只要能够通过有限的计算获得的 DNA的序列即可。
最后, 分离回收获得的 PCR扩增产物, PCR扩增产物构成所述 DNA标签文库。 根据本发明的实施例, 分离回收扩增产物的方法也不受特别限制, 本领域技术人员可以 根据扩增产物的特点选择适当的方法和设备进行分离,例如可以通过电泳并且回收特定 长度的 PCR扩增产物的方法进行回收。 根据本发明的实施例, 优选地回收长度为大约 380-400b 的 PCR扩增产物。
进一步, 根据本发明的实施例, 本发明提供了一种构建 DNA标签文库的方法, 其 包括:
1 )提供 n个 DNA样品, n为整数且 1 < n < 59的整数, 优选地 n为整数且 2 < n < 59, 所述 DNA样品来自所有真核和原核 DNA样品, 包括但不限于人 DNA样品;
2 ) 将人基因组 DNA 打断, 其中打断方法包括但不限于超声波打断方法, 优选地 使打断后的 DNA条带集中在 250 bp左右;
3 ) 末端修复;
4 ) DNA片段 3,末端加 "A" 碱基;
5 ) 连接 DNA标签接头, 其中优选地每一个标签接头连接到 DNA片段的两端;
6 )将步骤 5 )得到的连接产物进行凝胶回收纯化, 优选地通过 2 %的琼脂糖胶进行 电泳并回收, 并将各个 DNA样品的回收产物混合在一起;
7 ) PCR反应, 使用步骤 6 ) 的回收产物的混合物作为模板, 在适于扩增目的核酸 的条件下进行 PCR扩增, 将 PCR产物进行胶回收纯化, 优选地回收 380-400bp的目的 片段。
根据本发明的实施例, 通过上述根据本发明实施例的构建 DNA标签文库的方法所 构建的 DNA标签文库, 其 DNA标签接头包括如下或由如下组成: 表 1所示 59个 DNA标签 接头或与其所包含的 DNA标签序列相差 1个碱基的 DNA标签接头中的至少 5个, 或至少 10个, 或至少 15个, 或至少 20个, 至少 25个, 或至少 30个, 或至少 35个, 或至少 40个, 或 45个, 或至少 50个, 或至少 55个, 或全部 59个。 #居本发明的实施例, 上述 #居本发 明实施例的构建 DNA标签文库的方法中, 釆用的 DNA标签接头优选地至少包括表 1所示 的 59个 DNA标签接头中的 DNA Index 1F/R_adapter - DNA Index5F/R_adapter , 或 DNA Index6F/R_adapter ~ DNA Index 10F/R_adapter , 或 DNA Index 11F/R_adapter ~ DNA Indexl5F/R_adapter, 或 DNA Index 16F/R_adapter - DNA Index20F/R_adapter , 或 DNA Index21F/R_adapter ~ DNA Index25F/R_adapter , 或 DNA Index26F/R_adapter ~ DNA Index30F/R_adapter, 或 DNA Index31F/R_adapter - DNA Index35F/R_adapter , 或 DNA Index36F/R_adapter ~ DNA Index40F/R_adapter , 或 DNA Index41F/R_adapter ~ DNA Index45F/R_adapter , 或 DNA Index46F/R_adapter - DNA Index50F/R_adapter , 或 DNA Index51F/R_adapter - DNA Index55F/R_adapter , 或 DNA Index55F/R_adapter - DNA Index59F/R_adapter , 或者他们任何两个或多个的组合。 根据本发明的实施例, 相差 1个 碱基包括标签中 1个碱基的取代、 添加或删除。 根据本发明的实施例, 上述根据本发明 实施例的构建 DNA标签文库的方法的步骤 7 )PCR反应中使用的引物是,釆用分别由 SEQ ID NO: 178-236所示的核苷酸构成的一组分离的 PCR标签引物的一种作为 PCR Primer 1.0, 分别由 SEQ ID NO: 237-295所示的核苷酸构成的一组分离的 PCR标签引物的一种 作为 PCR Primer 2.0。
利用根据本发明实施例的构建 DNA标签文库的方法, 能够有效地将根据本发明实 施例的 DNA标签引入到针对 DNA样品所构建的 DNA标签文库中。 从而可以通过对 DNA标签文库进行测序, 获得 DNA样品的序列信息以及 DNA标签的序列信息, 从而 能够对 DNA样品的来源进行区分。 根据本发明的实施例的构建 DNA标签文库的方法 中所釆用的 DNA标签接头、 PCR1.0标签引物和 PCR2.0标签引物各自包含一个标签, 由此根据该方法构建的 DNA标签文库中就具有 3个标签,这 3个标签就组成了一个 "标 签组合" 。 根据本发明的实施例的 59种标签序列, 可以产生 205379种不同的具 3个标 签的标签组合。 由此根据本发明的实施例, 利用上述 DNA标签接头和两组 PCR标签引 物,通过接头连接和 PCR反应可以向 DNA样品中导入标签组合序列根据本发明的实施 例, 将 DNA标签接头、 PCR1.0标签引物和 PCR2.0标签引物同时导入 DNA文库中, 就能够将标签组合导入 DNA文库,通过导入不同的标签组合,就能够构建多种( 205379 种) DNA样品的 DNA标签文库, 从而可以在对 DNA标签文库测序后才艮据不同 DNA 标签文库中标签组合的不同对 DNA标签文库进行区分。 由此, 根据本发明的实施例, 可以通过构建庞大的标签集群, 最终实现对超大量样本的混合测序。 相对于 Illumina公 司的只能最多针对 12种 DNA样品构建 DNA标签文库进而进行混合测序, 本发明提供 的构建 DNA标签文库的方法有了明显的改进, 从而能够充分的利用高通量测序平台, 满足高通量测序的需求, 节省测序资源, 从而降低测序成本。 同时, 根据本发明的实施 例, 本发明将 Illumina公司提供的通过 3条 PCR引物 (两条公用引物和一条 PCR标签 引物) 导入标签的建库方法优化为仅通过两条 PCR引物 (PCR1.0标签引物和 PCR2.0 标签引物)即能导入标签,这样就降低了 PCR反应的难度,提高了 PCR扩增的特异性, 也就提高了 PCR扩增反应的效率, 同时本发明还提高了标签序列的识别效率, 从而提 高了 DNA标签文库的构建效率, 降低了文库构建的费用。 具体情况, 可比较参照图 1 和图 2, 其中图 1所示的 Illumina公司的 DNA 标签文库构建方法的流程图, 图 2所示 的 #居本发明的实施例的 DNA 标签文库构建方法的流程图。 目前为止,通过这些 DNA 标签接头、 PCR1.0标签引物和 PCR2.0标签引物导入标签组合的 DNA文库构建方法及 其标签序列, 并没有相关的报道。
另外, 发明人惊奇地发现, 当针对相同的样品, 基于上述方法, 釆用具有不同标签 组合的 DNA标签接头、 PCR1.0标签引物和 PCR2.0标签引物构建含有各种 DNA标签 组合的 DNA标签文库时, 所得到的测序数据结果的稳定性和可重复性非常好。
根据本发明的再一方面, 本发明还提供了一种用于构建 DNA标签文库的试剂盒。 根据本发明的实施例, 该试剂盒包括: 59 种分离的寡核苷酸, 这些分离的寡核苷酸具 有第一链和第二链, 其第一链分别由 SEQ ID NO: ( 3N-1 ) 所示的核苷酸构成, 第二 链分别由 SEQ ID NO: ( 3N ) 所示的核苷酸构成, 其中, 对于相同的寡核苷酸, 其第 一链和第二链的 N取值相同, 并且 N=l-59的任意整数; 59种分离的第一 PCR标签引 物, 其分别由 SEQ ID NO: 178-236所示的核苷酸构成; 以及 59种分离的第二 PCR标 签引物, 其分别由 SEQ ID NO: 237-295所示的核苷酸构成; 其中, 所述 59种分离的 DNA标签接头、 59种分离的第一 PCR标签引物、 以及 59种分离的第二 PCR标签引物 分别设置在不同的容器中。 由此, 利用该试剂盒, 能够方便地将根据本发明实施例的 DNA标签引入到构建的 DNA标签文库中。 由此, 利用该试剂盒, 能够方便地将根据本 发明实施例的 DNA标签引入到构建的 DNA标签文库中。 当然, 本领域技术人员能够 理解, 试剂盒中还可以包含其他用于构建 DNA标签文库的常规组件, 在此不再赘述。
DNA标签文库及测序方法
根据本发明的又一方面, 本发明还提供了一种 DNA标签文库, 其是根据本发明的 构建 DNA标签文库的方法所构建的。 该具有标签的 DNA标签文库可以有效地应用于 高通量测序技术例如 Solexa技术, 从而可以通过获得标签序列, 来对所获得的核酸序 列信息例如 DNA序列信息来精确地进行样品来源分类。
根据本发明的又一方面, 本发明还提供了一种确定 DNA样品序列信息的方法。 根 据本发明的实施例, 其包括: 根据本发明实施例的构建 DNA 标签文库的方法, 构建 DNA标签文库; 接着, 对所构建的 DNA标签文库进行测序, 以确定 DNA样品的序列 信息。 基于该方法, 能够有效地获得 DNA标签文库中 DNA样品的序列信息以及 DNA 标签的序列信息, 从而能够对 DNA样品的来源进行区分。 另外, 发明人惊奇地发现, 利用根据本发明实施例的方法确定 DNA样品序列信息, 能够有效地减少数据产出偏向 性的问题, 并且能够精确地对多种 DNA标签文库进行区分。 根据本发明的实施例, 可 以釆用任何已知的方法对所构建的 DNA标签文库进行测序, 其类型并不受特别限制。 根据本发明的一些示例, 可以利用 Solexa测序技术对 DNA标签文库进行测序。 根据本 发明的实施例, 可以根据具体情况选择合适的测序引物进行测序。
进一步, 可以将上面确定 DNA样品序列信息的方法应用于多种样品。 例如, 根据 本发明的实施例, 本发明提供了一种确定多种 DNA样品序列信息的方法。 根据本发明 的实施例, 其包括以下步骤: 针对多种样品的每一种, 分别独立地根据根据本发明的实 施例的构建 DNA标签文库的方法, 构建该 DNA样品的 DNA标签文库, 其中, 不同的 DNA样品釆用相互不同并且已知序列的 DNA标签的组合。这里,所使用的术语"多种" 为至少两种。 其中, 表达方式 "相互不同并且已知序列的 DNA标签的组合" 是指针对 一种 DNA样品构建的 DNA标签文库中, 其包含的 DNA标签组合与其它样品的 DNA 标签文库的标签组合不同, 且因组成标签组合的 3个标签序列已知, 所以各标签组合的 序列已知。 其中, 标签组合是指根据本发明的实施例的构建 DNA标签文库的方法中所 釆用的 DNA标签接头、 PCR1.0标签引物和 PCR2.0标签引物各自包含一个标签, 由此 根据该方法构建的 DNA标签文库中就具有 3个标签,这 3个标签可以看作是一个组合, 这里我们就称之为 "标签组合" 。 根据本发明的实施例, 每个标签组合中的 3个 DNA 标签之间可以全部相同, 或可以完全不同, 也可以有任意 2个相同。 在本说明书中, 标 签组合 "相互不同" , 是指各标签组合之间至少具有一个 DNA标签的不同, 也就是任 意 2个标签组合的 6个标签之间至少有一个标签是不同的, 即其必须有 1个标签不同, 或可以 2个标签不同, 或可以 3个标签不同, 或可以 4个标签不同, 或可以 5个标签不 同, 甚至可以 6个标签均不同。 由此, 根据本发明的实施例的 59种标签序列, 可以产 生 205379种不同的具 3个标签的标签组合。 根据本发明的实施例, 将 DNA标签接头、 PCR1.0标签引物和 PCR2.0标签引物同时导入 DNA文库中, 就能够将一个标签组合导 入 DNA文库,通过导入不同的标签组合,就能够构建多种 DNA样品的 DNA标签文库, 从而可以在对 DNA标签文库测序后才艮据不同 DNA标签文库中标签组合的不同对 DNA 标签文库进行区分。 由此, 根据本发明的实施例, 可以通过构建庞大的标签集群, 最终 实现对超大量样本的混合测序。具体地,将得到的多种样品的 DNA标签文库进行组合, 以便获得 DNA标签文库混合物, 利用 Solexa测序技术, 对所得的 DNA标签文库混合 物进行测序, 从而获得 DNA样品的序列信息以及标签的序列信息。 最后, 基于标签组 合的序列信息, 对 DNA样品的序列信息进行分类, 以便确定所述多种 DN A样品的序 列信息。 由此, 根据本发明实施例的该方法, 可以充分利用高通量的测序技术, 例如利 用 Solexa测序技术, 同时对多种样品的 DNA文库进行测序, 从而提高 DNA文库测序 的效率和通量, 同时可以提高确定多种 DNA样品的序列信息的效率。 关于测序的方法 和釆用的测序引物, 前面已经进行了详细描述, 此处不再赘述。
需要说明的是, 根据本发明实施例的确定 DNA样品序列信息的方法是本申请的发 明人经过艰苦的创造性劳动和优化工作才完成的。 下面将结合实施例对本发明的方案进行解释。 本领域技术人员将会理解, 下面的实 施例仅用于说明本发明, 而不应视为限定本发明的范围。 实施例中未注明具体技术或条 件的, 按照本领域内的文献所描述的技术或条件(例如参考 J.萨姆布鲁克等著, 黄培堂 等译的 《分子克隆实验指南》 , 第三版, 科学出版社)或者按照产品说明书进行。 所用 试剂或仪器未注明生产厂商者, 均为可以通过市购获得的常规产品, 例如可以釆购自 Illumina公司。
实施例 1 :
1.1 DNA 模板准备
pMD18-T质粒载体 ( 日本 takara ) 为模板, 使用 Primer Premier5.0软件设计引物, pMD18-T引物 1 : CGGGGAG AGGCGGTTTGCGTATTGG; pMD18-T引物 2: TTTTGTG ATGCTCGTCAGGGGGGCG, PCR扩增产物长度为 250bp的片段, 使用 NanoDrop 1000 仪器 (美国 NanoDrop )检测扩增产物的浓度, 然后根据浓度取 1 i克的该 PCR产物作 为文库构建的 DNA片段, 补水使其体积至 35微升。
pMD 18-T质粒 DNA模板 2微升
Taq酶 0.5微升
pMD18-T引物 1 1微升
pMD18-T引物 2 1微升
dNTP混合液 5微升
lO x PCR緩冲液 5微升
dd¾0 35.5微升
总体积 50微升
PCR反应条件
98 °C 30s
Figure imgf000032_0001
72 °C 5min
4°C 保存
然后 PCR产物用 QIAquick PCR纯化试剂盒进行纯化
1.2 末端修复
按照下列的配比准备反应混合:
DNA模板 35微升
T4 DNA 连接酶緩冲液 5(敫升
dNTPs 混合液 4^敫升
T4 DNA聚合酶 5微升
Klenow DNA聚合酶 1微升
T4多聚核苷酸激酶 5微升
总体积 10CM敫升 将舒适型恒温混匀器调至 20 °C , 反应 30min , 然后用 QIAquick PCR纯化试剂盒进 行纯化, 最后将样品溶于 32微升 EB solution 0
1.3 DNA片段 3'末端加碱基 A
按照下列的配比准备反应混合物:
末端修复后的 DNA 32微升
Kleno w酶緩冲液 5^敫升
dATP(lmM) 10微升
Klenow 酶 (3'到 5'外切酶活性) 3微升
总体积 50微升 将舒适型恒温混匀器调至 37 °C , 反应 30min , 然后用 MiniElute PCR纯化试剂盒进 行纯化, 最后将样品溶于 1(敫升 EB solution 0
1.4 连接 DNA标签接头
按照下列的配比准备反应混合物, :
DNA 10微升
T4 DNA 连接酶緩冲液 25^敫升
DNA标签接头 10微升
T4DNA 连接酶 5^敫升
总体积 50微升 注: DNA标签接头可以为表 1中的 59条 DNA标签接头中的任何一条标签接头 (其 由正反 2个互补序列 DNA Index-NF—adapter和 DNA Index-NR—adapter组成) 。
将舒适型恒温混匀器调至 20°C , 反应 15 min , 然后用 QIAquick PCR纯化试剂盒进 行纯化, 最后将样品溶于 30微升 EB solution 0
1.5 连接产物的胶回收纯化
将连接产物于 2%的琼脂糖胶中进行电泳分离; 随后将目的片段条带切胶转移至 Eppendorf管中。 用 QIAquick 胶纯化试剂盒进行胶纯化回收, 回收产物溶于 20微升 EB solution。
1.6 PCR反应导入 PCR标签引物
PCR反应: 按照下列的反应体系准备反应混合物, 将试剂放置于水上。
胶回收纯化后的 DNA 1(敫升 Phusion DNA聚合酶 25微升
PCRl .O—indexN Primer 1微升
PCR2.0_indexN Primer 1微升
ddH20 13微升
总体积 50微升 注: PCR1.0_indexN Primer可以为表 2中的 59条 PCR1.0_indexN Primer中的任何一 条标签引物; PCR2.0_indexN Primer可以为表 3中的 59条 PCR2.0_indexN Primer中的任何 一条标签引物。
PCR反应条件
98 °C 30s
98 °C 10s
65 °C 30s 15个循环
72 °C 30s
72 °C 5min
44°°CC 保存
1.7 PCR产物的胶回收纯化
将 PCR产物于 2%琼脂糖胶中电泳分离, 切割回收目的片段, 用 QIAquick胶纯化试 剂盒进行胶纯化回收, 回收产物溶于 3(敫升 EB solution。
1.8 DNA制备产物检测
1 ) 使用 Agilent 2100 Bioanalyzer检测文库产量。
2 ) 使用 QPCR定量检测文库产量。
图 3、 图 4、 图 5显示了根据本发明实施例的构建 DNA标签文库的方法构建的 DNA标 签文库及其不同的标签组合的示意图。通过不同的标签组合的序列信息就可以区别这些 DNA标签文库, 具体地, 可以通过 DNA标签接头、 PCR1.0 index Primer和 PCR2.0 index Primer中的标签的组合来区别 DNA标签文库, 其中, 这样的标签组合可以达到 205379 种 ( 59 x 59 x 59种) 。 根据本发明的实施例, 通过 DNA标签接头、 PCR1.0标签引物和 PCR2.0标签引物将不同的标签组合导入 DNA文库中,就能够构建多种( 205379种) DNA 样品的 DN A标签文库, 从而可以在对 DN A标签文库测序后根据标签组合的不同对 DN A 标签文库进行区分。 由此, 根据本发明的实施例, 可以通过构建庞大的标签集群, 最终 实现对超大量样本的混合测序。 例如,
如图 3 ,
DNA片段的信息序列为: GTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAA
图 3所示则构建的文库序列信息如下: >Index tagA-1: indexl+indexl+indexl
Figure imgf000035_0001
>Index tagA-2: index2+indexl+indexl
Figure imgf000035_0002
>Index tagA-3: index3+indexl+indexl
Figure imgf000035_0003
>Index tagA-58: index58+indexl+indexl
Figure imgf000035_0004
>Index tagA-59: index59+indexl+indexl
Figure imgf000035_0005
如图 4,
DNA片段的片段信息序列为:
GTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAA
图 4所示则构建的文库序列信息如下 >Index tagB-1: indexl+indexl+indexl
Figure imgf000035_0006
>Index tagB-2: indexl+index2+indexl
Figure imgf000035_0007
>Index tagB-3: indexl+index3+indexl >Index tagB-58: indexl+index58+indexl
Figure imgf000036_0001
GTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAA 图 5所示则构建的文库序列信息如下 >Index ta C-1: indexl+indexl+indexl
Figure imgf000036_0002
>Index ta C-2: indexl+indexl+index2
Figure imgf000036_0003
>Index ta C-3: indexl+indexl+index3
Figure imgf000036_0004
>Index ta C-58: indexl+indexl+index58
Figure imgf000036_0005
>Index tagC-59: indexl+indexl+index59 图 6显示了根据本发明实施例的构建 DNA标签文库的方法构建的 DNA标签文库的 电泳检测结果。 如图 6 , 目的片段文库如箭头所指, 为 390bp; D2000 marker条带大小依 次为: 2000bp、 1000bp、 750bp、 500bp、 250bp、 lOObp; 其中, 1、 D2000 marker; 2、 Index tagA-1 ; 3、 Index tagA-2; 4、 Index tagA-3; 5、 Index tagA-58; 6、 Index tagA-59; 7、 Index tagB-1 ; 8、 Index tagB-2; 9、 Index tagB-3; 10、 Index tagB-58 ; 11、 Index tagB-59; 12、 Index tagC-1 ; 13、 Index tagC-2; 14、 Index tagC-3; 15、 Index tagC -58、 16、 Index tagC-59; 17、 Index tagD-1 ; 18、 Index tagD-2; 19、 Index tagD-58 ; 20、 Index tagD-59; 21、 Index tagE-1 ; 22、 Index tagE-2; 23、 Index tagE-58; 24、 Index tagE-59; 25、 50bp marker
工业实用性
本发明的用于构建 DNA标签文库的 DNA标签、 DNA标签接头、 PCR标签引物、 DNA标签文库及其制备方法、 确定 DNA样品序列信息的方法、 确定多种 DNA样品序 列信息的方法以及用于构建 DNA标签文库的试剂盒, 能够应用于 DNA测序, 并且能 够有效地提高测序平台, 例如 Solexa测序平台的测序通量。
尽管本发明的具体实施方式已经得到详细的描述, 本领域技术人员将会理解。 根 据已经公开的所有教导, 可以对那些细节进行各种修改和替换, 这些改变均在本发明的 保护范围之内。 本发明的全部范围由所附权利要求及其任何等同物给出。
在本说明书的描述中, 参考术语 "一个实施例" 、 "一些实施例" 、 "示意性实施 例" 、 "示例" 、 "具体示例" 、 或 "一些示例" 等的描述意指结合该实施例或示例描 述的具体特征、 结构、 材料或者特点包含于本发明的至少一个实施例或示例中。 在本说 明书中, 对上述术语的示意性表述不一定指的是相同的实施例或示例。 而且, 描述的具 体特征、 结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结 合。

Claims

权利要求书
1、 一组分离的 DNA标签, 其分别由 SEQ ID NO: ( 3N-2 ) 所示的核苷酸构成, 其 中 N=l-59的任意整数。
2、 一组分离的寡核苷酸, 所述分离的寡核苷酸具有第一链和第二链, 所述第一链 分别由 SEQ ID NO: ( 3N-1 )所示的核苷酸构成, 所述第二链分别由 SEQ ID NO: ( 3N ) 所示的核苷酸构成, 其中, 对于相同的寡核苷酸, 其第一链和第二链的 N取值相同, 并 且 N=l-59的任意整数。
3、 一组分离的 PCR标签引物, 其分别由 SEQ ID NO: 178-236所示的核苷酸构成。
4、 一组分离的 PCR标签引物, 其分别由 SEQ ID NO: 237-295所示的核苷酸构成。
5、 一种构建 DNA文库的方法, 其特征在于, 包括下列:
将 DNA样品片段化, 以便获得 DNA片段;
将所述 DNA片段进行末端修复, 以便获得经过末端修复的 DNA片段;
在所述经过末端修复的 DN A片段的 3,末端添加碱基 A , 以便获得具有粘性末端 A的 DNA片段;
将所述具有粘性末端 A的 DNA片段与 DNA标签接头相连, 以便获得连接有 DNA标 签接头的连接产物, 其中所述 DNA标签接头为选自权利要求 2所述的分离的寡核苷酸的 一种;
将所述连接产物进行 PCR反应, 以便获得 PCR扩增产物, 其中所述 PCR反应釆用第 一 PCR标签引物和第二 PCR标签引物, 其中所述第一 PCR标签引物为选自权利要求 3所 述的一组分离的 PCR标签引物一种, 所述第二 PCR标签引物为选自权利要求 4所述的一 组分离的 PCR标签引物一种,所述 PCR扩增产物包含目的片段、 DNA接头以及 DNA标签, 其中所述目的片段的序列与所述 DNA片段的序列相对应; 以及
分离回收所述 PCR扩增产物, 所述 PCR扩增产物构成所述 DNA标签文库。
6、 根据权利要求 5所述的方法, 其特征在于, 通过超声打断法将所述 DNA样品片 段化。
根据权利要求 5所述的方法, 其特征在于, 所述 DNA样品为人 DNA样品。
8、 根据权利要求 5所述的方法, 其特征在于, 所述 DNA片段长度为约 250bp。
9、 根据权利要求 5所述的方法, 其特征在于, 所述 DNA片段的两端均连接 DNA标 签接头。
10、 根据权利要求 5所述的方法, 其特征在于, 所述第一 PCR标签引物和所述第二 PCR标签引物包含不同的 DNA标签。
11、 一种 DNA标签文库, 其是根据权利要求 5-10任一项所述的方法构建的。
12、 一种确定 DNA样品序列信息的方法, 其包括以下步骤:
根据权利要求 5-10任一项所述的方法, 建立所述 DNA样品的 DNA标签文库; 以 及
对所述 DNA标签文库进行测序, 以确定所述 DNA样品的序列信息。
13、 根据权利要求 12所述的确定 DNA样品序列信息的方法, 其特征在于, 对所述 DNA标签文库进行测序是利用 Solexa测序技术进行的。
14、 一种确定多种 DNA样品序列信息的方法, 其包括下列步骤:
针对所述多种样品的每一种, 分别独立地根据权利要求 5-10任一项所述的方法, 建立所述 DNA样品的 DNA标签文库, 其中, 不同的 DNA样品釆用相互不同并且已知 序列的 DNA标签的组合;
将所述多种样品的 DNA标签文库进行组合, 以便获得 DNA标签文库混合物; 利用 Solexa测序技术, 对所述 DNA标签文库混合物进行测序, 以获得所述 DNA 样品的序列信息以及所述标签组合的序列信息; 以及 基于所述标签组合的序列信息对所述 DNA样品的序列信息进行分类, 以便确定所 述多种样品的 DN A序列信息。
15、 一种用于构建 DNA标签文库的试剂盒, 其包括:
59 种分离的寡核苷酸, 所述分离的寡核苷酸具有第一链和第二链, 所述第一链分 别由 SEQIDNO: (3N-1)所示的核苷酸构成, 所述第二链分别由 SEQIDNO: ( 3N ) 所示的核苷酸构成, 其中, 对于相同的寡核苷酸, 其第一链和第二链的 N取值相同, 并且 N=l-59的整数;
59种分离的第一 PCR标签引物, 其分别由 SEQ ID NO: 178-236所示的核苷酸构成; 以及
59种分离的第二 PCR标签引物, 其分别由 SEQ ID NO: 237-295所示的核苷酸构 成;
其中, 所述 59种分离的 DNA标签接头、 59种分离的第一 PCR标签引物、 以及 59 种分离的第二 PCR标签引物分别设置在不同的容器中。
PCT/CN2011/079904 2010-09-21 2011-09-20 Dna标签及其应用 WO2012037882A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010299271.3 2010-09-21
CN 201010299271 CN102409048B (zh) 2010-09-21 2010-09-21 一种基于高通量测序的dna标签文库构建方法

Publications (1)

Publication Number Publication Date
WO2012037882A1 true WO2012037882A1 (zh) 2012-03-29

Family

ID=45873446

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/079904 WO2012037882A1 (zh) 2010-09-21 2011-09-20 Dna标签及其应用

Country Status (3)

Country Link
CN (1) CN102409048B (zh)
HK (1) HK1168630A1 (zh)
WO (1) WO2012037882A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014047561A1 (en) * 2012-09-21 2014-03-27 The Broad Institute Inc. Compositions and methods for labeling of agents
WO2014093825A1 (en) * 2012-12-14 2014-06-19 Chronix Biomedical Personalized biomarkers for cancer
WO2014143158A1 (en) * 2013-03-13 2014-09-18 The Broad Institute, Inc. Compositions and methods for labeling of agents
EP2898071A1 (en) * 2012-09-21 2015-07-29 The Broad Institute, Inc. Compositions and methods for long insert, paired end libraries of nucleic acids in emulsion droplets
EP4008796A1 (en) * 2017-01-27 2022-06-08 Integrated DNA Technologies, Inc. Construction of next generation sequencing (ngs) libraries using competitive strand displacement
US11692219B2 (en) 2017-01-27 2023-07-04 Integrated Dna Technologies, Inc. Construction of next generation sequencing (NGS) libraries using competitive strand displacement

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103571822B (zh) * 2012-07-20 2016-03-30 中国科学院植物研究所 一种用于新一代测序分析的多重目的dna片段富集方法
US11591637B2 (en) 2012-08-14 2023-02-28 10X Genomics, Inc. Compositions and methods for sample processing
US10323279B2 (en) 2012-08-14 2019-06-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10400280B2 (en) 2012-08-14 2019-09-03 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9701998B2 (en) 2012-12-14 2017-07-11 10X Genomics, Inc. Methods and systems for processing polynucleotides
CN102978205B (zh) * 2012-11-19 2014-08-20 北京诺禾致源生物信息科技有限公司 一种应用于标记开发的高通量测序的接头及其运用方法
CN102978206A (zh) * 2012-11-27 2013-03-20 北京诺禾致源生物信息科技有限公司 一种应用于混合建库的高通量测序接头及其建库方法
US10533221B2 (en) 2012-12-14 2020-01-14 10X Genomics, Inc. Methods and systems for processing polynucleotides
CN108753766A (zh) * 2013-02-08 2018-11-06 10X基因组学有限公司 多核苷酸条形码生成
CN104232626A (zh) * 2013-06-13 2014-12-24 深圳华大基因科技有限公司 简化基因组测序文库中条码物及其设计方法
CN104834833B (zh) * 2014-02-12 2017-12-05 深圳华大基因科技有限公司 单核苷酸多态性的检测方法及装置
EP3889325A1 (en) 2014-06-26 2021-10-06 10X Genomics, Inc. Methods of analyzing nucleic acids from individual cells or cell populations
CN105316320B (zh) * 2014-07-30 2020-02-21 天津华大基因科技有限公司 Dna标签、pcr引物及其应用
CN105296471B (zh) * 2014-08-01 2020-02-21 天津华大基因科技有限公司 Dna标签、pcr引物及其应用
CN105442051A (zh) * 2014-09-26 2016-03-30 深圳华大基因科技有限公司 一种基因文库的筛选方法
CN104293938B (zh) * 2014-09-30 2017-11-03 天津华大基因科技有限公司 构建测序文库的方法及其应用
CN104294371B (zh) * 2014-09-30 2017-07-04 天津华大基因科技有限公司 构建测序文库的方法及其应用
CN104293940B (zh) * 2014-09-30 2017-07-28 天津华大基因科技有限公司 构建测序文库的方法及其应用
CN104264231B (zh) * 2014-09-30 2017-04-19 天津华大基因科技有限公司 构建测序文库的方法及其应用
CN105780129B (zh) * 2014-12-15 2019-06-11 天津华大基因科技有限公司 目标区域测序文库构建方法
CN105401222A (zh) * 2015-12-30 2016-03-16 安诺优达基因科技(北京)有限公司 一种构建测序用dna文库的方法
CN108148900A (zh) * 2018-01-24 2018-06-12 深圳因合生物科技有限公司 基于分子标签和二代测序降低测序错误的测序方法、试剂盒及其应用
CN110317876A (zh) * 2019-08-02 2019-10-11 苏州宏元生物科技有限公司 一组染色体不稳定变异在制备诊断多发性骨髓瘤、评估预后的试剂或试剂盒中的应用
CN110452985A (zh) * 2019-08-02 2019-11-15 苏州宏元生物科技有限公司 一组染色体不稳定变异在制备诊断肝癌、评估预后的试剂或试剂盒中的应用
CN110408702A (zh) * 2019-08-02 2019-11-05 苏州宏元生物科技有限公司 一组染色体不稳定变异在制备诊断乳腺癌、评估预后的试剂或试剂盒中的应用
CN110317877A (zh) * 2019-08-02 2019-10-11 苏州宏元生物科技有限公司 一组染色体不稳定变异在制备诊断尿路上皮癌、评估预后的试剂或试剂盒中的应用
CN113584600A (zh) * 2021-08-11 2021-11-02 翌圣生物科技(上海)股份有限公司 一种全基因组甲基化单链dna建库方法

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006137733A1 (en) * 2005-06-23 2006-12-28 Keygene N.V. Strategies for high throughput identification and detection of polymorphisms

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2121983A2 (en) * 2007-02-02 2009-11-25 Illumina Cambridge Limited Methods for indexing samples and sequencing multiple nucleotide templates
CN100564618C (zh) * 2007-06-13 2009-12-02 北京万达因生物医学技术有限责任公司 分子置换标签测序并行检测法即寡聚核酸代码标签分子库微球阵列分析

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006137733A1 (en) * 2005-06-23 2006-12-28 Keygene N.V. Strategies for high throughput identification and detection of polymorphisms

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WILHELM J.ANSORGE.: "Next-generation DNA sequencing techniques.", NEW BIOTECHNOLOGY., vol. 25, no. 4, April 2009 (2009-04-01), pages 195 - 203 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014047561A1 (en) * 2012-09-21 2014-03-27 The Broad Institute Inc. Compositions and methods for labeling of agents
EP2898071A1 (en) * 2012-09-21 2015-07-29 The Broad Institute, Inc. Compositions and methods for long insert, paired end libraries of nucleic acids in emulsion droplets
EP2898071A4 (en) * 2012-09-21 2016-07-20 Broad Inst Inc COMPOSITIONS AND METHODS FOR LONG-TERM LABORATORIES AND PREPARED END OF NUCLEIC ACIDS IN EMULSION DROPS
US10738299B2 (en) 2012-09-21 2020-08-11 The Broad Institute, Inc. Compositions and methods for labeling of agents
US11643650B2 (en) 2012-09-21 2023-05-09 The Broad Institute, Inc. Compositions and methods for labeling of agents
WO2014093825A1 (en) * 2012-12-14 2014-06-19 Chronix Biomedical Personalized biomarkers for cancer
US9909186B2 (en) 2012-12-14 2018-03-06 Chronix Biomedical Personalized biomarkers for cancer
WO2014143158A1 (en) * 2013-03-13 2014-09-18 The Broad Institute, Inc. Compositions and methods for labeling of agents
EP4008796A1 (en) * 2017-01-27 2022-06-08 Integrated DNA Technologies, Inc. Construction of next generation sequencing (ngs) libraries using competitive strand displacement
US11692219B2 (en) 2017-01-27 2023-07-04 Integrated Dna Technologies, Inc. Construction of next generation sequencing (NGS) libraries using competitive strand displacement

Also Published As

Publication number Publication date
CN102409048B (zh) 2013-10-23
HK1168630A1 (en) 2013-01-04
CN102409048A (zh) 2012-04-11

Similar Documents

Publication Publication Date Title
WO2012037882A1 (zh) Dna标签及其应用
WO2012037876A1 (zh) Dna标签及其应用
WO2012037880A1 (zh) Dna标签及其应用
EP2427569B1 (en) The use of class iib restriction endonucleases in 2nd generation sequencing applications
US20210363570A1 (en) Method for increasing throughput of single molecule sequencing by concatenating short dna fragments
WO2012037877A1 (zh) Dna标签及其应用
AU2021204166B2 (en) Reagents, kits and methods for molecular barcoding
US8921076B2 (en) Method for genome complexity reduction and polymorphism detection
CA2892646A1 (en) Methods for targeted genomic analysis
TW201321518A (zh) 微量核酸樣本的庫製備方法及其應用
US20180223350A1 (en) Duplex adapters and duplex sequencing
WO2012116661A1 (zh) Dna标签及其应用
WO2012037884A1 (zh) Dna标签及其应用
WO2012037883A1 (zh) 核酸标签及其应用
WO2012126398A1 (zh) Dna标签及其用途
WO2018147438A1 (ja) Hla遺伝子のpcrプライマーセット及びそれを用いたシークエンス法
US20230017673A1 (en) Methods and Reagents for Molecular Barcoding
JP2020536525A (ja) プローブ及びこれをハイスループットシーケンシングに適用するターゲット領域の濃縮方法
WO2012037875A1 (zh) Dna标签及其应用
US20140336058A1 (en) Method and kit for characterizing rna in a composition
WO2012037881A1 (zh) 核酸标签及其应用
WO2018113799A1 (zh) 构建简化基因组文库的方法及试剂盒
WO2012037879A1 (zh) 核酸标签及其应用
WO2014086037A1 (zh) 构建核酸测序文库的方法及其应用
WO2010064040A1 (en) Method for use in polynucleotide sequencing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11826409

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05/08/2013)

122 Ep: pct application non-entry in european phase

Ref document number: 11826409

Country of ref document: EP

Kind code of ref document: A1