CN101351552A - Paired end sequencing - Google Patents

Paired end sequencing Download PDF

Info

Publication number
CN101351552A
CN101351552A CNA2006800281812A CN200680028181A CN101351552A CN 101351552 A CN101351552 A CN 101351552A CN A2006800281812 A CNA2006800281812 A CN A2006800281812A CN 200680028181 A CN200680028181 A CN 200680028181A CN 101351552 A CN101351552 A CN 101351552A
Authority
CN
China
Prior art keywords
nucleic acid
target nucleic
connector
dna
dna construction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2006800281812A
Other languages
Chinese (zh)
Inventor
J·伯卡
Z·陈
M·埃格霍姆
B·C·戈德温
S·K·哈奇森
J·H·利蒙
G·J·萨基斯
J·F·西蒙斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
454 Life Science Corp
Original Assignee
454 Life Science Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 454 Life Science Corp filed Critical 454 Life Science Corp
Publication of CN101351552A publication Critical patent/CN101351552A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides for a method of preparing a target nucleic acid fragments to produce a smaller nucleic acid which comprises the two ends of the target nucleic acid. Specifically, the invention provides cloning and DNA manipulation strategies to isolate the two ends of a large target nucleic acid into a single small DNA construct for rapid cloning, sequencing, or amplification.

Description

The pairing end sequencing
The cross reference of related application
The application requires the U.S. Provisional Patent Application sequence number 60/688 respectively at submission on June 6th, 2005,042,60/717 of submission on September 16th, 2005,60/771,818 the right of priority of submitting on February 8th, 964 and 2006, the content of these temporary patent applications is attached to herein by reference.
Each application mentioned in this article and patent and in each application and patent each file of mentioning or reference (be included in and mention in the checking process of each granted patent; " file that application is mentioned "), corresponding to and/or require each U.S. and the foreign application or the patent of the right of priority of any of these application and patent, and each file of in the file that each application is mentioned, mentioning or quoting, all clearly be attached to herein by reference.More generally say, in this article (perhaps in the reference inventory before claims, perhaps in the middle of text self) file or reference of mentioning; With each (" reference that this paper mentions ") in these files or the reference, and each file of in the reference that each this paper mentions, mentioning or reference (comprising specification sheets, operational guidance of any manufacturer etc.), all clearly be attached to herein by reference.The file that is attached to by reference herein can be used for implementing the present invention.
Governmental interests
The present invention supports by United States Government, and the subsidy that NIH authorizes number is R01HG003562.United States Government can have certain right in the present invention.
Invention field
The present invention relates to the field that nucleic acid sequencing, gene order-checking and sequencing result are assembled into continuous sequence.
Background of invention
The method of a kind of big target nucleic acid that checks order (as the people's gene group) is to use shotgun sequencing.In shotgun sequencing, target nucleic acid is by fragmentation or subclone, producing a series of eclipsed nucleic acid fragments, and measures these fragments sequence.Based on the overlapping and information of each fragments sequence, but the complete sequence of establishing target nucleic acid.
A shortcoming of shotgun sequencing is that if the target nucleic acid sequence comprises numerous little repeated fragments (series connection repeats or oppositely repeats), then assembling may be difficult to.The ability of assembling genome sequence does not cause the room occurring in the assembling sequence in the iteron.Therefore, after the initial assembling of nucleotide sequence, the room in the sequence scope must be filled up, and the uncertainty in the assembling must solve.
A kind of method that solves these rooms is to use bigger clone or fragment to check order, and is enough to cover the iteron because these big fragments should be looked.But the segmental order-checking of large nucleic acids is more difficult and more consuming time in present sequencing device.
The method in room is to measure the sequence of big segmental two ends in the another kind of sequence coverage.Compare with the single sequence of reading of an end of shotgun sequencing fragment, a pair of sequence of reading of two ends has known spacing and direction.Use long relatively fragment also to help to assemble and contain the sequence of scattering repeat element.The method of the type (Smith, M.W. etc., Nature Genetics 7:40-47 (1994)) is called as the pairing end sequencing in this area.The present invention includes and can be used for matching-novel method, system and the composition of end sequencing method and other nucleic acid technology.
The invention summary
One embodiment of the invention relate to a kind of method of the DNA of acquisition construction, and described DNA construction comprises two stub areas of target nucleic acid, and described target nucleic acid can be the big sections of biological gene group.Said method comprising the steps of:
(a) fragmentation large nucleic acids molecule is to produce target nucleic acid;
(b) capturing element is connected to target nucleic acid, to form the first ringed nucleus acid molecule;
(c) with the cutting target nucleic acid but the restriction endonuclease that does not cut capturing element digests the first annular nucleic acid, contain the linear nucleic acid of two ends of target nucleic acid with generation, described two ends are separated by capturing element;
(d) linear nucleic acid is connected with dividing element, to form the second annular nucleic acid;
(e) change the second annular nucleic acid into annular single-chain nucleic acid;
(f) make the annealing of first oligonucleotide and annular single-chain nucleic acid, and by the rolling circle amplification annular single-chain nucleic acid that increases, to produce strand rolling circle amplification product;
(g) make the annealing of second oligonucleotide and strand rolling circle amplification product, in strand rolling circle amplification product, to form a plurality of double stranded regions; With
(h) use the restriction endonuclease of a plurality of double stranded regions of cutting that the digestion of strand rolling circle amplification product is small segment, contain the DNA construction of two end region of target nucleic acid with generation.
Another embodiment of the invention relates to the second method that acquisition contains the DNA construction of two end region of target nucleic acid.Said method comprising the steps of:
(a) fragmentation large nucleic acids molecule is to produce target nucleic acid;
(b) connector is connected to each end of target nucleic acid;
(c) feature tag is connected to target nucleic acid, to form the ringed nucleus acid molecule;
(d), contain the DNA construction of two end region of target nucleic acid with generation with the cutting target nucleic acid but the restriction endonuclease that does not cut connector or feature tag digests annular nucleic acid.
The inventive method can be implemented simultaneously to a large amount of target dna fragments, contains the DNA construction library of big dna fragmentation end with generation.But an advantage of the present invention is the external structure library, need not to use protokaryon or eukaryotic host cell.
These and other embodiment is open by following detailed description, and is perhaps apparent by following detailed description, and comprised by following detailed description.
The accompanying drawing summary
Below detailed description provides with embodiment, but is not intended to limit the invention to described specific embodiments, and this detailed description can be understood together with accompanying drawing, and these accompanying drawings are attached to herein by reference, wherein:
Fig. 1 illustrates the synoptic diagram of an embodiment of pairing end sequencing strategy.Numerical reference is represented the nucleic acid starting point." 101 " represent a flanking region of capturing element, for example are presented at the left side of Fig. 3 A." 102 " represent second flanking region of capturing element, for example are presented at the right side of Fig. 3 A." 103 " expression capturing element.(and optional size classification is isolating) initial nucleic acid of " 104 " expression fragmentation." 105 " expression dividing element." 106 " expression polysaccharase.
Fig. 2 illustrates the synoptic diagram of second embodiment of pairing end sequencing strategy.
Fig. 3 illustrates and catches fragments sequence and design.The sign of sequence is as follows:
The terminal fragment product S EQ ID NO:1 that catches of pairing
Oligonucleotide 1 SEQ ID NO:2
Oligonucleotide 2 SEQ ID NO:3
Oligonucleotide 3 SEQ ID NO:4
Oligonucleotide 4 SEQ ID NO:5
Terminal fragment product (IIS type, MmeI) the SEQ ID NO:6 of catching of pairing
The terminal fragment SEQ ID NO:7 that catches of short connector pairing
Terminal fragment (IIS type, MmeI) the SEQ ID NO:8 of catching of short connector pairing
Fig. 4 illustrates the segmental embodiment of RE.
Fig. 5 illustrates segmental another embodiment of RE.
Fig. 6 illustrates the terminal reading method of the pairing of using the hair clip connector.The hair clip connector has following sequence:
Figure A20068002818100131
(SEQ ID NO:27)
The hair clip connector is a successive nucleotide sequence, and its diagram is separated into 4 above districts.4 districts are hair clip district, restriction endonuclease recognition site, biotinylation district and IIS type restriction endonuclease recognition site from left to right." 601 " expression hair clip connector." 603 " expression genomic dna.Met represents methylate DNA." 602 " expression hair clip connector dimer." 604 " expression is by the hair clip connector of restriction endonuclease cutting.Two hair clip connectors that " 605 " expression is also connected by the restriction endonuclease cutting again.SA represents the streptavidin pearl.Bio represents vitamin H (biological example elementization DNA).
Fig. 7 illustrates the improvement of the terminal method of pairing.
Fig. 8 illustrates the terminal reading method of the pairing of adopting the overhang connector.
Fig. 9 illustrates " mark triggers " two end sequencings, and it is a kind of method of the product of the present invention that checks order.
Figure 10 illustrates the cyclisation that connector joins.
Figure 11 illustrates the cyclisation based on ssDNA.
The synoptic diagram of sequence PET random fragmentation is read in another embodiment-pairing that Figure 12 illustrates pairing end sequencing strategy.SPRI refers to the immobilization of solid phase reversible.
Figure 13 illustrates the pairing that obtains of order-checking e. coli k12-read sequence PET random fragmentation sequencing data.
Figure 14 illustrates the whole bag of tricks that is carried out the double-stranded DNA cutting by intestinal bacteria endonuclease V.The Nucleotide " I " that adds frame is represented Hypoxanthine deoxyriboside.
Figure 14 A illustrates a kind of method, and wherein, the nucleotide sequence of double-stranded DNA guides the two strands cutting of intestinal bacteria endonuclease V in the mode that produces 3 ' strand palindrome overhang.Notice that 3 ' strand overhang contains the Hypoxanthine deoxyriboside residue.
Figure 14 B illustrates a kind of method, and wherein, the nucleotide sequence of double-stranded DNA guides the two strands cutting of intestinal bacteria endonuclease V in the mode that produces the non-palindrome overhang of 3 ' strand.Notice that 3 ' strand overhang contains the Hypoxanthine deoxyriboside residue.
Figure 14 C illustrates a kind of method, and wherein, the nucleotide sequence of double-stranded DNA guides the two strands cutting of intestinal bacteria endonuclease V in the mode that produces 5 ' strand palindrome overhang.Notice that 5 ' strand overhang does not contain the Hypoxanthine deoxyriboside residue.
Figure 14 D illustrates a kind of method, and wherein, the nucleotide sequence of double-stranded DNA guides the two strands cutting of intestinal bacteria endonuclease V in the mode that produces the non-palindrome overhang of 5 ' strand.Notice that 5 ' strand overhang does not contain the Hypoxanthine deoxyriboside residue.
Figure 14 E illustrates a kind of method, and wherein, the nucleotide sequence of double-stranded DNA guides the two strands cutting of intestinal bacteria endonuclease V in the mode that produces flush end.
Figure 15 illustrates the synoptic diagram of another embodiment of pairing end sequencing strategy, and it adopts the double-stranded cutting of intestinal bacteria endonuclease V to contain the hair clip connector (Hypoxanthine deoxyriboside hair clip connector) of Hypoxanthine deoxyriboside on opposite strand.
Figure 16 illustrates the pairing of using the Hypoxanthine deoxyriboside hair clip connector method order-checking e. coli k12 genomic dna be shown in Figure 15 to obtain-the read distribution of sequence distance.
Figure 17 illustrates the synoptic diagram of another embodiment of pairing end sequencing method of the present invention.The hair clip connector, match terminal connector (" A " and " B ") and the nucleotides sequence of PCR primer " F-PCR " and " R-PCR " and be shown in Figure 18.Each matches terminal connector and all has as shown in figure 18 two strands and strand part." Bio " represents vitamin H." Met " represents methylated base.The particulate of " SA-pearl " expression streptavidin bag quilt." EcoRI " and " MmeI " represents the recognition site of restriction endonuclease EcoRI and MmeI respectively.
Figure 18 illustrates the connector that is shown in Figure 17 and the nucleotide sequence and the modification of primer tasteless nucleotide.Figure 18 A illustrates hair clip connector sequence." iBiodT " represents inner biotin labeled deoxythymidine." Bio " represents vitamin H." EcoRI " and " MmeI " represents the recognition site of restriction endonuclease EcoRI and MmeI respectively.Figure 18 B illustrates terminal connector of pairing and PCR primer nucleotide sequence.Each matches terminal connector (" A " and " B ") by annealing two single stranded oligonucleotides " A top " and " A bottom ", " B top " and " B bottom " generation.5 ' the end of polynucleotide sequence that is shown in Figure 18 B is not by phosphorylation.
Figure 19 illustrates the synoptic diagram that is used for carrying out at water-in-oil emulsion an embodiment of the method that polynucleotide connect.
Figure 20 illustrates under the situation that is with or without the carrier DNA that contains the MmeI-site, the overburden depth of the e. coli k12 genomic dna of realizing by the pairing end sequencing data that obtain.
Detailed Description Of The Invention
Unless otherwise defined, otherwise all technology used herein and scientific terminology are all identical with the implication that one of ordinary skill in the art of the present invention understand usually. Although can with implementing the present invention to method described herein numerous methods and the material similar or of equal value with material, this paper describes preferred materials and methods.
The present invention relates to a kind of separate and two ends of sequencing nucleic acid large fragment fast and the method for saving cost. The method fast and can automation so that can check order and be connected the DNA large fragment.
Compare by the order-checking of clone's air gun with routine, paired end sequencing has kept numerous considerable advantages, and is in fact complementary by the order-checking of clone's air gun with routine. Even if the most important thing is when genome inserts repeat element, also to produce fast the ability of large genome skeleton in these advantages. The inventive method can be used for producing the dna fragmentation library, and wherein said fragment comprises the end of larger dna fragment.
First method
In one embodiment, paired end sequencing can be implemented by the following step:
Step 1A
Parent material can be any nucleic acid, comprises such as genomic DNA, cDNA, RNA, PCR product, episome etc. Although the inventive method is effective especially to the nucleic acid parent material of long tract, the present invention also can be applicable to small nucleic acids, such as clay, plasmid, little PCR product, mitochondrial DNA etc.
DNA can be from any source. For example, DNA can be from the genome of the unknown or not exclusively known biology of its dna sequence dna. As another example, DNA can be from the genome of the known biology of its DNA sequence. The DNA of order-checking known group makes the researcher can collect data and associated gene type and the disease of genome polymorphism.
The nucleic acid parent material can be known dimensions or known dimensions scope. For example, parent material can be average size and known cDNA library or the genomic library of distribution of inserting.
Perhaps, the nucleic acid parent material can be by any fragmentation (Figure 1A) in numerous common methods, described method comprises that atomizing, ultrasonic, HydroShear, ultrasonic fragmentation, enzyme cut that (for example the DNA enzyme is processed, comprise that restricted DNA enzyme is processed, the RNA enzyme processes (comprise restricted RNA enzyme process) and digests with restriction endonuclease), pre-fragmentation library (for example in cDNA library) and chemical (for example NaOH) induce that fragmentation, thermal induction fragmentation and transposon-mediated sudden change-it can import cleavage site in whole DNA sample, restriction endonuclease shearing site for example. Referring to Goryshin I.Y. and Reznikoff W.S., J Biol Chem.1998 March 27; 273 (13): 7367-74; Reznikoff W.S. etc., Methods Mol Biol.2004; 260:83-96; Oscar R. etc., Journal of Bacteriology, in April, 2001,2384-2388 page or leaf, 183 volumes, the 7th phase; Pelicic, V. etc., Journal of Bacteriology, in October, 2000,5391-5398 page or leaf, 182 volumes.
Some fragmentation methods such as atomizing can produce the target dna segment group that size only differs 2 times. Other fragmentation method such as restriction enzyme digestion produce large-scale fragment. If expectation large nucleic acids fragment then may be partial to other method such as HydroShearing. In HydroShearing (Genomic Solutions, Ann Arbor, Mich., USA), the DNA in the solution is by having the pipe that dwindles suddenly. Approach when dwindling the place at it, fluid accelerates, to keep volume flow rate by the place that dwindles than small size. In this accelerator, drag force stretching DNA is until it snaps. Dna fragmentation, to such an extent as to until fragment is too little concerning shearing force can not break chemical bonds. The flow velocity of fluid has determined final dna fragmentation size with the size of dwindling the place. Other method for preparing the nucleic acid parent material is found in international patent application no WO 04/070007, and this patent application by reference integral body is attached to herein.
According to the fragmentation method of using, the DNA end is disconnected may to need refine (polishing). That is, the double-stranded DNA end may need to process, so that its flat end, and be suitable for connecting. This step will change with manner known in the art according to the fragmentation method. For example, can use the DNA of Bal31 refine mechanical shearing, with shearing sequence jag, and can use polymerase such as klenow, T4 polymerase and dNTP to fill and lead up, to produce flush end.
Step 1B
When the variation of clip size surpassed the expectation, gradable separating acid fragment was to reduce this dimensional variation.
It is can be by the optional step of numerous means known in the art enforcements that size fractionation separates. Be used for the method that size fractionation separates and comprise gel method, pulsed-field gel electrophoresis for example, by saccharose gradient or cesium chloride gradient sedimentation, and size exclusion chromatography (gel permeation chromatography). The zone length that the selection of selected magnitude range will cover based on paired end sequencing.
A kind of optimization technique that separates for size fractionation is gel electrophoresis (referring to Figure 1B). In a preferred embodiment, the dna fragmentation that separates of size fractionation has each other 25% with interior size distribution. For example, the 5Kb size fractionation separates the fragment that should be included as 5Kb ± 1kb (being 4Kb-6 Kb), and the 50Kb size fractionation separates the fragment that should be included as 50Kb ± 10kb (being 40Kb-60 Kb).
Step 1C
In this step, preparation " capturing element ".Capturing element is that linear double-strandednucleic acid-its strand that can have the nucleic acid fragment that is used to connect previous step is terminal or double-stranded terminal." capturing element " can be used as and contain forward and oppositely annular nucleic acid (for example being shown in the plasmid of Fig. 1 C) amplification of connector end (the thick district as annulus is shown in Fig. 1 C).This circular plasmids can be cut before using capturing element.These connector ends are included in the nucleotide sequence of the hybridization site that can be used as potential PCR primer and sequencing primer in the later step.
Capturing element can comprise extra element, for example restriction endonuclease recognition site and/or cleavage site, antibiotics resistance mark, protokaryon or eucaryon replication orgin or these combination of elements between two connector ends.The example of these antibiotics resistance marks includes but not limited to give the gene to resistances such as penbritin, tsiklomitsin, Xin Meisu, kantlex, Streptomycin sulphate, bleomycin, zeocin, paraxin.The protokaryon replication orgin can comprise OriC and OriV etc.The eucaryon replication orgin can comprise autonomously replicating sequence (ARS), includes but not limited to these sequences.In addition, capturing element can comprise restriction endonuclease identification and/or cleavage site (for example preferred unique with rare site), and they can be used for subsequently nucleic acid product (step L) digestion for increasing (passing through PCR) small segment.Capturing element also can comprise mark or label, and the biological example element is used for easily purifying or enrichment pairing end sequencing nucleic acid.
Step 1D
Use known technology, for example (flush end or sticky end can be used for the preparation of different fragments in restriction endonuclease digestion; Vide infra and Fig. 1 D), the linearizing capturing element.For preventing that concatermer from forming (being being connected to each other of a plurality of capturing elements), available topoisomerase dephosphorylation or the modification capturing element that is used for the TA clone.
Step 1E
Capturing element is connected to the fragment (or size fractionation isolated fragment) of steps A or B, contains a capturing element and the segmental annular nucleic acid of target dna (Fig. 1 E) with formation.Capturing element and target dna join by well-known method, for example connect by dna ligase or connect by topoisomerase enzyme clone strategy.
Step 1F
The result of previous step produce a group be connected to can sizable dna fragmentation capturing element.This step is used to remove the segmental big inner region of target dna, inserts fragment (Fig. 1 F) to produce the clone that size may be more suitable for the automatization dna sequencing.
In this step, be used in the restriction endonuclease that can have one or more cleavage sites in the genomic dna and digest the genomic dna (i.e. the annular nucleic acid that produces by step e) that captures.In general, restriction endonuclease all can be used for " inner cutting " arbitrarily, as long as described restriction endonuclease does not cut in capturing element.Inner cutting refers to be in the inner but cutting of not cutting capturing element of target dna.Inner cutting restriction enzyme can be selected by the digestion capturing element, makes it not comprise the cleavage site of selected restriction endonuclease.Restriction endonuclease and to be applied in this area well-known can easily be applied to present method.In addition, the combination that is limited to the multiple restriction enzyme of inner cutting separately can be used for further reducing the segmental size of target dna.
In a preferred embodiment, genomic dna cuts into capturing element within 50-150 base by in these restriction endonucleases one or more.
Step 1G
In this step, will be connected between the end that digests the genome material of previous step for " dividing element " of the double-strandednucleic acid of known array, to form annular nucleic acid (Fig. 1 G).Should " dividing element " be used for two purposes.At first, dividing element can comprise the startup site (vide infra step I) that is used for the little ring of rolling circle amplification.Secondly because the sequence of dividing element is known, so its can be used as marker ligand to the marker of the end of genome end (with the end that can repair connection with easily the end that is connected is carried out software analysis).That is, in genomic fragment order-checking process subsequently, the sequence of dividing element can be sent the signal that the whole genome fragment has been checked order.Such dividing element also can comprise extra element, for example restriction endonuclease identification and/or cleavage site, antibiotics resistance mark, protokaryon or eucaryon replication orgin or these combination of elements.Exist although the element of antibiotics resistance mark and replication orgin and so on is optional, one of them advantage of the inventive method is that described method does not need to use host cell (for example intestinal bacteria) to carry out clone, amplification or other operation of nucleic acid.Dividing element also can perhaps be come mark with mark or label by biotinylation, is used to match the nucleic acid of end sequencing to be convenient to purifying or enrichment.
Step 1H
Annular nucleic acid (the being little ring) single stranded that previous step is produced is to produce single-chain nucleic acid.This uses standard DNA sex change technology to implement by salt, temperature or the pH that changes solution.Other DNA sex change technology is well known by persons skilled in the art.After sex change, still can connect from the dna circle of identical little ring, but this does not influence method of the present invention (Fig. 1 H).
Step 1I
Primer is annealed to dividing element, and described dividing element comprises can be to primer sequence annealed sequence.Therefore, this dividing element is as the initiator (Fig. 1 I) of rolling circle amplification.
Step 1J
Sample increases by rolling circle amplification, to produce long single stranded product (Fig. 1 J).An advantage of this rolling circle amplification step is that the element that does not have dividing element will not increase, rather than the amplification of the element of closed annular is poor.
Step 1K
One or more cap oligonucleotide that add are to side joint forward and the oppositely strand restriction site annealing of connector (making them in these zone two strandsizations) (Fig. 1 L).Add the cap oligonucleotide can with at least a portion complementation of capturing element, or with at least a portion complementation in connector district, or with these two complementation.
Step 1L
The single stranded DNA that will add cap at cap site is cut into small segment (Fig. 1 M).These small segments have the end of known array, can easily use conventional amplification technique such as pcr amplification.
Second method
In second embodiment, the pairing end sequencing can be implemented according to following steps:
Step 2A-fragmentation sample DNA
The fragmentation of target nucleic acid separates identical with last embodiment with size fractionation.
Step 2B-methylates and terminal refine
If needs are arranged, the target nucleic acid of fragmentation can methylate by any methylase.Preferred methylase should be the methylase that influences restriction endonuclease digestion.Methylase can two kinds of Different Strategies use at least.In a preferred embodiment, methylase can be by the restriction endonuclease cutting of only cutting at the restriction site that methylates.In another preferred embodiment, methylase prevents from only to cut the not restriction endonuclease cutting of methylate DNA.
The step of terminal refine is with identical described in the first method.
The connection of step 2C-label connector
In this step, connector is connected to the segmental end of target nucleic acid (Fig. 2 I), have the fragment of connector to be created in two ends.Connector can be any size, but the size of preferred 10-30 base, the more preferably size of 12-15 base.For preventing to form connector and/or the segmental concatermer of target nucleic acid, connector can comprise flush end and unmatched sticky end (end that promptly has 5 ' overhang or 3 ' overhang).After connector was connected to dna fragmentation and has removed ligase enzyme, sticky end can be filled and led up with polysaccharase and dNTP.
The connector of these chapters and sections can be the seizure fragment.Catch segmental example and be shown in Figure 4 and 5.
For preventing to form concatermer, connector can be hair clip connector (Fig. 6 A).Use hair clip connector (for example Fig. 6) to prevent that concatermer from forming, surpass dimeric any polymer because the hair clip connector can not form.Another prevents that the method thing of concatermer from using 5 ' terminal unphosphorylated connector of a chain wherein or two chains.
Other spendable connector comprises the non-phosphorylating connector, and it has the advantage of using less treatment step, but it also needs to use kinase whose phosphorylation step.
As the argumentation at its place in this paper disclosure, connector can be methylated or biotinylation, or has not only been methylated but also by biotinylation.
Enzymic digestion of step 2D-exonuclease and gel-purified
Can use the exonuclease enzyme purification to be connected to the dna fragmentation of two hair clip connectors.This exonuclease enzyme purification utilizes following true: the double-stranded DNA that is connected to the hair clip connector with two ends is 5 ' or the 3 ' terminal dna molecular that does not expose.Other DNA in connecting mixture, the double chain DNA fragment, the dna fragmentation that does not connect and the connector that is not connected that for example are connected to a hair clip connector only are to exonuclease sensitivity (Fig. 6 B).Therefore, connect mixture and be exposed to exonuclease and will remove most of DNA, be connected to except the dna fragmentation and hair clip connector dimer of two hair clip connectors.Because hair clip connector dimer is significantly less than dna fragmentation, remove them so can use known technology, for example size fractionation separator column (for example centrifugal post) or agarose or polyacrylamide gel electrophoresis, or a kind of in other polynucleotide size discrimination method that its place is discussed in known in the art and/or this paper disclosure.
In one embodiment, connector can be beneficial to carry the segmental separation/enrichment of label by biotinylation.
In another embodiment, the fragment that contains connector can anneal purifying to this fragment by making with sequence label complementary seizure oligonucleotide.
Step 2E-is used for the segmental preparation of cyclisation
Behind segmental two the terminal adding connectors of target nucleic acid, this fragment is by cyclisation.
For preparation is used for target nucleic acid from cyclisation, may be desirably in cutting in the connector zone owing to various reasons.For example, if use the hair clip connector, then dna fragmentation will be from cyclisation, because there is not free 5 ' or 3 ' end.As another example, if connector makes dna fragmentation stay flush end, then cutting can make connector have 5 ' or 3 ' overhang, and these overhangs (so-called " sticky end ") are very beneficial for joint efficiency.And the digestion in the connector zone should allow to select to have the dna fragmentation of two connectors, 1 connector of each terminal connection.This is because connector can be digested, the feasible sticky end that can stay coupling with the restriction endonuclease cutting.After the cutting, the dna fragmentation (unwanted material) that only has 1 connector should have 1 sticky end and 1 flush end in the connector zone, should be from difficult aspect the cyclisation.Therefore, the dna fragmentation that only all has a connector at two ends is just answered cyclisation.
The available numerous methods of the restricted cutting of connector are finished.In one approach, connector is methylated, and is connected to not methylate DNA.Then, with the restriction endonuclease digestion construction that only cuts methylate DNA.Because only connector is methylated, so have only connector to be cut.
In another approach, dna fragmentation can be methylated, and connector is not methylated.The restriction endonuclease cutting of methylate DNA will be limited to connector to cutting with only discerning and cut not.This can finish by methylated initiate dna or by external methylating by using.
Should be understood that in some cases, do not need to digest connector.For example, if the fragment of previous steps only comprises flush end, then the digestion of connector can be chosen wantonly.
Should also be understood that and to handle dna fragmentation, be beneficial to connection/cyclisation.For example, if connector is closed, perhaps do not comprise 5 ' phosphoric acid ester, then blocking groups can be removed, and perhaps can add phosphoric acid ester, so that fragment is easy to connect.
Step 2F-is terminal to be connected to form the cyclisation fragment
Can use numerous methods to carry out cyclisation.
In one embodiment, ligase enzyme is added to the reaction mixture with suitable ligase enzyme damping fluid, make dna fragmentation cyclisation again.
In one embodiment, connect, to promote from connecting and the formation of obstruction concatermer with rare DNA concentration.
In another embodiment, connect in water-in-oil emulsion, wherein the water-based drop contains about 1 and treats the cyclisation fragment, describes as its place in this paper disclosure.
In one embodiment, feature tag is connected to the target nucleic acid fragment, described fragment is from cyclisation (referring to Fig. 2).Feature tag is the double-strandednucleic acid sequence of 24-30 base pair.This " feature tag " is similar to " dividing element " in the previous embodiment, promptly its can be used as marker ligand to the marker of the end of genome end (with the end that can repair connection with easily the end that is connected is carried out software analysis).In genomic fragment order-checking process subsequently, the sequence of feature tag is sent the signal on the border between two ends of target nucleic acid sequence.
Step 2G
Adding feature tag and after cyclisation, further digesting and fragmentation target nucleic acid fragment.Fragmentation can use any fragmentation method of listing in this paper disclosure to implement.Referring to for example above step 1A.Perhaps, can use one or more restriction endonuclease digestion target dnas, to produce fragment.
In a preferred embodiment, using spraying gun fragmentation nucleic acid, is about 200-300bp until average clip size.As shown in FIG. 2, have some should comprise feature tag in these fragments, and other fragments should not comprise feature tag.
At this moment, can use standard technique sequencing nucleic acid fragment.The segmental method of sequencing nucleic acid is known.A kind of preferred sequence measurement is described in the international patent application no WO 05/003375 that submitted on January 28th, 2004.
Step 2H
In optional step, can contain the fragment of feature tag by the fragment enrichment of no feature tag.A kind of enriching method is included in and uses the biotinylation feature tag in the sample preparation steps.Behind fragmentation, the fragment that contains feature tag should be by biotinylation, and can use streptavidin post or the streptavidin pearl purifying in solution.
After enrichment, can use the standard technique that comprises automatic technology, such as the technology that is described in the international patent application no WO 05/003375 that submitted on January 28th, 2004, sequencing nucleic acid fragment.
The third method
The pairing end sequencing can be implemented by the third method.
Step 3A-3E
In the method, steps A to step e can be implemented as (promptly as step 2A to 2E) in the second approach.And in the third method, each connector all comprises bootable IIS type restriction endonuclease site of cutting apart from the DNA of the about 15-25bp of restriction endonuclease recognition site.Known dissimilar IIS type restriction endonuclease cuts in the different distance of distance endonuclease enzyme recognition site, and expection uses dissimilar IIS restriction endonucleases to adjust this distance.
Step 3F is terminal to be connected to form the cyclisation fragment
Step 3F can implement according to second method (step 2F), and the exception part is a use characteristic label (referring to Fig. 6 D) not.
Optional enriching step
In any method of the present invention, after connection, can use exonuclease, to remove not cyclisation fragment and to reduce that series connectionization is segmental to be existed.Because so 5 ' or 3 ' end that the dna fragmentation of correct cyclisation does not more expose is its anti-exonuclease enzymic digestion.In addition, bigger concatermer is because breach and should have higher having and expose 5 ' or 3 ' terminal chance.Exonuclease is handled also should remove the concatermer that these have breach.
Optional rolling circle amplification
Cyclized DNA can increase by rolling circle amplification.In brief, can use oligonucleotide and a chain hybridization of cyclized DNA again.This Oligonucleolide primers polymerase extension.Because template is an annular, so polysaccharase has generation the strand concatermer of a plurality of multiple target dnas.Can this strand concatermer be become two strands by making second kind of primer strand concatermer hybridization and second kind of primer extension thus therewith.For example, this second kind of primer connector sequence complementation of strand concatermer therewith.The double-stranded concatermer that obtains can be directly used in next step.
Digestion/fragmentation of step 3G DNA
In this step, with cyclisation nucleic acid or the series connectionization nucleic acid (Fig. 6 D) of IIS type restriction endonuclease digestion from rolling circle amplification.As the statement to step 3A, each connector all comprises the IIS type restriction endonuclease cleavage site of at least a type.IIS type restriction endonuclease will be discerned the IIS type restriction endonuclease cleavage site on the connector, and cutting is at a distance of the nucleic acid of about 10-20 base pair.The example of IIS type restriction endonuclease comprises MmeI (about 20bp), EcoP151 (25bp) or BpmI (14bp).
This step contains the short dna fragment (10-100bp) of segmental two ends of larger dna with generation, and the connector district is (Fig. 6 E) between two ends.The alternative method that produces same structure is to use any numerous dna fragmentation methods (for example describing) random fragmentation cyclisation nucleic acid as its place's description in this paper disclosure in step 1A.This makes the fragment can prepare any size (100bp, 150bp, 200bp, 250bp, 300bp or more than).
Also be created in middle other dna fragmentation (Fig. 6 E) that does not have the connector district with any method.But, because the connector district is by biotinylation, so can use solid support (for example streptavidin pearl, avidin pearl, BCCP pearl etc.) the selectivity purifying that vitamin H is had an affinity to contain the DNA in connector district.
Step 3H. order-checking
The product of any the inventive method can manually or by the automatization sequencing technologies check order.It is well-known manually checking order by the method such as Sanger order-checking or Maxam-Gilbert order-checking.For example can be by using automatization sequencing such as 454Life Sciences Corporation (Branford, Conn.) Kai Fa 454Sequencing TMCarry out the automatization order-checking, 454Sequencing TMAlso be described in application WO/05003375 that submitted on January 28th, 2004 and the common pending trial U.S. Patent application USSN:10/767 that submitted on January 28th, 2004,779, the USSN:60/476 that on June 6th, 2003 submitted to, 602, the USSN:60/476 that on June 6th, 2003 submitted to, 504, the USSN:60/443 that on January 29th, 2003 submitted to, 471, the USSN:60/476 that on January 6th, 2003 submitted to, 313, the USSN:60/476 that on June 6th, 2003 submitted to, 592, the USSN:60/465 that on April 23rd, 2003 submitted to, the USSN:60/497 that on August 25th, 071 and 2003 submitted to, 985.
In brief, in the automatization sequence measurement of the sequence measurement of developing such as 454Life Sciences Corp., a kind of order-checking connector (order-checking connector A) can be connected to an end of dna fragmentation, and second order-checking connector (order-checking connector B) can be connected to second end of dna fragmentation.After connection, can be by dna fragmentation being come with the order-checking connector purifying that is not connected arbitrarily in conjunction with vitamin H to solid support.Isolating nucleic acid fragment can be in the independent reaction chamber, and uses order-checking connector A and the specific primer of order-checking connector B are further increased by PCR.By biotin moiety being connected to A or B connector, the separable single stranded DNA of preferably forming by the A-B fragment.Can use order-checking connector A, the order-checking specific sequencing primer of connector B or to the specific sequencing primer of the connector between two ends (for example hair clip connector), this amplification of nucleic acid checks order.
In case prepared a plurality of these fragments that contain larger dna fragment end, just can check order, and can assemble end-matched sequence information, to produce partial or complete genome sequence collection of illustrative plates to them.
The 4th kind of method
The pairing end sequencing can use the variant of aforesaid method to implement, and this variant is called pairing and reads sequence PET random fragmentation, is summarized in Figure 12.Experimental result according to this 4th kind of method is illustrated in Figure 13.
Step 4A to 4E
In the method, steps A to step D can be as carrying out at (promptly as step 2A-2D or step 3A-3D) described in second method or the third method.As an alternative, step 4D can use SPRI (the reversible immobilization of solid phase) to carry out, with the fragment of purifying exonuclease processing.For example, the nucleic acid fragment in Figure 12 is connected to the biotinylation primer, and can for example use the streptavidin of streptavidin, avidin, affinity reduction or the pearl purifying of the avidin bag quilt that affinity reduces.
Step 4ECan be as described in step 2E or step 3E, carrying out.
Step 4FCan be as carrying out described in the step 3F.In brief, can use the linear DNA fragment that produces as the described any known cyclization method cyclisation previous step of above step 2F or step 3F.
In addition, can implement described optional enriching step, with enrichment annular nucleic acid as above step 3F.In brief, the nucleic acid of cyclisation can not removed by the exonuclease that degraded have the nucleic acid of free-end.Covalence closed annular nucleic acid does not have free-end, the attack of anti-exonuclease.In view of this, answer enrichment annular nucleic acid, remove linear nucleic acid simultaneously with the exonuclease processing.
Step 4G
After cyclisation, can use any fragmentation method of listing in this paper disclosure to implement fragmentation.A kind of preferred method is to use mechanical shearing fragmentation annular nucleic acid.Mechanical shearing for example can be implemented by vortex, other similarity method of forcing the nucleic acid in the solution to be described by its place in little throttle orifice or this paper disclosure.An advantage of mechanical shearing is to produce the nucleic acid (referring to the nucleic acid after the step G among Figure 12) of different lengths.
The dna fragmentation in no connector district in the middle of also being created in.Referring to Figure 12.But, because the connector district is by biotinylation, so can use solid or semi-solid support (for example streptavidin pearl, avidin pearl, BCCP pearl etc.) selectivity purifying that vitamin H is had an affinity to contain the DNA in connector district.
Step 4H
The product of method 4 can use any obtainable artificial or automatic mode order-checking.These class methods are specified in above step 3H.
As mentioned above and the pairing of in Figure 12, summarizing-read sequence PET random fragmentation numerous advantages are provided.At first, method 4 makes the degree of confidence of assembling higher, because mechanical shearing can produce longer fragment, to read sequence longer and longer fragment makes.Long read sequence and can make the assembling of target sequence have more high confidence level.Secondly, might cover more the pairing end in longer nucleic acid district and read sequence by producing of mechanical shearing preparation than long segment.By covering longer nucleic acid district, method 4 helps the room and connects (gap closure), also have cover the nucleic acid district that is difficult to analyze than high likelihood.The zone that these difficulty are big can be for example iteron or high GC content district.Like this, method 4 provides the advantage of the room switching performance that improves.The 3rd, because the ability that method 4 provides the room to connect, so method 4 can be exclusively used in order-checking complete genome group, because each independent end all can be used for setting up assembly.
An example of method 4 advantages is found in Figure 13.Figure 13 illustrates the e. coli k12 genomic dna of using method 4 order-checkings.As seen from the figure, use this method might obtain the obviously long sequence length of reading and distribute, by less than 50 to about 400.In addition, can produce the fragment length of about 3kb, its end checks order.This shows that method 4 provides the room that is better than other method switching performance.
The 5th kind of method
The pairing end sequencing can use as the variant of the aforesaid method of summarizing among Figure 15 and implement.
In this method, connector can be designed to mix the Hypoxanthine deoxyriboside hair clip connector of Hypoxanthine deoxyriboside Nucleotide (this paper is also referred to as inosine) on the opposite strand of hair clip double stranded region.Intestinal bacteria endonuclease V (EndoV) is importing strand otch (breach) by between the 2nd of inosine Nucleotide 3 ' rise and the 3rd Nucleotide.(Yao M and Kow Y W, J Biol Chem.1995,270 (48): 28609-16; Yao M and Kow Y W, J Biol Chem.1994,269 (50): 31390-6; Yao M etc., Ann N Y Acad Sci.1994,726:315-6; Yao M etc., J Biol Chem.1994,269 (23): 16260-8).
As described in Figure 14, the relative position of inosine in the hair clip connector determined that be to produce 3 ' strand overhang (Figure 14 A and Figure 14 B), 5 ' strand overhang (Figure 14 C and Figure 14 D) or flush end (no overhang) (Figure 14 E) when two chains of EndoV cutting.Can also design the sequence of hair clip connector, when EndoV cuts, to produce the non-palindrome (Figure 14 A and B) or the palindrome (Figure 14 A and C) strand overhang.Well-known in this area, any among Hypoxanthine deoxyriboside and 4 kinds of base A, G, C and the T and himself pairing (Watkins and SantaLucia, 2005, Nucleic Acids Res.33 (19): 6258-67).And connector can comprise as the IIS type restriction endonuclease recognition site (for example MmeI) in its place's argumentation of this paper disclosure.
Step 5A (Figure 15 steps A)
In the method, can be substantially as implementation step A as described in the step 1A.Can pass through aforesaid any physics known in the art or biochemical method fragmentation target dna.Randomly, the fragment of acquisition can be carried out the size fractionation separation by any size fractionation separation method of its place's description in this paper disclosure.
Step 5B and 5C (the step B+C of Figure 15)
The end of target dna can be by any refine method described herein refine, and can be connected to above-mentioned Hypoxanthine deoxyriboside hair clip connector, to form the target dna of connector mark.
Step 5D (Figure 15 step D)
Ligation can be handled (as its place's argumentation of this paper) with one or more exonucleases, and carries out size fractionation by any method as herein described and separate, with the reaction product of enrichment expectation.
Step 5E (Figure 15 step e)
Target nucleic acid with EndoV cutting connector mark.The condition that is used for cleavage reaction can be (Yao M and Kow Y W, J Biol Chem.1995,270 (48): 28609-16 such as Yao; Yao M and Kow Y W, J Biol Chem.1994,269 (50): 31390-6; Yao M etc., Ann N Y Acad Sci.1994,726:315-6; With Yao M etc., J Biol Chem.1994,269 (23): 16260-8) described any condition.The technician will appreciate that, also can use similar condition.
Step 5F-H (step F of Figure 15-H)
In this 5th kind of method, step F-H can as the 2nd, 3 or 4 kind of method described in (promptly the same) with step 2F-H or step 3F-H or step 4F-H implement.
The Hypoxanthine deoxyriboside hair clip connector of the 5th kind of method is favourable, cuts because EndoV only exists in DNA under the situation in inosine or some damage or base mispairing site.Therefore, target nucleic acid will can not handled by EndoV and be cut.Therefore, because the EndoV site is that connector is distinctive, so target dna does not need as protected by methylating in some above-mentioned embodiment.The removal of this step that methylates is saved time, and has eliminated the relevant problem that not exclusively methylates with target dna.And compare EndoV digestion with EcoRI digestion very fast, therefore shortened and implemented the needed time of this method.
The example of reading sequence results by the pairing of Hypoxanthine deoxyriboside hair clip connector method acquisition is shown in Figure 16.According to the 5th kind of method preparation and order-checking e. coli k12 genomic dna (Figure 15).The mean distance that pairing is read between the sequence is 2070bp (standard deviation=594).
The 6th kind of method
In other embodiments, can implement pairing-end sequencing by the method that contains some or all following steps, shown in Figure 17 and 18.
The fragmentation of step 6A-target dna (Figure 17 A)
According to the 6th kind of method, the polynucleotide molecule of target dna sample such as genomic dna are turned to by fragment and are longer than about 500 bases, are longer than about 1000 bases, are longer than about 2000 bases, are longer than about 5000 bases, are longer than about 10000 bases, are longer than about 20,000 base, be longer than about 50,000 base, be longer than about 100,000 bases, be longer than about 250,000 bases, be longer than about 1,000,000 base or be longer than the molecule of about 5,000,000 base.In preferred embodiments, fragment is being about 1.5kb to the scope of about 5kb.Fragmentation can be finished by any physics and/or the biochemical method of its place's description in this paper disclosure.In a preferred embodiment, by physical force random shearing target dna, for example by using
Figure A20068002818100321
Device (Genomic Solutions).But purifying is expected the shearing DNA of clip size then.This optional size is selected and can be realized by any big or small system of selection known in the art and disclosed herein, for example electrophoresis and/or liquid chromatography (LC).In a preferred embodiment, by
Figure A20068002818100322
Size exclusion pearl (Agencourt; Hawkins etc., Nucleic AcidsRes.1995 (23): 4742-4743) go up the size that purifying selects to shear the DNA sample.For example, the fragment end (paired) of the about 2-2.5kb of order-checking is used in and carries out the contig ordering in the experiment of typical bacteria gene order-checking.Bigger fragment may be favourable to the higher organism gene order-checking of fungi, plant and animal and so on.
Methylate (Figure 17 B) of some restriction site of step 6B-
As mentioned below, after connector was connected to the target dna fragment, connector can be with one or more restriction enzyme cuttings, for cyclisation is prepared.For the restriction enzyme digestion target dna that prevents to select, by exempting from digestion with corresponding methylase modification protection target dna.In a preferred embodiment, connector is the hair clip connector, and has EcoRI restriction site (Figure 18 A).Therefore, in a preferred embodiment, the EcoRI restriction site that uses the EcoRI methylase to methylate and exist in the sample dna fragment is to keep its integrity, then by the connection cyclisation when the EcoRI sticky end produces outside the hair clip connector.
Terminal refine of step 6C-fragment and phosphorylation (Figure 17 C)
Hydraulic shear DNA produces some fragment with damaged end (frayed end) (strand overhang).Flush end is preferred to connector connection subsequently.Therefore, randomly,, any damaged end can be become flush end, be used for enzymatic easily and connect by with " filling and leading up " of archaeal dna polymerase and/or by " switchback (chewing-back) " with exonuclease (for example mung-bean nuclease).Advantageously, some archaeal dna polymerase also has exonuclease activity.Perhaps, after the flush end reaction, preferably use the segmental 5 ' end of polynucleotide kinase phosphorylation.In a preferred embodiment, T4DNA polysaccharase and T4 polynucleotide kinase (T4PNK) are respectively applied for and fill and lead up and phosphorylation.The T4DNA polysaccharase is used for " filling and leading up " 3 of DNA '-depression terminal (5 '-overhang) through its 5 ' → 3 ' polymerase activity, and its strand 3 ' → 5 ' exonuclease activity removes 3 '-protruding terminus.The kinase activity of T4PNK is added to 5 with phosphate group '-C-terminal.
Step 6D-hair clip connector connects (Figure 17 D and Figure 18 A)
According to the present invention, the double chain oligonucleotide connector is connected to the segmental end of target dna.In a preferred embodiment, connector is hair clip connector (Figure 18 A).An advantage of hair clip connector is that connector-connector connection event only produces the connector dimer, promptly prevents to form polymer connector concatermer.In addition, its hairpin structure protection sample fragment exempts from and is used to remove the not exonuclease enzymic digestion (step 6E) of junction fragment.A preferred hair clip connector design that is shown in Figure 18 A comprises EcoRI and MmeI restriction site.EcoRI is used in each segmental end and sets up sticky end (step 6F), makes its cyclisation (step 6G), and MmeI is the IIS type restriction enzyme apart from its recognition site 20bp cutting DNA; The end cutting that it is used at the cyclisation sample fragment produces the terminal label of pairing to be checked order.The technician will appreciate that, EcoRI can be substituted by in a large amount of other endonucleases any, the change of the nucleotide sequence of simultaneous connector oligonucleotide and use the segmental suitable methylase of protection target dna.Equally, MmeI can substitute with other IIs type restriction enzyme, as long as selected enzyme is enough to produce the pairing end that length is enough to allow the downstream sequence assembling in the distance apart from its restriction site cutting.In a preferred embodiment, the hair clip connector is by biotinylation, for example in the site that is shown in Figure 18 A.Other biotinylation site also is suitable, and can be selected by the technician.Biotin moiety makes can be in the connection procedure of the terminal connector of pairing, fill and lead up in reaction (fragment reparation) process and the terminal amplified library process of pairing in randomly select to contain the pairing terminal fragment of connector and the terminal library of randomly fixing pairing fragment (after MmeI digests).
Step 6E-exonuclease is selected (Figure 17 E)
Preferably, the exonuclease enzymic digestion is after the hair clip connector connects, to remove at two ends not with the correct any DNA of assembling of hair clip connector; Purifying on SPRI size exclusion pearl is removed unwanted small-molecule substance, for example connector-connector dimer.The exonuclease enzymic digestion can be carried out with in the various exonucleases well-known in the art one or more.Preferably, digestion is finished with combined activity, and described combined activity allows together 3 ' → 5 ' and 5 ' → 3 ' both direction digestion strand and double-stranded DNA.In a preferred embodiment, the exonuclease enzyme mixture comprises intestinal bacteria exonuclease I (3 ' → 5 ' strand exonuclease), phage exonuclease (5 ' → 3 ' strand and double-stranded exonuclease) and phage t7 exonuclease (5 ' → 3 ' double-stranded exonuclease can start in room and incision).
Step 6F-EcoRI digests (Figure 17 F)
In a preferred embodiment, the cutting of the inscribe Nucleotide of EcoRI is used for by cutting hair clip connector (Figure 18 A) and allows the fragment cyclisation, sets up sticky end on each segmental end.Digest the hairpin structure that to remove at the fragment end with EcoRI, stay sticky end.The EcoRI site, inside that is present in the sample DNA is protected by methylating of before carrying out in step 6B.
Step 6G-cyclisation (Figure 17 G)
Intramolecularly by its viscosity EcoRI end connects the cyclisation fragment then.Therefore, connection site has two portions hair clip connector (EcoRI site that head to head, has reconstruct; Be total to 44bp), each side is the end of side joint sample fragment all.Carry out another kind of exonuclease enzymic digestion, to remove arbitrarily the not DNA of cyclisation.
Step 6H-MmeI digests (Figure 17 H)
Then with MmeI restriction cyclisation dna fragmentation.This IIs type restriction enzyme (stays 3 ' overhang of 2 Nucleotide, promptly cuts at 20/18 Nucleotide place apart from the about 20bp cutting of its restriction site; This enzyme also produces some to have apart from the low amounts of product of the otch of this site 19-22bp).End at the hair clip connector that is connected to sample dna fragment has MmeI site (Figure 18 A); Restriction in these sites produces the terminal DNA of pairing library fragment, and each fragment all contains " two " hair clip connector (44bp) of connection and two 20bp ends of sample fragment, and total length is 84bp.
Step 6I-separates (Figure 17 I) with the streptavidin pearl
Do not having under the situation of biotin label, randomly can remove the MmeI restricted fragment of connectionless " two " hair clip connector in this step.Can fix the library of (and by other MmeI restricted fragment separation) pairing terminal fragment by making the biotin label that is present in the hair clip connector in conjunction with streptavidin or avidin pearl.
Step 6J-matches terminal connector and connects (Figure 17 J)
In this step, will in step 6H, produce and randomly in step 6I the segmental end in the terminal library of the pairing of purifying be connected to double-stranded connector, this connector is called the pairing terminal library connector or the terminal connector (Figure 18 B) that matches.These match terminal connector provides the promoter region of supporting amplification and nucleotide sequencing, also can comprise to be used in 454Sequencing TMThe weak point of well surveying in the system (for example 4 Nucleotide) " order-checking is crucial " sequence.Connector can have " degeneracy " 2 base strands 3 ' overhang.Degeneracy means that 2 outstanding bases are at random, and promptly they all can be G, A, T or C.If use the enzyme beyond the MmeI, then the technician should be able to easily design the pairing terminal connector suitable mutually with this other enzyme.The exemplary connector that is shown in Figure 18 B designs to such an extent that support match terminal library fragment strongly and be connected with each connector orientation of the 2bp 3 '-overhang that comprises degeneracy at its 3 ' end, these connectors only can be connected to the segmental end in the terminal library of pairing that MmeI-produces (prerequisite be 5 ' end of connector not by phosphorylation, vide infra).Connector can contain the connector of very big molar excess (15: 1 connectors: in the ligation fragment ratio) with pairing terminal library fragment combination, both maximized the pairing segmental use in terminal library, made the potentiality minimum that forms the terminal library of pairing fragment concatermer again.Connector self can be unphosphorylated, so that the dimeric formation of connector is minimum, although thereby must connect product (step 6K) by filling and leading up reaction reparation subsequently.
Step 6K-fills and leads up reaction (Fig. 6 K)
If the terminal connector of the pairing that connects in step 6J is phosphorylation not, then itself and the terminal library DNA of pairing segmental 3 '-will there be the room in joint.This two " room " or " breach " can use the reparation of strand displacement archaeal dna polymerase, thereby polysaccharase identification breach, replace chain jaggy (replace with each connector free 3 '-end), and extend this chain in the mode that causes breach reparation and total length dsDNA to form.In a preferred embodiment, use BstDNA polysaccharase (big fragment).Other strand displacement archaeal dna polymerase known in the art also is applicable to this step, for example phi29 archaeal dna polymerase, dna polymerase i (Klenow fragment) or Archaeal dna polymerase.
Step 6L-increase (Fig. 6 L)
Randomly, " connection " pairing terminal DNA library of can increasing.Preferably, amplification is undertaken by PCR, but also can use other nucleic acid amplification method known in the art and/or described herein.Preferably, oligonucleotide F-PCR and the R-PCR that is shown in Figure 18 B can be used as the PCR primer.
Order-checking " connection " pairing terminal DNA library then, no matter its (as in above paragraph, describing) still amplification, order-checking then of increasing.Preferably, each molecule in order-checking library.If selected dna sequencing method needs a large amount of identical template molecules in each independent sequencing reaction, but each molecule clonal expansion in library then.Preferably, clonal expansion is undertaken by pearl emulsification PCR as described in international patent application no WO 2005/003375, WO 2004/069849, WO 2005/073410, these patent applications each all by reference integral body be attached to herein.
Certainly, also imagined the arbitrary combination of the corresponding step of above-mentioned 6 kinds of methods, and included among the present invention.
By above disclosure as seen, between method 1,2,3,4,5 and 6, similarity is arranged.Specifically, method 2,3,4,5 is especially similar with the similar step in 6, and can make up between each method and exchange, to produce of equal value or favourable result.
Now, describe the universal method of pairing end sequencing, described the variant of these methods.
In a variant, the hair clip connector can be replaced (Fig. 8) with the overhang connector.But overhang connector biotinylation perhaps for example has following sequence:
5′OH-AATTC---AAACCCTTTCGGT---TCCAAC-3′OH (Seq ID NO:28)
| ||||||||||||| ||||||
3′ OH-G---TTTGGGAAAGCCA---AGGTTG-5′PO4(Seq ID NO:29)
63 ' terminal nucleotide of upper chain (Seq ID NO:28), i.e. TCCAAC forms the recognition site of IIS type Restriction Enzyme MmeI together with the complementary nucleotide of lower chain (Seq ID NO:29).
Described variant is implemented in the mode that is similar to method 3.First genomic dna (Fig. 8 A) is connected to fragment end (Fig. 8 C) by fragmentation and refine (Fig. 8 B) with the overhang connector.The dimer of overhang connector can be removed by size exclusion chromatography (being centrifugal post) or based on the chromatography of electric charge.Can not form the higher concatermer of overhang connector owing to not having phosphoric acid ester at 5 ' overhang.After removing overhang primer dimer (Fig. 8 D), make fragment from connecting (Fig. 8 E) by handling with kinases.Implement to implement the exonuclease enzymic digestion subsequently, to remove the acyclic DNA that does not connect from connecting (being cyclisation).Because the dna fragmentation that is not connected to the overhang connector is owing to refine has flush end, thus expect they can't with have every side and be connected the equally effective connection of segmental 5 ' overhang (sticky end) of two overhang connectors of one.After cyclisation, use MmeI digestion to remove the DNA (referring to Fig. 8 F) of overhang connector far-end, on each side of the overhang connector that connects, stay the protogene group DNA (Fig. 8 G) of about 20 bases.Use has the fragment (Fig. 8 H) of overhang connector in conjunction with the streptavidin pearl purifying of biotinylation connector.
The fragment that obtains can be by any methods availalbe order-checking, for example method that provides in this paper disclosure (for example step 3H).
The nucleic acid that produces by the inventive method can use one or more and the terminal complementary primer of sequence to check order.That is to say that according to the order-checking scheme that step 3H describes, will check order connector A and order-checking connector B are connected to the fragment end, then to they order-checkings.Because known segmental end sequence is order-checking connector A or B, so can use and check order connector A or B complementary sequencing primer order-checking fragment.And the sequence that contains the connector that has connected at each fragment middle part is known (referring to 703 among Fig. 7 for example).Order-checking also can be used with this central region complementary primer and be begun by the middle part.And the sequencing primer of stub area and the sequencing primer of central region can be hybridized (referring to Fig. 9) with fragment to be checked order simultaneously.A primer is protected, and that another primer does not have is protected.In Fig. 9, protected by bound phosphate groups with the primer of end hybridization.First round order-checking will be begun (Fig. 9, middle part primer) by not protected primer.After first round order-checking, randomly can stop the extension of first kind of primer, for example by mixing complementary dideoxy nucleotide.Perhaps, the extension of first kind of primer can proceed to the end of template strand, makes to stop.Second kind of shielded primer can go protection, and takes turns in the order-checking second and to extend, to measure the sequence of fragment end.This method can obtain two long pairing end sequencings by the single template that can be strand and read sequence.
In second variant, the initiate dna (Figure 10 A) of fragmentation is connected to the connector in the IIS type restriction endonuclease site, inside that has 3 ' CC overhang and choose wantonly.The fragment self that connects can not be from connection or from cyclisation, because its end does not match (not complementary).But these fragments can use the joint that all has 5 ' GG overhang in both sides to connect (Figure 10 B).After connection, standard gel that can be by above argumentation and column chromatography or by cutting the not exonuclease enzymic digestion of cyclisation molecule, by acyclic DNA purification of nucleic acid fragment.The annular DNA (Figure 10 D) that obtains can as in other method with the MmeI cutting, and the DNA that can check order and obtain.
In another variant, the inventive method can be used for producing ssDNA (Figure 11, the step 1) that A/B connects.This single-chain fragment can by with the oligonucleotide hybridization cyclisation (Figure 11, step 2) that contains A/B connector complementary sequence, and in the presence of ligase enzyme, connect.Except helping connection, described oligonucleotide can be used as primer, is beneficial to rolling circle amplification (Figure 11, the step 3) of cyclisation ssDNA.The DNA of rolling circle amplification can be as (Fig. 1 L and M) cutting as described in the step 1K and L of method 1.After amplification, can be to the preparation of product application standard library and sequencing technologies (Figure 11, step 4).
Certain embodiments of the present invention are based on wonderful discovery: experimental program comprises in the coli strain K12 genome pairing end sequencing experiment of use according to the MmeI cutting of this paper described method therein, the change in depth of reading sequence of covering gene group is (Figure 20, " carrier free (-) ") greatly.The described degree of depth the quantity of reading sequence that refers to map to genomic basic identical zone.The density in MmeI site be associated (Figure 20) in this change in depth and the genome.Unexpectedly and surprisingly, the contriver finds, the double-stranded DNA (in Figure 20, being called " (+) ") that adds the known MmeI of containing site, be intestinal bacteria B bacterial strain DNA (" EcoliBStrain (+) "), salmon sperm dna (" SalSprmDNA (+) "), or the known pcr amplification product (" AmpPosMmeI (+) ") that contains the MmeI site, the greatly change in depth of reduction and randomization covering gene group.But, compare with the contrast of " carrier free ", add the double-stranded DNA (in Figure 20, being called " (-) ") that does not have the MmeI site, be poly (dIdC) (" dIdC (-) "), or the known pcr amplification product (" AmpNegMmeI (-) ") that does not contain the MmeI site, do not change the depth change mode of covering gene group.Therefore, use the positive carrier DNA of MmeI-to provide in genome more equally distributed pairing end to read sequence, this is favourable.These wonderful discoveries are further confirmed by the data that are shown in following table:
Table 1.MmeI carrier DNA is read the influence of the depth profile and the length of sequence to pairing-end
Sample Depth Ave Depth STDEV Depth% CV Length Ave Length STDEV Length %CV
Stratagene_ SS_dsDNA 25.59 9.27 36.2% 2,219 618 27.8%
EcoliBStrain 21.99 8.32 37.8% 2,210 618 28.0%
AmpPos 22.82 7.51 32.9% 2,199 618 28.1%
dldC 22.17 26.55 119.7% 2,397 651 27.2%
AmpNeg 21.10 22.93 108.7% 2,363 639 27.0%
Negative 23.05 26.01 112.8% 2,385 654 27.4%
Table 1 has shown the overburden depth statistics of e. coli k12.3 samples in top (OK) have the positive carrier DNA of MmeI-of adding, and 3 samples in bottom have the negative carrier DNA of the MmeI-of adding.Row headers representative: " Depth Ave "=mean depth; The standard deviation of " Depth STDEV "=degree of depth; " Depth%CV "=Depth STDEV is divided by Depth Ave (this Shang dynasty table is through the gauged change in depth of mean depth); The mean distance of sequence is read in pairing in " Length Ave "=genome; The standard deviation of the distance of sequence is read in pairing in " LengthSTDEV "=genome; " Length%CV "=Length STDEV is divided by Length Ave.
Table 1 shows, according to Figure 20, adds the positive carrier DNA of MmeI-and has greatly reduced the interior overburden depth variation (referring to Depth STDEV and Depth%CV value, less Depth STDEV and Depth%CV value are favourable) of e. coli k12 genome.This causes, and the terminal more homogeneous of reading sequence of pairing distributes in the genome.It is favourable that this homogeneous distributes.
Table 2. adopts the influence of the pairing end sequencing of the positive carrier DNA of MmeI-to e. coli k12 genome skeleton splicing (scaffolding)
Stratagene SS dsDNA(+) E.Coli Bstrain(+) Amplified Positive(+) dIdC(-) Amplified Negative(-) Carrier free (-)
The skeleton number 25 22 19 56 53 48
The base number of skeleton splicing 4,565,936 4,569,196 4,571,112 4,553,955 4,548,402 4,550,228
The genome per-cent of skeleton splicing 98.41% 98.48% 98.52% 98.15% 98.03% 98.07%
Table 2 has shown the influence of the pairing end sequencing data of the positive carrier DNA acquisition of usefulness MmeI-to the skeleton splicing of air gun contig.When reading sequence assembling with the pairing end sequencing at GS20 sequencing device (454Life Sciences, Branford, Conn., when USA) going up 121 big contigs that obtain by air gun order-checking e. coli k12 genomic dna, with without carrier DNA or read sequence (48-56 skeleton) with the pairing end sequencing that the carrier DNA that does not have the MmeI site produces and compare, by with the positive carrier DNA of MmeI-(capable " Stratagene SSdsDNA (+) ", " E.Coli Bstrain (+) " and " Amplified Positive (+) ") the pairing end that produces reads the skeleton number lower (19-25) (promptly bigger skeleton) that sequence obtains.Therefore, use the positive carrier DNA of MmeI to improve the genome assembling performance that obtains by the pairing end sequencing implemented according to the invention.
In certain embodiments, method of the present invention is included in to comprise and uses double-stranded " carrier DNA " in the arbitrary steps that utilizes restriction endonuclease MmeI to carry out the DNA cutting.Carrier DNA must comprise the MmeI site.When the mole number of MmeI enzyme molecule approximates the mole number in the MmeI site that exists in the DNA sample greatly (products catalogue of New England Biolab, Ipswich, Mass., USA), the most effective generation of the inscribe Nucleotide of MmeI cutting.In the method for the invention, the MmeI number of sites may be difficult to estimate, this is owing to credible detections difficulty and low DNA concentration (typically being ng to the 10ng order of magnitude) consuming time, also owing to the variation based on the MmeI number of sites of target dna to be checked order.Therefore, the amount of the MmeI enzyme of accurate calculation adding reaction (to obtain stoichiometric concentration) is a problem.In order to overcome this difficult problem and to satisfy the MmeI number of sites and MmeI enzyme molecule number equilibrated needs, some method of the present invention comprises the carrier DNA (with respect to sample DNA) that adding is excessive.Like this, can calculate the MmeI enzyme amount that adds reaction based on the carrier DNA of known quantity, and the MmeI number of sites in (annular) sample DNA becomes and can ignore.Therefore, the detection of the DNA concentration of sample DNA becomes unnecessary.This has promoted speed, and has reduced cost and time that described method needs.The amount of carrier DNA can reach several times to about 10 times of the sample DNA amount, to about 100 times, to about 1000 times or higher.In a preferred embodiment, with the double-stranded salmon sperm dna of the ultrasonic mistake of 2 μ g and 2 MmeI of unit and reagent (for example 1X NEBuffer 4 (NewEngland Biolabs) and 50 μ M S-adenosylmethionines (SAM)) that all need with 100 μ l volumes adding sample DNA, in about 15 minutes of about 37 ℃ of incubations.The technician will recognize that temperature of reaction and time length can be adjusted in actual range.
Contain MmeI enzyme the purposes in MmeI restrictive diges-tion of the carrier DNA in excessive MmeI-site as mentioned above, randomly can join in any method that contains MmeI digestion that (for example in the step 6H of the 6th kind of method (Figure 17 H)) describes in this paper disclosure together with about stoichiometric quantity.The technician also knows, and the strategy that adds " carrier DNA " contain the MmeI site all is useful in MmeI restrictive diges-tion reaction arbitrarily, especially wherein the reaction of the MmeI number of sites the unknown in the low and/or sample DNA of sample DNA amount.
In water-in-oil emulsion, connect
The present invention also comprises the method for cyclisation nucleic acid molecule.Usually, by connecting the cyclisation that realizes nucleic acid molecule at low nucleic acid concentration.With respect to the intermolecular incident that meets secondary (or more senior) reaction kinetics, lower concentration is partial to meet intramolecularly ligation (being the cyclisation) (F.M.Ausubel etc. (editor) of the expectation of first order reaction kinetics, 2001, Current Protocols inMolecular Biology, John Wiley ﹠amp; Sons Inc.).But even if at high dilution, intermolecular incident also can not be prevented from, and extremely dilution nucleic acid is infeasible.The incidence of the intramolecular cyclization incident that has reduced expectation of moleculartie (concatermer, dicyclo etc.).In some cases, the moleculartie product can be harmful to downstream application.In a word, ordinary method has at least two main drawbacks.At first, need the dilution initial nucleic acid to increase reaction volume and relevant reagent cost.High dilution also makes reaction product be difficult to efficient recovery.Secondly, a large amount of moleculartie incidents takes place really, the intramolecularly that has reduced expectation connects the yield of product.
The present invention includes the method for having eliminated the problem relevant basically with above-mentioned conventional cyclization method.For example, according to the present invention, need be, promptly carry out ligation with low nucleic acid concentration with high dilution.In one embodiment, each linear dsdna molecule that will have a connected end (for example flush end or staggered (" viscosity ") end) of coupling connects in the reaction environment of physical separation.Contain DNA to be connected and ligation all must reagent (for example dna ligase, ligase enzyme damping fluid, ATP etc.) aqueous solution emulsification in oil, preferably in the presence of the tensio-active agent that plays stable emulsion.The suitable composition and the method that produce emulsion are discussed hereinafter in more detail.The water-in-oil emulsion that obtains comprises droplet (microreactor), and each all comprises 0,1 or a plurality of dna molecular.The dna molecular number of each microreactor can be adjusted by changing DNA concentration and droplet size.For the technician, calculating suitable condition based on nucleic acid concentration, polynucleotide size (according to the length of base number detection) and droplet average-volume is the optimization routine problem.The ideal droplet contains the single dna molecular that connects.But, it being understood that in the microreactor group dna molecular number of each microreactor will be partly changes according to the stochastic distribution of the changeable scale of microreactor and dna molecular.Therefore, some reactor can not comprise dna molecular, and some can comprise 1 dna molecular, can comprise some 2 or more a plurality of dna molecular.Those of skill in the art will recognize that can be as required by changing the average dna molecule number balance yield and the cost (reagent use) of each microreactor.
Preferably, connect mixture and when assembling, keep cooling (for example in 0-4 ℃), finish until emulsion process.This will stop ligation to be proceeded before the emulsification environment that forms expectation, therefore will stop the formation of unwanted inter-molecular linkage.Subsequently, emulsive ligation thing will allowed the temperature incubation of ligation.The incubation time can be by several minutes to 1 hour, several hours, spend the night or 24 hours or above 1 day.Behind this incubation, but before destroying emulsification, during and afterwards, can stop ligation, with the moleculartie of not expecting in the ligation that prevents to make up.Can by temperature is reduced to about 0-4 ℃ (frozen water), by the heated and inactivated ligase enzyme, by adding EDTA, add ligase enzyme inhibitor etc., or the combination of these methods arbitrarily stops ligation.
The technician is applied to above-mentioned the inventive method the cyclisation of strand or double-stranded RNA or strand or double-stranded DNA easily.For example, the end of linear strand polynucleotide molecule can be by producing directly side by side adding cap oligonucleotide (being also referred to as the bridge joint oligonucleotide) annealing, the described cap oligonucleotide that adds has each terminal complementary part to described linear strand polynucleotide molecule, as at (referring to Fig. 1 L and Figure 11) described in the step 1K of method 1.
Then can be in suitable temperature incubation emulsive ligation thing.For example, connect for " sticky end " that adopt the T4DNA ligase enzyme, suitable heated culture temperature is 16 ℃, but the temperature of certain width scope is an acceptable.The condition of contact of DNA and other molecule is widely known by the people in this area.An advantage of carrying out cyclization in emulsion is that reaction times of prolonging is neutral to the success of described method, or even useful.For example, be no more than 1 dna molecular ideally at each microreactor, the incubation time can prolong, until most of dna molecular by cyclisation.By contrast, by using above-mentioned conventional non-emulsion process, the incubation of prolongation can produce more a high proportion of moleculartie product.Another advantage based on emulsive method of attachment of the present invention is to allow to react the ability of proceeding the long relatively time period and not increasing the generation of moleculartie.The quantity of chien shih cyclisation product is higher during the incubation of this increase, and increases the risk that moleculartie takes place.And, because molecule does not separate in the concentration dependent mode by physical method, so reaction volume possibility much lower (nucleic acid concentration that is aqueous phase nucleic acid may be much higher) has reduced reagent cost, and has increased the simplification of handling sample for identical connection event number.The technician will appreciate that for connecting, described droplet must contain the reagent of capacity in given droplet, comprise at least one ligase enzyme molecule.
Separating of breakdown of emulsion and cyclized DNA
After connection, can stop ligation, and " destruction " emulsification (being also referred to as " emulsion breaking " in this area).Many destruction emulsive methods (referring to for example No. the 5th, 989,892, United States Patent (USP) and the reference wherein mentioned) are arranged, and those skilled in the art should be able to select suitable method.Can be the separate nucleic acid step after the emulsion breaking, this step can be undertaken by the separate nucleic acid method of any appropriate.In case isolate nucleic acid, just can remove the material that does not connect by the method that is suitable for this task arbitrarily, wherein a kind of method is that sample is carried out the exonuclease enzymic digestion.The concrete exonuclease that uses depends in part on molecule type (strand or double-stranded DNA or RNA) and other Consideration that will be applied, and for example includes the temperature of reaction in the process expediently in.Handling back cyclisation material at exonuclease must be by a kind of purifying in numerous methods known in the art, and for example any commercially available purification kit of this purposes is extracted or be applicable to phenol/chloroform.
Observed already, and used above-mentionedly based on dilution conventional cyclization method, the recovery of the cyclisation product of expectation is along with the length of linearity input dna molecular increases and reduces.Emulsification connection method of the present invention is particularly useful to the long polynucleotide molecule of cyclisation, for example be longer than about 500 bases, be longer than about 1000 broken bases, be longer than about 2000 bases, be longer than about 5000 bases, be longer than about 10000 bases, be longer than about 20,000 base, be longer than about 50,000 base, be longer than about 100,000 base, be longer than about 250,000 base, be longer than about 1,000,000 base or be longer than about 5,000, the molecule of 000 base, or in fact in the target experimental program, think the molecule of any size of expectation.
Emulsification method of attachment described herein can be used for various ligations, and no matter they produce cyclisation does not still produce cyclisation.Therefore, above-mentioned emulsification method of attachment can be used in any Connection Step of the whole bag of tricks described herein, especially wherein the ligation of expectation input nucleic acid cyclisation.
Emulsification
Emulsion is the heterogeneous system of two immiscible liquid phases, wherein one as micro-or droplet distribution that colloid is big or small another mutually in.Emulsion of the present invention must be able to form micro-capsule (microreactor).Emulsion must be by the immiscible fluid combination results of any appropriate.Emulsion of the present invention has aqueous favoring (containing biochemical composition) and hydrophobic immiscible fluid (" oil "), the phase (disperse phase, interior phase or discontinuous phase) of aqueous favoring for existing with segmentation drop form, hydrophobic immiscible fluid is suspended in wherein matrix (non-dispersive phase, external phase or foreign minister) for these drops.This emulsion is called " water-in-oil " (W/O).It has following advantage: the whole water that contains biochemical composition in discrete drop (interior phase) by compartmentation.For the foreign minister of hydrophobic oil does not generally contain biochemical composition, be inert therefore.
In certain embodiments, microreactor comprises nucleic acid and connects essential reagent.Each can comprise 1 polynucleotide molecule just a plurality of microreactors.In certain embodiments, need heat-staple water-in-oil emulsion, if for example after reaction, will carry out the heat inactivation of ligase enzyme, if perhaps use thermally-stabilised ligase enzyme (for example Taq dna ligase) to connect in the temperature that promotes.Emulsion can form according to any suitable method known in the art.Hereinafter described a kind of method of setting up emulsion, but the method for preparing emulsion arbitrarily can be used all.These methods are known in this area, comprise adjuvant method, counter-current, cross-flow method, jolting, rotary drum method and embrane method.And, can pass through to change the flow velocity of component and the size that speed is adjusted micro-capsule.For example, dropwise add fashionable, the size of drop and to transmit total time variable.In certain embodiments, droplet can produce in microfluidic devices, for example be described in Link etc. (Angew.Chem.Int.Ed., 2006,45,2556-2560), document integral body by reference is attached to herein.
The nucleic acid that at least some microreactors should must be enough to comprise capacity greatly is connected reagent with other.But at least some microreactors should be enough little, makes part microreactor group comprise the single polynucleotide molecule that can connect certainly.In certain embodiments, emulsion is heat-staple.Preferably, to the scope of about 500 μ m, more preferably about 1 μ m is to about 100 μ m at about 100nm for the droplet dia size of formation.Advantageously, optional cross-flow fluid with the electric field combination mixes allow to control drop formation and drop size all once.
The various emulsions that are suitable for biological respinse are referring to Griffiths and Tawfik, EMBO, 22,24-35 page or leaf (2003); Ghadessy etc., Proc.Natl.Acad.Sci.USA 98,4552-4557 page or leaf (2001); United States Patent (USP) the 6th, 489, No. 103 and WO 02/22869, each all completely is attached to these documents herein by reference.In a preferred embodiment, described oil is silicone oil.
Tensio-active agent
Can be by adding one or more tensio-active agent (emulsion stabilizers; Tensio-active agent) stablizes emulsion of the present invention.These tensio-active agents are also referred to as emulsifying agent, work to prevent that (or postponing at least) respectively is separated at water/oily interface.Many oil and many emulsifying agents can be used for producing water-in-oil emulsion; Up-to-date compilation has been listed and has been surpassed 16,000 kinds of tensio-active agents, wherein many as emulsifying agent (Ash, M. and Ash, I. (1993) Handbook of industrial surfactants.Gower, Aldershot).The emulsion stabilizer that is used for the inventive method comprises Atlox 4912, sorbitan monooleate (Span80; ICI), polyoxyethylene sorbitan monooleate (Tween80; ICI) and other known and commercially available suitable stablizer.In various embodiments, tensio-active agent is with 0.5-50% in the oil phase of emulsion, preferred 10-45%, more preferably the volume/volume concentration of 30-40% provides.
In certain embodiments, use chemically inert based on organosilyl tensio-active agent, for example Organosiliconcopolymere.In one embodiment, the Organosiliconcopolymere of use is polysiloxane-poly-hexadecyl-ethylene glycol copolymer (hexadecyl dimethicone copolyol), for example EM90 (Goldschmidt).
Can be used as tensio-active agent unique in the emulsion compositions based on the organosilyl tensio-active agent of unreactiveness provides, and a kind of the providing in several tensio-active agents perhaps is provided.Therefore, can use mean mixtures of individual surfactants.
In specific embodiment, a kind of tensio-active agent of use is Dow 749Fluid (with 1-50%, preferred 10-45%, more preferably 25-35% w/w use).In other specific embodiment, a kind of tensio-active agent of use is Dow
Figure A20068002818100463
5225C Formulation Aid (with 1-50%, preferred 10-45%, more preferably 35-45% w/w use).In a preferred embodiment, oil/surfactant mixture is made up of following: 40% (w/w) Dow
Figure A20068002818100464
5225C Formulation Aid, 30% (w/w) Dow
Figure A20068002818100465
749Fluid and 30% (w/w) silicone oil.
Method of the present invention provides a large amount of interests and the advantage that surmounts existing method.The advantage that described method surmounts prior art is need not clone and breed the fragment for preparing in eucaryon or prokaryotic hosts.This therein target sequence contain under the situation of a plurality of tumor-necrosis factor glycoproteinss particularly usefully, these tumor-necrosis factor glycoproteinss are rearrangeable in the process as episome propagation in host cell.
Another advantage of disclosed method is that it not only provides the contig sequence, and provide the direction of the end sequence and the end sequence of long contig, can promote the genome assembling thus, described long contig can have above 100bp, surpasses 300bp, surpasses 500bp, surpasses 1kb, surpasses 5kb, surpasses 10kb, surpasses 100kb, surpasses 1Mb, surpass 10Mb or longer length.This sequence information and directional information can be used for promoting the genome assembling, and provide the room to connect.
And the pairing end is read sequence and provide second level of confidence in the genome assembling.For example, if consistent with regular contig order-checking about the pairing end sequencing of certain dna sequence dna, then the level of confidence of this sequence increases.In other words, if two sequence datas contradict each other, then degree of confidence descends, essential more analysis and/or check order and locate the contradiction source.
Exist or do not exist open reading-frame (ORF) that indication about the open reading-frame (ORF) position also is provided in the sequence terminal the reading of pairing.For example, if the contig end of two order-checkings all contains open reading-frame (ORF), might whole contig be an open reading-frame (ORF) then.This can confirm by the standard sequencing technologies.Perhaps,, can make up the specific PCR primer by the information of two ends, with two ends that increase, and the amplification region that can check order, to determine existing of open reading-frame (ORF).
The inventive method also will be improved the understanding to genomic organization and structure.Because having to cover, the pairing end sequencing is difficult to the regional ability that checks order, thus the deducibility genome structure, even if do not checked order in these zones.The difficulty in order-checking zone may be for example iteron and secondary structure district.In the case, can map in genome in the quantity of these difficult region and position, even if these regional sequences are unknown.
Method of the present invention also allows in the distance that prolongs genome to be carried out haplotyping.For example, can prepare Auele Specific Primer, contain the genome area of two SNP of long distance connection with amplification.Two ends of this amplification region can use the inventive method order-checking, and to determine haplotype, nucleic acid between two SNP need not to check order.This method is particularly useful to the situation that two SNP wherein cover the uneconomical zones of order-checking.These zones comprise long zone, have the zone or the secondary structure zone of tumor-necrosis factor glycoproteins.
The biotinylation connector of described method provides extra advantage (Fig. 7).Fig. 7 A has shown the nucleic acid that is connected to sequencing primer A and B with the form of preparing to be used to check order.The some of them pollution of nucleic acid do not contain the nucleic acid (701) of two ends in single contig district.The nucleic acid fragment that contains two ends of contig is called 702.Because nucleic acid 702 is unique nucleic acid substances that contain vitamin H, so this material can use streptavidin pearl purifying (Fig. 7 B).This material prepares to be used for order-checking behind purifying.By using affinity purification, the sequence fragment that produces useful information can significantly increase.
This is particularly useful when contaminating dna (701) is long, and for example each contaminated nucleic acid (701) among Fig. 7 D is all grown the situation of several kb.These pollutents that check order should consume reagent, manpower and the computer capacity that quite a few is used for project.In the case, before should provide significant labour and reagent saving by the suitable fragment of affinitive layer purification (Fig. 7 E).
The technician can recognize at once, EndoV to the inscribe Nucleotide cutting of any double-stranded DNA of containing the reverse strand inosine (as shown in figure 14, be with or without hair clip) can produce strand overhang (sticky end), wherein in fact overhang can have any nucleotide sequence.The present invention also comprises polynucleotide design and the method that is similar to Figure 14 substantially, but does not have hair clip.And, apparent, as mentioned above, be shown in the inventive method that is with or without hair clip of Figure 14 and a large amount of molecular biology and the recombinant DNA technology that composition can be used for wherein expecting to import unique endonuclease site.This technology includes but not limited to constructed dna and cDNA library, various subclone strategy or any means that is benefited by the unique endonuclease site in primer, connector or the joint.
Can be by the terminal nucleic acid construct thing of pairing that any method as herein described produces by any sequencing technologies order-checking known in the art.The standard sequencing that obtains the Maxam-Gilbert order-checking such as Sanger order-checking is widely known by the people in this area.Order-checking for example also can be used and be called 454Sequencing TMThe automatization sequence measurement carry out, its by
Figure A20068002818100481
Life SciencesCorporation (Branford, Conn., USA) exploitation, be described in international application no WO/05003375 that for example submitted on January 28th, 2004 and the u.s. patent application serial number of submitting on January 28th, 2,004 10/767,779, the u.s. patent application serial number 60/476 that on June 6th, 2003 submitted to, 602, the u.s. patent application serial number 60/476 that on June 6th, 2003 submitted to, 504, the u.s. patent application serial number 60/443 that on January 29th, 2003 submitted to, 471, the u.s. patent application serial number 60/476 that on June 6th, 2003 submitted to, 313, the u.s. patent application serial number 60/476 that on June 6th, 2003 submitted to, 592, the u.s. patent application serial number 60/465 that on April 23rd, 2003 submitted to, the u.s. patent application serial number 60/497,985 that on August 25th, 071 and 2003 submitted to.Also imagined other sequence measurement known in the art, for example Metzger (Genome Res.2005 Dec; 15 (12): 1767-76), be attached to herein by reference) the synthetic arbitrarily order-checking or the connection sequence measurement of summary, these methods can be used for pairing end sequencing method of the present invention.
In whole this paper disclosure, term " vitamin H ", " avidin " or " streptavidin " are used to describe a member in conjunction with right.Certainly, these terms only illustrate that use is in conjunction with right a kind of method.Therefore, term vitamin H, avidin or streptavidin can be by replacing in conjunction with any one right member.In conjunction with demonstrating any two kinds of molecules of specificity bonded each other, comprise that at least combination is to the anti-FLAG antibody of for example FLAG/, vitamin H/avidin, biotin/streptavidin, receptor/ligand, antigen/antibody, polyhistidyl/nickel, A albumen/antibody and derivative thereof to can be.Other is disclosed in the document in conjunction with to being known.
All patents, patent application and the reference that any place in this paper disclosure is mentioned all by reference integral body be attached to herein.
To utilize following non-limiting example to further describe the present invention now.
Embodiment
Embodiment 1: the oligonucleotide design
The oligonucleotide that uses in following design and the compound experiment.
The capturing element oligonucleotide that is shown in Fig. 3 A top designs to such an extent that comprise the UA3 connector and closes key sequence.The NotI site is between connector.Complete construction (capturing element) can use nested oligonucleotide and PCR to set up.Synthesize and clone's end product sequence.
The IIS type seizure fragment oligonucleotide that is shown in Fig. 3 A bottom is similar to above-mentioned seizure fragment, just represents the sequence in IIS type restriction endonuclease site (for example MmeI) to be included in the pass key sequence seizure fragment afterwards.These IIS type restriction endonuclease cleavage sites allow with any construction of IIS type restriction endonuclease cutting by these capturing element that will downcut preparations.As known in the art, the IIS restriction endonuclease, cuts at 20/18 base place with regard to MmeI at the different distance cutting DNA of distance recognition site.
Short connector seizure fragment oligonucleotide designs to such an extent that comprise the SAD1 connector and closes key sequence (Fig. 3 B).The NotI site is also between connector.Can synthesize this oligonucleotide, it has MmeI IIS type restriction endonuclease cleavage site (referring to Fig. 3 B, short connector is caught fragment (IIS type)) after closing key sequence.
Embodiment 2: the scheme that is used for hair clip connector pairing end sequencing
Use standard HydroShear device (Genomic Solutions, Ann Arbor, Mich., USA) 100 μ l solution with 10 couples of e. coli k12 DNA of speed (20 μ g) carry out 20 round-robin hydraulic shears.By adding 50 μ l DNA (5 μ g), 34.75 μ l H 2O, 10 μ l methylase damping fluids, 0.25 μ l 32mM SAM and 5 μ l EcoRI methylases (40,000 units/ml, New England Biolabs (NEB), Ipswich, Mass. USA) implements methylation reaction to the DNA that sheared.Reactant was in 37 ℃ of incubations 30 minutes.Behind methylation reaction, the methylate DNA that uses Qiagen MinElute PCR purification column to shear according to manufacturer's explanation purifying.With the DNA of 10 μ l EB damping fluids by post wash-out purifying.
The methylate DNA of shearing is carried out refine, have the shearing material of flush end with generation.10 μ l DNA are added in the reaction mixture, and reaction mixture contains 13 μ l H 2O, 5 μ l, 10 * refine damping fluid, 5 μ l 1mg/ml bovine serum albumins, 5 μ l 10mM ATP, 3 μ l 10mMdNTP, 5 μ l, 10 U/ μ l T4 polynucleotide kinases and 5 μ l, 3 U/ μ l T4 archaeal dna polymerases.Reactant in 12 ℃ of incubations 15 minutes, is after this risen to temperature 25 ℃, and incubation is 15 minutes again.Subsequently on Qiagen MinElute PCR purification column according to manufacturer's explanation purification reaction thing.
Shear DNA, 17.5 μ l H by the 5 μ g that add 10 μ l 2(the T4DNA ligase enzyme NEB), is connected to the hair clip connector flush end dna fragmentation of shearing for the 10 μ M hair clip connectors of O, 50 μ l, 2 * Quick ligase enzyme damping fluid, 20 μ l and 2.5 μ l Quick ligase enzymes.With reactant in 25 ℃ of incubations 15 minutes, after this by add 2 μ l λ exonucleases, 1 μ l Rec J (30 to mixture, 000 unit/ml, NEB), 1 μ l T7 exonuclease (10,000 unit/ml, NEB) and 1 μ l exonuclease I (20,000 units/ml, NEB) fragment that select to connect.With reactant in 37 ℃ of incubations 30 minutes, after this with sample purifying on Qiagen MinElute PCR purification column.Explanation according to the manufacturer makes the DNA of processing pass the InvitrogenPurelink post then, and is eluted in the 50 μ l volumes by post.
The DNA that the exonuclease that has connected is handled with EcoRI digests.To contain 50 μ l DNA, 30 μ l H 2(reactant of 20,000 units/ml) is in 37 ℃ of incubations that spend the night for O, 10 μ l EcoRI damping fluids and 10 μ l EcoRI.Use the explanation purifying cleaved products of Qiagen QiaQuick post according to the manufacturer.Containing 50 μ l DNA, 20 μ l Buffer 4 (New EnglandBiolabs), 2 μ l 100mM ATP, 123 μ l H 2The product that is connected cutting in the reactant of O and 5 μ l ligase enzymes (the same) once more is to produce closed annular DNA.With the ligation thing in 25 ℃ of incubations 15 minutes, after this by adding mixture 1 μ l λ exonuclease (5,000 unit/ml, NEB), 0.5 μ l Rec J (the same), 0.5 μ l T7 exonuclease (the same) and 0.5 μ l exonuclease I (the same) carry out another to them and take turns exonuclease and handle.After this exonuclease reactant uses Qiagen MinElute PCR purification column purification of samples in 37 ℃ of incubations 30 minutes.
Containing 10 μ l DNA, 78.75 μ l H then 2(2,000 units/ml carry out MmeI digestion to the DNA that handles in reaction mixture NEB) for O, 10 μ l Buffer 4 (New EnglandBiolabs), 0.25 μ l SAM and 0.5 μ l MmeI.Reach 60 minutes with MmeI in 60 ℃ of digestion reaction things, using purifying on the 3M sodium acetate buffered Qiagen QiaQuick post of final concentration 0.1% then.This post washs with 700 μ l 8.0M Guanidinium hydrochlorides, according to manufacturer's explanation sample is added to post.DNA is eluted in the 30 μ l EB damping fluids, and is diluted to 100 μ l final volume.
By cleaning with 2 * pearl binding buffer liquid, and pearl is suspended in 100 μ l, 2 * pearl binding buffer liquid, preparation streptavidin magnetic beads (50 μ l) (Dynal Dynabeads M270, Invitrogen, Carlsbad, Calif, USA), after this 100 μ l DNA samples are added in the pearl, in mixed at room temperature 20 minutes.In cleaning buffer solution, clean pearl twice.SAD7 connector group (A/B group, wherein single stranded oligonucleotide SAD7Ftop and SAD7Fbot annealing, form the A connector, single stranded oligonucleotide SAD7Rtop and SADRFbot annealing form the B connector) (SAD7Ftop:5 '-CCGCCCAGCATCGCCTCAGNN-3 ' (SEQ ID NO:51); SAD7Fbot:5 '-CTGAGGCGATGCTGG-3 ' (SEQ ID NO:52); SAD7Rtop:5 '-CCGCCCGAGCACCGCTCAGNN-3 ' (SEQ ID NO:53); SAD7Rbot:5 '-CTGAGCGGTGCTCGG-3 ' (SEQ D NO:54), wherein N is any among 4 kinds of base A, G, T or C) be connected to and streptavidin pearl bonded DNA, wherein will contain 15 μ l H 2The ligation mixture of O, 25 μ l Quick ligase enzyme damping fluids, 5 μ l SAD7 connector groups and 5 μ l Quick ligase enzymes (the same) joins in pearl-DNA mixture.The ligation thing cleans pearl twice with the pearl cleaning buffer solution then in 25 ℃ of incubations 15 minutes.
By containing 40 μ l H 2The 10mM dNTP of O, 5 μ l 10 * fill and lead up damping fluid, 2 μ l and 3 μ l fill and lead up polysaccharase, and (Bst archaeal dna polymerase, 8,000 units/ml, mixture NEB) add and carry out Nucleotide in the pearl and fill and lead up reaction.Reactant was in 37 ℃ of incubations 20 minutes, and pearl is cleaned twice with cleaning buffer solution.Then pearl is suspended in the 25 μ l TE damping fluids.
Containing 30 μ l H then 2O, 5 μ l, 10 * Advantage, 2 damping fluids, 2 μ l 10mMdNTPs, 1 μ l, 100 μ M forward primers (SAD7FPCR:5 '-Bio-CCGCCCAGCATCGCC-3 ' (SEQ ID NO:55)), 1 μ l, 100 μ M reverse primers (SAD7RPCR:5 '-CCGCCCGAGCACCGC-3 ' (SEQ ID NO:56), 10 μ l are in conjunction with DNA and 1 μ l Advantage, the 2 polysaccharase mixture (Clontech of pearl, MountainView, Calif., USA) in the reaction mixture, the DNA in conjunction with pearl is carried out PCR.Use following program to carry out PCR:(a) 94 ℃, 4 minutes; (b) 94 ℃, 15 seconds; (c) 64 ℃, 15 seconds, wherein step (b) and (c) carry out 19 circulations (d) 68 ℃, 2 minutes, after this remained on reactant 14 ℃.
Use Qiagen MinElute PCR purification column purified pcr product, then purified product on 1.5% sepharose with the 5V/cm electrophoresis, to detect the situation that exists of 120bp product.Downcut the 120bp fragment by gel, and use Qiagen MinElute gel extraction method to reclaim.The 120bp fragment is eluted in the 18 μ l EB damping fluids.Double-stranded product is in conjunction with the streptavidin pearl, and with twice of pearl cleaning buffer solution cleaning.Single stranded product is eluted among the 125mMNaOH, and on Qiagen MinElute PCR purification column purifying.Use standard 454 Life Sciences Corporation (Branford, Conn., USA) sequencing this material that on 454 LifeSciences Corporation automatization sequencing systems, checks order then.
Embodiment 3: the method that is used for non-hair clip connector pairing end sequencing
Use standard set-up (HydroShear, the same) to carry out 20 round-robin hydraulic shears with the e. coli k12 DNA (5 μ g) of 11 couples 100 μ l of speed volume.The DNA that shears on Qiagen MinElute PCR purification column according to manufacturer's explanation purifying, and with 23 μ l EB buffer solution elution.The shearing DNA of purifying carries out the flush end refine in the reaction mixture that contains 23 μ l DNA, 5 μ l, 10 * refine damping fluid, 5 μ l 1mg/ml bovine serum albumins, 5 μ l 10mM ATP, 3 μ l 10mM dNTP, 5 μ l10U/ μ l T4 polynucleotide kinases and 5 μ l 3U/ μ l T4DNA polysaccharases.Reactant was in 12 ℃ of incubations 15 minutes, and after this temperature rises to 25 ℃, and incubation is 15 minutes again.Subsequently on Qiagen MinElute PCR purification column according to manufacturer's explanation purification reaction thing.The purify DNA that uses 2 μ g to shear carries out being connected of non-hair clip connector in the reaction mixture that contains 25 μ l, 2 * Quick ligase enzyme damping fluid, the non-hair clip connector of 18.5 μ l, 10 μ M and 2.5 μ l Quick ligase enzymes (the same).After this ligation thing makes sample pass through the centrifugal post of Sephacryl S-400 in 25 ℃ of incubations 15 minutes, afterwards by QiagenMinElute PCR purification column.Use 10 μ l EB damping fluids by the post eluted dna then.
The DNA of connection to purifying carries out kinase reaction then, and wherein mixture contains 13 μ lH 2O, 25 μ l, 2 * damping fluid, 10 μ l DNA and 2 μ l 10U/ μ l T4 polynucleotide kinases.Reactant was in 37 ℃ of incubations 60 minutes, and after this sample carries out electrophoresis with 5V/cm on 1% sepharose.Downcut the band of 1500-4000bp by gel, and use Qiagen MinElute gel extraction method to reclaim.
Containing 18 μ l DNA, 20 μ l Buffer 4 (New England Biolabs), 2 μ l ATP, 150 μ l H 2In the reaction mixture of O and 10 μ l ligase enzymes (the same) DNA of purifying is carried out another and take turns and be connected, to produce annular DNA.Reactant is in 25 ℃ of incubations 15 minutes, and the mixture that after this will contain 2 μ l λ exonucleases (the same), 1 μ l Rec J (the same), 1 μ l T7 exonuclease (the same) and 1 μ l exonuclease I (the same) was in 37 ℃ of incubations 30 minutes.After exonuclease reaction, purify DNA on Qiagen MinElute PCR purification column, and with 20 μ l EB buffer solution elution.
Then the DNA of connection of purifying is joined and contain 68.6 μ l H 2In the mixture of O, 10 μ l Buffer 4 (New England Biolabs), 0.2 μ l SAM and 1 μ l MmeI restriction endonuclease (the same).Reach 30 minutes in 37 ℃ of cutting DNAs, after this use QiagenQiaQuick column purification DNA, this post cushions in advance with the 3M sodium acetate of final concentration 0.1%, and washs with 700 μ l 8.0M Guanidinium hydrochlorides.Use the DNA of 30 μ l EB buffer solution elution purifying then, and be adjusted to 100 μ l volumes.
Clean streptavidin magnetic beads (50 μ l) (the same) with 2 * pearl binding buffer liquid, and be resuspended in the 100 μ l pearl binding buffer liquid.Then, with pearl and 100 μ l DNA sample mix, and make it be bonded to each other 20 minutes in room temperature.After this, clean pearl twice, and carry out ligation with SAD7 connector group (A/B group) (the same) with cleaning buffer solution.To contain 15 μ l H 2The mixture of O, 25 μ l Quick ligase enzyme damping fluids, 5 μ l SAD7 connectors and 5 μ l Quick ligase enzymes (the same) adds among the DNA that is bonded to pearl, in 25 ℃ of incubations 15 minutes, after this, cleans pearl twice with cleaning buffer solution.
Containing 40 μ l H 2In the mixture that O, 5 μ l 10 * fill and lead up damping fluid, 2 μ l 10mM dNTP and 3 μ l fill and lead up polysaccharase (the same) DNA that is bonded to pearl is filled and led up reaction.React on 37 ℃ and took place 20 minutes, after this clean pearl twice, and be suspended in the 25 μ lTE damping fluids with cleaning buffer solution.Containing 30 μ l H 2O, 5 μ l, 10 * Advantage, 2 damping fluids, 2 μ l dNTP, 0.5 μ l, 100 μ M forward primers (the same), 0.5 μ l, 100 μ M reverse primers (the same), 10 μ l are bonded to the DNA that amplification in the reaction mixture of the DNA of pearl and 1 μ l Advantage 2 enzymes (the same) is bonded to pearl.The PCR reaction takes place under the following conditions: (a) 94 ℃, 4 minutes; (b) 94 ℃, 15 seconds; (c) 64 ℃, 15 seconds; Wherein step (b) and (c) 24 circulations of repetition; (d) 68 ℃, 2 minutes, after this PCR reactant is remained in 14 ℃.Purified pcr product on Qiagen MinElute PCR purification column, and on 1.5% sepharose, carry out electrophoresis with 5V/cm.Downcut the 120bp product by gel, and use Qiagen MinElute gel extraction method to reclaim.Eluted dna in 18 μ l EB damping fluids subsequently.
Double-stranded DNA is bonded to the streptavidin pearl, and cleans pearl twice with cleaning buffer solution.Use 125mM NaOH wash-out single stranded DNA then, use Qiagen MinElutePCR purification column purifying subsequently.To 454 emulsifications of purifying substance implementation criteria and sequence measurement.
Use aforesaid method, we obtain following result:
Produce intestinal bacteria contigs (reading sequence for about 1,300,000) by 4 standard 454 order-checkings of taking turns 60x60 experiment: produce 303 contigs greater than 1000bp, it has 16, and the mean size of 858bp is 94 to the maximum, 060bp.Table 3 comprises other result who uses above method to obtain.
Table 3: the result of pairing end sequencing method
Sequence is read in pairing The zone The connector group Total directed overlapping group The mean size of contig ordered set Maximum contig ordered set
19,605 1 Hair clip 15 308,129bp 2,989,419bp
14×43
71,822 Many Jie 14 * 43 Hair clip 11 420,302bp 3,330,963bp
20,571 2 14 * 43 Overhang 19 243,197bp 1,512,859bp
At first, the e. coli k12 genome blast that is obtained by Genbank reads sequence analysis by being retrieved all pairings.Reservation is to be lower than the genomic sequence of reading of 0.1 expected value coupling reference.Read sequence for all that contain two independent blast hit results separating with inner catenation sequence, analyze the blast retrieval distance of being separated by in its genome, only in distance less than 5, just keep under the situation of 000bp.Then, sort according to the hit results of first and second position in the genome that these read sequence, and test, to observe in the overlapping matched sequence that whether occurs in next selection.Then to test in these ordering contigs each and 454 situations that check order contig eclipsed mating partners with above-mentioned identical mode.
Described favourable embodiment of the present invention thus in detail, it being understood that the detail that the invention is not restricted to state in the above description of above paragraph explaination,, and do not departed from the spirit or scope of the present invention because might there be its many obvious changes.The modifications and changes of this paper described method it will be apparent to those skilled in the art that, are included in the following claim.

Claims (80)

1. method that obtains the DNA construction, described DNA construction comprises two stub areas of target nucleic acid, said method comprising the steps of:
(a) fragmentation large nucleic acids molecule is with productive target nucleic acid;
(b) capturing element is connected to described target nucleic acid, to form the first ringed nucleus acid molecule;
(c) with the described target nucleic acid of cutting but the restriction endonuclease that does not cut described capturing element digests the described first annular nucleic acid, contain the linear nucleic acid of two ends of described target nucleic acid with generation, described two ends are separated by described capturing element;
(d) described linear nucleic acid is connected with dividing element, to form the second annular nucleic acid;
(e) change the described second annular nucleic acid into annular single-chain nucleic acid;
(f) make first oligonucleotide to described annular single-chain nucleic acid annealing, and by the rolling circle amplification described annular single-chain nucleic acid that increases, to produce strand rolling circle amplification product;
(g) second oligonucleotide is annealed to described strand rolling circle amplification product, in described strand rolling circle amplification product, to form a plurality of double stranded regions; With
(h) use the restriction endonuclease of the described a plurality of double stranded regions of cutting that described strand rolling circle amplification product digestion is small segment, contain the described DNA construction of two end region of target nucleic acid with generation.
2. the method for claim 1, described method also are included in the segmental step of target nucleic acid that size fractionation after the described fragmentation step separates described target nucleic acid fragment and selects preferred size.
3. the process of claim 1 wherein that described first oligonucleotide is to described capturing element or the annealing of described dividing element.
4. the process of claim 1 wherein that described second oligonucleotide is to described capturing element or the annealing of described dividing element.
5. the process of claim 1 wherein that described restriction endonuclease is I type or IIS type restriction endonuclease.
6. the process of claim 1 wherein that described target nucleic acid is 50kb at least, 20kb, 10kb or 5kb at least at least at least.
7. the process of claim 1 wherein that described target nucleic acid is between 50kb and the 3kb, between 20kb and the 3kb or between 10kb and 3kb.
8. the process of claim 1 wherein that described capturing element or described dividing element comprise marker gene.
9. the method for claim 8, wherein said marker gene is an antibiotics resistance gene.
10. the process of claim 1 wherein that described capturing element or described dividing element comprise eucaryon or protokaryon replication orgin.
11. the process of claim 1 wherein that described capturing element or described dividing element are by biotinylation.
12. the method for claim 11, described method also are included in the step of separating the nucleic acid fragment that contains capturing element after the described digestion step.
13. an acquisition contains the method for the DNA construction of two end region of target nucleic acid, said method comprising the steps of:
(a) fragmentation large nucleic acids molecule is to produce target nucleic acid;
(b) connector is connected to each end of described target nucleic acid;
(c) feature tag is connected to described target nucleic acid, to form the ringed nucleus acid molecule;
(d), contain the described DNA construction of two end region of target nucleic acid with generation with the described target nucleic acid of cutting but the restriction endonuclease that does not cut described connector or described feature tag digests described annular nucleic acid.
14. the method for claim 13, described method also are included in the segmental step of target nucleic acid that size fractionation after the described fragmentation step separates described target nucleic acid fragment and selects preferred size.
15. the method for claim 13, described method also are included in the step of the described DNA construction of described step (d) back amplification.
16. the method for claim 15, the following enforcement of wherein said amplification:
(e) connector is connected to the end of described DNA construction; With
(f) by the described DNA construction of pcr amplification.
17. the method for claim 13, wherein said restriction endonuclease are I type or IIS type restriction endonuclease.
18. the method for claim 13, wherein said target nucleic acid is 50kb at least, 20kb, 10kb or 5kb at least at least at least.
19. the method for claim 13, wherein said target nucleic acid is between 50kb and the 3kb, between 20kb and the 3kb or between 10kb and 3kb.
20. the method for claim 13, wherein said feature tag comprises marker gene or replication orgin.
21. the process of claim 1 wherein that described connector or described feature tag are by biotinylation.
22. the method for claim 21, described method also are included in the step of separating the nucleic acid fragment that contains feature tag or connector after the described digestion step.
23. an acquisition comprises the method for the DNA construction of two stub areas of target nucleic acid, said method comprising the steps of:
(a) fragmentation large nucleic acids molecule is with productive target nucleic acid;
(b) first connector is connected to an end of described target nucleic acid, second connector is connected to second end of described target nucleic acid, to form the target nucleic acid of connector mark;
(c) by described first connector being connected to the target nucleic acid of the described connector mark of the described second connector cyclisation, contain the ringed nucleus acid molecule in target nucleic acid district and connector district with formation;
(d), contain the described DNA construction of two end region of target nucleic acid with generation at the described ringed nucleus acid molecule of target nucleic acid district fragmentation.
24. the method for claim 23 wherein methylates with methylase at preceding described large nucleic acids of step (b) or described target nucleic acid.
25. the method for claim 24, wherein said methylating prevents the restriction endonuclease cutting of one or more restriction endonucleases to target nucleic acid.
26. the method for claim 23, wherein said connector are the hair clip connectors.
27. the method for claim 26, described method is further comprising the steps of afterwards in step (b):
(b1) handle the target nucleic acid of described connector mark with exonuclease, to digest the arbitrary target nucleic acid that two ends all are not connected to the hair clip connector;
(b2) remove described exonuclease by the target nucleic acid of described connector mark.
28. the method for claim 23, described method is further comprising the steps of between step (b) and step (c):
(b3) digest the target nucleic acid of described connector mark with described first and second connectors of cutting and the restriction endonuclease that do not cut described target nucleic acid, to be created in the target nucleic acid that two ends all have the connector mark of the connector that cut.
29. also comprising afterwards in step (c), the method for claim 23, described method remove the not step of cyclisation target nucleic acid.
30. the method for claim 29 is wherein removed not the cyclisation target nucleic acid and is comprised described target nucleic acid is contacted with exonuclease.
31. the method for claim 23, wherein step (d) is implemented by mechanical shearing.
32. the method for claim 23, wherein said connector is by biotinylation.
33. the method for claim 32, described method also comprise the solid support of employing avidin or streptavidin bag quilt comes the target nucleic acid of the described connector mark of purifying by affinity purification step afterwards in step (b).
34. the method for claim 32, described method also comprise the solid support of employing avidin or streptavidin bag quilt comes the described DNA construction of purifying by affinity purification step afterwards in step (d).
35. the method for claim 23, wherein said target nucleic acid is 50kb at least, 20kb, 10kb or 5kb at least at least at least.
36. the method for claim 23, wherein said target nucleic acid is between 50kb and the 3kb, between 20kb and the 3kb or between 10kb and 3kb.
37. the method for claim 23, wherein said target nucleic acid at 500bp at least between the 1kb, between 1kb and the 3kb or between 500bp and 3kb.
38. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 10kb.
39. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 20kb.
40. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 40kb.
41. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 5kb.
42. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 3kb.
43. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 1kb.
44. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 500bp.
45. the method for claim 23, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 300bp.
46. an acquisition contains the method for the DNA construction of two end region of target nucleic acid, said method comprising the steps of:
(a) fragmentation large nucleic acids molecule is to produce target nucleic acid;
(b) capturing element is connected to described target nucleic acid, to form the ringed nucleus acid molecule;
(c) with the described target nucleic acid of cutting but the restriction endonuclease that does not cut described capturing element digests described annular nucleic acid, contain the described DNA construction of two end region of target nucleic acid with generation, described two end region are separated by described capturing element.
47. the method for claim 46, wherein said capturing element are the nucleic acid that contains in conjunction with a right member.
48. the method for claim 47, wherein said combination is to being selected from the anti-FLAG antibody of FLAG/, vitamin H/avidin and biotin/streptavidin.
49. the method for claim 47, wherein said capturing element is by biotinylation.
50. also comprising using, the method for claim 47, described method contain the described step of passing through the described DNA construction of affinity purification enrichment in conjunction with second right member solid support.
51. the method for claim 46 wherein methylates with methylase at described before large nucleic acids of step (b) or described target nucleic acid.
52. the method for claim 51, wherein said methylating prevents the restriction endonuclease cutting of one or more restriction endonucleases to target nucleic acid.
53. also comprising afterwards in step (b), the method for claim 46, described method remove the not step of cyclisation target nucleic acid.
54. the method for claim 53 is wherein removed not the cyclisation target nucleic acid and is comprised described target nucleic acid is contacted with exonuclease.
55. the method for claim 46, wherein said target nucleic acid is 50kb at least, 20kb, 10kb or 5kb at least at least at least.
56. the method for claim 46, wherein said target nucleic acid is between 50kb and the 3kb, between 20kb and the 3kb or between 10kb and 3kb.
57. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 5kb.
58. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 3kb.
59. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 1kb.
60. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 500bp.
61. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 300bp.
62. the method for claim 46, wherein said target nucleic acid at 500bp at least between the 1kb, between 1kb and the 3kb or between 500bp and 3kb.
63. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 10kb.
64. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 20kb.
65. the method for claim 46, the size of the DNA construction of wherein said two end region that comprise target nucleic acid is lower than 40kb.
66. the method for the linear polynucleotide molecule of cyclisation, described method comprises:
A) be provided at linear polynucleotide molecule in the aqueous solution;
B) described aqueous solution is emulsified in the oil; With
C) covalency engages the end of linear polynucleotide molecule,
The described linear polynucleotide molecule of cyclisation thus.
67. the method for claim 66, wherein said linear polynucleotide molecule is selected from single stranded DNA, double-stranded DNA, single stranded RNA and double-stranded RNA.
68. the method for claim 66, two ends of wherein said linear polynucleotide molecule have the compatible end that is suitable for connecting.
69. the method for claim 66, the end of wherein said linear polynucleotide molecule engages by connecting.
70. the method for claim 66, wherein said oil mixes with one or more tensio-active agents.
71. the method for the linear polynucleotide molecule of cyclisation, described method comprises:
A) be provided at linear polynucleotide molecule group in the aqueous solution;
B) described aqueous solution is emulsified in the oil, to produce a plurality of moisture microreactors; With
C) covalency engages the end of linear polynucleotide molecule at least one microreactor,
The described linear polynucleotide molecule of cyclisation thus.
72. the method for claim 71, wherein a plurality of microreactors comprise a polynucleotide molecule just.
73. the method for claim 71, wherein a plurality of microreactors comprise water-in-oil emulsion.
74. the method for claim 23, wherein in step (d), described ringed nucleus acid molecule is by digesting by fragmentation with restriction endonuclease.
75. each method in the claim 17,46 or 74, wherein said restriction endonuclease is MmeI.
76. the method for claim 75 wherein adds the carrier DNA that contains the MmeI restriction site in the MmeI digestive process.
77. the method for claim 76, wherein the amount mole number of the described carrier DNA of Jia Ruing surpasses annular nucleic acid.
78. the method for claim 76, wherein the MmeI site in MmeI enzyme and the carrier DNA exists with stoichiometric quantity.
79. the method for claim 26, described method is further comprising the steps of between step (b) and step (c):
(b4) with the described first and second hair clip connectors of cutting and do not cut the target nucleic acid of the described connector mark of endonuclease digestion of described target nucleic acid, to be created in the target nucleic acid that two ends all have the connector mark of the connector that cut.
80. the method for claim 79, wherein said hair clip connector has at least one Hypoxanthine deoxyriboside in every chain of its double stranded region, and wherein said endonuclease is endonuclease V.
CNA2006800281812A 2005-06-06 2006-06-06 Paired end sequencing Pending CN101351552A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US68804205P 2005-06-06 2005-06-06
US60/688,042 2005-06-06
US60/717,964 2005-09-16
US60/771,818 2006-02-08

Publications (1)

Publication Number Publication Date
CN101351552A true CN101351552A (en) 2009-01-21

Family

ID=40269690

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2006800281812A Pending CN101351552A (en) 2005-06-06 2006-06-06 Paired end sequencing

Country Status (1)

Country Link
CN (1) CN101351552A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102212612A (en) * 2011-03-23 2011-10-12 上海美吉生物医药科技有限公司 Constructing method of double-end library for high throughput 454 sequencing
WO2012079486A1 (en) * 2010-12-16 2012-06-21 深圳华大基因科技有限公司 Method of preparing dna sample for sequencing and use thereof
CN103890175A (en) * 2011-08-31 2014-06-25 学校法人久留米大学 Method for exclusive selection of circularized DNA from monomolecular DNA when circularizing DNA molecules
CN105349646A (en) * 2015-11-13 2016-02-24 中国农业科学院蔬菜花卉研究所 A BAC end sequencing method
CN105986020A (en) * 2015-02-11 2016-10-05 深圳华大基因研究院 Method and device for constructing sequencing library
CN107109478A (en) * 2014-09-11 2017-08-29 伊卢米纳剑桥有限公司 The method for obtaining paired end sequencing information
CN113279067A (en) * 2014-09-24 2021-08-20 赛科恩斯生物科学公司 Method for generating double-stranded adapters

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012079486A1 (en) * 2010-12-16 2012-06-21 深圳华大基因科技有限公司 Method of preparing dna sample for sequencing and use thereof
CN102534811A (en) * 2010-12-16 2012-07-04 深圳华大基因科技有限公司 DNA (deoxyribonucleic acid) library and preparation method thereof, as well as DNA sequencing method and device
CN102212612A (en) * 2011-03-23 2011-10-12 上海美吉生物医药科技有限公司 Constructing method of double-end library for high throughput 454 sequencing
CN103890175A (en) * 2011-08-31 2014-06-25 学校法人久留米大学 Method for exclusive selection of circularized DNA from monomolecular DNA when circularizing DNA molecules
CN103890175B (en) * 2011-08-31 2015-12-09 学校法人久留米大学 The method of the cyclized DNA formed by unit molecule is only selected in the cyclisation of DNA molecular
CN107109478A (en) * 2014-09-11 2017-08-29 伊卢米纳剑桥有限公司 The method for obtaining paired end sequencing information
CN113279067A (en) * 2014-09-24 2021-08-20 赛科恩斯生物科学公司 Method for generating double-stranded adapters
CN105986020A (en) * 2015-02-11 2016-10-05 深圳华大基因研究院 Method and device for constructing sequencing library
CN105986020B (en) * 2015-02-11 2019-08-09 深圳华大智造科技有限公司 Construct the method and device of sequencing library
CN105349646A (en) * 2015-11-13 2016-02-24 中国农业科学院蔬菜花卉研究所 A BAC end sequencing method

Similar Documents

Publication Publication Date Title
CN102027130A (en) Paired end sequencing
AU2020205215B2 (en) Preserving genomic connectivity information in fragmented genomic DNA samples
US10968448B2 (en) Methods and compositions using one-sided transposition
US10246705B2 (en) Linking sequence reads using paired code tags
CN101351552A (en) Paired end sequencing
KR102414127B1 (en) Sample preparation on a solid support
AU2013382098B2 (en) Methods and compositions for nucleic acid sequencing
EP3089822B1 (en) Analysis of nucleic acids associated with single cells using nucleic acid barcodes
US8829171B2 (en) Linking sequence reads using paired code tags
CA2821299C (en) Linking sequence reads using paired code tags
EP1910537A1 (en) Paired end sequencing
US20090233291A1 (en) Paired end sequencing
US20150045257A1 (en) Methods and transposon nucleic acids for generating a dna library
CN106715693A (en) Methods and compositions for preparing sequencing libraries
JP5103398B2 (en) Both end sequencing (paired sequencing)
KR20230091116A (en) Sequencing Templates Containing Multiple Inserts, and Compositions and Methods for Improving Sequencing Throughput
JP2019176859A (en) Methods for amplifying nucleic acids utilizing clamp oligonucleotides

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1128725

Country of ref document: HK

C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20090121

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1128725

Country of ref document: HK