An embodiment of a method for obtaining a DNA construct comprising two end regions of a target nucleic acid in an in vitro reaction is described that comprises the steps of: fragmenting a large nucleic acid molecule to produce a target nucleic acid molecule; ligating a recombination adaptor element to each end of the target nucleic acid molecule to produce an adapted target nucleic acid molecule; exposing the adapted target nucleic acid to a site specific recombinase to produce a circular nucleic acid product and a linear nucleic acid product from the adapted target nucleic acid, wherein the circular nucleic acid product comprises the target nucleic acid molecule; and fragmenting the circular nucleic acid product to produce a template nucleic acid molecule comprising a sequence region from each end of the target nucleic acid molecule.


配对末端测序法发明领域[0001] 本发明涉及核酸测序、基因组测序和将测序结果装配成邻接序列的领域。 Field of the invention paired end sequencing [0001] The present invention relates to nucleic acid sequencing, genomic sequencing, and the field sequencing results assembled contiguous sequence. [0002] 发明背景[0003] 对大的靶核酸(例如人基因组)进行测序的一种方法是使用鸟枪法测序。 [0002] BACKGROUND OF THE INVENTION [0003] A method of sequencing a target nucleic acid on a large (e.g. a human genome) is performed using shotgun sequencing. 在鸟枪法测序中,使靶核酸片段化或亚克隆产生一系列的重叠核酸片段后,测定这些片段的序列。 In the shotgun sequencing of the target nucleic acid fragment or a series of overlapping subclones a nucleic acid fragments, the sequence of these fragments was determined. 根据每个片段的序列的重叠和对每个片段的序列的认识,可以构建完整的靶核酸序列。 The knowledge of the sequence of each overlapping fragment, and the sequence of each fragment, can be constructed complete target nucleic acid sequence. [0004] 鸟枪法测序的一个缺点是如果靶核酸序列包含许多小的重复序列(串联重复序列或反向重复序列),则装配可能十分困难。 [0004] One disadvantage of the method is a shotgun sequencing if the target nucleic acid sequence comprising a number of small sequence repeats (tandem repeats or inverted repeats), the assembly may be difficult. 不能用重复区装配基因组序列导致装配序列中出现缺口(gap)。 It can not be fitted with the genomic sequence repeat regions leads to gaps (GAP) in the assembly sequence occurs. 因此,在最初的核酸序列装配之后,需要补平序列覆盖范围的缺口, 而且还需要解决装配中不确定性的问题。 Thus, after the initial assembly of a nucleic acid sequence, the sequence needs to fill-coverage gaps, but also need to address the problem of uncertainty assembly. [0005] 一种解决这些缺口的方法是使用较大的克隆或片段来测序,因为这些较大的片段可能足够长到跨过重复区。 [0005] One way to address these gaps is to use larger fragments for cloning or sequencing, because these larger fragments may be long enough to cross the repeat region. 然而,核酸大片段的测序在现有的测序仪中较困难并且耗时。 However, large segments of nucleic acid sequencing difficult in conventional sequencing instrument and time-consuming. [0006] 另一种跨越序列中的缺口的方法是确定大片段两个末端的序列。 [0006] Another method for sequence gaps across the two ends is the sequence of large fragments is determined. 与鸟枪法测序片段的一个末端的单一序列读长(sequence read)相比,两个末端的一对序列读长具有已知的间距和方向。 Read length (sequence read) sequence as compared to a single terminal fragment shotgun sequencing a sequence of read length having two end a known distance and direction. 使用相对长的片段还有助于含有散布重复元件(interspersed repetitive element)的序列进行装配。 Fragment using relatively long interspersed repetitive sequence elements also help (interspersed repetitive element) containing the assembly. 这一类型的方法6mith,MW等,Nature Genetics 7 : 40-47 (1994)在本领域称为配对末端测序法(paired end sequencing)。 This type of process 6mith, MW et, Nature Genetics 7: 40-47 (1994) referred to paired end sequencing (paired end sequencing) in the art. 本发明包括用于配对末端测序方法和其它核酸技术的新的方法、系统和组合物。 The present invention includes a new method of paired-end sequencing methods, and other nucleic acid techniques, systems and compositions. [0007] 发明概述[0008] 本发明的一个实施方案涉及用于在体外反应中获得包含靶核酸的两个末端区的DNA构建体的方法,所述靶核酸可以是得自生物基因组的大区段。 [0007] SUMMARY [0008] In one embodiment of the present invention relates to a method for obtaining a DNA comprising two end regions of a target nucleic acid in vitro constructs for the reaction, the target nucleic acid may be obtained from a region of the genome segment. 所述方法包括下列步骤:[0009] 本发明描述了用于在体外反应中获得包含靶核酸的两个末端区的DNA构建体的方法的一个实施方案,所述方法包括以下步骤:使大核酸分子片段化产生靶核酸分子; 使重组衔接子元件(adaptor element)与靶核酸分子的每个末端连接产生衔接的靶核酸(adapted target nucleic acid)分子;使衔接的靶核酸暴露于位点特异性重组酶,从衔接的靶核酸产生环状核酸产物和线性核酸产物,其中环状核酸产物包含靶核酸分子;使环状核酸产物片段化产生包含得自靶核酸分子每个末端的序列区的模板核酸分子。 Said method comprising the steps of: [0009] The present invention describes a method for obtaining a DNA embodiment of the two end regions of the target nucleic acid comprises an in vitro reaction construct, said method comprising the steps of: large nucleic acid generating molecular fragments of target nucleic acid molecule; recombinant adapter element (adaptor element) connected to each end of the target nucleic acid molecule produced convergence target nucleic acid (adapted target nucleic acid) molecules; target nucleic acid so that the adapter is exposed to the site-specific recombinase, resulting circular nucleic acid product and a linear nucleic acid product from the adapted target nucleic acid, wherein the circular nucleic acid product comprises the target nucleic acid molecule; circular nucleic acid template containing the product yielded fragments derived from the target nucleic acid molecule sequence of each end region nucleic acid molecule. [0010] 在一些实施过程中,所述方法还包括使用外切核酸酶除去非环状分子的步骤。 [0010] In some embodiments the process, the method further comprising the step of removing non-cyclic molecules exonuclease. 另外,在一些实施过程中,所述方法还包括以下步骤:将大量环状载体DNA分子(carrier DNA molecule)加入环状核酸产物中;使环状核酸产物和载体DNA分子片段化产生模板分子和大量的线性载体分子;测定自模板分子和线性载体分子片段化的效率;使模板分子扩增以产生包含大量基本相同拷贝的群体,其中线性载体分子是不可扩增的;对所述群体进行测序,生成包含模板核酸的序列组成的序列数据。 Further, the process in some embodiments, the method further comprising the steps of: a large number of circular vector DNA molecule (carrier DNA molecule) was added circular nucleic acid product; circular nucleic acid product and the vector fragment of a DNA molecule and the template molecule generated the large linear vector molecule; measured from the template molecule and the linear carrier molecules fragmenting efficiency; template molecule comprising a plurality of substantially amplified to produce a population of identical copies, wherein the linear carrier molecules are not amplified; the population sequencing generating template nucleic acid sequence comprises a sequence consisting of the data. [0011] 本发明的方法可同时在大量靶DNA片段中进行以产生DNA构建体的文库,所述构建体含有来自大的DNA片段的末端。 [0011] The method of the present invention can be performed simultaneously to create a DNA library construction body, from the large end containing the DNA fragments constructed in a number of target DNA fragment. 本发明的一个优势是可在体外构建文库而无需使用原核或真核宿主细胞。 One advantage of the present invention is to construct a library in vitro without the use of a prokaryotic or eukaryotic host cells. [0012] 因此,本发明涉及用于在体外反应中获得包含靶核酸的两个末端区的DNA构建体的方法,所述方法包括以下步骤:[0013]-使核酸片段化产生靶核酸分子;[0014]-使重组衔接子元件与靶核酸分子的每个末端连接产生衔接的靶核酸分子;[0015]-使衔接的靶核酸暴露于位点特异性重组酶中,由衔接的靶核酸产生环状核酸产物和线性核酸产物,其中环状核酸产物包含靶核酸分子;和[0016]-使环状核酸产物片段化产生包含得自靶核酸分子每个末端的序列区的模板核酸分子。 [0012] Accordingly, the present invention relates to a method for obtaining a DNA comprising two end regions of a target nucleic acid in vitro constructs for the reaction, said method comprising the steps of: [0013] - generating nucleic acid fragments of a target nucleic acid molecule; [0014] - each terminal recombination adapter element is connected with the target nucleic acid molecule to produce convergence target nucleic acid molecule; [0015] - that the adapter is exposed to the target nucleic acid site-specific recombinase, resulting in the convergence of the target nucleic acid circular nucleic acid product and a linear nucleic acid product, wherein the circular nucleic acid product comprises the target nucleic acid molecule; and [0016] - that the circular nucleic acid product to produce fragments of a nucleic acid molecule comprising a template from the target nucleic acid molecule sequence of each end region too. [0017] 片段化的核酸可由非常大的分子组成。 [0017] fragmented nucleic acid molecules may be very large. 例如,所述核酸可以是基本上未剪切或之前未被预片段化的基因组DNA。 For example, the nucleic acid may be substantially prior not pre-cut or non-fragmented genomic DNA. 在这种情况下,所述新的方法尤其适用于包含长度选自至少3Kb、至少8Kb、至少10Kb、至少20Kb、至少50Kb和至少IOOKb的靶核酸分子。 In this case, the new method is particularly suitable for containing at least a selected length 3Kb, at least 8Kb, at least in 10Kb, at least 20Kb, at least 50Kb, and at least IOOKb target nucleic acid molecule. [0018] 可用于本发明所述情况的位点特异性重组酶的一个突出实例是Cre重组酶。 A prominent example of [0018] can be used in the present invention, the case where the site specific recombinase is Cre recombinase. [0019] 使环状核酸产物片段化的一个优选的方法包括雾化的步骤。 [0019] circular nucleic acid product of fragmentation of a preferred method comprises the step of atomizing. 优选使环状核酸产物片段化的步骤还包括使用II型限制性内切酶对环状核酸产物进行第一次断裂和使用雾化进行第二次断裂,其中II型限制性内切酶在环状核酸产物的杂合衔接区(hybridadaptor region)的限制位点上切割并从靶核酸中产生短序列区,而雾化则从靶核酸中产生长序列区。 Preferably circular nucleic acid product further comprises the step of fragmenting the use of type II restriction endonuclease of the first circular nucleic acid product is conducted using a nebulizer and breaking a second fracture, wherein the type II restriction endonucleases in the ring cutting the shaped product of the nucleic acid hybrid adapter region (hybridadaptor region) restriction sites and produce a short sequence region from the target nucleic acid, the target nucleic acid from the atomization produces a long sequence region. 例如,II型限制性内切酶包含Mmel,短序列区包含20bp序列长度。 For example, type II restriction endonuclease comprising MMEL, 20bp sequence comprising the sequence of short length. [0020] 在第一个实施方案中,所述方法还包括在将衔接的靶核酸暴露于位点特异性重组酶的步骤之后除去非环状分子的步骤。 [0020] In a first embodiment, the method further comprising the step of removing non-cyclic molecules after engagement of the target nucleic acid exposing step site specific recombinase. 非环状分子优选包含线性核酸产物和衔接子二聚体产物,其中衔接子二聚体产物由两个重组衔接子元件彼此连接而产生。 Non-cyclic molecule preferably comprises linear nucleic acid product and adapter dimer product, wherein the adapter dimer product by two recombination adapter element to each other is generated. 所述方法同样还优选包括使用至少一种外切核酸酶除去非环状分子的步骤。 The method also preferably further comprises the step of using at least one non-cyclic molecules exo-nuclease to remove. [0021] 所述方法还优选包括以下这些步骤:[0022]-将大量环状载体DNA分子加入环状核酸产物中,[0023]-使环状核酸产物和载体DNA分子片段化产生模板分子和大量线性载体分子,[0024]-测定从模板分子和线性载体分子中片段化的效率,[0025]-使模板分子扩增以产生包含大量基本相同拷贝的群体,其中线性载体分子是不可扩增的;和[0026]-对所述群体进行测序以产生包含模板核酸的序列组成的序列数据。 [0021] The method preferably further comprises the following steps: [0022] - The addition of a large number of circular vector DNA molecule circular nucleic acid product, [0023] - that the circular nucleic acid product and the carrier DNA molecules to produce the template molecule and fragmentation large linear vector molecule, [0024] - Determination of the template molecule and the linear vector molecule fragmentation efficiency, [0025] - to generate amplification template molecule contains a large population of substantially identical copies, wherein the carrier molecule is not linear amplification a; and [0026] - the sequence data of the population was sequenced to produce the template nucleic acid sequence comprises the composition. [0027] 特别优选的是,环状载体分子包括pUC19。 [0027] Particularly preferred are cyclic carrier molecule comprising pUC19. 同样特别优选的是,环状载体分子包括受损的DNA,其中受损的DNA是不可扩增的。 Also particularly preferred are the cyclic carrier molecule comprising a DNA damage, which is not damaged DNA amplification. 而且受损的DNA可以是选自以下的损伤类型:UV损伤、烷基化/甲基化、X射线损伤、水解和氧化损伤。 And damaged DNA damage may be selected from the following types: UV damage, alkylation / methylation, X-ray damage, hydrolysis and oxidative damage. [0028] 在与上文中公开的第一个实施方案相容并且可与之组合的第二个实施方案中, 本发明的方法还包括以下步骤:[0029]-使模板核酸扩增以产生包含大量基本相同拷贝的群体;和[0030]-对所述群体进行测序以产生包含模板核酸的序列组成的序列数据。 [0028] In a first embodiment is compatible with the above disclosed embodiment and may be in combination with a second embodiment, the method of the present invention further comprises the step of: [0029] - to generate a template nucleic acid amplification comprising a plurality of substantially identical copies of the group; and [0030] - the sequence data of the population was sequenced to produce the template nucleic acid sequence comprises the composition. [0031] 所述方法优选还包括使第二套衔接子元件与模板核酸分子连接的步骤,其中第二套衔接子元件包含第一引物元件和第二引物元件而且其中扩增步骤使用第一引物元件,测序步骤使用第二引物元件。 [0031] The method preferably further comprises the step of a second set of adapter elements to the template nucleic acid molecule, wherein the second set of adapter elements comprise a first primer and a second primer element and wherein the element using the first amplification step primer element, the step of sequencing using a second primer element. [0032] 模板核酸的序列组成还优选包含来自靶分子末端的各个序列区的序列组成。 Sequence [0032] The template nucleic acid sequence composition preferably further comprises a sequence region from each end of the target molecule composition. [0033] 在与上文中公开的第一个和第二个实施方案相容并且可与之组合的第三个实施方案中,重组衔接子元件包含第一重组衔接子元件和第二重组衔接子元件,其中第一和第二重组衔接子元件两者均包含定向元件(directional element)。 [0033] compatible with the first and second embodiments disclosed in the above embodiment and may be in combination with a third embodiment, the recombination adapter elements comprise a first recombination adapter element and a second recombination adapter element, wherein the first and second recombination adapter elements both comprise orienting elements (directional element). [0034] 优选当第一和第二重组衔接子元件中的定向元件处于同向关系时,便产生环状核酸产物和线性核酸产物。 [0034] Preferably, when the first and second recombination adapter element directional element in relation to the same time, will produce circular nucleic acid product and a linear nucleic acid product. 因此第一和第二重组衔接子元件可各自包含以促进定向元件的同向关系(identical directional relationship)的取向与靶核酸分子连接的平端。 The first and second recombination adapter elements may each comprise the same element to facilitate the orientation relationship (identical directional relationship) blunt end orientation to a target nucleic acid molecule. [0035] 第一和第二重组衔接子元件还优选包含防止衔接子多联体(adaptor concatemer) 形成的突出端。 [0035] The first and second recombination adapter element preferably further comprises overhang prevents adapter concatemers (adaptor concatemer) is formed. 定向元件还优选包含Iox序列元件。 Iox directional element preferably further comprises sequence elements. 第一和第二重组衔接子元件还优选包含位于定向元件两端侧翼的回文序列元件。 The first and second recombination adapter element preferably further comprises a palindromic sequence elements located at both ends of the orientation flanking element. [0036] 在与上文中公开的第一和第二实施方案相容并且可与之组合的第四个实施方案中,环状核酸产物包含第一杂合重组衔接子(hybrid recomWnation adaptor),线性核酸产物包含第二杂合重组衔接子,其中第一和第二杂合重组衔接子包含得自连接的重组衔接子的元件。 [0036] The compatibility of the first and second embodiments disclosed above and may be combined and fourth embodiments, the circular nucleic acid product comprises a first hybrid recombination adapter (hybrid recomWnation adaptor), linear nucleic acid product comprises a second hybrid recombination adapter, wherein the first and second hybrid recombination adapters comprise a recombinant self-ligation of the adapter element. [0037] 模板核酸优选包含位于末端序列区之间的第一杂合重组衔接子。 [0037] Preferably the template nucleic acid comprising a sequence region at the end of a first hetero bonded recombination between the adapter. 十分优选的模板核酸包含至少一个与第一杂合重组衔接子结合的富集标签(enrichment tag)。 Highly preferred nucleic acid template comprises at least one engagement tab enrichment (enrichment tag) recombination adapter in combination with a first heteroatom. 所述富集标签可为例如生物素标签。 The label may be enriched, for example, a biotin tag. [0038] 另外,本发明涉及用于在体外反应中获得包含靶核酸的两个末端区的大量DNA 构建体的方法,所述方法包括下列步骤:[0039]-使大核酸分子片段化产生大量靶核酸分子,[0040]-使重组衔接子元件与靶核酸分子的每个末端连接产生大量衔接的靶核酸分子,[0041]-将衔接的靶核酸分子暴露于位点特异性重组酶中,从衔接的靶核酸分子中产生大量环状核酸产物和大量线性核酸产物,其中环状核酸产物包含靶核酸分子,和[0042]-使环状核酸产物片段化产生包含靶核酸分子的每个末端的序列区的大量模板核酸分子。 [0038] Further, the present invention relates to a method for obtaining large amounts of DNA comprising two end regions of a target nucleic acid in vitro constructs for the reaction, said method comprising the steps of: [0039] - the large-molecule nucleic acid fragments of a large amount of target nucleic acid molecule, [0040] - that the recombination adapter element to each end of the connection target nucleic acid molecule to generate a large number of convergence target nucleic acid molecule, [0041] - convergence of the target nucleic acid molecule is exposed to a site specific recombination enzyme, circular nucleic acid product and a large amount of large linear nucleic acid product from the adapted target nucleic acid molecule, wherein the circular nucleic acid product comprises the target nucleic acid molecule, and [0042] - that the circular nucleic acid product to produce fragments of each end of the target nucleic acid molecule comprising a large number of template nucleic acid molecule sequence region. [0043] 此外,本发明提供用于实施上文公开方法的试剂盒,所述试剂盒包含:[0044]-大量的重组衔接子元件;和[0045]-位点特异性重组酶。 [0043] Further, the present invention provides a kit for carrying out the method disclosed above, said kit comprising: [0044] - a large number of recombination adapter elements; and [0045] - a site specific recombinase. [0046] 位点特异性重组酶优选为Cre重组酶。 [0046] The site-specific recombinase is preferably Cre recombinase. [0047] 这类试剂盒可特别包含:[0048]-大量的重组衔接子元件,[0049]-位点特异性重组酶,例如Cre重组酶,[0050]-外切核酸酶;和[0051]-环状载体 DNA,例如pUC19DNA。 [0047] Such kits may comprise in particular: [0048] - a large number of recombination adapter element, [0049] - a site specific recombinase, Cre recombinase, for example, [0050] - an exonuclease; and [0051 ] - circular vector DNA, for example pUC19DNA. [0052] 附图简述[0053] 可结合附图来理解下面通过举例给出、但并不意味着将本发明局限于所述具体实施方案的发明详述,所述附图通过引用结合到本文中,其中:[0054] 图1表示配对末端测序策略的一个实施方案的示意图。 [0052] BRIEF DESCRIPTION [0053] The following can be understood in conjunction with the accompanying drawings given by way of example, but not intended to limit the invention to the detailed description of the specific embodiments, by reference to the accompanying drawings herein, in which: [0054] FIG. 1 shows a schematic of an embodiment of the paired-end sequencing strategy. 数字标记注明核酸的起点。 Numerals indicate the starting point of nucleic acids. “101”表示俘获元件(capture element)的一个侧翼区,例如图3A左侧中所示。 "101" denotes the capture element (capture element) in a flanking region, such as shown in the left side in FIG. 3A. “102”表示俘获元件的第二侧翼区,例如图3A右侧所示。 "102" denotes a second flanking region of the capture element, as shown in the right side of FIG. 3A. “103”表示俘获元件。 "103" denotes the capture element. “104”表示片段化的(且任选大小分级的(size fractionated))起始核酸。 "104" represents a fragmented (and optionally size fractionated (size fractionated)) starting nucleic acid. “105”表示隔离元件(separator element)。 "105" denotes a spacer member (separator element). “106”表示聚合酶。 "106" means polymerase. [0055 [0056 [0057 [0058 [0059 [0060 [0061 [0062 [0063 [0064 [0065 [0066 [0067图2表示配对末端测序策略的第二个实施方案的示意图。 Schematic 0056 [0057 [0058 [0059 [0060 [0061 [0062 [0063 [0064 [0065 [0066 [00672 paired-end sequencing strategy represents a second embodiment of FIG. [0055 [. 图3表示俘获片段的序列和设计。 3 shows a design capture sequences and fragments. 序列标识符如下:配对末端俘获片段产物 SEQ ID NO: 1Oligo 1 SEQ ID NO : 2Oligo 2 SEQ ID NO : 3Oligo 3 SEQ ID NO : 4Oligo 4 SEQ ID NO : 5配对末端俘获片段产物(lis型,MmeI) SEQ ID NO : 6短衔接子配对末端俘获片段 SEQ ID NO: 7短衔接子配对末端俘获片段(IIS型,Mmel) SEQ ID NO : 8图4表示RE片段的一个实施方案。 Sequence identifiers as follows: Paired-end capture fragment product of SEQ ID NO: 1Oligo 1 SEQ ID NO: 2Oligo 2 SEQ ID NO: 3Oligo 3 SEQ ID NO: 4Oligo 4 SEQ ID NO: 5 Paired-end capture fragment product (LIS type, MmeI) SEQ ID NO: 6 short adapter paired end capture fragment of SEQ ID NO: 7 short adapter paired end capture fragment (IIS type, Mmel) SEQ ID NO: 8 Figure 4 shows an embodiment of a RE fragment. 图5表示RE片段的另一个实施方案。 FIG 5 shows another embodiment of a RE fragment. 图6表示使用发夹衔接子的配对末端读长法(paired end read approach)。 Figure 6 shows the use of hairpin adapter paired end read lengths method (paired end read approach). 发夹衔接子具有以下序列:[0068] A[0069] A/ \AAACCCG-—GAATTC-—AAACCCTTTCGGT-—TCCAAC-[0070] 3 ' OH_] I lllllll Illlll IIIIIIIIIIIII Illlll[0072] T\ /TTTGGGC---CTTAAG-—TTTGGGAAAGCCA-—AGGTTG-[0073] 5 ' P04[0074] T[0075] (SEQ ID NO : 27)[0076] 发夹衔接子是一个连续核酸序列,将其分成以上4个区来描述。 Hairpin adapters having the sequence: [0068] A [0069] A / \ AAACCCG - GAATTC - AAACCCTTTCGGT - TCCAAC- [0070] 3 'OH_] I lllllll Illlll IIIIIIIIIIIII Illlll [0072] T \ / TTTGGGC-- -CTTAAG - TTTGGGAAAGCCA - AGGTTG- [0073] 5 'P04 [0074] T [0075] (SEQ ID NO: 27) [0076] hairpin adapters is a contiguous nucleic acid sequence, which is divided into four or more regions to description. 4个区从左到右是发夹区、限制性内切核酸酶识别位点、生物素化区和IIS型限制性内切核酸酶识别位点。 4 is a hairpin region from left to right region, endonuclease recognition sites within the restriction endonuclease recognition site and the biotinylated type IIS restriction region. “601”表示发夹衔接子。 "601" denotes hairpin adapter. “603”表示基因组DNA。 "603" represents genomic DNA. Met表示甲基化DNA。 Met represents methylated DNA. “602”表示发夹衔接子二聚体。 "602" denotes hairpin adapter dimers. “604”表示被限制性内切核酸酶切割的发夹衔接子。 "604" denotes an adapter endonuclease restriction cleaved hairpin. “605”表示被限制性内切核酸酶切割并且再连接的两个发夹衔接子。 "605" denotes adapter restriction endonuclease cleavage and reconnection of two hairpin. SA表示链霉抗生物素珠粒。 SA represents a streptavidin-biotin beads. Bio表示生物素(例如生物素化DNA)。 Bio denotes a biotin (e.g. biotinylated DNA). [0077] 图7表示配对末端方法的改进。 [0077] FIG. 7 shows a modification of the paired-end method. [0078] 图8表示具有突出端衔接子的配对末端读长法。 [0078] FIG. 8 shows a paired-end overhang adapters having read length method. [0079] 图9表示“标签引发的(tag primed)”双末端测序法,这是-测序的方法。 [0079] FIG. 9 shows a "tag initiated (tag primed)" paired-end sequencing, which is - sequencing method. [0080] 图10表示衔接子连接成环。 [0080] FIG. 10 shows an adapter to form a ring. -种用于本发明产物[0081] 图11表示基于的ssDNA的环化。 - for seed products of the invention [0081] FIG. 11 shows the cyclization of ssDNA-based. [0082] 图12表示配对末端测序策略的另一个实施方案的示意图一配对读长PET随机片段化(Paired-Reads PET Random Fragmentation)。 A schematic view of [0082] 12 shows a further embodiment of the paired-end sequencing strategy in view of a read length PET pairing random fragmentation (Paired-Reads PET Random Fragmentation). SPRI 是指固相可逆固定法(solid-phase reversible immobilization)。 SPRI refers to a solid-phase reversible immobilization (solid-phase reversible immobilization). [0083] 图13表示从大肠杆菌(E.Coli)K12测序中得到的配对读长PET随机片段化测序数据。 [0083] FIG. 13 shows obtained from E. coli (E.Coli) K12 PET sequencing read length pairing random fragmentation sequencing data. [0084] 图14表示用大肠杆菌内切核酸酶V切割双链DNA的各种方法。 [0084] FIG 14 shows various methods endonuclease V cleavage of double stranded DNA using E. coli. 被框住的核苷酸“I”表示脱氧肌苷。 The live block polynucleotide "I" represents deoxyinosine. [0085] 图14A表示其中双链DNA的核苷酸序列通过大肠杆菌内切核酸酶V以产生3' 单链回文突出端的方式指导双链切割的方法。 [0085] FIG. 14A shows the nucleotide sequence of double-stranded DNA wherein the endonuclease V for E. coli to generate a 3 'single stranded palindromic protruding end of guide way double-stranded cleavage method. 注意3'单链突出端含有脱氧肌苷残基。 Note 3 'single-stranded overhangs containing deoxyinosine residues. [0086] 图14B表示其中双链DNA的核苷酸序列通过大肠杆菌内切核酸酶V以产生3' 单链非回文突出端的方式指导双链切割的方法。 [0086] FIG. 14B shows the nucleotide sequence of double-stranded DNA wherein the endonuclease V by the method of E. coli to produce a 3 'single-stranded non-palindromic overhang manner duplexes guide cleavage. 注意3'单链突出端含有脱氧肌苷残基。 Note 3 'single-stranded overhangs containing deoxyinosine residues. [0087] 图14C表示其中双链DNA的核苷酸序列通过大肠杆菌内切核酸酶V以产生5' 单链回文突出端的方式指导双链切割的方法。 [0087] FIG 14C shows the nucleotide sequence of double-stranded DNA wherein the endonuclease V for E. coli to generate 5 'single-stranded palindromic overhang guide ways and double-stranded cleavage. 注意5'单链突出端不含脱氧肌苷残基。 Note 5 'single-stranded overhangs contain deoxyinosine residues. [0088] 图14D表示其中双链DNA的核苷酸序列通过大肠杆菌内切核酸酶V以产生5' 单链非回文突出端的方式指导双链切割的方法。 [0088] FIG 14D shows the nucleotide sequence of double-stranded DNA wherein the endonuclease V by the method of E. coli to produce a 5 'single-stranded non-palindromic overhang guided the double stranded cleavage. 注意5'单链突出端不含脱氧肌苷残基。 Note 5 'single-stranded overhangs contain deoxyinosine residues. [0089] 图14E表示其中双链DNA的核苷酸序列通过大肠杆菌内切核酸酶V以产生平端的方式指导双链切割的方法。 [0089] FIG 14E shows the nucleotide sequence of double-stranded DNA wherein the endonuclease V for E. coli to generate a blunt end double-stranded cleavage guide ways. [0090] 图15表示在相对链上含有脱氧肌苷的发夹衔接子(脱氧肌苷发夹衔接子)被大肠杆菌内切核酸酶V进行双链切割的配对末端测序策略的另一个实施方案的示意图。 [0090] FIG. 15 shows a hairpin adapter (Deoxyinosine Hairpin Adapters) containing deoxyinosine on the opposite strand is the E. coli endonuclease V nuclease cleavage of both strands of paired-end sequencing strategy another embodiment FIG. [0091] 图16表示使用图15中所述脱氧肌苷发夹衔接子方法,从大肠杆菌K12基因组DNA测序中获得的配对读长距离的分布。 [0091] FIG. 16 to FIG. 15 showing the method of deoxyinosine hairpin adapters, long-range reading pairing distribution obtained from E. coli K12 genomic DNA sequencing. [0092] 图17表示本发明配对末端测序方法的另一个实施方案的示意图。 [0092] FIG. 17 shows a schematic diagram of another embodiment of the present invention, paired-end sequencing methods. 发夹衔接子的核苷酸序列、配对末端衔接子(“A”和“B” )和PCR引物“F-PCR”和“R-PCR” 见图18。 The nucleotide sequence of the hairpin adapter paired end adapters ( "A" and "B") and the PCR primer "F-PCR" and "R-PCR" Figure 18. 每个配对末端衔接子具有如图18所示的双链和单链部分。 Each paired end adapters having single-stranded and double-stranded portion as shown in FIG. 18. “扭0”表示生物素。 "Twist 0" indicates biotin. “Met”表示甲基化碱基。 "Met" represents methylated bases. “SA珠粒”表示链霉抗生物素包被的微粒。 "SA beads" represents a streptavidin-coated microparticles. “EcoRI” 和“Mmel”分别表示限制性内切核酸酶EcoRI和MmeI的识别位点。 "EcoRI" and "Mmel" indicate nuclease recognition site cut with EcoRI and the MmeI restriction. [0093] 图18表示图17中所示的衔接子和引物寡核苷酸的核苷酸序列和修饰。 [0093] FIG. 18 showing the adapter shown in FIG. 17 and primer oligonucleotides and the nucleotide sequence of modifications. 图18A 表示发夹衔接子序列。 18A shows the hairpin adapter sequences. “miodT”表示内部生物素标记的脱氧胸腺嘧啶。 "MiodT" represents an internal biotin-labeled deoxythymidine. “Bio”表示生物素。 "Bio" represents biotin. “EcoRI”和“Mmel”分别表示限制性内切核酸酶Rx>RI和MmeI的识别位点ο[0094] 图18B表示配对末端衔接子和PCR引物核苷酸序列。 "EcoRI" and "Mmel" denote restriction endonuclease Rx> RI and MmeI recognition sites ο [0094] FIG. 18B shows the paired end adapters and PCR primers nucleotide sequence. 每个配对末端衔接子(“A”和“B” )由两条单链寡核苷酸“A上链”和“A下链”、“B上链”和“B 下链”退火产生。 Each paired end adapters ( "A" and "B") by the two single stranded oligonucleotides "A chain" and "A downlink", "B chain" and "B-chain lower" annealing generated. 图18B中所示的多核苷酸序列的5'端没有磷酸化。 5 'end phosphorylated polynucleotide sequences not shown in FIG. 18B. [0095] 图19表示用于在油包水乳液中连接多核苷酸的方法的一个实施方案的示意图。 [0095] FIG. 19 shows a schematic diagram of one embodiment of a method of linking polynucleotide for water in oil emulsion. [0096] 图20表示通过在有或没有含MmeI位点的载体DNA时获得的配对末端测序数据得到的大肠杆菌K12基因组DNA覆盖范围深度(depth of coverage)曲线图。 [0096] FIG. 20 shows the E. coli K12 genome obtained by paired-end sequencing data obtained with or without DNA vector containing a DNA Mmel site coverage depth (depth of coverage) graph. [0097] 图21表示用于基于重组的配对末端策略的方法的一个实施方案的示意图。 [0097] FIG. 21 shows a schematic of an embodiment of a method based on paired-end for recombination strategy. [0098] 图22表示用于图21的基于重组的策略的衔接子的一个实施方案和由其产生的衔接子产物。 [0098] FIG. 22 shows an embodiment of the adapter based on a policy and adapters recombinant products produced therefrom 21 for FIG. 本文按出现顺序描述了SEQ ID No : 57-64。 Described in order of appearance herein SEQ ID No: 57-64. [0099] 图23表示根据衔接子方向性的图21中基于重组策略的产物的示意图。 [0099] FIG. 23 shows a schematic diagram of policy 21 based on recombinant products according adapter directivity FIG. [0100] 图M表示至少部分根据图21中所述的基于重组的方法,由大肠杆菌K12基因组DNA得到的配对读长距离的分布。 [0100] M represents at least in part according to FIG recombinant-based long distance read pairing distribution obtained from E. coli K12 genomic DNA of claim 21 in FIG. [0101] 图25表示采用图21中所述的基于重组的方法产生的长配对末端片段所得到的序列信息所提供的优势的示意图。 [0101] FIG. 25 shows a schematic view in FIG. 21 of the sequence information based on the advantages of long paired end fragments generated by recombinant methods resulting in the provided use. [0102] 发明详述[0103] 除非另有说明,否则本文所使用的所有科技术语都具有本发明所属领域普通技术人员通常理解的相同含义。 [0102] DETAILED DESCRIPTION [0103] Unless otherwise indicated, all technical and scientific terms used herein have the same meaning ordinary skill in the art the present invention is commonly understood by one of ordinary. 尽管在本发明的实践中可以采用多种类似或等同于本文所述方法和材料的方法和材料,但是本文描述了优选的材料和方法。 Although in the practice of the present invention may be employed in a variety of similar or equivalent methods and materials described herein methods and materials are described herein but the preferred materials and methods. [0104] 本发明涉及用于核酸大片段两个末端分离和测序的有成本效益的快速方法。 [0104] Fast cost-effective method for nucleic acid of the present invention relates to the large fragment isolated and sequenced both ends. 所述方法是快速且适合于自动操作的,可供进行DNA大片段的测序和连接。 The method is fast and suitable for automatic operation, and for sequencing large DNA fragments connector. [0105] 与常规逐步克隆鸟枪法测序(clone-by-clone shotgun sequencing)相比,配对末端测序法有着多个重要优势,并且实际上是对逐步克隆鸟枪法测序的补充。 [0105] Compared with the conventional stepwise cloning shotgun sequencing (clone-by-clone shotgun sequencing), paired-end sequencing has several important advantages, and in fact is a supplement to gradually cloning shotgun sequencing. 在这些优势中最主要的是快速产生大基因组的支架(scaffolding)的能力,甚至当基因组散布有重复元件时。 The main advantage of these is the ability to rapidly generate large genomes stent (the scaffolded) even when the genome interspersed with repetitive elements. 本发明的方法可用来从体外反应中产生DNA片段的文库,其中所述片段含有较大的DNA片段的末端。 The method of the present invention may be used to generate a library of DNA fragments from the reaction in vitro, wherein said fragment contains a larger DNA fragment terminus. 甚至还可通过利用至少为IOkb以上的这些末端间的配对间隔距离(paired distance spacing),用最小的测序劳力,应用本发明的方法来装配整个基因组支架结构。 Even further by using at least a pair of distance (paired distance spacing) between these ends above IOkb, with minimal labor sequencing, the method of the present invention is applied to the entire genome assembly support structure. [0106] 第一种方法[0107] 在一个实施方案中,配对末端测序法可按下列步骤进行:[0108]步骤 IA[0109] 起始材料可以是任何核酸,包括例如基因组DNA、cDNA、RNA、PCR产物、附加体等。 [0106] The first method [0107] In one embodiment, paired-end sequencing according to the following steps: [0108] Step IA [0109] The starting material can be any nucleic acid, including for example genomic DNA, cDNA, RNA , the PCR product, episomes and the like. 虽然本发明的方法对长段的核酸起始材料尤其有效,但是本发明也适用于小核酸,例如粘粒、质粒、小PCR产物、线粒体DNA等。 While the method of the present invention is particularly effective for long segment of nucleic acid starting material, but the present invention is also applicable to small nucleic acids, such as cosmids, plasmids, small PCR products, mitochondrial DNA and the like. [0110] DNA可来自任何来源。 [0110] DNA may be from any source. 例如,DNA可来自其DNA序列是未知或不完全已知的生物的基因组。 Example, DNA may be unknown or incompletely known genomic organism from which the DNA sequence. 再举例来说,DNA可来自其DNA序列是已知的生物的基因组。 As another example, DNA can be derived from DNA sequences which are known in the genome of an organism. 已知基因组DNA的测序可供研究人员收集有关基因组多态性的数据并使基因型与疾病相互关联。 The researchers known for sequencing the genomic DNA of collecting data on the genome polymorphism and the genotype associated with the disease each other. [0111] 核酸起始材料可以是已知大小或已知大小范围的。 [0111] nucleic acid starting material can be a known size or size range is known. 例如,起始材料可以是其中平均插入序列大小和分布是已知的cDNA文库或基因组文库。 For example, where the starting material may be an average insert size and distribution are known cDNA libraries or genomic libraries. [0112] 或者,通过多种常用方法中的任一种使核酸起始材料片段化(图1A),包括雾化、超声处理、流体动力剪切(HydroShear)、超声片段化、酶促切割(例如DNA酶处理(包括有限DNA酶处理)、RNA酶处理(包括有限RNA酶处理)和用限制性内切核酸酶消化)、预片段化文库(prefragmented library)(例如cDNA文库中)和化学(例如NaOH) 诱导的片段化、热诱导的片段化和转座子介导的突变一这可引入切割位点,例如遍及整个DNA样品的限制性内切核酸酶切割位点。 [0112] Alternatively, any of a variety of conventional methods by a nucleic acid fragments of the starting material (FIG. 1A), comprising atomizing, sonication, hydrodynamic shear (HydroShear), ultrasonic fragmentation, enzymatic cleavage ( enzymatic treatment such as DNA (including the DNA enzyme treatment Co.), RNA processing enzymes (including RNA limited enzymatic treatment), and with the restriction endonuclease digestion), a library of pre-fragmentation (prefragmented library) (e.g. a cDNA library) and chemical ( e.g. NaOH) induced fragmentation, thermally induced fragmentation and transposon mediated by a mutation which may be introduced into the cleavage site, e.g. endonuclease cleavage sites throughout the DNA sample to be limiting. 参见Goryshin IY和Reznikoff WS,J Biol Chem. 1998 年3 月27 日,273(13) : 7367-74; Reznikoff WS等,Methods Mol Biol.2004 ; 260 : 83-96 ; OscarR.等,Journal of Bacteriology,2001 年4月,第2384-2388 页,第183卷,第7期;Pelicic, V.等,Journal of Bacteriology, 2000 年8 月,第5391-5398 页,第182 卷。 See Goryshin IY and Reznikoff WS, J Biol Chem 1998, 27 March, 273 (13): 7367-74; Reznikoff WS, etc., Methods Mol Biol.2004; 260:.. 83-96; OscarR the like, Journal of Bacteriology , April 2001, pp. 2384-2388, Vol. 183, No. 7; Pelicic, V., etc., Journal of Bacteriology, August 2000, pp. 5391-5398, Vol 182. [0113] 一些片段化方法(例如雾化)可产生靶DNA片段群,其大小仅相差2倍。 [0113] Some fragmentation methods (e.g., fogging) can generate the target population of DNA fragments that differ by only 2 times the size. 其它分级分离方法(例如限制性内切酶消化)产生较大的大小范围。 Other fractionation methods (e.g., restriction endonuclease digestion) have a greater size range. 如果需要大的核酸片段,则还有其它方法(例如流体动力剪切)可能是有利的。 If a larger nucleic acid fragments needed, there are other methods (e.g. hydrodynamic shear) may be advantageous. 在流体动力剪切(Genomic Solutions, Ann Arbor,MI, USA)中,使溶液的DNA通过一条突然收窄的管。 Hydrodynamic shear (Genomic Solutions, Ann Arbor, MI, USA) in the DNA solution by a sudden narrowing of the pipe. 当溶液接近收窄处时,流体加速以保持通过较小收窄区的体积流速。 When the solution is close to the narrow place, at a volume flow rate of the fluid to accelerate held by narrowing a small area. 在这个加速过程中,曳力拉伸着DNA直到它突然断裂。 In this acceleration process, drag forces stretch the DNA until it suddenly breaks. DNA发生片段化直到断片对于剪切力而言太小以致无法再破坏化学键。 DNA fragmentation occurred until the fragment is no longer too small shear forces for destruction of chemical bonds. 流体的流速和收缩的大小决定了最终的DNA片段大小。 Size and contracting fluid flow rate determines the final size of the DNA fragment. 用于制备核酸起始材料的其它方法可参见国际专利申请号W004/070007,该申请通过引用其全部内容予以结合。 Other methods for preparing nucleic acid starting materials can be found in International Patent Application No. W004 / 070007, which application is to be incorporated by reference in its entirety. [0114] 根据所采用的片段化方法,DNA末端可能需要精加工(polishing)。 [0114] The fragmentation methods employed, DNA end may need finishing (polishing). 也就是说, 可能需要对双链DNA末端进行处理使之制成平端并且适于连接。 That is, the double-stranded DNA ends may need to be treated to make blunt ends and adapted for connection is made. 这个步骤将根据片段化方法以本领域已知的方式而改变。 This step in a manner known in the art vary depending fragmentation methods. 例如,可以使用Bal31对机械剪切的DNA精加工以切割序列突出端,可使用聚合酶例如klenow、T4聚合酶和dNTP补平以产生平端。 For example, mechanical shearing Bal31 DNA finishing to cut the overhang sequence, for example using a polymerase klenow, T4 polymerase and dNTP to produce a blunt end fill level. [0115]步骤 IB[0116] 当片段的大小比所需要的变化更多时,可对核酸片段进行大小分级以减少这种大小变化。 [0115] Step IB [0116] When a change in size of the fragment more than necessary, can be size-fractionated nucleic acid fragments to reduce this size variation. [0117] 大小分级fractionation)是可通过本领域多种已知方法进行的任选步骤。 [0117] size fractionation Fractionation) are present can be obtained by methods known in the art a variety of optional steps. 用于大小分级的方法包括凝胶方法(例如脉冲凝胶电泳)、通过蔗糖梯度或氯化铯梯度的沉淀法和大小排阻层析法(凝胶渗透层析法)。 A method for size fractionation methods include gels (e.g. PFGE), by cesium chloride or sucrose gradient sedimentation and gradient size exclusion chromatography (gel permeation chromatography). 特定大小范围的选择将取决于由配对末端测序跨过的区域长度。 Selection of a particular size range will depend on the length of the region spanned by the paired-end sequencing. [0118] 用于大小分级的一项优选的技术是凝胶电泳(参见图1B)。 [0118] a preferred technique for size fractionation is gel electrophoresis (see FIG. 1B). 在一个优选的实施方案中,大小分级的DNA片段具有彼此在25%以内的大小分布。 In a preferred embodiment, size fractionation of the DNA fragments have a size distribution within 25% of each other. 例如,5Kb大小部分可包含5Kb+/-lkb(即4Kb〜6Kb)的片段,50Kb大小部分可包含50Kb+/_10kb(即40Kb〜 60Kb)的片段。 For example, 5Kb size fraction may comprise 5Kb +/- lkb (i.e. 4Kb~6Kb) fragments, 50Kb size fraction may comprise fragments of 50Kb + / _ 10kb (i.e. 40Kb~ 60Kb) a. [0119]步骤 IC[0120] 在该步骤中,制备了“俘获元件”。 [0119] Step IC [0120] In this step, a "capture element" is prepared. 俘获元件是线性双链核酸一它可具有用于连接得自前一步骤的核酸片段的单链末端或双链末端。 The capture element is a linear double stranded nucleic acid which may have a connection to obtain single-stranded termini or double stranded nucleic acid fragments from the previous step for the tip. “俘获元件”可以像含有正向和反向衔接子末端(图IC中绘制为圆的粗线区)的环状核酸(例如图IC所述质粒)一样增殖。 "Capture element" may contain as forward and reverse adapter tip (drawn circle in FIG IC area thick lines) of the circular nucleic acid (e.g., the IC of FIG plasmid) as proliferation. 可在将该环状质粒切割后,使用俘获元件。 May, after cutting the circular plasmid, using the capture element. 这些衔接子末端含有可用作在后续步骤中的潜在PCR引物和测序引物的杂交位点的核酸序列。 These terminal adapters comprises a nucleic acid sequence may be used in subsequent steps of the hybridization site of potential PCR primers and sequencing primers. [0121] 在两个衔接子末端之间,俘获元件可包含另外的元件,例如限制性内切核酸酶识别和/或切割位点、抗生素抗性标记、原核或真核复制起点或这些元件的组合。 [0121] In between the two terminal adapter, capture element may contain additional elements such as restriction endonuclease enzyme recognition and / or cleavage sites, antibiotic resistance markers, prokaryotic or eukaryotic origins of replication or these elements combination. 这类抗生素抗性标记的实例尤其包括而不限于赋予氨苄青霉素、四环素、新霉素、卡那霉素、链霉素、博莱霉素、零霉素(zeocin)、氯霉素等抗性的基因。 Examples of such antibiotic resistance markers include, without limitation in particular, confers resistance to ampicillin, tetracycline, neomycin, kanamycin, streptomycin, bleomycin, zeocin (Zeocin), chloramphenicol resistance genes. 原核复制起点尤其还可包括OriC和OriV。 Prokaryotic origin of replication in particular, may also include OriC and OriV. 真核复制起点可包括自主复制序列(ARS),但不限于这些序列。 Eukaryotic replication origin autonomously replicating sequences may comprise (ARS), but are not limited to these sequences. 另外,俘获元件可含有可用来将随后的核酸产物(步骤L)消化成为可扩增(通过PCR)的小片段的限制性内切核酸酶识别和/或切割位点(例如优选独特稀有的位点)。 Further, the capture device may then be used to contain a nucleic acid product (Step L) was digested become amplifiable endonuclease recognition and / or cleavage site (by PCR) the small restriction fragments (e.g., rare positions are preferably unique point). 俘获元件还可包含标记或标签,例如生物素,以易于用于配对末端测序的核酸的纯化或富集。 Capture element may also comprise a label or tag, such as biotin for easy purified or enriched for nucleic acid of the paired-end sequencing. [0122]步骤 ID[0123] 应用已知技术使俘获元件线性化,例如限制性内切核酸酶消化(平端或粘端可用于不同的片段制备;参见下文和图1D)。 [0122] Step ID [0123] Application of the capture element known techniques linearized, for example, restriction endonuclease digestion (blunt end or sticky ends may be different for preparing fragments; see below and FIG. 1D). 为了防止多联体形成(即多个俘获元件彼此连接),可使俘获元件去磷酸化或者用于TA克隆的拓扑异构酶修饰。 To prevent concatemer formation (i.e., capture a plurality of elements connected to each other), the capture element can dephosphorylation or for TA cloning topoisomerase modification. [0124]步骤 IE[0125] 将俘获元件与步骤A或B的片段(或大小分级片段)连接形成包含一个俘获元件和一个靶DNA的片段的环状核酸(图1E)。 [0124] Step IE [0125] The fragment capture element (or size fractionated fragments) Step A or B is connected to form a fragment containing a target DNA capture element and a circular nucleic acid (FIG. 1E). 通过已知方法使俘获元件与靶DNA连接, 例如通过DNA连接酶或通过拓扑异构酶克隆策略连接。 By known methods so that the capture element is connected to the target DNA, such as ligase or topoisomerase cloning strategy via through DNA. [0126]步骤 IF[0127] 前面步骤的结果产生俘获元件与可能具有相当大小的DNA片段连接的集合体。 [0127] The results of the previous step [0126] Step IF generating assembly may have capture elements with a considerable size of the DNA fragments. 使用本步骤剔除靶DNA片段大的内部区,产生大小可更适于自动化DNA测序的克隆的插入序列(图1F)。 This step removed using large internal fragment of the target DNA region, generating insert size may be more suitable for cloning of automated DNA sequencing (FIG. 1F). [0128] 在该步骤中,俘获的基因组DNA(即由步骤E产生的环状核酸)用一种或多种限制性内切核酸酶消化,所述限制性内切核酸酶在基因组DNA内可具有一个或多个切割位点。 [0128] In this step, the capture of genomic DNA (i.e., the circular nucleic acid produced in step E) with the one or more restriction endonuclease digestion, the restriction endonuclease enzyme can be genomic DNA having one or more cleavage sites. 一般而言,任何限制性内切核酸酶都可用于“内切割(internal cleavage) ”,只要限制性内切核酸酶不在俘获元件内切割即可。 Generally, any restriction endonuclease enzyme can be used in "inner cutting (internal cleavage)", as long as the restriction endonuclease enzyme cleavage can not capture the element. 内切割是指在靶DNA内部切割且不会切割俘获元件的切割。 Inside the inner cutting means cut and not cut the target DNA capture the cutting element. 可设计俘获元件使得它不含选定限制性内切核酸酶的切割位点,从而选择内切割限制性内切酶。 Capture element may be designed such that it restriction endonuclease enzyme cleavage site contain selected restriction enzymes to cut the selection. 限制性内切核酸酶及其用途是本领域众所周知的,并且易应用于本发明的方法。 Endonuclease restriction and the use thereof are well known in the art and easily applied to the method of the present invention. 另外,可以应用各自局限于内切割的多种限制性内切酶的组合来进一步减小靶DNA片段的大小。 Further, each composition may be applied more restriction endonuclease cleavage confined to further reduce the size of the target DNA fragment. [0129] 在一个优选的实施方案中,基因组DNA被这些限制性内切核酸酶中的一种或多种切割成50-150个碱基的俘获元件。 [0129] In a preferred embodiment, genomic DNA is cut nuclease capture elements of one or more bases 50-150 cut with these restriction. [0130]步骤 IG[0131] 在该步骤中,使作为已知序列的双链核酸的“隔离元件”在前面步骤的消化基因组材料末端之间连接形成环状核酸(图1G)。 [0130] Step IG [0131] In this step, the double stranded nucleic acid sequence known as the "spacer element" is a nucleic acid to form a ring (FIG. 1G) between the end of the digestion of genomic material in the previous step. 该“隔离元件”用作两个目的。 The "spacer element" is used for two purposes. 第一, 隔离元件可包含用于小环的滚环扩增的引发位点(见下文,步骤I)。 First, the spacer member may include a priming site for a small ring rolling circle amplification (see below, step I). 第二,因为隔离元件的序列是已知的,所以它可用作标记配对基因组末端各端的标识符(使得能够修剪(trimming)并易于对连接的末端进行软件分析)。 Second, since the spacer element sequence is known, it can be used as markers at each end of pairing identifier genomic terminus (enable trimming (Trimming) on ​​the terminus and easy software analysis). 也就是说,在随后的基因组片段测序过程中,隔离元件的序列可发出表明已对整个基因组片段进行了测序的信号。 That is, in the subsequent sequencing of the genomic fragment, the sequence of the spacer member may be issued to indicate that the entire group has signal segments gene sequencing. 这类隔离元件还可包含另外的元件,例如限制性内切核酸酶识别和/或切割位点、抗生素抗性标记、原核或真核复制起点或这些元件的组合。 Such spacer element may further comprise additional elements, such as a combination of restriction endonuclease enzyme recognition and / or cleavage sites, antibiotic resistance markers, prokaryotic or eukaryotic origins of replication or these elements. 尽管任选存在诸如抗生素抗性标记和复制起点这类元件,但是本发明方法的优势之一是所述方法不需要使用宿主细胞(例如大肠杆菌)用于核酸的克隆、扩增或其它操作。 Despite the presence of such optional elements such as an origin of replication and antibiotic resistance marker, but one advantage of the method of the present invention that the method does not require the use of a host cell (e.g. E. coli) for cloned nucleic acid, amplification or other operations. 隔离元件还可以是生物素化的,又或者用标记或标签加标记的,易于配对末端测序的核酸的纯化或富集。 Spacer element can also be biotinylated, or a label or tag tagged, paired-end sequencing readily purified or enriched nucleic acids. [0132]步骤 IH[0133] 由上一步骤产生的环状核酸(即小环)以单链提供,用于产生单链核酸。 [0132] Step IH [0133] a step resulting from the circular nucleic acid (i.e., a small ring) to provide single-chain, for generating a single stranded nucleic acid. 这可采用标准DNA变性技术,通过改变溶液的盐、温度或pH来进行。 This denatured DNA using standard techniques, carried out by changing a salt solution, the temperature or pH. 其它DNA变性技术为本领域技术人员所知。 Other DNA denaturation techniques known to those skilled in the art. 变性后,得自同一小环的DNA环仍可连接,但这不影响本发明的方法(图1H)。 After denaturation, DNA derived from the same ringlet ring still connected, but this does not affect the method of the present invention (FIG. 1H). [0134]步骤 II[0135] 使引物与包含可与引物退火的序列的隔离元件退火。 [0134] Step II [0135] primers to anneal comprises a spacer element sequence can be annealed with a primer. 因此,该间隔序列用作滚环扩增的起始区(图II)。 Thus, the spacer sequence is used as the initial rolling circle amplification region (FIG. II). [0136]步骤 IJ[0137] 通过滚环扩增使样品扩增,产生长的单链产物(图1J)。 [0136] Step IJ [0137] by rolling circle amplification to amplify the sample, to produce a long single-stranded product (FIG. 1J). 该滚环扩增步骤的一个优势是没有隔离元件的元件将不会扩增,而未闭环的元件难以扩增。 The rolling circle amplification step is not a spacer element advantage member will not be amplified without amplification of the closed loop element is difficult. [0138]步骤 IK[0139] 将一种或多种加帽寡核苷酸(cappteg oligo)与位于正向和反向衔接子侧翼的单链限制位点退火(在这些区域为其提供双链)(图1L)。 [0138] Step IK [0139] one or more capping oligonucleotides (cappteg oligo) located forward and reverse single stranded adapters flanking restriction sites annealing (in these regions to provide a double stranded ) (FIG. 1L). 加帽寡核苷酸可与至少部分的俘获元件、至少部分的衔接子区域或两者互补。 Capping oligonucleotides may capture at least a portion of the element, or both of the adapter region is at least partially complementary. [0140]步骤 IL[0141] 在加帽位点把加帽单链DNA切割成小片段(图1M)。 [0140] Step IL [0141] In the cap site capping single-stranded DNA into small fragments (FIG. 1M). 这些小片段具有已知序列的末端并且可容易地使用常规扩增技术(例如PCR)扩增。 Ends of small fragments having a known sequence and can be easily amplified using conventional amplification techniques (e.g. PCR). [0142] 第二种方法[0143] 在第二个实施方案中,配对末端测序法可按下列步骤进行:[0144] 步骤2A-样品DNA的片段化[0145] 靶核酸的片段化和大小分级与前面的实施方案相同。 [0142] The second method [0143] In a second embodiment, paired-end sequencing according to the following steps: [0144] Step 2A- fragmented sample DNA [0145] and the target nucleic acid fragment size fractionation as in the previous embodiments. [0146] 步骤2B-甲基化和末端精加工[0147] 如有需要,可以通过任何甲基化酶使片断化靶核酸甲基化。 [0146] Step methylation and end finishing 2B- [0147] If desired, the target nucleic acid is fragmented by any methylase methylation. 优选的甲基化酶可以是影响限制性内切核酸酶消化的甲基化酶。 Preferred methylase may affect endonuclease restriction digestion methylase. 可按至少两种不同的策略使用甲基化酶。 At least two different strategies can be used methylase. 在一个优选的实施方案中,甲基化酶能够实现通过只在甲基化限制位点上切割的限制性内切核酸酶切割。 In a preferred embodiment, methylases can be realized only by cutting the cut in the methylation restriction endonuclease restriction sites. 在另一个优选的实施方案中,甲基化酶防止被只切割未甲基化DNA的限制性内切核酸酶切割。 In another preferred embodiment, the endo being the methylase prevents cleavage only unmethylated DNA restriction nuclease cleavage. [0148] 末端精加工的步骤与第一方法中所述步骤相同。 [0148] Step finishing end of the first step in the same method. [0149] 步骤2C-标签衔接子的连接[0150] 在该步骤中,将衔接子与靶核酸片段的末端连接(图2,I),产生在两端具有衔接子的片段。 [0149] Step 2C- tab connection adapter [0150] In this step, the connecting terminal adapter (FIG. 2, I) promoter and target nucleic acid fragment, to generate fragments with adapters at both ends. 衔接子可为任何大小,但优选10-30个碱基的大小,更优选12-15个碱基的大小。 Adapter may be of any size, but the size of the size is preferably 10-30 bases, more preferably 12-15 bases. 为了防止形成衔接子和/或靶核酸片段的多联体,衔接子可包含平端和不相容粘端(即具有5'突出端或3'突出端的末端)。 In order to prevent the formation of adapter and / or concatamers of target nucleic acid fragment, comprising a blunt end adapters can and the incompatible sticky ends (i.e., having a 5 'protruding end or the 3' end of the overhang). 在衔接子与DNA片段连接后,除去连接酶,用聚合酶和dNTP补平粘端。 After the DNA fragments with the adapter removed ligase, polymerase and dNTP cohesive ends blunted. [0151] 这个部分的衔接子可以是俘获片段。 [0151] This adapter fragment portion may be captured. 俘获片段的实例见图4和图5。 Examples of capture fragments Figures 4 and 5. [0152] 为了防止多联体形成,衔接子可以是发夹衔接子(图6A)。 [0152] To prevent concatemer formation, adapters may be a hairpin adapters (Figure 6A). 发夹衔接子(例如图6)的使用防止多联体形成,因为发夹衔接子无法形成超过二聚体的任何多聚体。 Use of hairpin adapters (e.g., FIG. 6) to prevent the formation of concatemers, as hairpin adapters can not form any more than multimer dimer. 防止多联体的另一种方法是使用其中一条或两条链的5'端没有磷酸化的衔接子。 Another method to prevent concatemer is used in which 'end of the adapter is not phosphorylated in one or both strands 5. [0153] 可以使用的其它衔接子包括未磷酸化衔接子,具有使用较少加工步骤的优势, 但仍需要使用激酶的磷酸化步骤。 [0153] Other adapters may be used include non-phosphorylated adapters, having the advantage of using fewer processing steps, but still requires the use of kinase phosphorylation step. [0154] 如本公开内容其它部分论述的一样,衔接子可被甲基化或生物素化或两者兼有。 [0154] As used in this disclosure, as discussed in other sections, the adapter can be methylated or biotinylated or both. [0155] 步骤2D-外切核酸酶消化和凝胶纯化[0156] 与两个发夹衔接子连接的DNA片段可使用外切核酸酶进行纯化。 [0155] Step 2D- exonuclease digestion and gel purified [0156] DNA fragment was ligated to two hairpin adapters may be used exonuclease purification. 该外切核酸酶纯化利用在两端与发夹衔接子连接的双链DNA是无暴露的5'端或3'端的DNA分子这一事实。 The purified using exonuclease at the 5 'end or 3' end of the DNA molecule of double-stranded DNA adapter ligated to both ends of the hairpin is the fact that non-exposed. 连接混合物中的其它DNA,例如只与一个发夹衔接子连接的双链DNA片段、 未连接的DNA片段和未连接的衔接子,易感于外切核酸酶(图6B)。 Other DNA mixture is connected, double stranded DNA fragments ligated to the adapters, for example, only a hairpin, and the DNA fragments unattached unconnected adapters, susceptible to exonuclease (FIG. 6B). 因此,暴露于外切核酸酶的连接混合物将除去大多数DNA,但与两个发夹衔接子连接的DNA片段和发夹衔接子二聚体除外。 Thus, exposure to the exonuclease will remove most of the DNA ligation mixture, but the DNA fragments with the exception of two hairpin adapters and hairpin adapter dimers. 由于发夹衔接子二聚体比DNA片段明显较小,因此它们可采用已知技术去除,例如大小分级柱(例如旋转柱(spte column)),或琼脂糖或丙烯酰胺凝胶电泳, 或本领域已知和/或本公开内容其它部分论述的其它多核苷酸大小判别法之一。 Since the hairpin adapter dimers is significantly smaller than the DNA fragments, they can be removed with known techniques, such as size fractionation column (e.g. spin columns (spte column)), or agarose or acrylamide gel electrophoresis, or the present known in the art and / or one of the other polynucleotide size determination method of the present disclosure is discussed in other sections. [0157] 在一个实施方案中,衔接子可被生物素化以利于携带标签片段的分离/富集。 [0157] In one embodiment, the adapter may be biotinylated to facilitate carrying the tagged fragments separation / enrichment. [0158] 在另一个实施方案中,可通过使与标签序列互补的俘获寡核苷酸与片段退火, 来纯化含有衔接子的片段。 [0158] In another embodiment, the adapter can be purified fragment containing the promoter by a sequence complementary to the tag fragment capture oligonucleotide annealed. [0159] 步骤2E-用于环化的片段的制备[0160] 在将衔接子加入靶核酸片段的两个末端后,使该片段环化。 Preparation [0160] [0159] Step 2E- fragments used to cyclize the adapter is added after two ends of the target nucleic acid fragment so that the fragment was circularized. [0161] 为了制备用于自环化的靶核酸,出于多种原因,可能需要切割衔接子区。 [0161] For the preparation of the target nucleic acid from the loop for a number of reasons, it may be necessary to cut the adapter region. 例如,如果使用发夹衔接子,则DNA片段不会自环化,因为没有游离的5'端或3'端。 For example, if the hairpin adapters, the DNA fragment is not self-cyclized, because no free 5 'end or 3' end. 再举例来说,如果衔接子留下带有平端的DNA片段,则切割可允许衔接子具有5'突出端或3'突出端,而且这些突出端(所谓的“粘端”)大大促进连接的效率。 As another example, if a DNA fragment having adapters leave blunt ends, adapters may allow the cut having a 5 'protruding end or 3' overhangs, and the overhangs (so-called "sticky ends") greatly facilitate the connection effectiveness. 此外,衔接子区的消化可供具有两个衔接子(每端各连接一个)的DNA片段的选择。 Further, the sub-region of the adapter was digested with two adapters for selection (each connected to a respective terminal) DNA fragment.这是因为可以设计衔接子,使得用限制性内切核酸酶的切割可留下相容的粘端。在衔接子区中进行切割后,只具有一个衔接子的DNA片段(不理想类型)可具有一个粘端和一个平端,并且可能难以自环化。因此,仅在两端具有衔接子的DNA片段可以环化。 [0162] 可用多种方法完成衔接子的限制性切割。在一种方法中,使衔接子甲基化,并与未甲基化DNA连接。然后,构建体用只切割甲基化DNA的限制性内切核酸酶消化。因为只有衔接子被甲基化,所以只有衔接子可被切割。 [0163] 在另一种方法中,DNA片段可被甲基化,而衔接子未被甲基化。用只识别和切割未甲基化DNA的限制性内切核酸酶进行切割可限制对衔接子的切割。这可通过使用已被甲基化或通过体外甲基化的起始DNA来实现。 [0164] 要了解的是在一些情况下,不需要消化衔接子。例如,如果得自上述步骤的片段仅包含平端,则可以任选消化衔接子。 [0165] 还要了解的是,可以对DNA片段进行处理以促进连接/环化。例如,如果衔接子是封闭的,或者不含5'磷酸,则可除去封闭基团,或者可加入磷酸盐以使片段易于连接。 [0166] 步骤2F-末端连接形成环化片段[0167] 多种方法可用于环化。 [0168] 在一个实施方案中,将连接酶加入具有合适的连接酶缓冲液的反应混合物中, 可供DNA片段再环化。 [0169] 在一个实施方案中,连接在稀的DNA浓度下进行以促进自连接,并阻碍多联体的形成。 [0170] 在另一个实施方案中,按照本公开内容其它部分所描述,在油包水乳液中进行连接,其中含水液滴含有大约一个待环化的片段。

[0171] 在一个实施方案中,将特征标签(signature tag)与靶核酸片段连接,并且使该片段自环化(参见图2)。 [0171] In one embodiment, the feature tag (signature tag) is connected to the target nucleic acid fragment, and the fragment from the cyclization (see FIG. 2). 特征标签是介于24-30个碱基对的双链核酸序列。 Wherein the label is a double stranded nucleic acid sequence of 24-30 base pair range. 这种“特征标签”类似于上述实施方案的“隔离元件”,因为它可用作标记配对基因组末端各端的标识符(使得能够修剪并且易于进行连接末端的软件分析)。 This "feature tag," similar "spacer element" of the above-described embodiment, since it is used as a marker pairing identifier of each terminal end of the genome (enable trimming and easy software analysis of the connecting end). 在基因组片段随后的测序过程中,特征标签的序列表示靶核酸序列两个末端之间的边界。 Genomic fragment in subsequent sequencing process, the sequence of feature tag indicates the target nucleic acid sequence the boundary between the two ends.

[0172]步骤 2G [0172] Step 2G

[0173] 在加入特征标签和自环化后,使靶核酸片段进一步消化或片段化。 [0173] After addition of the label and wherein the self-cyclizing the target nucleic acid fragment is further digested or fragmented. 片段化可采用本公开内容所给出的任何片段化方法进行。 Any fragmentation fragmentation methods may be employed according to the present disclosure will be given. 参见例如上述步骤1A。 See, for example, the above-described step 1A. 或者,可使用一种或多种限制性内切核酸酶消化靶DNA产生片段。 Alternatively, using one or more restriction endonuclease digestion to produce the target DNA fragment.

[0174] 在一个优选的实施方案中,使用喷雾器使核酸片段化直到平均片段大小约为200-300bp。 [0174] In a preferred embodiment, the nucleic acid fragments of the nebulizer until the average fragment size is approximately 200-300bp. 如图2所示,这些片段中的一些可含有特征标签,而其它片段可不含特征标签。 As shown in FIG. 2, some of these fragments may contain feature tag, and may contain other fragments feature tag.

[0175] 在这一点上,可采用标准技术对核酸片段进行测序。 [0175] At this point, using standard techniques of nucleic acid fragments were sequenced. 用于核酸片段测序的方法是已知的。 A method for sequencing nucleic acid fragments are known. 一种优选的测序方法参见2004年1月28日提交的国际专利申请WO 05/003375。 International patent a preferred sequencing method, see January 28, 2004 filed WO 05/003375.

[0176]步骤 2H [0176] Step 2H

[0177] 在一个任选步骤中,可以从没有特征标签的片段中富集含有特征标签的片段。 [0177] In an optional step, the fragment containing the feature tag can be enriched from the segment feature tag no. 用于富集的一种方法包括在样品制备步骤中使用生物素化特征标签。 A method for enrichment include biotinylated feature tag in the sample preparation step. 在片段化后,可使含有特征标签的片段生物素化,并且可使用链霉抗生物素柱或溶液中的链霉抗生物素珠粒进行纯化。 After fragmentation, biotinylated fragments can contain feature tag, and may use biotin streptavidin column or streptavidin solution was purified biotin beads.

[0178] 富集之后,可采用标准技术对核酸片段进行测序,包括自动化技术,例如2004 年1月28日提交的国际专利申请WO 05/003375中所述的技术。 [0178] After enrichment, using standard techniques for sequencing nucleic acid fragments, including automation, for example, International Patent January 28, 2004 filed WO 05/003375 in the art.

[0179] 第三种方法 [0179] A third method

[0180] 可通过第三种方法进行配对末端测序法。 [0180] paired-end sequencing may be performed by the third method.

[0181]步骤 3A-3E [0181] Step 3A-3E

[0182] 在该方法中,步骤A〜步骤E可按照第二种方法(即按照步骤2A〜2E)中所描述的步骤进行。 [0182] In this method, the step may be A~ Step E (i.e., in accordance with step 2A~2E) carried out according to procedures described in the second method. 此外,在第三种方法中,每个衔接子包含IIS型限制性内切核酸酶位点,该位点可以指导在距离限制性内切核酸酶识别位点约15-25bp处切割DNA。 Further, in the third method, each adapter contains endonuclease site type IIS restriction, the site may direct the distance restriction endonuclease recognition site at about 15-25bp cut DNA. 已知不同的IIS型限制性内切核酸酶在距内切核酸酶识别位点不同距离处切割,预期使用不同的IIS型限制性内切核酸酶调节该距离。 The known different type IIS restriction endonuclease cut at different distances from nuclease cleavage recognition site in the distance, it is contemplated to use a different endonuclease type IIS restriction enzyme regulating the distance.

[0183] 步骤3F-末端连接形成环化片段 [0183] Step 3F- terminus to form a ring fragment,

[0184] 步骤3F可以按照第二种方法(步骤2F)进行,只是不使用特征标签(参见图6D)。 [0184] Step 3F may be performed in accordance with the second method (Step 2F), except that no feature tag (see FIG. 6D).

[0185] 任选的富集步骤 [0185] The optional enrichment step

[0186] 在本发明的任何方法中,在连接后,都可使用外切核酸酶除去非环化片段并减少多联体化片段的存在。 [0186] In any of the methods of the present invention, after the connection, can be used to remove the presence of an exonuclease acyclic segments and reduce concatemerization fragments. 因为适当再环化的DNA片段有未暴露的5'端或3'端,这可抵抗外切核酸酶消化。 Because the appropriate DNA fragments recircularized with a non-exposed 5 'end or 3' end, which is resistant to exonuclease digestion. 另外,较大的多联体,由于切口而具有暴露的5'端或3'端的机会可能较大。 Further, the larger concatemer, since the notch has the opportunity to expose the 5 'end or 3' end may be larger. 外切核酸酶处理还可除去具有切口的这些多联体。 These concatemers having a cutout exonuclease treatment also removed.

[0187] 仵诜的滚环扩增 [0187] Shen Wu rolling circle amplification

[0188] 环化DNA可以通过滚环扩增进行扩增。 [0188] circularized DNA can be amplified by rolling circle amplification. 简单地讲,可以使用寡核苷酸与再环化DNA的一条链杂交。 Briefly, an oligonucleotide may be used with a strand of DNA recircularized. 该寡核苷酸引物用聚合酶延伸。 The oligonucleotide primer extension by a polymerase. 因为模板是个圆,聚合酶将产生具有靶DNA的多个重复序列的单链多联体。 Since the template is a circle, the polymerase will generate single stranded concatemer having multiple repeating target DNA sequence. 该单链多联体可通过使第二引物与之杂交成为双链,并从该第二引物起延伸。 The single stranded concatemer may be double-stranded by hybridization with the second primer, and extending from the second primer. 例如,该第二引物可与此单链多联体的衔接子序列互补)。 For example, the second adapter primer may be single-stranded concatemers with this sequence complementarity). 所得双链多联体可直接用于下一步骤。 The resulting double stranded concatemers can be directly used in the next step.

[0189] 步骤3G-DNA的消化/片段化 Digestion [0189] Step a 3G-DNA / fragmentation

[0190] 在该步骤中,得自滚环扩增的环化核酸或多联体化核酸用IIS型限制性内切核酸酶消化(图6D)。 [0190] In this step, from ring rolling circle amplification, nucleic acid or nucleic acid concatemers digestion (FIG. 6D) with a type IIS restriction endonuclease enzymes. 如步骤3A中所述,每个衔接子含有至少一个IIS型限制性内切核酸酶切割位点。 In the step, each adapter 3A endonuclease comprising at least a type IIS restriction enzyme cleavage site. IIS型限制性内切核酸酶将识别衔接子上的IIS型限制性内切核酸酶切割位点, 并切去约10-20个碱基对的核酸。 Type IIS restriction endonuclease will endonuclease cleavage site of the type IIS restriction on the identification adapter, and cut out about 10 to 20 nucleic acid base pairs. IIS型限制性内切核酸酶的实例包括MmeI(约20bp)、 EcoP151 (25bp)或BpmI (14bp)。 Examples of type IIS restriction endonucleases include enzyme Mmel (about 20bp), EcoP151 (25bp) or BpmI (14bp).

[0191] 该步骤将产生短的DNA片段(IO-IOObp),该片段包含较大DNA片段的两个末端,两个末端之间具有衔接子区(图6E)。 [0191] This process will produce a short DNA fragment (IO-IOObp), the two-terminal fragment comprises large DNA fragments, the adapter having a region (FIG. 6E) between the two ends. 用于产生相同结构的一种备选方法是采用本公开内容其它部分所描述的多种DNA片段化方法中的任一种(例如步骤IA中所述)使环化核酸随机片段化。 An alternative method for generating the same structure is the use of any one of the present disclosure (e.g. in the Step IA) of various DNA fragments with the other parts described in the circularized nucleic acid random fragmentation. 这可供制备任何大小的片段(lOObp、150bp、200bp、250bp、300bp 或以上)。 This preparation for any segment size (lOObp, 150bp, 200bp, 250bp, 300bp or more).

[0192] 至于另一种方法,也可产生在中间没有衔接子区的其它DNA片段(图6E)。 [0192] As another method, can produce no other adapter sub-region DNA fragment (FIG. 6E) in the middle. 然而,因为衔接子区是生物素化的,所以包含衔接子区的DNA可使用对生物素有亲和力的固相支持体进行选择性纯化,固相支持体例如链霉抗生物素珠粒、抗生物素蛋白珠粒、 BCCP珠粒等。 However, since the adapter region is biotinylated, DNA comprising adapter regions it may be used for the solid support is known as bio-selective affinity purified solid support e.g. biotin-streptavidin beads, the anti- avidin beads, BCCP beads and the like.

[0193] 步骤3H-测序 [0193] Step 3H- sequencing

[0194] 可用手工或通过自动化序列技术对本发明方法的任何产物进行测序。 [0194] be manually sequenced or any product of the process of the present invention by an automated sequence technique. 通过诸如Sanger测序法或Maxam-Gilbert测序法等这类方法进行手工测序是众所周知的。 Performed by manual sequencing, or Sanger sequencing of such methods Maxam-Gilbert sequencing method or the like as is well known. 例如,可以通过采用自动化测序方法如由454 Life Sciences Corporation(Branford,CT)研发的454SeqUenCingTM进行自动化测序,该方法还可参见2004年1月28日提交的申请W0/05003375和同时待审的2004年1月28日提交的美国专利申请USSN : 10/767,779 ; 2003 年6 月6 日提交的USSN : 60/476,602 ; 2003 年6 月6 日提交的USSN : 60/476,504 ; 2003年1月29日提交的USSN : 60/443,471 ; 2003年6月6日提交的USSN : 60/476,313 ; 2003 年6 月6 日提交的USSN : 60/476,592 ; 2003 年4 月23 日提交的USSN : 60/465,071 ;以及2003 年8 月25 日提交的USSN : 60/497,985。 For example, as it can be automated sequencing of 454 Life Sciences Corporation (Branford, CT) developed 454SeqUenCingTM by automated sequencing method see also apply January 28, 2004 filed W0 / 05003375 and pending 2004 in January, US Patent 28 filed USSN: 10 / 767,779; USSN 2003 June 6 filed: 60 / 476,602; USSN 2003 June 6 filed: 60 / 476,504; filed 2003 January 29 the USSN: 60 / 443,471; USSN 2003 June 6 filed: 60 / 476,313; USSN 2003 June 6 filed: 60 / 476,592; USSN 2003 April 23 filed: 60 / 465,071; and USSN August 25, 2003 filed: 60 / 497,985.

[0195] 简单地讲,在自动测序方法(例如由454 Life Sciences Corp.开发的测序方法) 中,可将一个测序衔接子(测序衔接子A)与DNA片段的一个末端连接,并可将第二测序衔接子(测序衔接子B)与DNA片段的第二个末端连接。 [0195] Briefly, in automated sequencing method (sequencing by the method developed by 454 Life Sciences Corp., for example), the adapter may be a sequenced (sequencing adapter A) is connected to one end of the DNA fragment, and the first two sequencing adapter (sequencing adapter B) connected to the second end of the DNA fragment. 在连接之后,通过使生物素与固相支持体结合,可将DNA片段从任何未连接的测序衔接子中纯化出来。 After the connection, the purification of DNA fragments from any sequencing adapters are not connected by a biotin binding solid support. 可将分离的核酸片段放入单独的反应槽中,使用对测序衔接子A和测序衔接子B有特异性的引物通过PCR进一步扩增。 The isolated nucleic acid fragment may be placed in a separate reaction vessel, the use of the sequencing adapter A and sequencing adapter primers specific for B is further amplified by PCR. 可通过使生物素部分与优先由AB片段组成的A或B衔接子单链DNA的任一条连接进行分离。 Can be separated by a connecting portion biotin and any priority of AB fragments of A or B is single-stranded DNA adapters. 可以使用对测序衔接子A、测序衔接子B有特异性的测序引物或者对位于两个末端之间的衔接子(例如发夹衔接子)有特异性的测序引物,对该扩增的核酸进行测序。 May be used for sequencing adapter A, specific for sequencing adapter B sequencing primers or primers specific for the sequencing of the adapter (e.g. hairpin adapters) located between the two ends, for the amplified nucleic acid sequencing.

[0196] 一旦制成大量的包含较大DNA片段的末端的这些片段,则可对其进行测序,并对配对末端序列信息进行装配以产生基因组的部分或完整序列图谱。 [0196] Once made large ends of these fragments contains a large DNA fragments can be sequenced, and assembled paired-end sequence information to generate a partial or complete genome sequence map.

[0197] 第四种方法 [0197] A fourth method

[0198] 配对末端测序法可采用上述方法的变通方法,即如图12中所示的称为配对读长PET随机片段化的方法进行。 Can be [0198] paired end sequencing work of the above methods, the read length is called pairing PET random fragmentation of the method shown in FIG. 12 that is performed. 按照此第四种方法的实验结果见图13。 Following the experimental results of this fourth method is shown in Figure 13.

[0199]步骤 4A-4E [0199] Step 4A-4E

[0200] 在该方法中,步骤A〜步骤D可按照第二种方法或第三种方法(即如步骤2A-2D或步骤3A-3D)中所述方法进行。 [0200] In this method, the step A~ Step D may be carried out according to the second method or the third method (i.e., step or steps 2A-2D 3A-3D) of the method. 作为备选方法,步骤4D可采用SPRI (固相可逆固定法)进行以对外切核酸酶处理片段进行纯化。 As an alternative, the step of 4D can be the SPRI (solid phase reversible immobilization method) to exonuclease treatment fragment was purified. 例如,将图12中的核酸片段与生物素化引物连接,并且可以使用例如链霉抗生物素、抗生物素蛋白、低亲和性链霉抗生物素或低亲和性抗生物素蛋白包被的珠粒进行纯化。 For example, the nucleic acid fragments in FIG 12 is connected with a biotinylated primer, and may use, for example, streptavidin-biotin, avidin, streptavidin low affinity or low affinity biotin-avidin package beads were purified.

[0201] 步骤4E可以桉照步骤2E或步骤3E所述的步骤进行。 [0201] Step 4E 3E step may be the step or steps of 2E as for eucalyptus.

[0202] 步骤4F可以桉照步骤3F所沭步骤讲行。 [0202] Step 3F 4F can eucalyptus step as the step Shu say row. 简单地讲,可采用如步骤2F或步骤3F所述的任何已知的环化方法使上一步骤产生的线性DNA片段环化。 Briefly, as may be employed in any known method of cyclization step 2F 3F or step of the step of generating a linear DNA fragment was circularized.

[0203] 另外,可进行如上述步骤3F中所述的任选富集步骤来富集环状核酸。 [0203] Further, the above step may be performed as the optional enrichment step to enrich for circular nucleic acid 3F. 简单地讲,可通过降解具有游离末端的核酸的外切核酸酶除去没有环化的核酸。 Briefly, a nucleic acid may have free ends of degradation by exonuclease removed without circularized nucleic acid. 共价闭合的环状核酸没有游离末端,可抵抗外切核酸酶攻击。 Covalently closed circular nucleic acid is not a free end, may be resistant to exonuclease attack. 因为这样,用外切核酸酶处理可在除去线性核酸的同时富集环状核酸。 Because of this, treatment with exonuclease circular nucleic acid may be enriched is removed while the linear nucleic acid.

[0204]步骤 4G [0204] Step 4G

[0205] 在自环化后,可采用本公开内容所列举的任何片段化方法进行片段化。 [0205] After the self-cyclizing any fragmentation methods may be employed according to the present disclosure is exemplified fragmented. 一种优选的方法是采用机械剪切使环状核酸片段化。 A preferred method is to use mechanical shearing of the circular nucleic acid fragments. 例如,可通过涡旋振荡、通过迫使溶液中的核酸通过小口或本公开内容其它部分所描述的其它类似方法进行机械剪切。 For example, can be carried out by mechanical shearing force by vortexing nucleic acid solution through the small opening in the present disclosure, or other similar methods described elsewhere herein. 机械剪切的一个优势是可产生不同长度的核酸(参见图12步骤G后的核酸)。 One advantage of mechanical shear is a nucleic acid (see FIG. 12 nucleic after step G) of different lengths.

[0206] 还产生在中间没有衔接子区的DNA片段。 [0206] DNA fragments adapter also generates no sub-regions in the middle. 参见图12。 See Figure 12. 然而,由于衔接子区是生物素化的,因此可以采用对生物素具有亲和力的固相或半固相支持体(例如链霉抗生物素珠粒、抗生物素蛋白珠粒、BCCP珠粒等)对包含衔接子区的DNA进行选择性纯化。 However, since the adapter region is biotinylated, and therefore can be used a solid phase or semi-solid support having an affinity for biotin (e.g. biotin-streptavidin beads, avidin beads, BCCP beads and the like ) of DNA comprising adapter regions is selectively purified.

[0207]步骤 4H [0207] Step 4H

[0208] 可采用可利用的任何手工或自动方法对方法4的产物进行测序。 [0208] Any manual or automatic methods can be available for products of the process 4 were sequenced. 这类方法的详情见上述步骤3H。 For details, see the above step such methods 3H.

[0209] 如上所述和图12中所示的配对读长PET随机片段化法提供多个优势。 [0209] providing a plurality of read length match advantages described above, as shown in FIG. 12 and PET random fragmentation methods. 第一,方法4在装配方面提供较高置信度,因为机械剪切可产生较长的片段,该片段进而可供较长的读长。 First, the method provides a high degree of confidence 4 in terms of assembly, since the mechanical shearing may generate a longer fragment in turn for a longer read length. 较长的读长使靶序列的装配具有较高置信度。 Longer read lengths so that the assembly of the target sequence with a higher degree of confidence. 第二,由于机械剪切而成为可能的较长片段导致跨越较长核酸区的配对末端读长。 Second, due to the mechanical shearing it made possible resulting in longer fragments spanning a longer region of nucleic acid paired-end read length. 通过跨越较长核酸区,方法4有利于缺口闭合(gap closure),并且还具有跨越难以分析的核酸区的较高可能性。 By spanning longer region of nucleic acid, the method is conducive to the notch 4 is closed (gap closure), and also across the region of nucleic acid has a higher likelihood of difficult to analyze. 这些困难区域可以是例如重复区或高GC含量区。 These difficulties may be, for example, repeat regions or areas of high GC content. 这样,方法4提供缺口闭合性能得到改进的优势。 Thus, the method 4 provides improved gap closure performance advantage. 第三,因为方法4提供缺口闭合的能力,所以当各个末端可用于构建装配件时,该方法可专门用来对完整基因组进行测序。 Third, since the method provides the ability to close the gap 4, when the respective end fitting may be used to construct, which can be specifically used to complete genomes.

[0210] 方法4的优势的一个实例可参见图13。 [0210] Method 4 is an example of the advantages See Figure 13. 图13描述了采用方法4进行测序的大肠杆菌K12基因组DNA。 Figure 13 depicts the E. coli K12 genomic DNA sequencing method 4. 正如可观察到的一样,采用该方法,明显较长的读长长度分布, 从小于50到约400不等都是可行的。 As can be observed, as to this method, significantly longer read lengths length distribution, from less than 50 to about 400 range are possible. 另外,可产生约3kb的片段长度并对其末端测序。 Further, a length of about 3kb fragments may be generated and sequenced in the end. 这就表明了与其它方法相比,方法4提供较好的缺口闭合性能。 This shows that compared to the other methods, method 4 provides better performance gap closed.

[0211] 第五种方法 [0211] The fifth method

[0212] 可采用如图15给出的上述方法的变通方法进行配对末端测序。 [0212] modifications of the above methods may be employed are given in FIG. 15 paired end sequencing.

[0213] 在该方法中,可将衔接子设计成脱氧肌苷发夹衔接子,其在发夹双链区的相对链上掺入了脱氧肌苷核苷酸(本文亦称肌苷)。 [0213] In this process, the adapter can be designed deoxyinosine hairpin adapter, which incorporates deoxyinosine nucleotides (also known herein inosine) on opposite strands of the hairpin double-stranded region. 大肠杆菌内切核酸酶V(EndoV)在自肌苷核苷酸起的第2位和第3位核苷酸3 '之间引入单链切口(cut/nick)。 E. coli Endonuclease V (EndoV) between the 3 'position and the second bit 3 from inosine nucleotides from nucleotide single-stranded introduced cutout (cut / nick). (Yao M和Kow YW, J Biol Chem. 1995, 270 (48) : 28609-16 ; Yao M 和Kow YW, J Biol Chem. 1994, 269(50) : 31390-6 ; YaoM 等,Ann NY Acad Sci. 1994,726 : 315-6; YaoM 等,JBiol Chem. 1994, 269(23) : 16260-8)。 (Yao M and Kow YW, J Biol Chem 1995, 270 (48): 28609-16; Yao M and Kow YW, J Biol Chem 1994, 269 (50):.. 31390-6; YaoM the like, Ann NY Acad Sci . 1994,726: 315-6; YaoM the like, JBiol Chem 1994, 269 (23): 16260-8)..

[0214] 如图14中所示,肌苷在发夹衔接子中的相对布置决定在EndoV切割两条链时是否会产生3'单链突出端(图14A和图14B)、5'单链突出端(图14C和图14D)或平端(无突出端)(图14E)。 [0214], inosine disposed opposite the hairpin adapters shown in Figure 14 at decision whether EndoV cleavage produces two chain 3 'single-stranded overhangs (FIG. 14A and FIG. 14B), 5' single-stranded overhang (FIG. 14C and FIG. 14D), or blunt end (no overhang) (FIG. 14E). 还可设计发夹衔接子的序列,在EndoV切割时产生非回文(图14A和图14B)或回文(图14A和图14C)单链突出端。 Hairpin adapters may be designed in a sequence, to produce non-palindromic (FIGS. 14A and 14B) or a palindrome (FIGS. 14A and FIG. 14C) when the single-stranded overhangs EndoV cleavage. 本领域众所周知的是脱氧肌苷将与4种碱基A、G、C和T的任一种以及与自身配对(Watkins和SantaLucia,2005, NucleicAcids Res.33 (19) : 6258-67)。 It is well known in the art and will dideoxyinosine four bases A, G, C and T, along with any self-pair (Watkins, and SantaLucia, 2005, NucleicAcids Res.33 (19): 6258-67). 此外,衔接子可含有如本公开内容其它部分所描述的IIS型限制性内切核酸酶识别位点(例如MmeI)。 Further, the adapter may contain a type IIS restriction endonuclease of the present disclosure, as described elsewhere in the nuclease recognition site (e.g. MmeI).

[0215] 步骤5A(图15步骤A) [0215] Step 5A (Step 15 in FIG. A)

[0216] 在该方法中,步骤A基本上可按照步骤IA中所述方法进行。 [0216] In this method, the step A may be carried out substantially according to the procedure described in Method IA. 可通过如上所述的本领域已知的任何物理或生物化学方法使靶DNA片段化。 It may be any physical or biochemical methods known in the art as described above, so that fragments of the target DNA. 可任选通过本公开内容其它部分所描述的任何大小分级方法对所得片段进行大小分级。 Optionally any size fractionated by methods described elsewhere in the disclosure of the resulting fragments were size fractionated.

[0217] 步骤5B和5C (图15步骤B+C) [0217] Step 5B and 5C (FIG. 15 Step B + C)

[0218] 可通过任何本文所描述的精加工方法对靶DNA的末端进行精加工,并可与上述脱氧肌苷发夹衔接子连接形成衔接子标记的靶DNA。 [0218] can be finished to the end of the finishing target DNA by any method described herein, and may be joined to form labeled target DNA adapters with said deoxyinosine hairpin adapter.

[0219] 步骤5D(图15步骤D) [0219] Step 5D (FIG. 15 step D)

[0220] 连接反应物可以用一种或多种外切核酸酶(如本文其它部分的论述)处理,并通过本文所述的任何方法进行大小分级以富集所需反应产物。 [0220] The reaction may be connected with one or more exonuclease (as discussed elsewhere herein) process, and size-fractionated by any of the methods described herein to enrich the desired reaction product.

[0221] 步骤5E(图15步骤E) [0221] Step 5E (FIG. 15 step E)

[0222] 衔接子标记的靶核酸用EndoV切割。 [0222] adapters labeled target nucleic acid cleaved with EndoV. 切割反应的条件可以是以下文献所披露的任何条件:Yao 等(Yao M 和Kow YW,J Biol Chem. 1995, 270 (48) : 28609-16 ; Yao M 禾口KowYW,JBiol Chem. 1994,269(50) : 31390-6; YaoM 等,Ann NY Acad Sci.1994, 726 : 315-6 ;和Yao M 等,J Biol Chem. 1994,269(23) : 16260-8)。 Cleavage reaction conditions may be any conditions disclosed in the following literature: Yao et (Yao M and Kow YW, J Biol Chem 1995, 270 (48): 28609-16; Yao M Wo port KowYW, JBiol Chem 1994,269.. (50): 31390-6; YaoM the like, Ann NY Acad Sci.1994, 726: 315-6; Yao M and the like, J Biol Chem 1994,269 (23): 16260-8).. 技术人员应了解的是还可以采用类似条件。 Art should appreciate that similar conditions may be employed.

[0223]步骤 5F~H(图15 步骤F~H) [0223] Step 5F ~ H (FIG. 15 step F ~ H)

[0224] 在该第五种方法中,步骤FH可按第二种、第三种或第四种方法(即如步骤2F-H或步骤3F-H或步骤4F-H)所述方法进行。 [0224] In the fifth method, the step may FH second, third or fourth method (i.e., step 2F-H or step or steps 3F-H 4F-H) of the method. [0225] 第五种方法的脱氧肌苷发夹衔接子是有利的,因为EndoV只可在肌苷或DNA的某些损伤位点或碱基错配存在时切割。 Deoxyinosine Hairpin Adapters [0225] The fifth method is advantageous, since only the cutting EndoV mismatches present in some injury site or base or inosine DNA. 因此,靶核酸将不会被EndoV处理切割。 Thus, the target nucleic acid will not be cut EndoV processed. 因此, 当EndoV位点对衔接子是独特时,靶DNA不需要像上述实施方案中的某些一样通过甲基化来保护。 Thus, when the pair of adapter EndoV site is unique, target DNA need not like certain of the above embodiments as protected by methylation. 去除甲基化步骤节省了时间,并且消除了与靶DNA的不完全甲基化有关的问题。 Methylation removal step saves time and eliminates the problems related to the incomplete methylation of the target DNA. 此外,与EcoRI消化相比,EndoV消化非常快,因此缩短了实施该方法所需的时间。 Further, as compared with EcoRI digested, digested EndoV very fast, thus shortening the time required to implement the method.

[0226] 通过脱氧肌苷发夹衔接子方法得到的配对读长结果的一个实例见图16。 [0226] Figure 16 example of a read length match the results obtained by the method deoxyinosine hairpin adapter. 按照第五种方法制备大肠杆菌K12基因组DNA并进行测序(图15)。 E. coli K12 genomic DNA prepared and sequenced (FIG. 15) in accordance with the fifth method. 配对读长之间的平均距离为2070bp(标准差=594)。 The average distance between pairs of read length is 2070 bp (SD = 594).

[0227] 第六种方法 [0227] A sixth method

[0228] 在另外的实施方案中,可通过包括下列步骤中的一些或全部的方法进行配对末端测序法,参见图17和图18。 [0228] In a further embodiment, paired-end sequencing may be performed by a method comprising the steps of some or all of the methods, see FIGS. 17 and 18.

[0229] 步骤6A-靶DNA的片段化(图17A) [0229] Step 6A- fragmented target DNA (Figure 17A)

[0230] 按照第六种方法,使靶DNA样品的多核苷酸分子(例如基因组DNA)片段化成大于约500个碱基、大于约1000个碱基、大于约2000个碱基、大于约5000个碱基、大于约10000个碱基、大于约20,000个碱基、大于约50,000个碱基、大于约100,000个碱基、 大于约250,000个碱基、大于约1百万个碱基或大于约5百万个碱基的分子。 [0230] According to the sixth method, a polynucleotide molecule of the target DNA sample (e.g. genomic DNA) into fragments greater than about 500 bases, longer than about 1000 bases, longer than about 2000 bases, longer than about 5000 bases, longer than about 10,000 bases, longer than about 20,000 bases, longer than about 50,000 bases, longer than about 100,000 bases, longer than about 250,000 bases, longer than about 1 million bases, or greater than about 5 molecular million bases. 在一个优选的实施方案中,片段长度从约1.5kb到约5kb不等。 In a preferred embodiment, the fragment length ranging from about 1.5kb to about 5kb. 可通过本公开内容其它部分所描述的任何物理和/或生物化学方法完成片段化。 May be any physical this disclosure that other portions of the described and / or biochemical methods to complete fragmentation. 在一个优选的实施方案中,靶DNA通过物理力量随机剪切,例如通过使用HydroShear ®仪器(GenomicSolutions)进行。 In a preferred embodiment, the target DNA is randomly sheared by physical force, for example by using HydroShear ® instrument (GenomicSolutions). 然后按照所需片段大小对剪切的DNA进行纯化。 Then follow the size of the desired fragment was purified sheared DNA. 这种任选的大小选择可通过本领域已知和本文所公开的任何大小选择方法实现,例如电泳和/或液相层析法。 This optional size selection may be achieved by any size selection methods known in the art and disclosed herein, such as electrophoresis and / or liquid chromatography. 在一个优选的实施方案中, 通过在SPRI ®大小排阻珠粒上进行纯化来根据大小选择剪切的DNA样品(Agencourt ; Hawkins 等,Nucleic Acids Res. 1995 (23) : 4742-4743)。 In a preferred embodiment, by purification on SPRI ® size exclusion beads according to size selection sheared DNA sample (Agencourt; Hawkins et, Nucleic Acids Res 1995 (23): 4742-4743.). 例如,在经典的细菌基因组测序实验中,对约2-2.5kb的片段的末端(成对)测序可供重叠群排序(contig ordering)。 For example, in the classical bacterial genome sequencing experiments, (pair) for sequencing contigs sorting (contig ordering) of the ends of the fragments of about 2-2.5kb. 较大片段可能有利于较高等生物(例如真菌、植物和动物)的基因组的测序。 Larger fragment may facilitate higher biological sequencing (e.g. fungal, plant and animal) genome.

[0231] 步骤6B-某些限制位点的甲基化(图17B) Methylation [0231] Step 6B- certain restriction sites (FIG. 17B)

[0232] 如下所述,在衔接子与靶DNA片段连接后,在为环化作准备时,衔接子可用一种或多种限制性内切酶切割。 [0232] as described below, the adapter in the target DNA fragment, when turned into a ring preparation, endonuclease sub adapter with one or more restriction cleavage. 为了防止靶DNA被所选的限制性内切酶消化,通过用相应的甲基化酶修饰使靶DNA免遭消化。 In order to prevent the target DNA is selected restriction endonuclease digestion by the corresponding modification methylase from digestion of the target DNA. 在一个优选的实施方案中,衔接子为发夹衔接子, 并且携带EcoRI限制位点(图18A)。 In a preferred embodiment, the hairpin adapters adapters, and carrying EcoRI restriction site (FIG. 18A). 因此,在一个优选的实施方案中,在通过连接进行环化前,当由发夹衔接子产生EcoRI粘端时,使用EcoRI甲基化酶使样品DNA片段中存在的EcoRI限制位点甲基化以保护DNA片段的完整性。 Thus, in a preferred embodiment, before the connection by cyclization, when the hairpin adapters produce EcoRI cohesive ends, so the use of EcoRI methylase EcoRI restriction site present in the DNA fragment sample of methyl to protect the integrity of the DNA fragment.

[0233] 步骤6C-片段末端精加工和磷酸化(图17C) [0233] Step 6C- finishing fragment ends and phosphorylated (FIG. 17C)

[0234] 对DNA的流体动力剪切产生具有翻口末端(frayed end)(单链突出端)的一些片段。 [0234] The hydrodynamic shearing of DNA fragments having a number of terminal introversion (frayed end) (single-stranded overhangs) of. 平端对于随后的衔接子连接是优选的。 For subsequent blunt end ligated adapters are preferred. 因此,任选通过酶促法用DNA聚合酶“补平”和/或通过用外切核酸酶(例如绿豆核酸酶)“chewing-back”,使任何翻口末端平整并使之易于连接。 Thus, by optionally enzymatically with DNA polymerase "fill level" and / or by treatment with an exonuclease (e.g. Mung Bean nuclease) "chewing-back", so that any of introversion end flat and easy to connect. 有利的是,一些DNA聚合酶还具有外切核酸酶活性。 Advantageously, some DNA polymerases also have exonuclease activity. 任选在平整反应之后,优选可用多核苷酸激酶使片段5'端磷酸化。 Optionally after formation of the reaction, preferably polynucleotide kinase fragments can be used 5 'end phosphorylated. 在一个优选的实施方案中,分别使用T4 DNA聚合酶和Τ4多核苷酸激酶(Τ4 PNK)来补平和磷酸化。 In a preferred embodiment, each using T4 DNA polymerase and [tau] 4 polynucleotide kinase (Τ4 PNK) to fill levels and phosphorylation. 使用Τ4 DNA聚合酶通过其5' 一3'聚合酶活性来“补平” DNA的3'凹端(5'突出端),而其单链3' 一5'外切核酸酶活性脱去3'突出端。 Τ4 DNA polymerase by using its 5'-3 'polymerase activity to "fill-in" the 3' recessed ends (5 'overhangs of the DNA), while the single-stranded 3' a 5 'exonuclease activity removed 3 'overhang. Τ4ΡΝΚ的激酶活性将磷酸基团加到5' -羟基端。 Τ4ΡΝΚ kinase activity of phosphate groups added to the 5 '- hydroxyl end.

[0235] ^m 6D- Mmm^im (图17Ρ 和图18A) [0235] ^ m 6D- Mmm ^ im (FIGS 17Ρ and FIG. 18A)

[0236] 按照本发明,使双链寡核苷酸衔接子与靶DNA片段的末端连接。 [0236] According to the present invention, double-stranded oligonucleotide adapters sub-terminus of the target DNA fragment. 在一个优选的实施方案中,衔接子为发夹衔接子(图18A)。 In a preferred embodiment, the hairpin adapters adapter (FIG. 18A). 发夹衔接子的一个优势是衔接子之间的连接事件将只产生衔接子二聚体,即防止了多聚体衔接子多联体的形成。 One advantage of hairpin adapters are adapters between the connection event will produce only sub-adapter dimers, i.e., preventing the formation of multimer adapter concatemers. 另外,其发夹结构可保护样品片段免于用来脱去未连接片段的外切核酸酶消化(步骤6E)。 Further, a hairpin structure which can be used to protect the sample fragment removed from unconnected fragments exonuclease digestion (Step 6E). 图18A中所示的一个优选的发夹衔接子设计含有EcoRI和MmeI限制位点。 A preferred 18A shown in FIG hairpin adapters designed containing EcoRI restriction sites and MmeI. EcoRI可用来在每个片段的末端上产生粘端(步骤6F)以供其环化(步骤6G),MmeI是由其识别位点切去DNA 20bp的IIS型限制性内切酶;它被用来切割成环化样品片段的末端,产生待测序的配对末端标签。 EcoRI can be used to produce the ends of each fragment sticky ends (Step 6F) for cyclization (step 6G), MmeI recognition site is cut by the 20bp DNA IIS-type restriction endonuclease; it is used cut to end rings of the sample fragment, generating paired terminal tag to be sequenced. 技术人员应了解,EcoRI可用多种其它的在衔接子寡核苷酸的核苷酸序列中具有伴随变化的内切核酸酶中的任一种替换,并且使用合适的甲基化酶以保护靶DNA片段。 The art will appreciate, EcoRI available any accompanying various other changes endonuclease having a replacement of one nucleotide sequence of the oligonucleotide adapter, and a suitable methylase to protect target DNA fragment. 同样,MmeI可用其它IIS型限制性内切酶替换,只要所选定的酶在距其限制位点的足够距离上切割以产生长度足以供下游序列装配的配对末端即可。 Similarly, the available Mmel endonuclease other type IIS restriction Alternatively, as long as the selected cleavage at a sufficient distance from its restriction site to produce a sufficient length for downstream paired-end sequence assembly can. 在一个优选的实施方案中,发夹衔接子在例如图18A中所示的位点上被生物素化。 In a preferred embodiment, the hairpin adapters are biotinylated on the site, for example, shown in FIG. 18A. 其它生物素化位点也是适宜的,技术人员可以选用。 Other biotinylation sites are also suitable, art can choose. 在配对末端衔接子连接期间,在补平反应(片段修复)期间, 以及在配对末端文库扩增期间,生物素部分可供对含衔接子的配对末端片段的任选选择及配对末端文库片段的任选固定化(在MmeI消化后)。 During the mating end of the adapter is connected, during the fill-in reaction (fragment repair), and during the paired end library amplification, the biotin moiety of the available fragment containing paired end adapters are optionally selected and paired end library fragments optionally immobilized (after digestion MmeI).

[0237] 步骤6E-外切核酸酶选择(图17E) [0237] Step 6E- selected exonuclease (FIG. 17E)

[0238] 优选外切核酸酶消化接着发生发夹衔接子的连接,以除去在两端不与发夹衔接子正确契合的任何DNA ;并且在SPRI大小排阻珠粒上的纯化除去不需要的小分子类别, 例如衔接子_衔接子二聚体。 [0238] Preferably exonuclease digestion followed hairpin adapters connected to occur, to remove any DNA without hairpin adapters in both ends of the sub-correct fit; and size exclusion purification on SPRI beads to remove unwanted Collections of small molecules, such adapter _ adapter dimers. 外切核酸酶消化可用本领域已知的各种外切核酸酶的一种或多种进行。 Exonuclease digestion One useful known in the art of various exonucleases or more for. 消化优选用活性组合来完成,其同时可供以3' -5'和5' -3'两个方向消化单链和双链DNA。 Digestion is done preferably with the active composition, which is available while the 3 '5' and 5 '3' direction was digested two single and double stranded DNA. 在一个优选的实施方案中,外切核酸酶混合物含有大肠杆菌外切核酸酶1(3' - 5'单链外切核酸酶)、噬菌体λ外切核酸酶(5' - 3'单链和双链外切核酸酶)和噬菌体Τ7外切核酸酶(5 ' -3'双链外切核酸酶,可在缺口和切口处启动)。 In a preferred embodiment, the mixture contains an exonuclease enzyme to E. coli exonuclease 1 (3 - outer '5' single-strand exonuclease), an outer bacteriophage λ exonuclease (5 '- 3' single-stranded and the outer double stranded exonuclease) and phage Τ7 exonuclease (5 '-3' exonuclease outer double-stranded, and can start the notch incision).

[0239]步骤 6F-EcoRI 消化(图17F) [0239] Step 6F-EcoRI digestion (FIG. 17F)

[0240] 在一个优选的实施方案中,使用由EcoRI引起的内核切割,通过切割发夹衔接子而在每个片段的末端产生粘端(图18Α)并且可供片段进行环化。 [0240] In a preferred embodiment, a core cut with EcoRI caused by cutting the hairpin adapters cohesive ends generated (FIG 18Α) at the ends of each fragment for fragment and cyclized. 用EcoRI消化将在片段末端除去发夹结构,留下粘端。 The digested fragment was removed at the end of the hairpin structure with EcoRI, leaving a sticky end. 样品DNA中存在的内部EcoRI位点被较早前在步骤6Β中进行的甲基化所保护。 DNA samples were present in the internal EcoRI site is protected methylated conducted earlier in the step 6Β.

[0241] 步骤6G-环化(图17G) [0241] Step 6G- cyclization (FIG. 17G)

[0242] 然后片段通过其EcoRI粘端进行分子内连接而环化。 [0242] and then ligated intramolecular cyclization thereof by EcoRI sticky end. 连接的位点因此具有两部分的发夹衔接子(头对头,具有重构的EcoRI位点;共44bp),在两侧是样品片段的末端。 Connecting site thus having a hairpin adapters (head to head, having reconstituted the EcoRI site; co 44bp) two parts, on both sides of the sample is a terminal fragment. 进行另一种外切核酸酶消化以除去任何非环化DNA。 Another performed exonuclease digestion to remove any non-circularized DNA.

[0243]步骤 6H_MmeI 消化(图17H) [0243] Step 6H_MmeI digestion (FIG. 17H)

[0244] 然后,环化DNA片段用MmeI进行限制酶切。 [0244] Then, circularized DNA fragment was digested with restriction MmeI. 这种IIS型限制性内切酶在距其限制位点大约20bp处切割(留下2nt3'突出端,即在20/18nt切割;该酶还产生一些少数产物,其切口自位点起由19bp到22bp不等)。 The type IIS restriction enzyme which cuts from its restriction site at about 20bp (left 2nt3 'overhang, i.e. cut at 20 / 18nt; The enzyme also produce some few products, from which the incision site 19bp from to 22bp range). 在与样品DNA片段连接的发夹衔接子的末端有MmeI位点(图18A);在这些位点进行限制酶切产生配对末端DNA文库片段, 各含有连接的“双”发夹衔接子(44bp)和样品片段的两个20bp端,长度共84bp。 At the end of the hairpin adapters connected to the sample DNA fragments have MmeI site (FIG. 18A); for these restriction site generating paired end DNA library fragments, "double" hairpin adapters (44bp each containing connection 20bp two ends), and the sample fragment, a total length of 84bp.

[0245] 制聚61-難·物碰紗胃(B 171) [0245] was prepared poly-touch yarn 61- difficult stomach (B 171)

[0246] 在该步骤中可任选剔除缺乏生物素标签、没有连接的“双”发夹衔接子的MmeI 限制片段。 [0246] In this step may be optionally removed lack biotin tag, MmeI "double" hairpin adapter is not connected to the restriction fragment. 使发夹衔接子中存在的生物素标签与链霉抗生物素或抗生物素蛋白珠粒结合,可以使配对末端片段的文库固定化(并且从其它MmeI限制片段中分离出来)。 Hairpin adapters present in the biotin tag with streptavidin-biotin or avidin bound beads, can be made of paired end library fragments immobilized (and separated from other MmeI restriction fragment).

[0247] 步骤6J-配对末端衔接子连接(图17J) [0247] Step 6J- paired end adapters ligated (FIG. 17J)

[0248] 在该步骤中,将在步骤6H中产生并任选在步骤61中纯化的配对末端文库片段的末端与称为配对末端文库衔接子(paired end library adaptor)或配对末端衔接子(paired end adaptor)的双链衔接子连接(图18B)。 [0248], and optionally generated at the end of the purification step 61 and the paired end library fragments called paired end library adapters (paired end library adaptor) or paired end adapters (6H Paired step in this step end adaptor) double-stranded adapter-ligated (FIG. 18B). 这些配对末端衔接子提供引发区(primingregion) 以同时支持扩增和核苷酸测序,并且还可包含用于在454S equencing™系统中精确查找的短的(例如4个核苷酸)“测序键(sequencing key)”序列。 These paired end adapters provide priming region (primingregion) to support both amplification and nucleotide sequencing, and may also contain short (e.g. 4 nucleotides) is used to pinpoint the 454S equencing ™ system "key Sequencing (sequencing key) "sequence. 衔接子可具有“简并” 2-碱基单链3'突出端。 Adapter may have a "degenerate" single-chain 2-base 3 'overhang. 简并是指2个突出的碱基是随机的,即它们各自可以是G、A、T或C。 Refers to two degenerate bases are random projection, i.e. each of which may be G, A, T or C. 如果使用MmeI以外的酶,则技术人员能够容易地设计与其它的酶相容的配对末端衔接子。 If enzymes other than MmeI, the skilled person can easily design compatible with the other paired end adapters enzymes. 图18B中所示的示例性衔接子被设计成十分有利于配对末端文库片段与在其3' 端含有简并2bp3'突出端的各衔接子定向连接,所述衔接子只能与用MmeI产生的配对末端文库片段(假定衔接子的5'端未被磷酸化,见下文)的末端连接。 View of an exemplary adapter shown in FIG. 18B is designed to be very beneficial to the paired end library fragments with 3 'end containing a degenerate 2bp3' directional ligation of each adapter overhang, only the adapter with the generation of MmeI paired end library fragments (assuming the adapter 5 'end of the unphosphorylated, see below) terminus. 在含有大量摩尔过量的衔接子(衔接子:片段比率为15 : 1)的连接反应中,衔接子可与配对末端文库片段结合,同时最大限度地利用配对末端文库片段并使形成配对末端文库片段多联体的可能性最小化。 Containing a large molar excess of adapters (adapters: ratio of 15 fragments: 1) the ligation reaction, it may be combined with the adapter paired end library fragments, while maximizing the use of paired end library fragments and forming paired end library fragments possibility concatemers minimized. 衔接子本身可以是未磷酸化以使衔接子二聚体的形成最小化,但因此连接产物随后需通过补平反应修复(步骤6K)。 Adapter themselves may be unphosphorylated to adapter dimers formed is minimized, but the product is subsequently connected so need repair by fill-in reaction (Step 6K).

[0249] 步骤6K-补平反应(图6K) [0249] Step 6K- fill-in reaction (FIG. 6K)

[0250] 如果在步骤6J中连接的配对末端衔接子没有被磷酸化,则在其与配对末端文库DNA片段的3' -接点上将存在缺口。 [0250] If the pairing in step 6J terminus of the adapter is not phosphorylated, at its' and the paired end library DNA fragment 3 - present on the contact gap. 可以使用链置换DNA聚合酶修复这两个“缺口” 或“切口”,因此所述聚合酶识别切口,置换有切口的链(成为每个衔接子的游离3' 端),并且以导致修复切口和形成全长dsDNA的方式延伸。 Strand displacement DNA polymerase may be used to repair two "gaps" or "incision", so the cut-polymerase recognition, substitutions of chain notch (each adapter becomes a free 3 'end), and to cause the cut-repair and is formed extending the entire length of dsDNA. 在一个优选的实施方案中, 使用BstDNA聚合酶(大片段)。 In a preferred embodiment, a BstDNA polymerase (large fragment). 本领域已知的其它链置换DNA聚合酶也适用于该步骤,例如phi29DNA聚合酶、DNA聚合酶I (Klenow片段)或Vent® DNA聚合酶。 Known in the art other strand displacement DNA polymerases are also suitable for this step, e.g. phi29DNA polymerase, DNA polymerase I (Klenow fragment) or Vent® DNA polymerase.

[0251] 步骤6L-扩增(图6L) [0251] Step 6L- amplification (FIG. 6L)

[0252] 可任选扩增“衔接”配对末端DNA文库。 [0252] Optionally amplification "Cohesion" paired end DNA library. 优选扩增通过PCR进行,但是也可采用本领域已知和/或本文所描述的其它核酸扩增方法。 Preferably amplified by PCR, it is also known in the art may be employed and / or other nucleic acid amplification methods described herein. 优选图18B中所示的寡核苷酸F-PCR和R-PCR可以用作PCR引物。 FIG preferably oligonucleotides F-PCR and R-PCR shown in FIG. 18B may be used as PCR primers.

[0253] 不论扩增(如上面段落所描述的一样)与否,都随后对“衔接”配对末端DNA 文库进行测序。 [0253] Regardless of the amplification (e.g., as described in the paragraph above) or not, then on the "adapter" paired end DNA library were sequenced. 优选对文库的各种分子进行测序。 Various preferred molecule libraries were sequenced. 如果所选择的DNA测序方法在每个独特的测序反应中需要大量相同的模板分子,则文库的各个分子可以克隆的方法扩增。 If the selected DNA sequencing methods require large amounts of identical template molecules in each unique sequencing reactions, the various methods of molecular libraries can be cloned amplification. 优选克隆扩增按照国际专利申请号WO 2005/003375、WO 2004/069849、WO 2005/073410 中所描述的方法通过珠粒乳液PCR进行,所述各申请通过引用全部结合到本文中。 Preferably clonal amplification in accordance with International Patent Application No. WO 2005/003375, WO 2004/069849, the method described in WO 2005/073410 by bead emulsion PCR, each of said applications fully incorporated herein by reference. [0254] 第七种方法 [0254] A seventh method

[0255] 在又一个实施方案中,可以通过包括下列步骤中的一些或全部的方法进行配对末端测序,见图21-25。 [0255] In yet another embodiment, paired-end sequencing may be performed by the following steps comprising some or all of the methods, shown in Figure 21-25.

[0256] 所述实施方案提供特别有利和创造性的方法,该方法提供通过连接进行环化并适于实施上述方法及变通方法一些或全部的备选方法。 [0256] The embodiment provides a particularly advantageous and inventive method of providing connections cyclized by the method described above and is adapted workarounds some or all of the alternative methods. 另外,现描述的实施方案对于产生IOKb以上配对末端距离(即约20Kb的配对末端距离)特别有效,然而还应当了解的是,所述基于重组的策略也可用于短于IOKb (即约3Kb或8Kb的配对末端距离)的环化片段。 Further, embodiments will now be described with respect to generating the above IOKb paired end distance (i.e., paired end distance of about 20Kb) are particularly effective, however, is also to be understood that the recombination based strategy can also be used in IOKb shorter (i.e. about 3Kb or paired end distance of 8Kb) cyclization fragment. 现描述的实施方案利用基于分子内重组的策略用于核酸分子的环化,所述核酸分子包含对于较长的配对末端距离所需要的序列长度,并且在用于核酸分子(尤其大核酸分子)环化的效率方面提供主要的优势。 Embodiments will now be described by intramolecular recombination-based strategies for the nucleic acid molecule of the ring, the length of the nucleic acid molecule comprising a sequence of paired end distances for longer required, and nucleic acid molecules (nucleic acid molecules, especially large) cyclization efficiency provide major advantages.

[0257] —些优选的实施方案包括据称是通过重组反应方法的体外切除,所述方法利用Cre/Lox型位点特异性重组酶(下文称为“SSR”)系统,用于线性衔接靶片段环化以产生一种包含靶片段的环状核酸和第二种包含杂合衔接子序列的切除的线性片段,这类方法的一个实例如图21中所示。 [0257] - some preferred embodiments, including what is said by in vitro excision recombination reaction method, the method using the Cre / Lox site-specific recombinase type (hereinafter referred to as "SSR") system, a linear convergence target fragment excised cyclized to produce linear fragment comprising the target nucleic acid fragment and a second annular adapter comprising a hybrid sequence, one example of such a method as shown in FIG. 21. 例如,图21提供基于SSR的策略的示例性概况,用于产生配对距离为IOKb以上的可测序的配对末端模板核酸分子的文库。 For example, Figure 21 provides an exemplary overview of SSR based strategy for generating paired end library matching distance is above a template nucleic acid molecule may be sequenced IOKb. 正如下文中将详细描述的一样,图21说明以下方法:使基因组DNA或其它所需DNA片段化,连接衔接子2105和2107产生衔接片段2100,然后根据所需要的长度对其进行选择。 Just as will be described in detail below, FIG. 21 illustrates a method: genomic DNA or other desired DNA fragments, ligation adapters 2105 and 2107 generated a fragment adapter 2100, and then according to a desired length be selected. 图中还说明了从衔接片段2100产生环状产物2150和线性产物2155的SSR重组步骤,其中用机械方式剪切环状产物2150产生线性配对末端模板2160,随后使之扩增产生包含许多基本相同的模板2160拷贝的2170群。 The figure also illustrates the generation of the circular and linear product 2150 2100 SSR product from the adapter fragment recombination step 2155, wherein the cyclic product was cut to produce a linear 2150 mechanically paired end template 2160, and then amplified so as to produce substantially identical contain many 2160 copies of the template 2170 group.

[0258] 相关领域技术人员应当理解的是,尽管本文描述了使用Cre/Lox的SSR系统的实施方案,但是也可使用整合酶家族的其它成员,例如Int/att和FLP/FRT,因此Cre/Lox 的公开内容不应视作限制性的。 [0258] relevant art will appreciate that although the herein described embodiment uses Cre / a Lox the SSR system, it is also possible to use other members of the integrase family, e.g. Int / att and FLP / FRT, so Cre / Lox disclosure should not be considered restrictive. 另外,尽管一般按照单个分子描述该方法,但是应当理解的是,该方法在相同或类似反应环境中同时在众多分子上进行,例如本说明书其它部分描述的油包水型乳液反应器(water-in-oil type emulsion reactor),其中在各种反应环境中大量靶分子可约为一个分子或10、100、1000、1,000,000个分子等。 Further, although generally a single molecule according to the method described, it is to be understood that the method simultaneously on the same number of molecules in the reaction environment or the like, the present specification, for example water-in-oil emulsion reactor described elsewhere (water- in-oil type emulsion reactor), wherein the reaction environment in a variety of large molecules or target molecules may be about a 10,100,1000,1,000,000 molecules like. 例如,利用如本说明书其它部分描述的油包水乳液策略抑制分子间事件(即多联体的形成等),并促进产生环化产物所需要的分子内重组,更多详情见下文。 For example, the use of oil as described elsewhere in this specification in-water emulsion policy inhibiting intermolecular events (i.e. form concatemers, etc.), and to promote intramolecular cyclization to produce the desired recombinant product, more detailed below.

[0259] 步骤7A-片段化 [0259] Step fragmentation 7A-

[0260] 如上述各种实施方案中所述,将原始基因组或其它来源的靶DNA样品的多核苷酸分子片段化成大于约10,000个碱基、大于约20,000个碱基、大于约50,000个碱基、大于约100,000个碱基、大于约250,000个碱基、大于约1百万个碱基或大于约5百万个碱基的分子。 [0260] As the above-described various embodiments, the fragments of the target polynucleotide molecule of the original DNA sample derived from genomic or other chemical conversion of greater than about 10,000 bases, longer than about 20,000 bases, longer than about 50,000 bases , greater than about 100,000 bases, longer than about 250,000 bases, longer than about 1 million bases, or greater than about 5 million base molecule. 在一些优选的实施方案中,片段长度的范围从约IOKb到约50Kb、从约IOKb 到约IOOKb或从约IOKb到超过IOOKb不等。 In some preferred embodiments, the fragment length range from about to about 50Kb IOKb, IOKb from about or from about to about IOOKb IOKb to vary over IOOKb. 片段化可通过本公开内容其它部分所描述的任何物理和/或生物化学方法实现。 Fragmentation can be any physical and / or biochemical methods of the present disclosure by other portions of the described implementations. 在一个优选的实施方案中,靶DNA通过物理力量随机剪切,例如通过使用HydroShear ®仪器(Genomic Solutions)。 In a preferred embodiment, the target DNA is randomly sheared by physical force, for example by using HydroShear ® instrument (Genomic Solutions). 尽管应当理解的是, 如果所选择的方法能够产生所需要的片段长度,则可以采用产生本文所描述的片段的任何方法。 Although it should be appreciated that, if the selected method is capable of producing the desired fragment length, the fragments produced by any method described herein may be employed.

[0261] 步骤7B-末端精加工[0262] 在现描述的变通方法中,可采用本公开内容其它部分所描述的任何方法,对各片段的末端进行精加工,例如上文步骤6C中所描述的方法。 [0261] Step finishing end 7B- [0262] In the alternative method will now be described, the present disclosure may employ any of the methods described in other portions of the finishing end of each segment, such as described in the above step 6C Methods. 正如所描述的一样,优选平端用于随后的衔接子连接。 As described, as preferably used in the subsequent blunt end adapter ligation. 因此,任选可通过酶促法用DNA聚合酶“补平”和/或通过用外切核酸酶(例如绿豆核酸酶)“chewing-back”,对任何翻口端或突出端进行平整并使之易于连接。 Optionally, therefore, be "blunted" by enzymatically with DNA polymerase and / or (e.g. Mung Bean nuclease) "chewing-back", or any of introversion end overhangs and planarized by treatment with exonuclease it is easy to connect. 有利的是,一些DNA聚合酶还具有外切核酸酶活性。 Advantageously, some DNA polymerases also have exonuclease activity. 任选在平整反应之后,可优选用多核苷酸激酶使片段的5'端磷酸化。 Optionally after formation of the reaction, preferably with polynucleotide kinase 5 'end phosphorylated fragment. 在一个优选的实施方案中,分别使用T4 DNA聚合酶和T4多核苷酸激酶(T4 PNK)用于补平和磷酸化。 In a preferred embodiment, each using T4 DNA polymerase and T4 polynucleotide kinase (T4 PNK) to complement calm phosphorylation. T4 DNA聚合酶用来通过其5' 一3'聚合酶活性“补平” DNA的3'凹端(5'突出端),而其单链3' 一5'外切核酸酶活性去掉3'突出端。 T4 DNA polymerase to the 3 'recessed ends (5' overhangs of the DNA), while the single-stranded 3 'a 5' exonuclease activity removed by three 5 'an 3' polymerase activity "blunted" ' overhangs. T4 PNK的激酶活性将磷酸基团加至5' -羟基端。 T4 PNK kinase activity of the phosphate groups to the 5 '- hydroxyl end.

[0263] 步骤7C-衔接子连接 [0263] Step 7C- connection adapter

[0264] 又如上所述,将双链寡核苷酸衔接子与精加工的靶DNA片段的末端连接。 [0264] As also described above, double-stranded oligonucleotide adapters target DNA fragment terminus and a finishing sub-connection. 在现描述的实施方案中,衔接子可包括IoxP衔接子,该衔接子的一个实例见图22。 In the embodiment now described, the adapter may include a IoxP adapter, an example of the adapter of Figure 22. 例如,图22提供2个双链衔接子物质loxP-6F衔接子2105和loxP-6R衔接子2107的说明性实例, 各衔接子具有缺乏5'磷酸基的第一平端,而且具有3个序列位置的3'突出端和磷酸化5'端的第二末端。 For example, Figure 22 provides two double stranded adapter species loxP-6F adapter 2105 and the loxP-6R 2107 adapter sub illustrative example, each adapter having a first blunt end lacking a 5 'phosphate group, and a sequence having three positions the 3 'protruding end and a phosphorylated 5' end of the second end. 普通技术人员应当了解的是,所述3'突出端不局限于3个序列位置,根据所需条件可能多于或少于3个。 Ordinary skill in the art will be appreciated that the 3 'overhang sequence is not limited to three positions, may be more or less than three depending on the desired conditions.

[0265] 为了促进环化产物,使衔接子2105和2107的第一平端与精加工(即平整)的靶DNA片段的末端连接,使得各衔接子中的Iox P 2200区以同向取向,有关详情见下文。 [0265] In order to facilitate the cyclized product, adapter 2105 and 2107 make a first blunt end and finishing (i.e., flat) target DNA fragment terminus, such that each of the adapters Iox P 2200 oriented in the same area, the relevant see below for details. 另外,包含突出端的两种衔接子物质第二末端和各衔接子的5'磷酸化提供特异性优势。 Further, two kinds of material the second end adapter and 5 'phosphorylated adapter each comprise overhang provides specific advantages. 第一个优势是抑制多聚体衔接子形成产生如上所述的衔接子多联体分子。 The first advantage is to suppress formation of multimer adapter adapter described above to generate concatemers molecule. 换句话说,仅衔接子2105和衔接子2107的平端是彼此可连接的,限制了这类衔接子连接事件形成与长多联体相对立的二聚体,所述多联体较难与衔接靶分子区分,并且在一些情况下消耗相当比例的衔接子分子,使得它们不能用于与靶分子的连接。 In other words, only the flat end of the adapter 2105 and adapter 2107 are connected to one another, limiting the adapter ligation events such form long concatemers opposed to a dimer, with the adapter concatemers is difficult distinguishing the target molecule, and consumes a significant proportion of adapter molecules in some cases, so that they can not be used in connection with a target molecule. 第二个优势是5'磷酸化和3'突出端各自改进外切核酸酶降解的效率,因此未环化分子的除去得到改进,所有详情见下文。 The second advantage is the 5 'phosphate and 3' end of the respective projection improved exonuclease degradation efficiency, to remove non-cyclic molecules is improved, all as detailed below.

[0266] 步骤7D-大小选择 [0266] Step size selection 7D-

[0267] 接下来,可按照所需片段大小对衔接子连接的核酸片段2100进行纯化。 [0267] Next, a nucleic acid fragment can be purified in accordance with the desired size of adapter-ligated fragments of 2100. 这个任选的大小选择步骤可以采用本领域已知和本文所公开的任何大小选择方法如电泳和/或液相层析法来进行。 This optional size selection step may be performed by any selected method, such as the size of the electrophoresis and / or liquid chromatography as known in the art and disclosed herein. 在一个实施方案中,通过如上所述的凝胶电泳选择剪切DNA样品的大小。 In one embodiment, the selected DNA sample cut size by gel electrophoresis as described above. 在所述实施方案中,基于凝胶的方法产生大小分级的DNA片段,所述片段包含具有所需长度的某种程度(例如为所需长度25%的范围)的长度的大小分布。 In the embodiment, the generation of the gel size fractionation of DNA fragments on a fragment having a desired length is a certain amount (e.g., a desired length of 25%) of the length of the size distribution. 例如,靶定的20Kb大小部分将产生一群片段,其长度为20Kb+/_5kb (即产生15Kb_25Kb的片段长度范围)。 For example, the targeted portion of a 20Kb fragment group generated, a length of 20Kb + / _ 5kb (i.e., generation of fragment length 15Kb_25Kb). 在相同或其它实施方案中,可应用备选的大小分级技术,特别是其中需要较长片段以加大配对末端距离。 In the same or other embodiments, it may be applied alternative size fractionation techniques, in particular the need to increase the longer fragments paired end distance. 适于较大分子的大小分级的这类技术之一称为“脉冲场凝胶电泳”(下文称为PFGE,参见Schwartz DC,Cantor CR.Separation of yeast chromosome-sized DNAs by pulsed field gradient gel electrophoresis (通过脉冲场梯度凝胶电泳对酵母染色体大小的DNA进行分离).Cell. 1984 May; 37(1) : 67-75,该文献通过引用其整体结合到本文中用于所有目的)。 One such technique is suitable for size fractionation of large molecules called "pulsed field gel electrophoresis" (hereinafter referred to as PFGE, see Schwartz DC, Cantor CR.Separation of yeast chromosome-sized DNAs by pulsed field gradient gel electrophoresis ( by pulse field gradient gel electrophoresis of yeast chromosomal DNA size separation) .Cell 1984 May; 37 (1):. 67-75, for all purposes of this document incorporated herein by reference in its entirety). 与用标准凝胶电泳方法所达到的分辨率相比,PFGE能够以大得多的分辨率对大尺寸分子进行大小分级。 Compared with the gel electrophoresis using standard resolution achieved, PFGE can be size-fractionated large molecular size much greater resolution. 例如,相关领域普通技术人员要理解的是,标准凝胶电泳方法一般对大分子进行有效的大小分离不起作用,尤其序列长度约为20Kb以上的核酸分子。 For example, one of ordinary skill in the relevant art will be appreciated that standard gel electrophoresis methods are generally large molecules separated by size does not work effectively, especially in the sequence length of about 20Kb or more nucleic acid molecules. PFGE方法在另一方面提供对这类大核酸分子大小的精确辨别。 In another aspect PFGE provided accurate discrimination of the size of such large nucleic acid molecule.

[0268] 此外,在应用标准凝胶电泳或PFGE方法的实施方案中,有时需要采用本领域普通技术人员已知的称为“电洗脱”的方法用于从聚丙烯酰胺或琼脂糖凝胶中有效地提取核酸或蛋白质分子。 [0268] Further, in embodiments or PFGE gel electrophoresis using standard methods, the method may need to those of ordinary skill in the art as "electroelution" for a polyacrylamide or agarose gel from effectively extract the nucleic acid or protein molecules.

[0269] 在一些实施方案中,采用本说明书其它部分描述的方法(例如在步骤6K中描述的方法),补平从上述衔接子连接步骤中留下的缺口可能十分重要。 [0269] In some embodiments, the method (e.g. the method described in step 6K) is described elsewhere in this specification, from the fill level adapters connecting step may be important to leave a gap.

[0270] 步骤7E-通过重组讲行环化 [0270] Step 7E- cyclization by recombinant speaking

[0271] 接下来,将线性衔接核酸序列片段2100暴露于位点特异性重组酶中,例如识别连接靶核酸序列末端并邻接靶核酸序列的衔接子2105和2107的34bp IoxP区2206的Cre 重组酶。 [0271] Next, the linear nucleic acid sequence fragments adapter 2100 is exposed to the site-specific recombinase, such as the identification terminal connected to the target nucleic acid sequence and the target nucleic acid sequence adjacent to the adapter 2105 and 2107 34bp IoxP recombinase Cre region 2206 . 对于包含同向取向的衔接子IoxP区2206(详情见下文)的衔接片段,Cre重组酶切除包含IoxP区杂合体的短的线性片段(见图21,如线性产物2155),并且用第二杂合IoxP区和靶核酸使靶核酸环化产生环状分子(见图21,如环状产物2150)。 For the same adapter to adapter fragments comprise aligned IoxP 2206 (details below), containing the Cre recombinase excision hybrids short linear fragments IoxP region (see FIG. 21, such as linear product 2155), and with a second hybrid IoxP target nucleic acid region target nucleic acid and cyclized to produce a circular molecule (see FIG. 21, such as cyclic product 2150). 例如,图21 和图23说明由Cre重组酶产生的作为线性产物2155和环状产物2150的两个重组产物。 For example, FIGS. 21 and 23 illustrate a linear product and a cyclic product 2155 2150 two recombination products generated by Cre recombinase. 图22进一步说明存在于环化产物2150中的重组衔接子2110与杂合IoxP区2208的组成。 FIG 22 is further described in the present composition of the product cyclization recombination adapter 2150 2110 2208 hybrid IoxP region. 普通技术人员应当了解的是,Cre重组酶在两个衔接子2105和2107中的IoxP区2206内切割,并与原始衔接子2105和2107两者的作为2206区杂合体的IoxP区重组形成产物。 Ordinary skill will appreciate that, in the region of the Cre recombinase IoxP two adapters 2105 and 2107 of the 2206 cut, the original adapters 2105 and 2107 and 2206 as the IoxP region domain hybrid of both recombinant product formation. 例如,Cre重组酶在6F 2105和6R 2107衔接子中的任一个的IoxP区2206上结合,并且各自在相同序列位置切割。 For example, the Cre recombinase 6F 2105 and 6R 2107 adapter of any one of IoxP binding region 2206, and each cleavage sequence at the same position. 结合的重组酶/核酸复合体位于衔接的靶核酸序列片段的每个末端,并且彼此反应将6F 2105和6R 2107衔接子的切割末端连接起来由此使核酸片段环化。 Located at each end of a nucleic acid sequence fragments adapter recombinase binding / target nucleic acid complex, react with each other and the cutting tip 6F 2105 and 6R 2107 adapter is connected whereby the ring of nucleic acid fragments. 在本实例中,重组酶使从缺乏8bp定向序列(directional sequence) 2200的6F 2105衔接子切下的区段与包含8bp定向序列2200的6R 2107衔接子的区段连接,从而产生环状产物2150。 In the present example, so that the recombinase deficiency 8bp directional sequence (directional sequence) of the cut section 6F 2105 adapter connection section 2200 comprising 8bp directional sequence 2200 6R 2107 adapter, thereby producing a cyclic product 2150 . 另外,将得自6F 2105衔接子的8bp定向序列2200元件与余下的缺乏8bp定向序列2200元件的6R 2107衔接子连接,产生短的杂合衔接子,为上述线性产物2155。 Further, from the 6F 2105 adapter 8bp directional sequence lacking from the remaining 8bp directional sequence 2200 element 6R 2107 adapter element 2200 is connected to produce a hybrid short adapter, the above-described linear product 2155. 所得到的作为环状产物2150的杂合衔接子见图22,为包含IoxP区2208的衔接子2110。 The resulting product as an annular hybrid adapter 2150 shown in Figure 22, the adapter comprising IoxP region of 21,102,208. 包含与IoxP区2206基本相同的序列组成的区域2208及环状产物2150中衔接子2110的区域2208的实施方案还包括富集标签2205的两个关联实施方案(一个标签来源于衔接子2105和2107之一)。 2206 comprises IoxP sequences substantially identical to the annular region 2208 and adapter 2150 product sub-region 2110 of embodiment 2208 also includes two related embodiments of enrichment tag 2205 (derived from a tag adapters 2105 and 2107 one). 在一些实施方案中,富集标签2205的两个实施方案的存在提高随后的富集步骤的效率。 In some embodiments, there are two embodiments of enrichment tag 2205 to improve the efficiency of subsequent enrichment step. 如图22中所示,富集标签可包括生物素,然而应当理解的是,可以使用本文所描述或本领域普遍已知的任何类型的富集标签(即结合对(bindingpair))。 As shown, the enrichment may include biotin tag 22, it should be understood that, as described herein may be used generally known in the art or any type of enrichment tag (i.e. binding pair (bindingpair)). 还要注意的是,衔接子2110还包括环状产物2150中连接至靶DNA片段的原始衔接子2105 和2107的平端。 Note also that the adapter 2110 further includes annular product 2150 is connected to the original target DNA fragment blunt-ended adapters 2105 and 2107.

[0272] 图22和图23提供IoxP位点的方向性对于从SSR方法产生环化产物的重要性的一个实例。 [0272] Figures 22 and 23 provide a directivity example IoxP sites importance cyclization product produced from the SSR method. 在图22的实例中,IoxP区2206的野生型形式(用序列区周围的框表示)与衔接子2105和2107结合。 In the example of FIG. 22, IoxP wild-type form area 2206 (indicated by the surrounding frame sequence region) in combination with adapter 2105 and 2107. 然而,应当理解的是,可以使用其它突变型,只要保持SSR 功能性即可。 However, it should be appreciated that other mutations can be used insofar as maintaining functionality to SSR. 另外,相关领域技术人员应当理解的是,在所描述的SSR系统中,IoxP区具有方向性特征,而且这类特征当暴露于Cre重组酶时将影响产物。 Further, in the relevant art will appreciate that, in the described SSR system, IoxP region having directivity characteristics, and such features will affect when the product when exposed to Cre recombinase. 在图22的实例中, 6F衔接子2105和6R衔接子2107两者的区域2206包括对Cre/Lox系统是典型的特征,即包括长度为8bp的定向IoxP序列2200 (方向性用与序列2200相连的箭头表示)。 In the example of FIG. 22, 6F 2107 adapter region of both the 2105 and 6R adapter 2206 includes Cre / Lox system is a typical feature, i.e., including the length of the orientation IoxP sequence of 8 bp 2200 (with directional sequence 2200 is connected the arrows indicate). 另夕卜,区域2206包含在定向序列2200各侧侧翼的约13bp的回文序列元件。 Another Bu Xi, region 2206 comprises a palindromic sequence approximately 13bp element 2200 in the targeting sequence flanking each side.

[0273] 图23 提供根据IoxP区2206的相对取向所产生的SSR产物的说明性实例。 [0273] Figure 23 provides an illustrative example of the relative orientation of SSR product IoxP region 2206 produced. 第一,图23A提供衔接片段2100'的代表性实例,该衔接片段具有以相反方向关系定位的两个IoxP区2206和由Cre重组酶产生的线性倒位产物2305 (用阴影区2300的位置改变来表示)。 First, FIG. 23A provides adapter segment 2100 'is a representative example, the adapter having two segments in opposite directions IoxP region 2206 and the linear relationship positioned generated by Cre recombinase inversions product 2305 (by changing the position of the shaded area 2300 To represent). 完全不同的是,图23B提供代表性的衔接片段2100”,该衔接片段具有以相同方向关系定位的两个IoxP区2206和由Cre重组酶产生的产物,其包括包含区域2208 (在如上所述的重组衔接子2110中)的第一环状产物2150和自衔接片段2100切下的第二线性产物2155,并包含第二重组区2208。应当理解的是图23B的重组反应是正如双向箭头所表示一样是“双向”的,其中与由整合箭头2336所表示的整合方向相比,切除箭头2334表示反应方向的幅度较大。相关领域普通技术人员还要理解的是,给出的箭头2334 和2336仅用于说明目的,并非按方向性实际幅度的确切比例绘制,所述方向性可能至少部分取决于反应条件。重要的是,在一个优选的实施方案中,使反应条件最优化以促进切除方向和形成环状产物。 It is completely different, FIG. 23B provides a representative fragment adapter 2100 ", the engaging region fragments with two IoxP positioning relationship in the same direction and 2206 products produced by the Cre recombinase, which comprises a region comprising 2208 (as described above recombinant adapter 2110) 2150 and the first annular product from the second adapter segment cut linear product 2100 2155, and comprising a second recombination region 2208. It should be understood that the recombination reaction, as FIG. 23B is a bidirectional arrow is expressed as "two-way" in which the integration compared with the integration direction indicated by an arrow 2336, arrow 2334 represents the magnitude of the reaction cutting direction is large. ordinary skill in the relevant art will also be understood that, given the arrows 2334 and 2336 for illustrative purposes only, are not in the exact ratio of the actual magnitude of the directivity drawing, the directivity may be at least partially depending on the reaction conditions. importantly, in a preferred embodiment, the reaction conditions are optimized to facilitate removal direction and form a cyclic product.

[0274] 步骤7F-除去非环状核酸 [0274] Step 7F- remove non circular nucleic acid

[0275] 随后,可以采用本说明书其它部分所描述的任何方法除去所有的线性核酸分子,包括切除的产物2155、倒位产物2305、衔接子二聚体、未衔接的靶核酸片段等。 [0275] Subsequently, any method may be employed as described elsewhere in this specification to remove all of the linear nucleic acid molecules, including removal of the product 2155, 2305 inversion product, adapter dimers, non-convergence target nucleic acid fragment and the like. 例如,可以采用外切核酸酶处理策略有效地除去所有的线性核酸分子产物或其它残留的线性片段。 For example, it can be effectively removed all of the linear nucleic acid molecule, or other product using a linear fragment of residual exonuclease treatment strategy.

[0276] 在一些实施方案中,可能需要使用不止一种类型的外切核酸酶以提高除去任何不需要的线性核酸分子的效率。 [0276] In some embodiments, it may be desirable to use more than one type of exonuclease in order to improve the efficiency of removing any unwanted linear nucleic acid molecule. 例如,在一些实施方案中,可使用两种或更多种外切核酸酶类,可包括但不限于外切核酸酶1(亦可称为EXO 1)外切核酸酶类和被称为依赖于ATP的DNA酶以消化线性双链DNA (即例如Plasmid-Safe™依赖于ATP的DNA酶,该酶可获自Epicentre Biotechnologies, Madison WI)。 For example, in some embodiments, two or more may be used exonuclease enzymes may include, but are not limited to Exonuclease 1 (also referred EXO 1) exonuclease enzymes, and is called a dependent ATP to the enzyme to digest the DNA double-stranded linear DNA (i.e., e.g. Plasmid-Safe ™ DNA-dependent ATP-enzyme, the enzyme available from Epicentre Biotechnologies, Madison WI).

[0277] 步骤7G-线性化 [0277] Step 7G- linear

[0278] 然后,可采用本说明书其它部分描述的各种方法的任一种,使环状核酸产物2150片段化形成线性核酸分子,其包含起始靶核酸的末端区,在其中部具有衔接子区。 [0278] Then, any one of various methods can be used elsewhere in the specification of the present description, that the circular nucleic acid product to form a linear fragment of 2150 nucleic acid molecule, comprising a starting end region of the target nucleic acid, wherein the adapter portion having Area. 在现描述的变通方法中,可能特别有利的是利用机械剪切型方法之一,例如能够选择优选的片段长度并促进配对标签形成的雾化,其中一个或多个配对的标签具有较长的序列长度。 In the alternative method will now be described, it may be particularly advantageous is one method by mechanical shearing, for example, to select a preferred fragment length and to promote atomization of paired tag formation, wherein the one or more pairs of tags have a longer sequence length.

[0279] 此外,重要的是,注意图22所示衔接子元件缺乏MmeI或本说明书其它部分描述的其它IIS型限制位点,然而应当容易理解的是,这类位点也可包括在内。 [0279] In addition, it is important to note that element 22 shown in FIG adapter absence of other type IIS restriction site MmeI or described elsewhere in this specification, it should be readily understood that such sites may also be included. 实际上,在一些实施方案中,使MmeI位点与衔接子物质之一结合是有利的,使得当核酸片段与两个衔接子物质连接并且环化时,可以使用MmeI酶切割环状分子,在新的线性片段的一个末端留下20bp标签。 Indeed, in some embodiments, so that one binding site MmeI adapter substance is advantageous when the nucleic acid fragment so that the two adapters and cyclized substances can be used when MmeI cleavage of the cyclic molecules, in end of a new linear segments left 20bp tag. 然后,再次采用机械法使线性片段片段化,更多详情见下文和本说明书的其它部分,其中机械片段化选出比20bp标签和34bp IoxP区的组合大得多的特定片段长度。 Then, again using the linear fragment mechanically fragmented, and more detailed in other parts of this specification hereinafter, wherein the mechanical fragmenting is much greater than the selected combination of 20bp and 34bp IoxP label zone specific fragment length. 结果是成对的第二标签的长度比第一标签长,并大大降低了包含衔接子2110的间插区内片段化的可能性。 The result is a pair of the second label is longer than the length of the first tab, and greatly reduces the possibility of inter-plug 2110 comprising adapter fragmentation region. 成对的第二标签的优选长度可至少部分基于用来产生所得配对末端片段的序列数据的测序方法的平均读长或总读长能力。 The preferred length of the second pair of tags may be used to generate at least in part, on an average read length or total capacity of read length of sequencing fragments resulting paired-end sequence data. [0280] 在一些实施方案中,为了防止在随后的纯化步骤中无意中丧失可能以低数量和/ 或低质量存在的有价值的靶DNA片段,也可在线性化步骤之前加入载体DNA。 [0280] In some embodiments, in order to prevent accidental loss of subsequent purification steps may be used in low amounts and / or value of the target DNA fragment in the presence of low quality, but also may be added to the linear vector DNA prior step. 在使用II型限制位点(例如MmeI)的所述实施方案中,使用MmeI载体DNA可能是有利的,正如本说明书其它部分描述的一样。 In the embodiment using a type II restriction site (e.g. MmeI), the use of MmeI vector DNA may be advantageous, as is described elsewhere in the specification of the same.

[0281] 还可能是有利的是,在相同或备选实施方案中使用更适于特殊应用的其它类型的载体DNA用于其它目的。 [0281] may also be advantageous to use the same or alternative embodiments, other types of vector DNA and more suitable for the particular application for other purposes. 这类目的中的一个包括分析机械操作步骤(例如上述线性化步骤)的效率。 The purpose of such a mechanical operation comprises the step of analyzing efficiency (e.g., the above-described linearization procedure). 在一些实施方案中,需要评价机械片段化方法的效率,例如本文所述的雾化法,其中配对末端模板2160不是以足够量产生以用于这种效率的有效测定。 In some embodiments, the need to evaluate the efficiency of the mechanical fragmentation methods, atomization methods such as described herein, wherein the paired end template 2160 is not sufficient to generate an amount effective for such efficiency determination. 因此,需要通过在片段化步骤之前加入一些环状载体DNA以增加片段化产物的量。 Therefore, by adding some fragments of a circular vector DNA prior step to increase the amount of fragmentation products. 然而, 这类载体DNA产物当在样品中合并时,难以从配对末端模板2160中分辨出来。 However, when such a vector DNA products were combined in a sample, it is difficult to distinguish from the paired end template 2160. 在这类实施方案中,更有利的是在进行机械分析步骤后限制可测序的载体DNA的量。 In such embodiments, it is more advantageous that the amount after performing a mechanical analysis step may be restricted vector DNA sequencing. 换句话说,有益的是使用载体DNA用于机械操作步骤的分析,但一般不需要消耗测序步骤的宝贵资源,来从无价值的载体DNA中产生序列信息。 In other words, beneficial to use vector DNA for analysis of mechanical steps, but generally do not need to consume valuable resources of the sequencing step, to generate DNA sequence information from the carrier in worthless. 其中限制可测序量的载体DNA的一种方法是通过PCR或其它扩增方法使之无法扩增。 A method wherein the vector DNA is to limit the amount of sequencing so as not amplified by PCR or other amplification methods. 因此,在其中线性化产物库(例如配对末端模板2160)被进一步扩增用于测序的实施方案中,在用2170群表示的可测序模板的扩增群中,总的载体DNA群的出现明显减少。 Thus, where the linear product libraries (e.g. paired end template 2160) is amplified further embodiment for sequencing, the sequencing template was amplified population may be represented by the group in 2170, the total vector DNA appeared significantly group cut back. 例如,正如将在下文中更详细描述的一样,环状载体DNA例如pUC 19可具体用短波紫外光处理,通过产生嘧啶二聚体使各链有效交联并使之无法扩增,致使它基本上不出现在最终的样品中和被测序。 For example, as will be described in more detail below, as for example, circular vector DNA pUC 19 may be particularly short-wave ultraviolet light treatment, pyrimidine dimers produced by each chain make it impossible to effectively crosslink the amplification, so that it is substantially does not appear in the final sample and were sequenced. 可将处理的载体DNA加入具有环化靶DNA(即环状产物2150)和线性化的样品中,使得样品包括得自靶标(即配对末端模板2160)和载体DNA群两者的线性化代表。 Vector DNA may be added to the process of target DNA having a ring (i.e., cyclic product 2150) and the linearized sample so that the sample obtained from a target comprising (i.e., paired end template 2160) and linearized vector DNA representatives of both groups. 在本实例中,可对整个样品进行分析以确定线性化的效率,例如通过使用可获自Agilent Technologies,inc.的LabChip DNA 7500芯片,其中由于核酸体积增加所致,载体DNA使得能够更准确地进行测定。 In the present example, the entire sample can be analyzed to determine the efficiency of linearized, for example by using available from Agilent Technologies, inc. Of LabChip DNA 7500 chip, due to the increased volume caused by the nucleic acid, vector DNA enables more accurate measured. 在随后使用本文所述的任何方法使样品扩增的过程中,载体DNA的拷贝数将不会增加,使得扩增样品具有明显较大比例的靶DNA分子。 In subsequent use any of the methods described herein the sample during amplification of the copy number of vector DNA will not increase, such that the amplified sample DNA with the target molecule significantly greater proportion.

[0282] 步骤7H-富集 [0282] Step enriched 7H-

[0283] 另外,图22表示与各衔接子物质结合的富集标签2205的实施方案,所述衔接子物质可包括生物素标签或本说明书其它部分描述或本领域普遍已知的其它类型的富集标签。 [0283] Further, FIG. 22 shows the enrichment of each label substance bound adapter 2205 of embodiment, the adapter may include a substance, or a biotin label described elsewhere in this specification or other types commonly known in the art rich set label. 如上所述,在配对末端衔接子的连接期间,在补平反应(片段修复)期间,以及在配对末端文库扩增期间,富集标签例如生物素部分可供含衔接子的配对末端片段的任选选择及配对末端文库片段(在环状核酸线性化后)的任选固定化。 , During connection of the paired end adapters, during the fill-in reaction (fragment repair), and during the paired end library amplification, enrichment tag described above, for example, any one of the paired end fragments containing biotin moiety for adapters select All select and optionally paired end library fragments (after linearization of the circular nucleic acid) immobilized. 本文所描述的IoxP衔接子2105和2107另外的优势是衔接子_衔接子连接事件仅导致衔接子二聚体,即防止多聚体衔接子多联体的形成。 IoxP described herein adapters 2105 and 2107 further advantage is that the adapter _ adapter ligation event results in only adapter dimers, i.e., preventing formation of multimer adapter concatemers promoter.

[0284] 本发明方法7的变通方法的一个方面与本文所述的其它方法和变通方法一致, 例如用于连接衔接子和扩增的第六种方法(步骤6J-6L)的步骤JL,以及也在本申请中描述的随后的产物测序。 Other methods and modifications consistent with an aspect of the alternative methods described herein [0284] 7. The method of the present invention a method, for example, the adapter-ligated and the sixth method (Step 6J-6L) for amplification step JL, and subsequent sequencing of this product is also described in the application.

[0285] 如前所述,在用如图25所示的最小数目的序列读长有效覆盖基因组支架的能力方面,方法7的变通方法提供优于其它方法的明显优势。 [0285] As described above, the ability to effectively cover the genome of the stent with the minimum number shown in FIG. 25 sequence read lengths, the method workaround 7 provide significant advantages over other methods. 例如,图25说明在大肠杆菌K12基因组支架装配中,提供约20Kb的长配对末端读长的显著优势,优于约3Kb的较短配对末端读长,甚至优于已知的基于鸟枪方法的更大优势。 For example, Figure 25 illustrates the E. coli K12 genome holder assembly is provided about 20Kb long paired end read length significant advantage of better than about 3Kb paired end reads shorter length, even over known methods based on more shotgun big advantage. 第七种方法提供优于基于连接方法的其它优势,因为它只需要较少的处理步骤,这些步骤需要较少的宝贵资源,例如技术人员工时、仪器用时和使用率以及试剂使用率。 Seventh method provides advantages over other methods based connection, because it requires less processing steps that require fewer valuable resources, for example, art work, and utilization of instruments and reagent usage.

[0286] 要理解的是本发明还预期并包括上述7种方法相应步骤的任何组合。 [0286] is to be understood that the present invention also contemplates and includes any combination of the above seven steps of the corresponding method.

[0287] 正如从上述本公开内容可观察到的一样,在方法1、2、3、4、5和6间有相似性。 [0287] As can be observed from the foregoing disclosure to present, as in the methods 1,2,3,4,5 and 6 have similar properties. 特别是方法2、3、4、5和6的相似步骤尤其类似,在方法间可以合并和互换以产生等同或有利的结果。 Particularly to a method similar to Steps 2 and 6 are similar in particular, can be combined and interchanged between a method to produce a result equivalent or advantageous.

[0288] 既然介绍了配对末端测序法的通用方法,下面介绍所述方法的变通方法。 [0288] Now that the general method described paired end sequencing, the following describes modifications of the method described.

[0289] 在一种变通方法中,发夹衔接子可以用突出衔接子(overhang adaptor)(图8)代替。 [0289] In an alternative method, the hairpin adapters may be replaced with overhang adapters (overhang adaptor) (FIG. 8). 突出衔接子可被生物素化,并且可具有例如以下序列: Overhang adapters may be biotinylated, and can have, for example, the following sequence:


_] I lllllllllllll Illlll _] I lllllllllllll Illlll

[0292] 3 ' OH-G-—TTTGGGAAAGCCA-—AGGTTG—5 ' P04 (SeqID NO : 29) [0292] 3 'OH-G - TTTGGGAAAGCCA - AGGTTG-5' P04 (SeqID NO: 29)

[0293]上链(Seq ID NO : 28)的6 个3,端核苷酸即TCCAAC,与下链(SeqID NO : [0293] chain (Seq ID NO: 28) 6 3, i.e. terminal nucleotide TCCAAC, with the lower chain (SeqID NO:

29)的互补核苷酸连接,形成II型S限制性内切酶MmeI的识别位点。 29) connected complementary nucleotide formed MmeI endonuclease recognition site in the type II S restriction.

[0294] 该变通方法以类似于方法3的方式进行。 [0294] The work process is conducted in a manner similar to Method 3. 将第一基因组DNA(图8A)片段化并精加工(图8B)后,使突出衔接子与片段末端连接(图8C)。 After the first genomic DNA (FIG. 8A) and fragmented finish (FIG. 8B), the projecting end segment is connected to the adapter (FIG. 8C). 可通过大小分级层析法(即旋转柱)或基于电荷的层析法除去突出衔接子的二聚体。 Chromatography by size fractionation (i.e., spin column) or dimeric removed overhang adapters charge-based chromatography. 无法形成突出衔接子的较高级多联体,因为在5'突出端缺乏磷酸基。 Can not form a projection of the adapter concatemers higher, because of the lack a phosphate group at the 5 'overhangs. 在除去突出端引物二聚体后(图8D),通过激酶处理能够使片段自连接(图8E)。 After (FIG. 8D) is removed overhang primer dimers, can be made by kinased fragment self-ligation (FIG. 8E). 进行自连接(即环化),随后可进行外切核酸酶消化以除去未连接的非环状DNA。 Self-connected (i.e., cyclized), may then be performed exonuclease digestion to remove non-circular DNA unconnected. 由于未与突出衔接子连接的DNA片段具有因精加工产生的平端,所以它们的连接不如具有两个各在一侧连接的突出衔接子片段的5'突出端(粘端)的有效。 Since the DNA fragment is not connected to the overhang adapters having blunt ends generated by finishing, not as they are connected with a 5 'protruding ends of the two projections of each adapter segment is connected on one side (sticky ends) is valid. 在环化后,利用Mme I消化脱去突出衔接子远处的DNA(参见图8F),在连接突出衔接子的每侧留下起始基因组DNA的约20个碱基(图8G)。 After cyclization, removal of the use of Mme I digestion of DNA adapters distant projection (see FIG. 8F), leaving about each side of the adapter 20 bases starting genomic DNA (FIG. 8G) connecting projection. 具有突出衔接子的片段使用结合生物素化衔接子的链霉抗生物素珠粒纯化(图8H)。 Fragment having overhang adapters using adapter binding of biotinylated streptavidin beads purification biotin (FIG. 8H).

[0295] 所得片段可通过任何有效方法进行测序,例如本公开内容所提供的方法(例如步骤3H)。 [0295] The resulting fragments can be sequenced by any effective method, for example, the method provided by the present disclosure (e.g., step 3H).

[0296] 由本发明方法产生的核酸可使用一个或多个与所述序列末端互补的引物进行测序。 [0296] nucleic acids produced by the methods of the present invention may use one or more of the end of the sequence complementary to the sequencing primer. 也就是说,在步骤3H描述的测序方案下,使测序衔接子A和测序衔接子B在进行测序之前与片段末端连接。 That is, in step 3H sequencing protocol described in the sequencing adapter A and sequencing adapter B is connected to the ends of the fragments is performed prior to sequencing. 因为已知片段的末端序列或者是测序衔接子A或者是测序衔接子B,因此与测序衔接子A或B互补的测序引物可用来进行片段测序。 Since the end of the sequence or sequence fragment known adapter sequencing adapter A or B, the sequencing adapter A or B is complementary to the sequencing primer can be used for sequenced. 此外,在包含连接衔接子的各片段中部的序列是已知的(参见例如图7中的703)。 Further, in the middle of the sequence of each adapter-ligated fragments comprise are known (see, for example, in FIG. 7703). 还可使用与该中部区域互补的引物从中部起开始进行测序。 Use may also be complementary to the central region sequencing primer starts from the middle. 此外,可使末端区的测序引物和中部区的测序引物杂合成待同时测序的片段(参见图9)。 Furthermore, the sequencing enable terminal sequencing primer region and a central region of the primer to be synthesized simultaneously heteroaryl sequenced (see FIG. 9). 一个引物被保护起来,而另一个引物则未保护。 A primer is protected, and the other primer is unprotected. 图9中,与末端杂合的引物受磷酸基团的保护。 9, with the end of the primer hybrid protective group receiving phosphate. 第一轮测序将从未保护的引物开始(图9,中部引物)。 The first round of sequencing starts unprotected primer (FIG. 9, the central primer). 在第一轮测序后,可任选终止第一引物的延伸,例如通过掺入互补双脱氧核苷酸。 After the first round of sequencing, may optionally terminate extension of the first primer, for example, by incorporating complementary dideoxynucleotide. 或者,可将第一引物的延伸进行到模板链的末端,使得终止不必要。 Alternatively, the extension of the first primer may be subjected to the end of the template strand, so that the necessary termination. 可使第二被保护引物脱保护并在第二轮测序中延伸以确定片段末端的序列。 The second primer can be protected and deprotected fragment ends extending to determine the sequence in a second round of sequencing. 该方法使得可以是单链的单一模板的两个长配对末端测序读长成为可能。 The method may be such that two single-stranded template single long paired-end sequencing read length possible.

[0297] 在第二种变通方法中,使片段化的起始DNA(图10A)与具有3' CC突出端和任选内部IIS型限制性内切核酸酶位点的衔接子连接。 [0297] In a second alternative method, the initial fragmented DNA (FIG. 10A) and having a 3 'CC projecting endonuclease site inside the inner end and optionally the type IIS restriction adapter ligation. 连接片段无法自连接或自环化,因为它们的末端是不相容的(不互补)。 Fragments not connected or is connected from the self-cyclizing, because they are incompatible ends (not complementary). 然而,这些片段可使用在两侧具有5' GG突出端的接头连接(图10B)。 However, these fragments can be used with a 5 'linker (FIG. 10B) GG overhangs on both sides. 在连接后,可通过上文论述的标准凝胶和柱层析法或通过切割未环化分子的外切核酸酶消化,使核酸片段从非环状DNA中纯化出来。 After connecting, discussed above by standard gel column chromatography and by cutting or non-cyclized molecules exonuclease digestion, purified nucleic acid fragments from the non-circular DNA. 所得环状DNA(图10D)可如其它方法中一样用MmeI切割后,可对所得DNA进行测序。 The resulting cyclic DNA (FIG. 10D) may be the same as with the other methods MmeI cleavage, the resultant can be sequenced DNA.

[0298] 在另一种变通方法中,可以采用本发明的方法产生A/B衔接ssDNA(图11,步骤1)。 [0298] In another alternative method, the method of the present invention may be employed to produce A / B adapter the ssDNA (Fig. 11, Step 1). 可通过同包含与A/B衔接子互补的序列的寡核苷酸杂交,使这种单链片段环化(图11,步骤2),并在连接酶存在下连接。 Adapter oligonucleotides can hybridize with a sequence complementary to comprise A / B, so that such single chain fragments cyclized (Fig. 11, Step 2), and connected to the presence of a ligase. 除了有利于连接外,寡核苷酸还可用作促进环化ssDNA滚环扩增的引物(图11,步骤3)。 In addition to facilitate connection, the oligonucleotides may also be used to promote the cyclization ssDNA rolling circle amplification primers (FIG. 11, Step 3). 可按照方法1,步骤IK和L (图IL和图1M)中的描述切割滚环扩增DNA。 1 in accordance with the method described in (FIGS IL and 1M) and L cutting step IK rolling circle amplification DNA. 在扩增之后,可将标准文库制备和测序技术应用于该产物(图11,步骤4)。 After amplification, the standard library preparation and sequencing techniques may be applied to the product (FIG. 11, step 4).

[0299] 本发明的一些实施方案以在大肠杆菌菌株K12基因组的配对末端测序实验中预料不到的发现为基础,其中实验方案包括按照本文所述方法使用MmeI切割,跨基因组的读长覆盖范围的深度极为不同(图20, “无载体(_)”)。 [0299] Some embodiments of the present invention to be expected in the experimental paired end sequencing E. coli strain K12 genome is not based on the discovery protocol which comprises using the method as described herein MmeI cleavage, read length coverage across the genome very different depths (FIG. 20, "no carrier (_)"). 所谓深度是指作图到基本相同的基因组区的序列读长的数目。 Depth refers to the number called is mapped to a sequence of read length substantially the same genomic region. 这种深度变化与跨基因组的MmeI位点的密度有关(图20)。 This depth variation with MmeI site density across the genome-related (FIG. 20). 预料不到并令人惊奇的是,本发明人发现加入已知含有MmeI 位点的双链DNA(在图20中标为“(+)”),即大肠杆菌B菌株DNA( "EcoliB Strain(+)”)、鲑精DNA( “SalSprmDNA(+) ”)或已知含有MmeI 位点的PCR 扩增产物(“AmpP0SMmeI(+)”)大大降低跨基因组的覆盖范围深度的变化,并使之随机化。 Surprisingly and unexpectedly, the present inventors have found that adding a known comprising a double-stranded DNA MmeI site (labeled in FIG. 20 as "(+)"), i.e., E. coli strain DNA B ( "EcoliB Strain (+ ) "), salmon sperm DNA (" SalSprmDNA (+) "), or known to contain MmeI site PCR amplification product (" AmpP0SMmeI (+) ") greatly reduces the change in the depth of coverage across the genome, and stochastic of. 然而,与“无载体”对照相比,加入缺乏MmeI位点的双链DNA(在图20中标为“(_)”),BP poly(dldC) ( “dldC㈠,,)或已知不含MmeI位点的PCR扩增产物(“AmpNegMmeK-)”)不会改变跨基因组的覆盖范围深度的变化形式。因此,使用MmeI阳性载体DNA提供配对末端读长跨基因组的更均勻分布,这是有利的。下表所列数据进一步证实了这些预料不到的发现: However, compared to the control "unsupported", the lack of added MmeI site double stranded DNA (in FIG. 20 is successful "(_)"), BP poly (dldC) ( "dldC㈠ ,,) or known not to contain MmeI site of the PCR amplification product ( "AmpNegMmeK-)") does not change the variation of the depth of coverage across the genome. Thus, the use of vector DNA positive MmeI provide long paired end reads across the genome of a more evenly distributed, it is advantageous the data listed in the table further confirmed these unexpected discovery:

[0300] 表l.Mmel载体DNA对配对末端读长的深度分布和长度的作用 [0300] Function Table l.Mmel DNA vector length and the depth distribution of paired end read length

[0301] [0301]

样品 Depth DepthDepth%CV Length LengthSTDEV Length%CV Samples Depth DepthDepth% CV Length LengthSTDEV Length% CV


Stratagene_SS_dsDNA 25.599^7 36.2% 2.219 όΐδ 27.8% Stratagene_SS_dsDNA 25.599 ^ 7 36.2% 2.219 όΐδ 27.8%

EcoliBStrain 21.99Οϊ 37.8% 2.210 6l8 28.0% EcoliBStrain 21.99Οϊ 37.8% 2.210 6l8 28.0%

AmpPos 22.82751 32.9% 2.199 όΤδ 28.1% AmpPos 22.82751 32.9% 2.199 όΤδ 28.1%

"dldC 22.1726^5 119.7% 2.397651 27.2% "DldC 22.1726 ^ 5 119.7% 2.397651 27.2%

AmpNeg 21.1022.93 108.7% 2.363 639 27.0% AmpNeg 21.1022.93 108.7% 2.363 639 27.0%

阴性 23.0526.01 112.8% 2.385 654 27.4% Negative 23.0526.01 112.8% 2.385 654 27.4%

[0302] 表1表示大肠杆菌Κ12的覆盖范围深度的统计资料。 [0302] Table 1 shows E. coli Κ12 depth coverage of statistics. 头3个样品(行)加入了MmeI阳性载体DNA,而底部3个样品加入了MmeI阴性载体DNA。 The first three samples (rows) joined MmeI positive carrier DNA, and the bottom three MmeI negative sample was added to the vector DNA. 每栏标题表示: Each column heading means:

“Depth Ave” =平均深度;“Depth STDEV” =深度的标准差;“Depth% CV” =深度的标准差除以平均深度(此商表示通过平均深度校正的深度的变化);“LengthAve” = 基因组中配对读长的平均距离;“LengthSTDEV” =基因组中配对读长距离的标准差;"Length% CV"=长度标准差除以平均长度。 "Depth Ave" = average depth; "Depth STDEV" = standard deviation of the depth; "Depth% CV" = standard deviation divided by the average depth of the depth (this depth commercially represented by the average depth variation correction); "LengthAve" = paired read length genome of average distance; "LengthSTDEV" = genome paired long distance read standard deviation; "length% CV" = standard deviation divided by the average length of the length.

[0303] 根据图20,表1表示通过加入MmeI阳性载体DNA,跨越大肠杆菌K12基因组的覆盖范围深度的变化大大降低(参见Depth STDEV和Depth % CV值;较小的Depth STDEV和Depth % CV值是有利的)。 [0303] According to FIG. 20, Table 1 shows the DNA vector by adding positive MmeI, depth variation across the E. coli K12 genome coverage is greatly reduced (see Depth STDEV and Depth% CV values; small Depth STDEV and Depth% CV values It is beneficial).

[0304] 这导致跨基因组的配对末端读长的分布更均勻。 [0304] This results in a more uniform paired end reads across the genome length distribution. 这种均勻分布是有利的。 This uniform distribution is advantageous.

[0305] 表2.具有MmeI阳性载体DNA的配对末端测序对大肠杆菌K12的基因组支架的作用 [0305] Table 2. paired end sequencing has the effect of positive MmeI genomic DNA vector of the stent of E. coli K12

[0306] [0306]

Figure CN102027130AD00281

[0308] 表2表示用MmeI阳性载体DNA获得的配对末端测序数据对鸟枪法重叠群的支架的作用。 [0308] Table 2 shows the effect of paired end sequencing data obtained MmeI positive vector DNA contigs stent shotgun. 当在GS20测序仪(454 Life Sciences,Branford, CT, USA)上通过对大肠杆菌K12基因组DNA进行鸟枪法测序获得的121个大重叠群用配对末端测序读长装配时,与无载体DNA或缺乏MmeI位点的载体DNA时所产生的配对末端测序读长(48-56 支架)相比,用MmeI阳性载体DNA (栏“Stratagene SS dsDNA⑴,,、“大肠杆菌B菌株(+)”和“扩增阳性(+)” )产生的配对末端测序读长所得到的支架数目较小(即较大支架)(19-25)。因此,MmeI阳性载体DNA的使用改进通过按照本发明进行的配对末端测序得到的基因组装配性能。 When GS20 sequencer (454 Life Sciences, Branford, CT, USA) on E. coli K12 genomic DNA 121 large contigs shotgun sequencing obtained by reading the long assembly with paired end sequencing, and no vector DNA or lack of paired end sequencing read length (48-56 stent) than when the DNA vector MmeI sites generated with MmeI positive vector DNA (column "Stratagene SS dsDNA⑴ ,,," E. coli B strain (+) "and" expansion a smaller number of paired-end sequencing read length stent by positive (+) ") to produce the resultant (i.e., a larger stent) (19-25). Thus, the use of an improved DNA vector MmeI positive by pairing terminal according to the invention genome sequencing assembly performance obtained.

[0309] 如上所述,本发明的一些实施方案包括双链“载体DNA”的使用。 [0309] As described above, some embodiments of the present invention includes the use of a double-stranded "carrier DNA" to. 在一些实施方案中,在包括通过限制性内切核酸酶MmeI进行的DNA切割的步骤中使用载体DNA。 In some embodiments, a DNA vector including the MmeI endonuclease cleavage of DNA by restriction step. 在所述实施方案中,载体DNA含有一个或多个MmeI位点。 In such embodiments, the vector DNA containing one or more MmeI sites. 当MmeI酶分子的摩尔数约等于DNA 样品(产品目录,New England Biolabs,Ipswich, MA, USA)中存在的MmeI 位点的摩尔数时,通过MmeI的内核切割发生得最为有效。 When the molar number approximately equal to MmeI enzyme molecule DNA samples (Catalog, New England Biolabs, Ipswich, MA, USA) the number of moles present in MmeI site through MmeI core was the most efficient cleavage occurs. 在本发明的方法中,由于对进行可靠地测量而言既困难又耗时的低浓度DNA(通常约为几纳克〜几十纳克),而且由于基于待测序的靶DNA的MmeI位点的数目变化所致,可能难以估计MmeI位点的数目。 In the method of the present invention, since the reliable measurement difficult and time consuming in terms of a low concentration of DNA (usually about several to several tens of nanograms ng), and because the MmeI site on the target DNA to be sequenced due to changes in the number, the number of MmeI sites may be difficult to estimate. 因此,正确计算要加入反应物(以达到化学计算浓度)中的MmeI酶的量便成为问题。 Therefore, to correctly calculate the reaction was added (to achieve a stoichiometric concentration) is the amount of enzyme becomes MmeI problem. 为了克服这个困难并满足使MmeI位点的数目与MmeI酶分子的数目达到平衡的需要,本发明的一些方法包括加入过量的载体DNA(相对于样品DNA)。 To overcome this difficulty and to cater for the number of enzyme molecules and so MmeI MmeI site reaches equilibrium, some of the methods of the invention comprise adding an excess of vector DNA (relative to the sample DNA). 这样,要加入反应物中的MmeI酶的量可根据已知载体DNA的量计算,而(环状)样品DNA中MmeI位点的数目可忽略不计。 Thus, the amount of MmeI enzyme is added to the reactants can be calculated according to a known amount of vector DNA, while the number of MmeI sites (cyclic) sample DNA negligible. 因此,测量样品DNA的DNA浓度变得不必要。 Thus, measurement of the concentration of DNA sample DNA becomes unnecessary. 这就提高了速度,降低了该方法所需的成本和时间。 This improves the speed and reduce the cost and time required for this method. 载体DNA的量可超过样品DNA的量达数倍〜约10倍、数倍〜约100倍、数倍〜约1000倍或更多。 Vector DNA may exceed the amount of sample DNA from several to about 10 times the capacity, several to about 100 times to about 1000 times or more times. 在一个优选的实施方案中,将2微克超声处理的双链鲑精DNA加入具有2单位MmeI和所有所需要的试剂的样品DNA(例如IXNEBuffer 4 (New England Biolabs)和50 μ M S-腺苷甲硫氨酸(SAM))中达100 微升的体积,在大约37摄氏度下孵育约15分钟。 In a preferred embodiment, double-stranded salmon sperm DNA sonicated 2 g sample was added 2 units of DNA having MmeI and all required reagents (e.g. IXNEBuffer 4 (New England Biolabs) and 50 μ M S- adenosyl methionine (the SAM)) in a volume of 100 microliters and incubated at about 37 degrees Celsius for about 15 minutes. 技术人员应了解,可在实践范围内调节反应温度和持续时间。 The art will appreciate, the reaction temperature and duration may be adjusted within the scope of practice time.

[0310] 在MmeI限制酶切消化中,含MmeI位点的载体DNA的过量使用,与如上所述的大约化学计算量的MmeI酶联合,可任选并入包括本公开内容所描述的MmeI消化的任何方法中,例如第六种方法的步骤6Η(图17Η)。 [0310] In the MmeI restriction digestion, the excess of vector DNA containing MmeI site, MmeI enzyme in combination with about stoichiometric amounts as described above, may be optionally incorporated include MmeI digestion described in the present disclosure any method, for example in the sixth method step 6Η (FIG 17Η). 技术人员还应认识到,加入含有MmeI 位点的“载体DNA”的策略在任何MmeI限制酶切消化反应,特别是其中样品DNA含量低和/或样品DNA中的MmeI位点的数目未知的反应中都是有益的。 Art will also recognize that the addition of "DNA vector" containing MmeI site policy MmeI digestion reaction at any restriction, particularly where a low sample DNA content and / or the number of samples in DNA MmeI site of reaction unknown in it is beneficial.

[0311] 更多的一些实施方案的载体DNA可用来分析样品的机械操作,其中最好载体DNA不妨碍该方法中的其它步骤。 [0311] DNA vector further embodiments of some mechanical operation can be used to analyze the sample, wherein the vector DNA is preferably no other steps in the method hampered. 一种这样的方法是DNA样品的扩增,其中可采用掌握使DNA不扩增但也不受影响的普通技术的技术人员已知的方法,来处理环状载体DNA(即通过引起DNA损伤)。 One such method is amplification of the DNA sample, wherein the master may be employed to amplify the DNA are not affected but one of ordinary skill in the art known methods to deal with circular vector DNA (i.e., by causing DNA damage) . 例如,pUC 19载体DNA可用短波长紫外光照射45分钟左右(即通常介于30分钟和60分钟之间),在DNA结构中产生所谓的“嘧啶二聚体”。 For example, pUC 19 vector DNA may be short-wavelength ultraviolet light for about 45 minutes (i.e., generally between 30 and 60 minutes), so-called "pyrimidine dimer" in DNA structure. 常用于扩增方法的聚合酶不能够“读过(read through)”模板DNA上的二聚体,因此经照射的pUC DNA是不可扩增的。 Amplification methods commonly used in the polymerase can not "read (read through)" dimer on the template DNA, pUC DNA irradiated by thus are not amplified. 本领域技术人员还要理解的是,可以采用破坏DNA使之不能扩增的任何其它方法。 Those skilled in the art will also be understood that any other method for destroying DNA so amplified can be employed. 例如,可通过内源或外源方法产生损伤。 For example, it may be damaged by endogenous or exogenous method. 产生DNA损伤的一些方法包括但不限于UV损伤(UV-B、UV-A)、烷基化/甲基化、X射线损伤、水解(即通过热破坏引起脱嘌呤)和氧化损伤。 Some methods of producing DNA damage including but not limited to UV damage (UV-B, UV-A), alkylation / methylation, X-ray damage, hydrolysis (i.e., thermal damage caused by depurination) and oxidative damage.

[0312] 如上所述,在一些实施方案中,将经过处理的环状载体DNA加入环化靶DNA 样品中以改进线性化步骤有效性特征,特别是利用机械片段化(例如通过使用雾化)的线性化。 [0312] As described above, in some embodiments, the vector DNA through the annular ring-treated target DNA sample is added to improve the effectiveness of the linearly step, in particular by mechanical fragmentation (e.g., by using a nebulizer) linearization. 例如,可将1-4 μ g间的处理过的载体pUC DNA加入环化靶DNA样品中,并在30psi下雾化2分钟以产生其成员包含约20kb的成对距离(pair distance)的线性核酸片段。 For example, the treated support the pUC between 1-4 μ g DNA was added circularized target DNA sample and atomized at 30psi for 2 minutes to produce a linear members which contains approximately 20kb of pairwise distances (pair distance) of nucleic acid fragments. 使用得自Agilent Technologies的LabChip 7500试验芯片,测定整个雾化样品,确定雾化是否成生所需要的结果。 Available from Agilent Technologies using the test chip LabChip 7500, the entire assay atomized sample, determining whether raw atomized into the desired result.

[0313] 表3 :使用未处理载体DNA所得到的结果 [0313] Table 3: Results obtained untreated vector DNA

Figure CN102027130AD00291

[0315] 表3表示扩增后样品中存在的载体DNA的相对百分比,这与加入扩增前样品中的未处理载体DNA的量成比例。 [0315] Table 3 shows the relative percentage of amplified vector DNA present in the sample, which is added in an amount of vector DNA prior to amplification of the untreated sample is proportional. 例如,WAlyg未处理载体DNA导致载体DNA以6 % 的核酸分子呈现在扩增样品中,同样加入3 μ g导致以20%的呈现。 For example, WAlyg untreated carrier DNA results in a 6% to vector DNA nucleic acid molecule present in the amplified samples, the same leads to added 3 μ g to exhibit 20%.

[0316] 表4 :使用处理载体DNA得到的结果 [0316] Table 4: Results obtained using the vector DNA was treated

Figure CN102027130AD00301

[0317] [0318] 表4表示扩增后样品中存在的处理过的载体DNA的相对百分比,其中与表3中提供的未处理载体DNA相比大大降低。 [0317] [0318] Table 4 shows the relative percentage of the treated vector DNA present in the amplified sample, wherein the carrier is greatly reduced compared to the untreated DNA provided in Table 3. 例如,加入Iyg处理过的载体DNA导致在扩增样品中载体DNA以0.02%核酸分子呈现,同样加入3μ g导致以0.06%的呈现。 For example, addition of vector DNA Iyg treated vector DNA leads to 0.02% present in the nucleic acid molecules amplified samples, the same result in addition of 0.06% 3μ g of presentation.

[0319] 油包水乳液中的连接 [0319] connecting the oil in water emulsion

[0320] 本发明的一些实施方案还包括用于核酸分子通过连接而环化的方法。 [0320] Some embodiments of the present invention further comprises a nucleic acid molecule by connecting method for cyclization. 核酸分子的环化一般通过在低核酸浓度下连接而实现。 Circularized nucleic acid molecules is generally achieved by connecting at low nucleic acid concentration. 相对于遵循二级(或更高级)反应动力学的分子间事件,低浓度有利于遵循一级反应动力学的所需要的分子内连接反应(即环化) (FMAusubel 等(编辑),2001, Current Protocols in Molecular Biology, John Wiley & Sons Inc.)。 With respect to two intermolecular follow (or higher) the reaction kinetics of the event, the low concentration favors intramolecular ligation reactions follow kinetics required (i.e. cyclization) (FM Ausubel et al. (Eds.) A., 2001, Current Protocols in Molecular Biology, John Wiley & Sons Inc.). 然而,即使在高稀释度下,也不能防止分子间事件,核酸的过度稀释也不实际。 However, even at high dilution, intermolecular events can not be prevented, excessive dilution of the nucleic acid is not practical. 分子间连接(多联体、双环等)的发生减少所需要的分子内环化事件的产生。 Generating event occurring intramolecular cyclization to reduce inter-molecule (concatemers, bicyclic, etc.) required. 在一些情况下,分子间连接产物对下游应用可能是不利的。 In some cases, the intermolecular ligation product to downstream applications may be disadvantageous. 总的来说,常规方法至少有两个主要的缺点。 In general, the conventional methods are at least two major drawbacks. 第一,需要稀释起始核酸增加反应体积和相关的试剂成本。 First, the need to increase the reaction volume was diluted starting nucleic acid and associated reagent costs. 高稀释度还难以有效地回收反应产物。 High dilution also difficult to efficiently recover the reaction product. 第二,的确发生大量分子间连接事件,减少所需分子内连接产物的产量。 Second, a large number of molecules between the true events, reducing the yield of the desired ligated product molecule.

[0321] 本发明包括大大排除了与上述常规环化方法有关的问题的方法。 [0321] The present invention includes the above-described problems and substantially excludes conventional cyclization methods related methods. 例如,按照本发明,不需要以高稀释度(即在核酸浓度下)进行连接反应。 For example, according to the invention does not require high dilutions (i.e., at a nucleic acid concentration) subjected to ligation reaction. 在一个实施方案中,具有相容可连接端(例如平端或交错(“粘”)端)的各个线性双链DNA分子在物理上隔离的反应环境中连接。 In one embodiment, it is compatible with the connection end (e.g., blunt ends or staggered ( "sticky") ends) of the respective linear double stranded DNA molecule is physically isolated reaction environment. 优选在用来使乳液稳定的表面活性剂存在下,将含有要连接的DNA的水溶液和连接反应必需的所有试剂(例如DNA连接酶、连接酶缓冲液、ATP等) 在油中乳化。 It is preferably used to stabilize the emulsion in the presence of a surfactant containing an aqueous solution of DNA to be connected and connecting all of the necessary reaction reagents (e.g., DNA ligase, ligase buffer, and the like of ATP) emulsified in oil. 用于制备乳液合适的组成和方法的更多论述见下文。 More discussion of emulsions suitable for the preparation of compositions and methods are given below. 所得到的含有微滴(microdroplet)的油包水乳液(微型反应器),各含有零、1个或多个DNA分子。 The resulting oil droplets containing (microdroplet) in water emulsion (microreactor), each containing zero, one or more DNA molecules. 可通过改变DNA浓度和微滴的大小,来调整每个微型反应器的DNA分子的数目。 DNA concentration by changing the number and size of the droplets, to adjust each microreactor DNA molecule. 对于技术人员而言,根据核酸浓度、多核苷酸的大小(长度用碱基数来测量)和微滴的平均体积来计算合适条件,只是个常规的优化问题。 For the skilled person, the concentration of the nucleic acid, polynucleotide size (length measured by the number of bases) and the average volume of the droplet to calculate the appropriate conditions, just routine optimization problem. 理想的微滴可含有一个可连接的DNA分子。 Over the droplets may contain a DNA molecule can be attached. 然而,要了解的是,在一群微型反应器中,每个微型反应器DNA分子的数目部分将根据微型反应器的大小变化和DNA分子的随机分布而变化。 However, it is understood that, in a group of microreactors, the number of portions of each microreactor DNA molecule will vary according to changes in the size and random distribution of DNA molecules microreactor. 因此,一些微型反应器可能不含DNA分子,一些可含有一个DNA分子,一些可含有两个或更多个DNA分子。 Thus, some of the microreactors may contain DNA molecules, some of which may contain a DNA molecule, some of which may contain two or more DNA molecules. 本领域技术人员应当认识到,可根据需要,通过改变每个微型反应器DNA分子的平均数目来平衡产量和成本(试剂使用)。 Those skilled in the art will recognize, according to the needs, by varying the average number of DNA molecules per microreactor to balance yield and cost (reagents).

[0322] 优选在装配的同时将连接混合物保持冰冷(例如在0-4摄氏度),直到乳化过程完成。 [0322] Preferably the connector assembly while the mixture was kept cold (e.g., 0-4 ° C), until complete emulsification process. 这将在所需要的乳液环境形成之前防止进行连接反应,因此可防止不需要的分子间键合的形成。 This will prevent the ligation reaction environment required before the emulsion is formed, and therefore prevents formation of unwanted intermolecular bonding. 随后,将乳化的连接反应物在容许连接反应的温度下孵育。 Subsequently, the emulsified ligation reaction was incubated at a temperature allowing ligation reaction. 孵育时间可从几分钟到1小时、几小时、隔夜或24小时或1天以上不等。 Incubation time may be more than a few hours, overnight, or 24 hours or 1 day ranging from several minutes to one hour. 在此孵育后,但在破乳之前、期间或之后,为了防止混合的连接反应物中不需要的分子间连接,可停止连接反应。 After this incubation, but prior to breaking, during, or after, in order to prevent inter-mixing of the ligation reaction unwanted molecules, can be connected to stop the reaction. 可通过降低温度到约0-4摄氏度(冰水)、通过对连接酶的热灭活、通过加入EDTA、加入连接酶抑制剂等或这类方法的任何组合来停止连接反应。 By lowering the temperature to about 0-4 ° C (ice-water), by ligase heat inactivation, any combination stopped by addition of EDTA, or the like is added ligase inhibitors such methods ligation reaction.

[0323] 技术人员可容易地将本发明的上述方法应用于单链或双链RNA或者单链或双链DNA的环化。 [0323] the art can readily be applied to the above-described method of the present invention is single or double stranded RNA or single or double stranded DNA cyclization. 例如,通过与加帽寡核苷酸(亦称桥接寡核苷酸(bridging oligonucleotide)) 退火,可引起线性单链多核苷酸分子的末端直接并列,所述加帽寡核苷酸具有与所述线性单链多核苷酸分子的各末端互补的部分,正如方法1步骤IK描述的一样(参见图IL和图11)。 For example, directly parallel by capping oligonucleotides (also known as bridging oligonucleotide (bridging oligonucleotide)) anneal, can cause terminal polynucleotide molecule of linear single strand, the capping oligonucleotide has the each end portion is complementary to said polynucleotide molecule of linear single strand, as described in method 1, step as IK (see FIG. 11 and FIG. IL).

[0324] 然后,可将乳化的连接反应物在合适的温度下孵育。 [0324] can then be emulsified ligation reaction was incubated at a suitable temperature. 例如,对于与T4 DNA连接酶连接的“粘端”,合适的孵育温度为16摄氏度,但更大的温度范围也是可接受的。 For example, with T4 DNA ligase for "sticky end" enzyme-linked, suitable incubation temperature is 16 degrees Celsius, but the larger ranges are also acceptable temperature. DNA和其它分子连接的条件是本领域普遍已知的。 Conditions for DNA and other molecules are well known in the art. 在乳液中进行环化反应的一个优势是延伸的反应时间对该方法的成功是中性的或甚至是有益的。 One advantage cyclization reaction in emulsion success of the method of extending the reaction time to be neutral or even beneficial. 例如,在每个微型反应器只是一个DNA分子的情况下,孵育时间可以延长直到大多数DNA分子被环化为止。 For example, in the case where each microreactor just one DNA molecule, the incubation time can be extended until most of the DNA molecules are circularized so far. 相比之下,通过使用上述常规非乳液方法,长时间的孵育可能导致较高比例的分子间连接产物。 In contrast, by using the above-described conventional non-emulsion process, incubation time may result in a higher ratio between the product molecule. 本发明基于乳液的连接方法的另一个优势是使反应进行相对长的时间而不会提高分子间连接的发生率。 Another advantage of the present invention is based on the connection method of the emulsion is relatively long reaction time does not increase the incidence of intermolecular connections. 这种孵育时间的增加允许较大数目的环化产物而不会增加发生分子间连接的危险。 This increase in the incubation time allows a greater number of cyclized product without increasing the risk of occurrence of inter-linked molecules. 此外,由于分子是通过物理方法分离的,并且不是以浓度依赖性方式,因此对于相同数目的连接事件,反应体积可低得多(即水相中核酸的核酸浓度可高得多),这降低了试剂成本,提高了处理样品的便利性。 Further, since the molecules are separated by physical methods, and are not in a concentration dependent manner, thus connecting the same number of events for the reaction volume can be much lower (i.e., the nucleic acid nucleic acid concentration in the aqueous phase can be much higher), which reduces reagent costs and improve the convenience of handling the samples. 技术人员应理解的是连接发生在给定的微滴中,所述微滴必须含有足够的试剂,包括至少一个连接酶分子。 The art will appreciate that in a given connection occurs droplets, the droplets must contain sufficient reagents, comprising at least one molecule ligase.

[0325] 破乳和环化DNA的分离 [0325] separation and breaking of the cyclized DNA

[0326] 连接之后,可终止连接反应,将乳液“破裂”(本领域亦称“反乳化”)。 [0326] After the connection, terminate the ligation reaction, the emulsion is "broken" (in the art known as "demulsification"). 存在许多破乳的方法(参见例如美国专利第5,989,892号及其中引用的参考文献),本领域技术人员能够选择适当的方法。 Methods (see, e.g. U.S. Pat. No. 5,989,892 and the references cited therein) many demulsification, those skilled in the art can select an appropriate method. 反乳化之后可以是核酸分离步骤,这可通过分离核酸的任何合适方法进行。 After the nucleic acid separation step may be demulsification, which may be carried out by any suitable method of isolating nucleic acids. 一旦分离出核酸,便可通过适于此项任务的任何方法除去未连接的材料,所述任务之一是对样品进行外切核酸酶消化。 Once isolated nucleic acid material can be removed is not connected, one of the tasks is the sample an exonuclease digestion by any method suitable for this task. 所使用的具体外切核酸酶可部分取决于研究的分子类型(单链或双链DNA或RNA)和其它考虑事项,例如在该方法中适当地考虑反应温度。 DETAILED outer endonucleases used may be portions of the molecule type (single or double stranded DNA or RNA) depending on the study, and other considerations, such as a suitable reaction temperature is considered in the method. 在通过本领域已知的多种方法之一进行外切核酸酶处理后,可对环化材料进行纯化,例如苯酚/氯仿提取法或任何适于此目的的市售纯化试剂盒。 Following exonuclease treatment by one of several methods known in the art, it can be purified cyclized material, such as phenol / chloroform extraction or any commercially available purification kits suitable for this purpose.

[0327] 采用上述常用的基于稀释的环化方案,观察到所需环状产物的回收随线性输入DNA分子长度的增加而减少。 [0327] Based on the above-described conventional scheme ring diluted, was observed to increase linearly with the length of input DNA molecule is reduced recovering the desired cyclic product employed. 本发明的乳液连接方法特别适用于长的多核苷酸分子的环化,例如分子长度大于约500个碱基,长度大于约1000个碱基,长度大于约2000个碱基,长度大于约5000个碱基,长度大于约10000个碱基,长度大于约20,000个碱基,长度大于约50,000个碱基,长度大于约100,000个碱基,长度大于约250,000个碱基,长度大于约1百万个碱基或长度大于约5百万个碱基或实际上在目标实验方案中视为需要的任何大小。 Emulsion connection method of the invention is particularly suitable for the long cyclized polynucleotide molecules, such as molecular length greater than about 500 bases in length, greater than about 1000 bases, about 2000 bases in length greater than a length greater than about 5000 bases in length, greater than about 10,000 bases, about 20,000 bases in length is greater than, greater than about 50,000 bases in length, a length of greater than about 100,000 bases, a length of greater than about 250,000 bases, a length of greater than about 1 million bases or length greater than about 5 million bases, or indeed of any size deemed desired target experimental protocol.

[0328] 本文所描述的乳液连接方法可用于各种连接反应,不论是否导致环化。 [0328] Emulsion connection method described herein may be used for a variety of ligation reaction, whether or not resulting in cyclization. 因此, 上述乳液连接方法可用于本文所述各种方法的任何连接步骤,尤其其中需要使输入核酸环化的连接反应。 Thus, the above emulsion can be used for any connection method step of connecting the various methods described herein, in particular wherein the nucleic acid is necessary to connect the input cyclization reaction.

[0329] 乳化[0330] 乳液是两种不混溶的液相的混杂系统,其中一相分散在作为显微大小或胶体大小的液滴的另一相中。 [0329] emulsion [0330] Emulsions are two immiscible liquid phases of the hybrid system in which one phase is dispersed in the other as droplets of microscopic or colloidal size size phase. 本发明的乳液必须能够形成微囊(微型反应器)。 Emulsion of the present invention must be capable of forming microcapsules (microreactor). 乳液可以从不混溶液体的任何合适的组合中产生。 The emulsion can never any suitable combination of immiscible liquids produced. 本发明的乳液具有亲水相(含有生化组分)和疏水不混溶液体(一种“油”),所述亲水相为以微细液滴形式存在的相(分散相、内相或不连续相),所述疏水不混溶液体为所述液滴悬浮于其中的基质(非分散相、连续相或外相)。 Emulsion of the present invention has a hydrophilic phase (containing the biochemical components) hydrophobic and immiscible liquids (a kind of "oil"), the hydrophilic phase is present in the form of droplets of a fine phase (dispersed phase, or internal phase continuous phase), the hydrophobic liquid immiscible liquid droplets suspended in said matrix (non-dispersed therein, the continuous or external phase). 这类乳液称为“油包水”(W/0)。 Such emulsions as "water in oil" (W / 0). 这便具有含有生化组分的整个水相被分隔在分散液滴(内相)中的优势。 This would have the entire aqueous phase containing the biochemical components are separated (internal phase) in the dispersed droplets advantage. 外相(为疏水性油)一般不含任何生化组分,因此是惰性的。 External phase (hydrophobic oil) generally free of any biological component, thus inert.

在一些实施方案中,微型反应器含有核酸连接必需的试剂。 大量微型反应器各可含有正好一个多核苷酸分子。 在某些实施方案中,可能需要热稳定的油包水乳液, 例如在以下情况下:在反应后进行连接酶的热灭活,或者使用热稳定连接酶(例如Taq DNA连接酶)在高温下进行连接。 可按照本领域已知的任何合适方法形成乳液。 下文中描述了产生乳液的一种方法,但是可以采用制备乳液的任何方法。 这些方法是本领域已知的,并包括辅助方法(adjuvant method)、逆流法、错流法、振荡、转鼓法和膜法。 此夕卜,可通过改变流速和组分的速度调节微囊的大小。 例如,在滴加时,可以改变液滴的大小和递送的总时间。 在一些实施方案中,可在微流体装置内产生微滴,例如正如Link 等人描述的一样(Angew.Chem.Int.Ed.,2006,45,2556-2560),通过引用全部结合特此

[0332] 至少一些微型反应器应足够大以包括足够的核酸和其它连接试剂。 [0332] at least some of the microreactors should be sufficiently large enough to include a nucleic acid and other ligation reagents. 然而,至少一些微型反应器应足够小使得部分微型反应器群含有单个可自连接的多核苷酸分子。 However, at least some of the microreactors should be sufficiently small so that part of the microreactor population of polynucleotide molecules comprising a single, self-ligation. 在一些实施方案中,乳液是热稳定的。 In some embodiments, the emulsion is heat stable. 优选所形成的液滴的直径范围大小约100纳米〜约500微米,更优选约1微米〜约100微米。 The droplet size diameter ranging preferably formed from about 100 nanometers to about 500 microns, more preferably about 1 micron to about 100 microns. 有利的是,错流流体混合,任选与电场联合, 可供控制液滴形成和液滴大小的一致性。 Advantageously, the cross-flow fluid mixing, optionally combined with an electric field, droplet formation and for controlling the consistency of droplet size.

[0333] 适用于生物反应的各种乳液可参见Griffiths和Tawfik,EMBO, 22,第24-35页(2003) ; Ghadessy 等,Proc.Natl.Acad.Sci.USA 98,第4552-4557 页(2001);美国专利第6,489,103号和WO 02/22869,所述文献都通过引用全部结合到本文中。 [0333] The bioreactor suitable for a variety of emulsions may be found in Griffiths and Tawfik, EMBO, 22, pp. 24-35 (2003); Ghadessy et, Proc.Natl.Acad.Sci.USA 98, pp. 4552-4557 ( 2001); U.S. Pat. No. 6,489,103 and WO 02/22869, all of the documents incorporated by reference herein. 在一个优选的实施方案中,油是硅油。 In a preferred embodiment, the oil is a silicone oil.

[0334] 表面活性剂 [0334] Surfactant

[0335] 可通过加入一种或多种表面活性剂(乳液稳定剂、表面活性剂)使本发明的乳液稳定。 [0335] The present invention can stabilize the emulsion by the addition of one or more surfactants (emulsion stabilizer, a surface active agent). 这些表面活性剂亦称乳化剂并用作水/油界面以防止(或至少推迟)相分离。 These surfactants also known as an emulsifier and the water / oil interface to prevent (or at least postpone) the phase separation. 可使用多种油和多种乳化剂以产生油包水乳液;最新编列的表面活性剂超过16,000种,其中许多可用作乳化剂(Ash,Μ.和Ash,I. (1993) Handbook of industrial surfactants.Gower, Aldershot)。 More oils may be used to generate more emulsifying agents and water in oil emulsion; surfactant latest provision of more than 16,000 species, many of which can be used as emulsifying agents (Ash, Μ and Ash, I (1993) Handbook of industrial.. surfactants.Gower, Aldershot). 用于本发明方法的乳液稳定剂包括Atlox 4912、失水山梨醇单油酸酯(司盘80; ICI)、聚氧乙烯失水山梨醇单油酸酯(吐温80; ICI)和其它公认和市售的合适稳定剂。 Emulsion stabilizer used in the process according to the present invention include Atlox 4912, sorbitan monooleate (Span 80; ICI), polyoxyethylene sorbitan monooleate (Tween 80; ICI) and other recognized commercially available and suitable stabilizing agents.

[0336] 在各种实施方案中,表面活性剂以油相中的0.5-50%、优选10-45%、更优选30-40%乳液的体积/体积浓度提供。 [0336] In various embodiments, the surface active agent is provided in the oil phase is 0.5-50%, preferably 10-45%, more preferably 30-40% by volume of the emulsion / volume concentration.

[0337] 在一些实施方案中,使用化学惰性的硅氧烷型表面活性剂,例如硅氧烷共聚物。 [0337] In some embodiments, chemically inert silicone-type surfactants, for example siloxane copolymers. 在一个实施方案中,所用硅氧烷共聚物为聚硅氧烷-聚鲸蜡基-聚乙二醇共聚物(鲸蜡基二甲聚硅氧烷共聚醇)例如Abil ®; EM90 (Goldschmidt)。 In one embodiment, the copolymer is a polysiloxane with a silicone - cetyl polypropylene - polyethylene glycol copolymer (Cetyl Dimethicone Copolyol) e.g. Abil ®; EM90 (Goldschmidt) .

[0338] 化学惰性的硅氧烷型表面活性剂可作为乳液组合物中唯一的表面活性剂提供, 或者可作为几种表面活性剂之一提供。 [0338] The chemically inert silicone-type surfactants may be used as the sole emulsion composition of a surfactant to provide, or can be provided as one of several surfactants. 因此,可以使用不同表面活性剂的混合物。 Thus, a mixture of different surfactants may be used.

[0339] 在具体的实施方案中,所使用的一种表面活性剂为DowComing :® 749 Fluid(以1-50%,优选10-45%、更优选25-35%重量/重量使用)。 [0339] In particular embodiments, one surfactant used is DowComing: ® 749 Fluid (1 to 50%, preferably 10 to 45%, more preferably 25-35% wt / wt use). 在其它具体的实施方案中,所使用的一种表面活性剂为DowComing5225C Formulation Aid (以1-50 %、优选10-45%,更优选35-45%重量/重量使用)。 In other particular embodiments, one surfactant used is DowComing5225C Formulation Aid (1 to 50%, preferably 10 to 45%, more preferably 35-45% wt / wt use). 在一个优选的实施方案中,油/表面活性剂混合物由以下组成:40% (重量/ 重量)Dow Corning 5225C Formulation Aid、30% (重量/重量)Dow Corning ® 749 Fluid和30 % (重量/重量)硅油。 In a preferred embodiment, the oil / surfactant mixture activity consists of: 40% (wt / wt) Dow Corning 5225C Formulation Aid, 30% (wt / wt) Dow Corning ® 749 Fluid and 30% (wt / wt ) silicone oil.

[0340] 本发明的方法提供优于现有方法的多种益处和优势。 [0340] The method of the present invention provide a variety of benefits and advantages over existing methods. 本发明方法优于现有技术的一个优势是不需要在真核或原核宿主中对制备的片段进行克隆和扩增。 One advantage of the method of the present invention over the prior art is not required fragments were prepared for cloning and amplification in a prokaryotic or eukaryotic hosts. 当靶序列包含在宿主细胞中作为附加体增殖期间可重排的多个重复序列时,这尤其有益。 When the target sequence comprises a plurality of repeat sequences may be rearranged during the episomal proliferation in a host cell, which is particularly advantageous.

[0341] 本公开方法的另一个优势是可通过不仅仅提供重叠群序列,而且还提供长度可超过lOObp、超过300bp、超过500bp、超过Ikb、超过5kb、超过10kb、超过lOOkb、超过1Mb、超过IOMb以上的长重叠群的末端序列和末端序列的方向,从而可促进基因组装配。 [0341] Another advantage of the present disclosure methods are provided not only by the contig sequence, but also provides length over lOObp, more than 300bp, 500bp over, than Ikb, more than 5kb, exceeds 10KB, over lOOkb, more than 1Mb, over terminal sequences and terminal sequences direction of length of the contig above IOMb, which can facilitate genome assembly. 这种序列信息和方向信息可用来促进基因组装配,并且提供缺口闭合。 This sequence information and direction information may be used to facilitate genome assembly, and providing the gap is closed.

[0342] 此外,配对末端读长提供在基因组装配中的第二置信水平。 [0342] In addition, paired-end read length to provide a second level of confidence in the genome assembly. 例如,如果配对末端测序和常规重叠群测序在有关DNA序列上一致,则该序列的置信水平提高。 For example, if the paired-end sequencing and conventional sequencing contigs consistency in the relevant DNA sequence, the sequence improved the level of confidence. 或者,如果两个序列数据彼此矛盾,则置信度降低,可能需要更多的分析和/或测序找出不一致的原因。 Or, if two conflicting sequence data, the lower confidence level, may require more analysis and / or sequencing to identify reasons for the discrepancy.

[0343] 配对末端读长中可读框的存在与否还提供关于可读框位置的方向。 [0343] The read length reading frame is also provided the presence or absence of paired end direction with respect to the position of the open reading frame. 例如,如果重叠群的两个测序端含有可读框,则很有可能整个重叠群就是可读框。 For example, if the two ends of a contig sequence containing an open reading frame, it is likely that the entire open reading frame contig. 这可通过标准测序技术予以证实。 This can be confirmed by standard sequencing techniques. 或者,出于对两端的认识,可以构建特异性PCR引物以扩增两端,可对扩增区进行测序以确定可读框的存在。 Alternatively, for the understanding of the ends may be constructed specific PCR primers to amplify both ends, the amplified region may be sequenced to determine the presence of open reading frames.

[0344] 本发明的方法还可提高对基因组组构和结构的认识。 [0344] The method of the present invention may also improve the understanding of genomic organization and structure. 因为配对末端测序法具有跨越难以测序的区域的能力,因为即使无法对这些区域进行测序,也可推出基因组结构。 Because the paired end sequencing across the region has the ability to sequencing difficult, if not sequenced since these regions may be introduced genomic structure. 难以测序的区域可为例如重复区和二级结构区域。 Region can be sequenced, for example, difficult to repeat region and regions of secondary structure. 在这种情况下,即使不知道这些区域的序列,也能在基因组中绘制这些困难区域的数目和位置的图谱。 In this case, even without knowing the sequence of these regions, it is possible to draw maps of the number and location of these difficulties in a region of the genome.

[0345] 本发明的方法还供在延伸距离内测定基因组的单元型。 Method [0345] The present invention further for the haplotype in the genome of the extended distance measurement. 例如,可以制备特异型引物以扩增含有以长距离连接的两个SNP的基因组的区域。 For example, a specific type may be prepared primers to amplify two regions of the genome containing the SNP in long distance connections. 可采用本发明的方法,对该扩增区域的两个末端进行测序以确定单元型,而无需对两个SNP间的核酸进行测序。 The method of the present invention may be employed, the amplification of two end regions were sequenced to determine the haplotype, without SNP between two nucleic acid sequencing. 该方法当两个SNP跨越测序效率低的区域时龙其有用。 This method is useful when its Long area across the lower two SNP sequencing efficiency. 这些区域包括长区域、具有重复序列的区域或二级结构的区域。 These regions include long region, a region or regions of secondary structure of the repeat sequences.

[0346] 生物素化衔接子的方法提供额外的优势(图7和图22)。 Method [0346] biotinylated adapters provide additional advantages (FIGS. 7 and 22). 图7A表示核酸以易以测序的方式与测序引物A和B连接。 7A shows the nucleic acid sequence in a manner to easily sequencing primer A and B are connected. 核酸中的一些是不含单一重叠群区的两个末端的污染核酸(701)。 Nucleic acid is free of some of the contaminating nucleic acids overlaps two single terminal group region (701). 含有重叠群两个末端的核酸片段用702表示。 End of a nucleic acid fragment comprising two contigs is represented by 702. 由于核酸702是唯一的包含生物素的核酸类,因此该核酸类可以使用链霉抗生物素珠粒纯化(图7B)。 Since 702 is only the nucleic acid containing biotin nucleic acids, the nucleic acids may be used so streptavidin beads purification element (FIG. 7B). 该核酸类在纯化后易于测序。 The nucleic acid sequencing after purification easy. 通过使用亲和力纯化,产生有用信息的序列部分大为增加。 Greatly increased by the use of affinity purification sequence portion, to produce useful information.

[0347] 这在当污染DNA(701)很长时尤其有用,例如,如果图7D中每个污染的核酸(701)有几个kb的长度。 [0347] This is particularly useful when the DNA contamination when long (701), for example, if each of the soiled FIG. 7D nucleic acid (701) having a length of several kb. 对这些污染物测序可能消耗专用于该项目的相当一部分的试齐U、人力和计算机功率。 Qi U test for a considerable part of these pollutants may consume sequencing dedicated to the project, human and computer power. 在这种情况下,在通过亲和力层析法(图7E)纯化合适的片段之前可节省大量的工作和和试剂。 In this case, before the appropriate fragment was purified by affinity chromatography (FIG. 7E) and can save a lot of work and reagents.

[0348] 技术人员应即时了解到,通过EndoV内核切割含有相对链肌苷(见图14,有或没有发夹)的任何双链DNA可产生单链突出端(粘端),其中突出端实际上可以是任何核苷酸序列。 [0348] the art shall understand that instant, by the opposite strand comprises cutting the core EndoV inosine (see FIG. 14, with or without a hairpin) may generate any double-stranded DNA single stranded overhang (sticky end), wherein the projecting end-actual the sequence may be any nucleotide. 本发明还包括基本上类似于图14,但却没有发夹的多核苷酸设计和方法。 The present invention also includes polynucleotides substantially similar design and method of FIG. 14, but no hairpins. 此外,还容易理解的是,如上所述,如图14所示的有或没有发夹的本发明方法和组成, 可用于多种分子生物学和重组DNA技术,其中需要引入独特的内切核酸酶位点。 In addition, readily it is understood that, as described above, the method of the present invention have no or hairpins and composition, useful in a variety of molecular biology and recombinant DNA technology, which requires the introduction of unique endonuclease 14 shown in FIG. enzyme sites. 这类技术包括但不限于DNA和cDNA文库的构建、各种亚克隆策略或者获益于引物、衔接子或接头中的独特内切核酸酶位点的任何方法。 Such techniques include but are not limited to DNA and Construction of cDNA libraries, subcloning strategies or benefit from various primers, method of any single adapter Te Neiqie nuclease sites or linkers in.

[0349] 由本文所描述的任何方法产生的配对末端核酸构建体可通过本领域已知的任何测序方法进行测序。 [0349] paired end produced by any of the methods described herein may be any nucleic acid sequencing methods known in the body to build the art sequenced. 标准测序方法例如Sanger测序或Maxam-Gilbert测序是本领域普遍已知的。 Standard sequencing methods such as Sanger sequencing or Maxam-Gilbert sequencing are well known in the art. 还可通过例如使用通过由454 ® Life Sciences Corporation (Branford, CT, USA) 开发的称为454SeqUenCingTM的自动化测序方法进行测序,例如参见美国专利第7,323,305 和7,244,567号及2004年1月28日申请的美国专利申请顺序号10/767,894 ;以及2004年1月28日申请的10/767,899。 But also by using, for example, by sequencing the 454 ® Life Sciences Corporation (Branford, CT, USA) developed a method called automated sequencing 454SeqUenCingTM, for example, see US Patent Application No. 7,323,305 and No. 7,244,567 and January 28, 2004 United States Patent application Serial No. 10 / 767,894; and January 28, 2004 application 10 / 767,899. 本领域已知的另外的测序方法,例如任何边合成边分析法(sequencing-by-synthesis)或通过边连接边测序法(sequencing-by-ligation),有关综述参见Metzger(Genome Res.2005年12月;15(12) : 1767-76),通过引用特此结合)也包括在内,并且可用于本发明的配对末端测序方法。 Additional sequencing methods known in the art, for example, any analysis-by-synthesis (sequencing-by-synthesis) or connection sequencing method (sequencing-by-ligation) by an edge, see review Metzger (Genome Res. 122 005 in may; 15 (12): 1767-76), hereby incorporated by reference) are also included, and may be used paired end sequencing methods of the invention.

[0350] 贯穿本公开内容的术语“生物素”、“抗生物素蛋白”或“链霉抗生物素”被用来描述多种结合对。 [0350] Throughout this disclosure, the term "biotin", "avidin" or "streptavidin-biotin" is used to describe a variety of binding pairs. 要了解的是,这些术语只是说明使用结合对的一种方法。 To be understood that these terms are merely illustrative of a method of using the binding pair. 因此, 术语生物素、抗生物素蛋白或链霉抗生物素可用结合对的任一成员替换。 Thus, the term biotin, avidin or streptavidin-biotin binding can be used to replace any member. 结合对可以是任何彼此特异性结合的两种分子,至少包括以下结合对:例如FLAG/抗FLAG抗体、生物素/抗生物素蛋白、生物素/链霉抗生物素、受体/配体、抗原/抗体、受体/配体、 polyHIS/镍、A蛋白/抗体及其衍生物。 Binding pair may be any two molecules that specifically bind to each other, at least the following binding pairs: e.g. FLAG / anti-FLAG antibody, biotin / avidin, biotin / avidin, streptavidin, receptor / ligand, antigen / antibody, receptor / ligand, polyHis / Ni, A protein / antibody and derivatives thereof. 其它结合对是已知的并在文献中发表。 Other binding pairs are known and published in the literature.

[0351] 所有专利、专利申请和本公开内容任何部分引用的参考文献都通过引用其整体特此予以结合。 [0351] All patents, patent applications and the present disclosure any reference cited are hereby portion to be incorporated by reference in its entirety.

[0352] 下面,本发明将通过下列非限制性实施例作进一步的描述。 [0352] Hereinafter, the present invention will be further described by the following non-limiting examples. 实施例 Example

[0353] 实施例1 :寡核苷酸设计[0354] 如下设计和合成用于实验的寡核苷酸。 [0353] Example 1: Oligonucleotide Design [0354] and the following synthetic oligonucleotide design used for the experiment.

[0355] 对图3A上部分所显示的俘获元件寡核苷酸进行设计以包括UA3衔接子和测序键(key)。 [0355] In the design and sequencing adapter includes UA3 key (key) of FIG. 3A oligonucleotide capture elements on the display portion. 使NotI位点位于衔接子之间。 That the NotI site located between the adapter sub. 可使用嵌套寡核苷酸和PCR产生完整构建体(俘获元件)。 Using nested PCR-generated oligonucleotides and complete construct (capture element). 对最终产物的序列进行合成和克隆。 The final sequence was synthesized and cloned product.

[0356] 图3A下部分显示的IIS型俘获片段寡核苷酸,与上述俘获片段类似,只是在测序键序列后的俘获片段中包括代表IIS型限制性内切核酸酶位点(例如MmeI)的序列。 [0356] FIG. 3A Type IIS capture oligonucleotide fragments nucleotide partial display, similar to the above-described capture fragments, fragments except the capture sequence comprises sequencing key representative of a Type IIS endonuclease restriction endonuclease sites (e.g., Mmel) the sequence of. 这些IIS型限制性内切核酸酶切割位点允许切割用将被IIS型限制性内切核酸酶切割的这些俘获元件构建的任何构建体。 The endo type IIS restriction endonuclease cleavage site to allow the cutting is cut by the type IIS restriction enzyme cleavage of a nucleic acid construct of any of these elements trapping construct. 正如本领域所知一样,IIS型限制性内切核酸酶切割位于距识别位点各种距离的DNA,在MmeI的情况下,为20/18个碱基的距离。 As it is known in the art as type IIS restriction endonuclease cleavage recognition site is located at various distances from the site of the DNA, in the case of the MmeI, 20/18 distance bases.

[0357] 将短衔接子俘获片段寡核苷酸设计成含有SADl衔接子和测序键(图3B)。 [0357] A short adapter capture fragment oligonucleotides were designed to contain SADl adapters and sequencing key (FIG. 3B). NotI 位点同样位于衔接子之间。 NotI site is also located between the adapter sub. 将该寡核苷酸合成为在测序键序列后具有MmeI IIS型限制性内切核酸酶切割位点(参见图3B,短衔接子俘获片段(IIS型))。 The synthesis of oligonucleotides having endonuclease cleavage site MmeI IIS type restriction (3B, a short adapter capture fragment (IIS type)) after sequencing key sequence.

[0358] 实验例2:用于发夹衔接子配对末端测序的方案 [0358] Experimental Example 2: scheme for hairpin adapter paired end sequencing

[0359]使用标准 HydroShear 组件(Genomic Solutions,Ann Arbor, MI, USA),将100 μ 1中的大肠杆菌Κ12 DNA (20 μ g)以速度10流体动力剪切处理20个循环。 [0359] Using standard HydroShear assembly (Genomic Solutions, Ann Arbor, MI, USA), 100 μ 1 of the E. coli Κ12 DNA (20 μ g) at 10 hydrodynamic shear velocity for 20 cycles. 通过加入50 μ IDNA (5 μ g)、34.75 μ IH2CK 10 μ 1 甲基化酶缓冲液、0.25 μ 1 32mM SAM 禾口5 μ 1 EcoRI 甲基化酶(40,000 单位/ml,New England Biolabs (NEB),Ipswich, MA, USA), By the addition of 50 μ IDNA (5 μ g), 34.75 μ IH2CK 10 μ 1 methylase buffer, 0.25 μ 1 32mM SAM port Wo 5 μ 1 EcoRI methylase (40,000 units / ml, New England Biolabs (NEB ), Ipswich, MA, USA),

对剪切的DNA进行甲基化反应。 DNA was sheared for methylation reaction. 将反应物在37°C下孵育30分钟。 The reaction was incubated at 37 ° C 30 min. 在甲基化反应后,按照生产商的说明书,剪切的甲基化DNA使用QiagenMinElute PCR纯化柱纯化。 After the methylation reaction according to the manufacturer's instructions, cut methylated DNA was purified using QiagenMinElute PCR purification columns. 用10 μ 1 EB缓冲液将纯化的DNA从柱上洗脱出来。 With 10 μ 1 EB buffer purified DNA was eluted from the column.

[0360] 对剪切的甲基化DNA进行精加工步骤以产生具有平端的剪切材料。 [0360] DNA methylation shear finishing steps to produce a material having a blunt end cut. 将10μ1 DNA加入含有13 μ 1 H2CK 5 μ 1 10Χ精加工缓冲液、5 μ 1 lmg/ml牛血清白蛋白、5μ1 IOmM ATP、3 μ 1 IOmM dNTP> 5 μ 1 IOU/μ 1T4 多核苷酸激酶禾口5 μ 1 3U/μ 1 T4 DNA聚合酶的反应混合物中。 The 10μ1 DNA solution containing 13 μ 1 H2CK 5 μ 1 10Χ finishing buffer, 5 μ 1 lmg / ml bovine serum albumin, 5μ1 IOmM ATP, 3 μ 1 IOmM dNTP> 5 μ 1 IOU / μ 1T4 polynucleotide kinase Wo port 5 μ 1 3U / μ 1 T4 DNA polymerase reaction mixture. 将反应物在12°C下孵育15分钟,此后,将温度升至25°C再达15分钟。 The reaction was incubated for 15 minutes at 12 ° C, after which the temperature was raised to 25 ° C and then for 15 min. 随后按照生产商的说明书,将反应物在Qiagen MinElute PCR纯化柱上进行纯化。 Then according to the manufacturer's instructions, and the reaction was purified on a Qiagen MinElute PCR purification column.

[0361]通过加入 ΙΟμΙ 5yg 剪切DNA、17.5 μ IH2CK 50 μ 1 2Χ Quick 连接酶缓冲液、 20 μ 1 10 μ M发夹衔接子和2.5 μ 1 Quick连接酶(T4 DNA连接酶,NEB),使发夹衔接子与剪切的平端DNA片段连接。 [0361] by adding ΙΟμΙ 5yg cut DNA, 17.5 μ IH2CK 50 μ 1 2Χ Quick Ligase buffer, 20 μ 1 10 μ M hairpin adapters and 2.5 μ 1 Quick ligase (T4 DNA Ligase, NEB), hairpin adapters blunt-ended DNA fragment was connected to the shear sub. 将反应物在25°C下孵育15分钟,此后,通过向混合物中加入2 μ 1 λ 外切核酸酶、1 μ 1 Rec J (30,000 单位s/ml,NEB)、1 μ 1 T7 外切核酸酶(10,000 单位/ml,NEB)和1 μ 1外切核酸酶I (20,000单位/ml,NEB),来选择连接片段。 The reaction was incubated for 15 minutes at 25 ° C, after which, by the addition of 2 μ to the mixture exo-1 λ nuclease, 1 μ 1 Rec J (30,000 units s / ml, NEB), 1 μ 1 T7 exonuclease enzymes (10,000 units / ml, NEB), and 1 μ 1 exonuclease I (20,000 units / ml, NEB), to select the connection segments. 将反应物在37°C下孵育30分钟,此后,使样品在Qiagen MinElute PCR纯化柱上进行纯化。 The reaction was incubated for 30 minutes at 37 ° C, Thereafter, the samples were purified on a Qiagen MinElute PCR purification column. 然后按照生产商的说明书,使经处理的DNA通过Invitrogen Purelink柱,并以50 μ 1的体积从柱中洗脱出来。 According to the manufacturer's instructions and the DNA treated by Invitrogen Purelink column, and a volume of 50 μ 1 eluted from the column.

[0362] 通过EcoRI对连接的外切核酸酶处理的DNA进行消化。 [0362] treated by the endonuclease EcoRI outer ligated DNA was digested. 将含有50 μ 1 DNA、 30 μ 1 H2O> 10 μ 1 EcoRI缓冲液和10 μ 1 EcoRI (20,000单位/ml)的反应物在37°C下孵育过夜。 The reaction product containing 50 μ 1 DNA, 30 μ 1 H2O> 10 μ 1 EcoRI buffer and 10 μ 1 EcoRI (20,000 units / ml) and incubated overnight at 37 ° C. 按照生产商的说明书,将切割产物用Qiagen QiaQuick柱纯化。 According to the manufacturer's instructions, the cleavage product was purified by Qiagen QiaQuick column. 在含有50 μ 1 DNA、 20 μ 1 缓冲液4 (NewEngland Biolabs)、2 μ 1 IOOmM ATP、123 μ 1 H2O 禾口5 μ 1 连接酶(同上)的反应物中,使切割产物再次连接以产生闭环DNA。 Containing 50 μ 1 DNA, 20 μ 1 Buffer 4 (NewEngland Biolabs), 2 μ 1 IOOmM ATP, 123 μ 1 H2O Wo port reactant ligase (supra) 5 μ 1 of the cleavage product was ligated again to yield closed-loop DNA. 将连接反应物在25°C下孵育15 分钟,此后,通过向混合物中加入1 μ 1λ外切核酸酶(5,000单位/ml,NEB)、0.5 μ 1 RecJ (同上)、0.5 μ 1T7外切核酸酶(同上)和0.5 μ 1外切核酸酶I (同上),对其进行另一轮的外切核酸酶处理。 The ligation reaction was incubated at 25 ° C 15 minutes, after which, by 1 μ 1λ exonuclease was added to the mixture (5,000 units /ml,NEB),0.5 μ 1 RecJ (ibid.), 0.5 μ 1T7 exonuclease enzymes (supra) and 0.5 μ 1 exonuclease I (supra), subjected to another round its outer endonuclease treatment. 将外切核酸酶反应物在37°c下孵育30分钟,此后,样品用Qiagen MinElute PCR纯化柱纯化。 The exonuclease reaction was incubated for 30 min at 37 ° c, after which, samples were purified using Qiagen MinElute PCR purification column.

[0363]然后在含有 10 μ 1 DNA、78.75 μ 1 H2O> 10 μ 1 缓冲液4 (New England Biolabs)、 0.25 μ ISAM和0.5 μ IMme I (2,000单位/ml,NEB)的反应混合物中,对处理DNA进行Mme I消化。 [0363] and containing 10 μ 1 DNA, 78.75 μ 1 H2O> 10 μ 1 Buffer 4 (New England Biolabs), 0.25 μ ISAM and 0.5 μ IMme I (2,000 units / ml, NEB) of the reaction mixture, and Mme I digested DNA was treated. 将反应物在37°C下用Mme I消化60分钟,然后在用最终浓度0.1 % 3M乙酸钠缓冲的Qiagen QiaQuick柱上进行纯化。 The reaction was at 37 ° C for 60 minutes with Mme I digestion and then purified with a final concentration of 0.1% 3M sodium acetate buffer Qiagen QiaQuick column. 按照生产商的说明书,用700 μ 1 8.0Μ盐酸胍洗涤柱子,并将样品加到柱上。 According to the manufacturer's instructions, washed with guanidine hydrochloride 700 μ 1 8.0Μ column, and the sample was applied to the column. DNA用30 μ IEB缓冲液洗脱,并稀释到100 μ 1的最终体积。 30 μ IEB DNA with elution buffer, and diluted to a final volume of 100 μ 1.

[0364]如下制备链霉抗生物素磁性珠粒(50 μ 1) (Dynal Dynabeads M270, Invitrogen, Carlsbad, CA, USA):用2X珠粒结合缓冲液洗涤,将珠粒悬浮于100 μ 1 2X珠粒结合缓冲液中,此后,将IOOylDNA样品加入珠粒中,在室温下混合20分钟。 [0364] prepared by streptavidin-biotin magnetic beads (50 μ 1) (Dynal Dynabeads M270, Invitrogen, Carlsbad, CA, USA): 2X with binding buffer, the beads were washed, the beads were suspended in 100 μ 1 2X bead binding buffer, after which the sample was added IOOylDNA beads, mixed at room temperature for 20 minutes. 将珠粒在洗涤缓冲液中洗涤两次。 The beads were washed twice in wash buffer. 将SAD7衔接子套组(Α/Β套组,其中使单链寡核苷酸AD7Ftop和SAD7Fbot退火形成A衔接子,将单链寡核苷酸AD7Rtop和SADRFbot 退火形成B 衔接子)(SAD7Ftop : 5,-CCGCCCAGCATCGCCTCAGNN-3,(SEQ ID NO : 51) ; SAD7Fbot : 5' -CTGAGGCGATGCTGG-3 ' (SEQ ID NO: 52); SAD7Rtop : 5,-CCGCCCGAGCACCGCTCAGNN-3 ' (SEQ ID NO : 53) ; SAD7Rbot : 5,-CTGAGCGGTGCTCGG-3,(SEQ ID NO : 54),其中N 是4 种碱基(A、G、T 或C 的任一种),与结合链霉抗生物素珠粒的DNA连接,其中将含有15 μ IH2CK 25 μ 1 Quick 连接酶缓冲液、5 μ 1 SAD7衔接子套组和5 μ 1 Quick连接酶(同上)的连接反应混合物加入珠粒-DNA混合物中。将连接反应物在25°C下孵育15分钟,然后,用珠粒洗涤缓冲液洗涤珠粒两次。 The SAD7 adapter kit (Α / Β set, wherein the single-stranded oligonucleotide annealed to form AD7Ftop SAD7Fbot A and adapter, forming the single-stranded oligonucleotide annealed SADRFbot AD7Rtop and B adapters) (SAD7Ftop: 5 , -CCGCCCAGCATCGCCTCAGNN-3, (SEQ ID NO: 51); SAD7Fbot: 5 '-CTGAGGCGATGCTGG-3' (SEQ ID NO: 52); SAD7Rtop: 5, -CCGCCCGAGCACCGCTCAGNN-3 '(SEQ ID NO: 53); SAD7Rbot: 5, -CTGAGCGGTGCTCGG-3, (SEQ ID NO: 54), wherein N is four bases (either a, G, T or C), avidin streptavidin beads with the ligated DNA binding, wherein containing 15 μ IH2CK 25 μ 1 Quick ligase buffer, 5 μ 1 SAD7 adapter kit and 5 μ 1 Quick ligase (supra) -DNA reaction mixture was added to the beads to the mixture. the ligation reaction was 25 ° C incubation for 15 minutes, then the beads were washed twice with bead wash buffer.

[0365] 通过向珠粒中加入含有40 μ 1 H2O> 5 μ 1 IOX补平缓冲液、2 μ 1 IOmM dNTP禾口3 μ 1补平聚合酶(Bst DNA聚合酶,8,000单位/ml,NEB)的混合物来进行核苷酸补平反应。 [0365] by adding to the beads containing 40 μ 1 H2O> 5 μ 1 IOX buffer fill level, 2 μ 1 IOmM dNTP Wo port 3 μ 1 fill-polymerase (Bst DNA polymerase, 80 units / ml, NEB ) was used to fill-in reaction nucleotide. 将反应物在37°C下孵育20分钟后,将珠粒在洗涤缓冲液中洗涤两次。 The reaction was incubated at 37 ° C 20 min, the beads were washed twice in wash buffer. 然后将珠粒悬浮于25 μ ITE缓冲液中。 The beads were then resuspended in 25 μ ITE buffer.

[0366] 然后将结合珠粒的DNA在含有30 μ IH2CK 5 μ 1 IOX Advantage 2 缓冲液、2μ1 IOmM dNTP、1 μ 1 100 μ M 正向弓| 物(SAD7FPCR : 5,-Bio-CCGCCCAGCATCGCC—3,(SEQ ID NO : 55) )、1 μ 1 100 μ M 反向弓I 物(SAD7RPCR : 5,-CCGCCCGAGCACCGC-3,(SEQ ID NO : 56)、10 μ 1 结合珠粒的DNA和lylAdvantage2 聚合酶混合物(Clontech,Mountain View, CA, USA)的反应混合物中进行PCR。PCR采用下列程序进行:(a)在94°C下4分钟,(b)在94°C下15秒钟,(C)在64°C下15秒钟,其中步骤(b)和(C)进行19次循环,(d)在68°C下2分钟, 此后,将反应物保存在14°C下。 [0366] DNA was then bound to beads containing 30 μ IH2CK 5 μ 1 IOX Advantage 2 buffer, 2μ1 IOmM dNTP, 1 μ 1 100 μ M forward bow | composition (SAD7FPCR: 5, -Bio-CCGCCCAGCATCGCC-3 , (SEQ ID NO: 55)), 1 μ 1 100 μ M I was reverse bow (SAD7RPCR: 5, -CCGCCCGAGCACCGC-3, (SEQ ID NO: 56), 10 μ 1 of DNA bound beads and a polymerization lylAdvantage2 the reaction mixture was enzyme mix (Clontech, Mountain View, CA, USA) in PCR.PCR performed using the following procedures: (a) at 94 ° C 4 minutes, (b) at 94 ° C 15 seconds, (C ) at 64 ° C 15 seconds, wherein step (b) and (C) for 19 cycles, (d) at 68 ° C 2 minutes, after which the reaction was stored at 14 ° C.

[0367] PCR产物使用Qiagen MinElute PCR纯化柱进行纯化,然后,将纯化产物在5 伏特/厘米下在1.5%琼脂糖凝胶上进行电泳,检测120bp产物的存在。 [0367] PCR product was purified using the Qiagen MinElute PCR Purification column, and the purified product was subjected to electrophoresis the presence, 120bp product was detected in 1.5% agarose gel at 5 volts / cm. 从凝胶上切下120bp片段,采用Qiagen MinElute凝胶提取方案回收。 120bp was cut from the gel fragments using Qiagen MinElute gel extraction protocol recovered. 将120bp片段用18 μ 1 EB缓冲液洗脱。 The 120bp fragment was eluted with 18 μ 1 EB buffer. 使双链产物与链霉抗生物素珠粒结合,并用珠粒洗涤缓冲液洗涤两次。 Double stranded product with a streptavidin-biotin binding beads, and washed twice with bead wash buffer. 单链产物用125mM NaOH洗脱,在Qiagen MinElute PCR纯化柱上进行纯化。 Single-stranded product eluted with 125mM NaOH, purified on Qiagen MinElute PCR purification column. 然后采用标准454Life Sciences Corporation (Branford, CT, USA)测序方法在454 Life Sciences Corporation Then standard 454Life Sciences Corporation (Branford, CT, USA) in a sequencing method 454 Life Sciences Corporation

自动化测序系统上对该材料进行测序。 The material on automated sequencing system for sequencing.

[0368] 实施例3 : 用于非发夹衔接子配对末端测序的方案 [0368] Example 3: protocol for non-hairpin adapter paired end sequencing

[0369] 使用标准组件(HydroShear,同上),将100 μ 1体积的大肠杆菌K12DNA (5 μ g) 以速度11经流体动力剪切处理20个循环。 [0369] Using standard components (HydroShear, supra), the volume of 100 μ 1 E. coli K12DNA (5 μ g) at 11 by the hydrodynamic shear velocity for 20 cycles. 按照生产商的说明书,将剪切DNA在Qiagen MinElute PCR纯化柱上进行纯化,并用23 μ 1 EB缓冲液洗脱。 According to the manufacturer's instructions, will cut DNA was purified on a Qiagen MinElute PCR Purification column and eluted with 23 μ 1 EB elution buffer. 将纯化的剪切DNA在含W 23 μ 1 DNA> 5 μ 110Χ 精加工缓冲液、5 μ 1 lmg/ml 牛血清白蛋白、5 μ 1 IOmM ATP、 The purified DNA containing the shear W 23 μ 1 DNA> 5 μ 110Χ finishing buffer, 5 μ 1 lmg / ml bovine serum albumin, 5 μ 1 IOmM ATP in,

3 μ IlOmM dNTP、5 μ 1 10U/ μ 1 Τ4多核苷酸激酶禾口5 μ 1 3U/ μ 1 Τ4 DNA聚合酶的反应混合物中进行平端精加工。 3 μ IlOmM dNTP, 5 μ 1 10U / μ 1 Τ4 polynucleotide kinase Wo port 5 μ 1 3U / blunt end DNA polymerase finishing the reaction mixture in μ 1 Τ4. 将反应物在12°C下孵育15分钟,此后,把温度升至25°C再达15 分钟。 The reaction was incubated for 15 minutes at 12 ° C, after which, the temperature was raised to 25 ° C and then for 15 min. 随后按照生产商的说明书,将反应物在Qiagen MinElute PCR纯化柱上进行纯化。 Then according to the manufacturer's instructions, and the reaction was purified on a Qiagen MinElute PCR purification column. 无发夹衔接子的连接使用含有25 μ 1 2X Quick连接酶缓冲液、18.5 μ 1 10 μ M无发夹衔接子和2.5 μ 1 Quick连接酶(同上)的反应混合物中的2 μ g剪切的纯化DNA进行。 No connection hairpin adapters containing 25 μ 1 2X Quick Ligase Buffer, the reaction mixture was 18.5 μ 1 10 μ M without hairpin adapters and 2.5 μ 1 Quick Ligase (as above) in 2 μ g Shear DNA was purified. 将连接反应物在25°C下孵育15分钟,此后,使样品依次通过Sephacryl S-400旋转柱和Qiagen MinElute PCR纯化柱。 The ligation reaction was incubated at 25 ° C 15 minutes, after which the sample passes through the Sephacryl S-400 spin column and the Qiagen MinElute PCR purification column. 然后用10 μ 1 EB缓冲液从柱上洗脱出DNA。 Then 10 μ 1 EB buffer the DNA was eluted from the column.

[0370] 然后,使纯化的连接DNA进行激酶反应,其中混合物含有13 μ IH2CK 25 μ 1 2Χ 缓冲液、IOylDNA和2μ1 10U/μ 1Τ4多核苷酸激酶。 [0370] Then, the purified ligated DNA kinase reaction wherein the mixture containing 13 μ IH2CK 25 μ 1 2Χ buffer, IOylDNA and 2μ1 10U / μ 1Τ4 polynucleotide kinase. 将反应物在37°C下孵育60分钟, 此后,以5伏特/cm使样品在琼脂糖凝胶上进行电泳。 The reaction was incubated for 60 min at 37 ° C, thereafter, to 5 volts / cm the sample was subjected to electrophoresis on an agarose gel. 从凝胶切下1500bp至4000bp 之间的条带,采用Qiagen MinElute凝胶提取方案回收。 From the strip between the gel and the 1500bp to 4000bp cut using Qiagen MinElute gel extraction protocol recovered.

[0371 ]使纯化的DNA 在含有18 μ 1 DNA、20 μ 1 缓冲液4 (New England Biolabs)、2 μ 1 ATP、150 μ 1 H2O和10 μ 1连接酶(同上)的反应混合物中进行另一轮连接以产生环状DNA。 [0371] The purified DNA containing 18 μ 1 DNA, 20 μ 1 Buffer 4 (New England Biolabs), 2 μ 1 ATP, the reaction mixture was ligase (supra) in 150 μ 1 H2O and 10 μ 1 performed another a connector to generate circular DNA. 将反应物在25°C下孵育15分钟,此后,将含有2 μ 1 λ外切核酸酶(同上)、Ιμΐ Rec J (同上)、1 μ 1 Τ7外切核酸酶(同上)和1 μ 1外切核酸酶I (同上)的混合物在37°C 下孵育30分钟。 The reaction was incubated at 25 ° C 15 minutes, after containing 2 μ 1 outer λ exonuclease (supra), Ιμΐ Rec J (as above), 1 μ 1 Τ7 exonuclease (supra) and 1 μ 1 exo mixture nuclease I (supra) were incubated at 37 ° C 30 min. 在外切核酸酶反应后,将DNA在Qiagen MinElute PCR纯化柱上进行纯化,并用20 μ IEB缓冲液洗脱。 After the exonuclease reaction, the DNA was purified on a Qiagen MinElute PCR Purification column and eluted with elution buffer 20 μ IEB.

[0372] 然后将纯化的连接DNA加入含有68.6 μ IH2CK 10 μ 1缓冲液4(New England Biolabs)、0.2 μ 1 SAM禾Π 1 μ 1 Mme I限制性内切核酸酶(同上)的混合物中。 [0372] The purified ligated DNA solution containing 68.6 μ IH2CK 10 μ 1 Buffer 4 (New England Biolabs), cutting nuclease mixture (supra) in the 0.2 μ 1 SAM Wo Π 1 μ 1 Mme I restriction. 将DNA在370C下切割30分钟,此后,使用以0.1 % 3Μ乙酸钠的最终浓度预缓冲的Qiagen QiaQuick 柱纯化DNA,用700μ18.0Μ盐酸胍洗涤。 The DNA cleavage in 30 minutes at 370C, after which, using the Qiagen QiaQuick column to a final concentration of 0.1% 3Μ sodium acetate buffer pre-purified DNA, washed with 700μ18.0Μ guanidine hydrochloride. 然后将纯化的DNA用30 μ 1 EB缓冲液洗脱, 调节体积至100 μ 1。 The purified DNA was treated with 30 μ 1 EB elution buffer adjusted to a volume of 100 μ 1.

[0373] 将链霉抗生物素磁性珠粒(50 μ 1)(同上)用2Χ珠粒结合缓冲液洗涤后,悬浮于100 μ 1珠粒结合缓冲液中。 [0373] After the biotin-streptavidin magnetic beads (50 μ 1) (supra) with binding buffer 2Χ beads, the beads were suspended in 100 μ 1 of binding buffer. 然后将珠粒与100 μ IDNA样品混合,使之在室温下彼此结合20分钟。 The beads were then mixed with 100 μ IDNA sample and allowed to bind to each other at room temperature for 20 minutes. 此后,将珠粒在洗涤缓冲液中洗涤两次,用SAD7衔接子套组(Α/Β套组)(同上)进行连接反应。 Thereafter, the beads were washed twice in wash buffer, reaction SAD7 connection adapter kit (Α / Β kit) (supra). 将含有15 μ 1 H2O> 25 μ 1 Quick连接酶缓冲液、5 μ 1 SAD7衔接子和5 μ 1 Quick连接酶(同上)的混合物加入结合珠粒的DNA中,在25°C下孵育15分钟,此后,将珠粒在洗涤缓冲液中洗涤两次。 Containing 15 μ 1 H2O> DNA 25 μ 1 Quick Ligase Buffer, a mixture of 5 μ 1 SAD7 adapters and 5 μ 1 Quick Ligase (as above) bound beads was added and incubated at 25 ° C 15 minutes Thereafter, the beads were washed twice in wash buffer.

[0374] 将结合珠粒的DNA在含有40 μ 1 H2O> 5 μ 1 10Χ补平缓冲液、2 μ IlOmM dNTP [0374] The DNA bound to beads containing 40 μ 1 H2O> 5 μ 1 10Χ buffer fill level, 2 μ IlOmM dNTP

和3 μ 1补平聚合酶(同上)的混合物中进行补平反应。 And a mixture of 3 μ 1 polymerase blunted (supra) was performed fill-in reaction. 反应在37°C下进行20分钟,此后,将珠粒在洗涤缓冲液中洗涤两次后,悬浮于25 μ ITE缓冲液中。 The reaction for 20 minutes at 37 ° C, after which, the beads were washed twice in wash buffer, resuspended in 25 μ ITE buffer. 将结合珠粒的DNA 在含有30 μ IH2CK 5 μ IlOXAdvantage 2 缓冲液、2yldNTP、0.5 μ 1 100 μ M 正向引物(同上)、0.5 μ 1 100 μ M反向引物(同上)、10 μ 1结合珠粒的DNA和1 μ 1 Advantage2酶(同上)的反应混合物中进行扩增。 The DNA bound to beads containing 30 μ IH2CK 5 μ IlOXAdvantage 2 buffer, 2yldNTP, 0.5 μ 1 100 μ M forward primer (supra), 0.5 μ 1 100 μ M reverse primer (supra), 10 μ 1 binding beads amplified DNA and 1 μ 1 Advantage2 enzymes (supra) in the reaction mixture. PCR反应在下列条件下进行:(a)在94°C下4分钟,(b) 在94°C下15秒钟,(c)在64°C下15秒钟,其中步骤(b)和(C)重复24次循环,(d)在68°C下2分钟,此后,将PCR反应物保存在14°C下。 PCR reaction was carried out under the following conditions: (a) at 94 ° C 4 minutes, (b) at 94 ° C 15 seconds, (c) at 64 ° C 15 seconds, wherein step (b) and ( C) was repeated 24 cycles, (d) at 68 ° C 2 minutes, after which the PCR reaction was stored at 14 ° C. PCR产物用Qiagen MinElute PCR 纯化柱进行纯化,以5伏特/cm在1.5%琼脂糖凝胶中进行电泳。 PCR products were purified using Qiagen MinElute PCR purification column, / cm electrophoresed in 1.5% agarose gel at 5 volts. 从凝胶上切下120bp的产物,用Qiagen MinElute凝胶提取方案回收。 120bp product was excised from the gel and recovered by Qiagen MinElute gel extraction protocol. DNA随后用18 μ 1 EB缓冲液洗脱。 DNA was then eluted with 18 μ 1 EB buffer.

[0375] 使双链DNA与链霉抗生物素珠粒结合,珠粒用洗涤缓冲液洗涤两次。 [0375] double-stranded DNA with biotin-streptavidin binding beads, the beads were washed twice with wash buffer solution. 单链DNA 然后用125mM NaOH洗脱,随后使用Qiagen MinElute PCR纯化柱进行纯化。 Single stranded DNA was then eluted with 125mM NaOH, followed by purification using Qiagen MinElute PCR purification column. 使纯化的材料进行标准454乳液和测序方案。 The purified material 454 standard sequencing protocols and emulsions.

[0376] 采用上述方法,我们得到下列结果: [0376] With the above method, we obtained the following results:

[0377] 从4个60x60电泳(大约1.3xl06读长)的正常的454序列产生大肠杆菌重叠群: 产生了303个大于IOOObp的重叠群,其平均大小为16,858bp,最大大小为94,060bp。 [0377] from the four 60x60 electrophoresis (about 1.3xl06 read length) normal sequence generator 454 E. Contig: IOOObp produced greater than 303 contigs with an average size of 16,858bp, the maximum size of 94,060bp. 表5包括采用上述方法得到的另外的结果。 Table 5 further comprising using the results obtained by the method described above.

[0378] 表5 :配对末端测序方法的结果 [0378] Table 5: Results of paired end sequencing methods

[0379] [0379]

Figure CN102027130AD00381

[0381] 通过对从Genbank获取的大肠杆菌K12基因组的全部配对读长进行第一次比对检索(blasting)来进行分析。 [0381] The retrieval (DEMOLITION) to analyze all pairings read length by E. coli K12 genome Genbank acquired from a first level attained. 保留与参比基因组匹配的预期值小于0.1的读长。 The reference value is smaller than the read retention longer than expected 0.1 genome match. 分析了含有两个独立的被内部接头序列分隔的比对检索命中(blasthit)的全部读长在基因组中相距的比对检索距离,如果距离小于5,OOObp的则保留。 Containing two separate analyzes are distance retrieval, if the distance is less than 5, OOObp linker sequence is retained inside the long distance read all in the genome than the partition ratio of the retrieval hit (blasthit) a. 然后使这些读长在基因组中的第一和第二位置命中排序,并测定以观察重叠是否发生在分选的配对序列的附近。 Then these first and second read length hit position in the genome of sorting, and determining to see whether it occurs in the vicinity of the overlap of paired sequences sorting. 然后按照上述相同方式,测定这些排序的重叠群的每一种与454测序重叠群的重叠配偶体。 Then the same manner as described above, each of these overlapping partner ordered contigs with 454 sequencing contigs was determined.

[0382] 種仿丨丨4:诵i寸翻G舰刊本夕卜磁白妨# [0382] Species imitation Shushu 4: Total recite i G-inch magnetic white ship Editions Bu Xi hinder #

[0383] 1.DNA 片段化 [0383] 1.DNA fragmentation

[0384] 使用Hydroshear large assembly剪切30 μ g大肠杆菌Kl2 DNA样品以产生15-30Kb片段。 [0384] using Hydroshear large assembly shear 30 μ g of E. coli Kl2 DNA sample to produce fragments 15-30Kb. 使DNA片段通过MicroSpin S400柱纯化。 The DNA fragment was purified by MicroSpin S400 column.

[0385] 2.片段末端精加工 [0385] 2. Finishing terminal fragment

[0386] 用T4 DNA聚合酶和T4 PNK将DNA片段末端在微量离心管中如下进行了精加工。 [0386] with T4 DNA polymerase and T4 PNK DNA fragment ends in the microcentrifuge tube was finished as follows. 30 μ g起始DNA样品进行了两次反应。 30 μ g DNA sample twice starting the reaction.

[0387] IOX PNK 缓冲液 10 μ 1 [0387] IOX PNK buffer, 10 μ 1

[0388] BSA (20mg/ml 稀释液)0.5 μ 1 [0388] BSA (20mg / ml dilution) 0.5 μ 1

[0389] ATP(IOOmM) 1 μ 1 [0389] ATP (IOOmM) 1 μ 1

[0390] dNTP (1 OmM each) 4 μ 1 [0390] dNTP (1 OmM each) 4 μ 1

[0391]剪切 DNA(<15yg) 75 μ 1 [0391] Shear DNA (<15yg) 75 μ 1

[0392] Τ4 DNA 聚合酶(3U/ μ 1) 5 μ 1 [0392] Τ4 DNA polymerase (3U / μ 1) 5 μ 1

[0393] Τ4 PNKdOU/ μ 1) 5 μ 1 [0393] Τ4 PNKdOU / μ 1) 5 μ 1

[0394] 将反应混合物充分混合,并在12°C下孵育15分钟。 [0394] The reaction mixture was mixed well, and incubated at 12 ° C 15 min. 紧接其后使反应混合物在25°C下孵育15分钟。 Immediately thereafter the reaction mixture was incubated at 25 ° C 15 min. 将反应物用QIAEX II试剂盒纯化,每次反应用37 μ 1 EB洗脱。 The reaction was purified with QIAEX II kit, each reaction was eluted with 37 μ 1 EB.

[0395] 3.LoxP衔接子连接 [0395] 3.LoxP connection adapter

[0396] 如下将10ΧΡ6衔接子加入精加工DNA片段中(需要反应一式二份)。 [0396] The following adapter was added 10ΧΡ6 finishing DNA fragments (the reaction requires a duplicate).

[0397] Roche 2Χ快速连接酶缓冲液(#1) 50 μ 1 [0397] Roche 2Χ Quick Ligase Buffer (# 1) 50 μ 1

[0398] 1οχΡ6 衔接子(每20 μ Μ) 10 μ 1 [0398] 1οχΡ6 adapters (each 20 μ Μ) 10 μ 1

[0399]精加工 DNA 35 μ 1 [0399] Finishing DNA 35 μ 1

[0400] Roche 快速连接酶(#3) 5 μ 1 [0400] Roche Rapid Ligase (# 3) 5 μ 1

[0401] 将反应混合物充分混合,并在25°C下孵育15分钟。 [0401] The reaction mixture was mixed well, and incubated at 25 ° C 15 min.

[0402] 4.凝胶纯化和大小选择 [0402] 4. gel purification and size selection

[0403] 使用制备梳将两个IoxP连接的DNA样品加载到大的0.5%琼脂糖凝胶中(如果使用样品梳则可用多个孔),使凝胶在35V下电泳过夜。 [0403] The DNA sample was prepared using two IoxP comb connected to large loading 0.5% agarose gel (using a sample comb is available if a plurality of holes), gel electrophoresis overnight at 35V.

[0404] 次日上午,收集所需范围(例如20_25Kb)的DNA片段,按照生产商的说明书, 使用QIAEXII进行纯化。 [0404] The next morning, collecting the required range (e.g. 20_25Kb) a DNA fragment according to the manufacturer's instructions and purified using QIAEXII.

[0405] 5.补平反应 [0405] The fill-in reaction

[0406] 进行补平反应以修复由1οχΡ6衔接子连接引入的切口。 [0406] to fill-in reaction to repair incisions adapters connected 1οχΡ6 introduced.

[0407] LoxP 衔接DNA 38 μ 1 [0407] LoxP adapter DNA 38 μ 1

[0408] IOX Bst聚合酶缓冲液5μ1 [0408] IOX Bst polymerase buffer 5μ1

[0409] dNTP (各IOmM) 4 μ 1 [0409] dNTP (each IOmM) 4 μ 1

[0410] Bst DNA 聚合酶3 μ 1 [0410] Bst DNA polymerase 3 μ 1

[0411] 将反应混合物充分混合后,在50°C下孵育15分钟,随后流过MicroSpin S400 柱。 After [0411] The reaction mixture was mixed thoroughly, incubated for 15 minutes at 50 ° C, then through MicroSpin S400 column. 然后定量测定DNA浓度。 DNA concentration was then quantified.

[0412] 6.用于环化的切除反应 Removal of the reaction [0412] 6. A cyclization

[0413] 用从上述补平反应中产生的150_300ngDNA,进行位点特异型重组以产生环化分子。 [0413] with 150_300ngDNA generated from the fill-in reaction, the site-specific recombination to produce a ring-type molecules.

[0414] 分子生物学级水39 μ 1 [0414] Molecular biology grade water 39 μ 1

[0415] IOX Cre 缓冲液10 μ 1 [0415] IOX Cre buffer 10 μ 1

[0416]补平的完整 DNA(150ng) 50 μ 1 Full DNA [0416] fill level (150ng) 50 μ 1

[0417] Cre 重组酶(12U/ μ 1) 1 μ 1 [0417] Cre recombinase (12U / μ 1) 1 μ 1

[0418] 将反应混合物充分混合后,在37°C下孵育45分钟,然后在80°C达10分钟使Cre 重组酶失活。 After [0418] The reaction mixture was mixed thoroughly, incubated for 45 min at 37 ° C, and then at up to 80 ° C 10 minutes to inactivate the enzyme Cre recombinase. 将反应混合物冷却至10°c,立即进行下一步骤。 The reaction mixture was cooled to 10 ° c, the next step immediately. [0419] 7.除去线性分子 [0419] 7. Remove the linear molecule

[0420] 通过外切核酸酶处理从上述反应混合物中除去线性分子。 [0420] by exonuclease treatment to remove linear molecules from the reaction mixture.

[0421] 通过将下列试剂加入冷却的上述切除反应混合物中,来立即进行外切核酸酶孵育。 [0421], is performed by incubating the exonuclease following reagents added to the above reaction mixture was cooled excised immediately.

[0422] ATP(IOOmM) 1.1 μ 1 [0422] ATP (IOOmM) 1.1 μ 1

[0423] DTT(IOOmM) 1.1 μ 1 [0423] DTT (IOOmM) 1.1 μ 1

[0424] Plasmid-Safe 依赖于ATP 的DNA 酶(10U/ μ 1) 5 μ 1 [0424] Plasmid-Safe DNA ATP-dependent enzyme (10U / μ 1) 5 μ 1

[0425]外切核酸酶 Ι(20υ/μ1) 3μ1 [0425] Exonuclease Ι (20υ / μ1) 3μ1

[0426] 将反应混合物充分混合后,在37°C下孵育30-60分钟。 After [0426] The reaction mixture was mixed well and incubated at 37 ° C 30-60 minutes. 然后在80°C下孵育20分钟使外切核酸酶立即失活。 Then incubated for 20 minutes exonuclease immediately inactivated at 80 ° C.

[0427] 下面的其余方法是454文库制备方法的修改形式。 Remaining [0427] The following method is a modification of the method of preparation 454 libraries.

[0428] 8.环化分子的雾化 [0428] 8. The atomizing cyclized molecules

[0429] 通过雾化使环化分子片段化成小于1Kb的片段。 [0429] By atomizing the cyclic molecules into fragments of less than 1Kb fragment.

[0430] 将1 μ 1 0.5Μ EDTA和1 μ g pUC19加入上述热灭活的反应混合物中。 [0430] The 1 μ 1 0.5Μ EDTA and 1 μ g pUC19 heat inactivated added to the reaction mixture. 在44psi下在雾化缓冲液使DNA雾化2分钟。 At 44psi atomization spray buffer the DNA for 2 minutes. 按照生产商的说明书,用MinElute试剂盒对经雾化的DNA片段进行纯化。 According to the manufacturer's instructions, the DNA fragment was purified using MinElute atomized kit.

[0431] 9.片段末端精加工 [0431] 9. The ends of the fragments finishing

[0432] IOX PNK 缓冲液 5 μ 1 [0432] IOX PNK buffer 5 μ 1

[0433] BSA (lmg/ml 稀释液) 5 μ 1 [0433] BSA (lmg / ml dilution) 5 μ 1

[0434] ATP(IOmM) 5μ 1 [0434] ATP (IOmM) 5μ 1

[0435] dNTP (各IOmM) 2 μ 1 [0435] dNTP (each IOmM) 2 μ 1

[0436]雾化 DNA 23 μ 1 [0436] DNA 23 μ 1 atomizer

[0437] Τ4 DNA 聚合酶(3U/ μ 1) 5 μ 1 [0437] Τ4 DNA polymerase (3U / μ 1) 5 μ 1

[0438] PNK (1OU/μ 1) 5μ 1 [0438] PNK (1OU / μ 1) 5μ 1

[0439] 将反应混合物充分混合后,在12°C下孵育15分钟。 After [0439] The reaction mixture was mixed well and incubated at 12 ° C 15 min. 紧接其后使反应混合物在25°C下孵育15分钟。 Immediately thereafter the reaction mixture was incubated at 25 ° C 15 min. 反应物用QiaQuick纯化,用50 μ IEB洗脱。 The reaction was purified by QiaQuick, eluting with 50 μ IEB.

[0440] 10.文库固定化 [0440] 10. The library immobilized

[0441] 按照生产商的推荐,使精加工DNA片段与链霉抗生物素包被的珠粒(例如Dynal Μ270珠粒)结合。 [0441] as recommended by the manufacturer of the finished strand DNA fragment with streptavidin-coated beads (Dynal Μ270 e.g. beads) binding. 珠粒用500 μ 1 TE洗涤3次后,只留下珠粒。 After the beads were washed with 1 TE 500 μ three times, leaving only the beads.

[0442] 11.454 PE衔接子连接 [0442] 11.454 PE connection adapter

[0443] 如下使454配对末端衔接子连接至在珠粒上固定化并精加工的DNA片段: [0443] 454 as follows so that the paired end adapters ligated to the immobilized on the bead and finishing the DNA fragment:

[0444] 分子生物学级水 15 μ 1 [0444] Molecular biology grade water 15 μ 1

[0445] Roche快速连接酶缓冲液(#1) 25 μ 1 [0445] Roche Rapid Ligase Buffer (# 1) 25 μ 1

[0446] 非生物素化454ΡΕ衔接子5 μ 1 [0446] non-biotinylated adapters 5 μ 1 454ΡΕ

[0447] 将反应混合物充分混合后,加入具有俘获DNA的珠粒中。 After [0447] The reaction mixture is thoroughly mixed, the beads having captured DNA. 使反应混合物涡旋振荡混合,然后加入 The reaction mixture was vortexed mixed, and then

[0448] Roche 快速连接酶(#3) 5 μ 1 [0448] Roche Rapid Ligase (# 3) 5 μ 1

[0449] 将反应混合物充分混合后,在室温下在旋转器上孵育15分钟。 After [0449] The reaction mixture was mixed thoroughly, incubated for 15 min on a rotator at room temperature. 珠粒用500 μ ITE 洗涤至少3次,仅留下珠粒。 Beads were washed with 500 μ ITE at least 3 times, leaving only the beads. [0450] 12.补平反应 [0450] 12. The fill-in reaction

[0451] 进行补平反应以修复切口,并补平通过454 PE衔接子引入的5'突出端。 [0451] to fill-in reaction to repair the incision, and fill-in the 5 'overhangs introduced by the adapter 454 PE.

[0452] 分子生物学级水 40 μ 1 [0452] Molecular biology grade water 40 μ 1

[0453] IOX Bst DNA聚合酶缓冲液5 μ 1 [0453] IOX Bst DNA polymerase buffer 5 μ 1

[0454] dNTP (各IOmM) 2 μ 1 [0454] dNTP (each IOmM) 2 μ 1

[0455] Bst DNA 聚合酶3 μ 1 [0455] Bst DNA polymerase 3 μ 1

[0456] 将反应混合物加入上述DNA珠粒中后,在37°C下孵育15分钟。 After [0456] The reaction mixture was added to the DNA the beads, incubated at 37 ° C 15 min. 然后将珠粒悬浮于20 μ IEB中。 The beads were then suspended in 20 μ IEB.

[0457] 13.文库预扩增 [0457] 13. A library of pre-amplification

[0458] 如下使双链配对末端文库预扩增: [0458] the following double-stranded paired end library pre-amplification:

[0459] 分子生物学级水 28.5 μ 1 [0459] Molecular biology grade water 28.5 μ 1

[0460] IOXHiFi 缓冲液 5μ1 [0460] IOXHiFi buffer 5μ1

[0461] 50mM MgCl2 2.5 μ 1 [0461] 50mM MgCl2 2.5 μ 1

[0462] dNTP (各IOmM) 2 μ 1 [0462] dNTP (each IOmM) 2 μ 1

[0463] 正向/反向引物对(各100 μ Μ) Ιμΐ [0463] the forward / reverse primer pairs (each 100 μ Μ) Ιμΐ

[0464]珠粒上的 DNA 10 μ 1 [0464] DNA 10 μ on the bead 1

[0465] HiFi Taq DNA 聚合酶(5U/ μ 1) 1 μ 1 [0465] HiFi Taq DNA polymerase (5U / μ 1) 1 μ 1

[0466] 应用下列程序用于热循环仪: [0466] Applications for the thermal cycler following procedure:

[0467] 94"C 3 分钟 [0467] 94 "C 3 minutes

[0468] 94"C 30 秒钟;60 °C 20 秒钟;72 °C 45 秒钟,20 次循环 [0468] 94 "C 30 sec; 60 ° C 20 sec; 72 ° C 45 seconds, 20 cycles

[0469] 72 "C 2 分钟 [0469] 72 "C 2 minutes

[0470]保持 10°C [0470] maintaining 10 ° C

[0471] 14.文库大小选择 [0471] 14. A size-selected library

[0472] 通过如下进行两轮SPRI珠粒清洗,选出所需文库片段大小。 [0472] SPRI beads were washed by two rounds, the desired library fragments size selected.

[0473] 1)加入分子生物学级水,使上述反应混合物达到100 μ 1。 [0473] 1) molecular biology grade water added to the reaction mixture reached 100 μ 1. 将72 μ ISPRI珠粒加入样品中。 The beads were added to 72 μ ISPRI sample. 按照生产商的说明书将珠粒孵育后,洗涤。 According to the manufacturer's instructions after incubation the beads washed. DNA用80μ1ΕΒ洗脱。 DNA was eluted with 80μ1ΕΒ.

[0474] 2)将52 μ 1 SPRI珠粒加入80 μ 1洗脱样品中后,在室温下孵育5分钟。 After [0474] 2) Add 52 μ 1 SPRI beads 80 μ 1 sample was eluted, incubated at room temperature for 5 minutes. 使珠粒与MPC结合,并收集未结合的上清液。 The beads were combined with the MPC, and the unbound supernatant collected.

[0475] 3)用QiaQuick试剂盒进行缓冲液交换后,用50 μ 1 EB洗脱。 After [0475] 3) was buffer exchanged with QiaQuick kit, eluting with 50 μ 1 EB.

[0476] 15.单链文库分离 [0476] 15. The isolated single chain library

[0477] 1)用链霉抗生物素珠粒俘获上述按大小选出的DNA。 [0477] 1) anti-biotin capture beads described above according to the size selected DNA with streptavidin. 洗涤后,结合珠粒的DNA 用解链溶液变性后,收集未结合的ssDNA。 After washing, the bound beads with DNA denaturing solution after melting, collecting unbound ssDNA.

[0478] 2) ssDNA用乙酸钠中和,用MinElute试剂盒交换缓冲液。 [0478] 2) ssDNA and neutralized with sodium acetate, buffer exchanged using MinElute kit. ssDNA用15-20 μ 1 TE洗脱。 ssDNA eluting with 15-20 μ 1 TE.

[0479] 然后在标准454乳液扩增反应中,使单链配对末端文库成员扩增后,对扩增成员群进行测序。 After [0479] 454 and a standard emulsion amplification reaction, single stranded paired end library member amplification, sequencing the amplified group members. 图24包括表示与靶插入序列大小24Kb —致的成对距离分布和所检测的大约40Kb最长成对距离的图。 Figure 24 includes a representation of the target insert size of 24Kb - induced pairwise distances and distribution of the detected longest approximately 40Kb FIG pairwise distances.

[0480] 虽然在此详细描述了本发明的有利实施方案,但是要理解的是通过上述段落限定的本发明不限于上述描述说明书中所给出的具体细节,困为在不偏离本发明的精神或范围的情况下,本说明书的许多明显的变通方法都是可行的。 [0480] Although described in detail advantageous embodiments of the invention, it is to be understood that the specific details of the present invention is defined in the above paragraphs is not limited to the above description is given in the description, is trapped in the spirit of the invention without departing from the or scope of the case, obviously many modifications of the methods described are possible. 本文所描述方法的修改和变通方法对本领域技术人员而言是显而易见的,并且包括在随附的权利要求书中。 Modifications and variations of the methods described herein to those methods will be apparent to the skilled person and include in the appended claims.

Claims (12)

1. 一种用于在体外反应中获得包含靶核酸的两个末端区的DNA构建体的方法,所述方法包括以下步骤:-使核酸分子片段化以产生靶核酸分子;-使重组衔接子元件与靶核酸分子的每个末端连接产生衔接的靶核酸分子; -使衔接的靶核酸暴露于位点特异性重组酶中,由衔接的靶核酸产生环状核酸产物和线性核酸产物,其中所述环状核酸产物包含靶核酸分子;和-使环状核酸产物片段化以产生包含来自靶核酸分子每个末端的序列区的模板核酸分子。 1. A method for obtaining a DNA comprising two end regions of a target nucleic acid in vitro constructs for the reaction, said method comprising the steps of: - fragments of nucleic acid molecules to produce a target nucleic acid molecule; - recombinant adapter each end element is connected to the target nucleic acid molecule to produce convergence target nucleic acid molecule; - make convergence target nucleic acid is exposed to a site specific recombinase, the circular nucleic acid product and a linear nucleic acid product from the adapted target nucleic acid is generated, wherein said circular nucleic acid product comprises the target nucleic acid molecule; and - that the circular nucleic acid product to produce a template nucleic acid fragment molecule comprising a sequence region from each end of the target nucleic acid molecule.
2.权利要求1的方法,其中在将所述衔接的靶核酸暴露于位点特异性重组酶的步骤之后,所述方法还包括除去非环状分子的步骤。 2. The method of claim 1, wherein the adapter after the step of exposing the target nucleic acid site-specific recombinase, said further comprising the step of removing non-cyclic molecules.
3.权利要求1的方法,所述方法还包括以下步骤:-使所述模板核酸扩增,产生包含大量基本相同的拷贝的群体;和-对所述群体进行测序,产生包含模板核酸的序列组成的序列数据。 The method of claim 1, said method further comprising the steps of: - amplifying the template nucleic acid, comprising generating a plurality of substantially identical copies of the group; and - sequencing the population to produce sequence comprising a template nucleic acid the sequence of data.
4.权利要求1的方法,其中所述重组衔接子元件包含第一重组衔接子元件和第二重组衔接子元件,其中所述第一和第二重组衔接子元件两者都包含定向元件。 The method of claim 1, wherein the recombination adapter elements comprise a first recombination adapter element and a second recombination adapter element, wherein both of said adapter element comprises a first and a second directional element are recombinant.
5.权利要求1的方法,其中所述位点特异性重组酶包括Cre重组酶。 The method of claim 1, wherein said site-specific recombinase comprises Cre recombinase.
6.权利要求1的方法,其中所述靶核酸分子包括选自以下的长度:至少3Kb、至少8Kb、至少10Kb、至少20Kb、至少50Kb和至少100Kb。 6. The method of claim 1, wherein the target nucleic acid molecule comprises a length selected from: at least 3Kb, at least 8Kb, at least in 10Kb, at least 20Kb, at least 50Kb, and at least 100Kb.
7.权利要求1的方法,其中所述大核酸分子包括基因组DNA。 The method of claim 1, wherein said nucleic acid molecule comprises large genomic DNA.
8.权利要求1的方法,其中所述环状核酸产物包含第一杂合重组衔接子,所述线性核酸产物包含第二杂合重组衔接子,其中所述第一和第二杂合重组衔接子包含来自连接的重组衔接子的元件。 The method of claim 1, wherein the circular nucleic acid product comprises a first hybrid recombination adapter, the linear nucleic acid product comprises a second hybrid recombination adapter, wherein said first and second hybrid recombination adapter comprising the sub-elements from the ligated recombination adapters.
9.权利要求1的方法,其中所述使环状核酸产物片段化的步骤包括雾化。 9. The method of claim 1, wherein the atomizing step so that circular nucleic acid product comprises fragmented.
10. 一种用于在体外反应中获得包含靶核酸的两个末端区的大量DNA的方法,所述方法包括下列步骤:-使大核酸分子片段化以产生大量靶核酸分子;-使重组衔接子元件与靶核酸分子的每个末端连接,产生大量的衔接的靶核酸分子;-将衔接的靶核酸分子暴露于位点特异性重组酶中,从衔接的靶核酸分子中产生大量环状核酸产物和大量线性核酸产物,其中所述环状核酸产物包含靶核酸分子,和-使环状核酸产物片段化以产生包含来自靶核酸分子每个末端的序列区的大量模板核酸分子。 10. A method to obtain amounts of DNA comprising two end regions of the target nucleic acid in the reaction in vitro, said method comprising the steps of: - fragmenting a large nucleic acid molecules to produce a large number of target nucleic acid molecules; - recombinant adapter each end of the linker member of the target nucleic acid molecule, a large amount of a target nucleic acid molecule of convergence; - the convergence target nucleic acid molecule is exposed to a site specific recombinase, the circular nucleic acid from a large amount of the target nucleic acid molecule Cohesion linear nucleic acid product and a large number of products, wherein the circular nucleic acid product comprises the target nucleic acid molecule, and - that the circular nucleic acid product to produce a large number of template fragments of a nucleic acid molecule comprising a sequence region from each end of the target nucleic acid molecule.
11. 一种实施权利要求1的方法的试剂盒,所述试剂盒包含: -大量的重组衔接子元件;和-位点特异性重组酶,其优选为Cre重组酶。 11. A method of embodiment kit of claim 1, said kit comprising: - a large number of recombination adapter elements; and - a site specific recombinase, which is preferably Cre recombinase.
12. 一种实施权利要求1的方法的试剂盒,所述试剂盒包含: -大量的重组衔接子元件;-位点特异性重组酶;其优选为Cre重组酶; -外切核酸酶;和-环状载体DNA,其优选为pUC19t The kit of claim 12. A method of claim 1 embodiment, said kit comprising: - a large number of recombination adapter elements; - a site specific recombinase; it is preferably Cre recombinase; - exonuclease; and - circular vector DNA, which is preferably pUC19t
