CN104039438A - Treatment for stabilizing nucleic acid arrays - Google Patents

Treatment for stabilizing nucleic acid arrays Download PDF

Info

Publication number
CN104039438A
CN104039438A CN201280065478.1A CN201280065478A CN104039438A CN 104039438 A CN104039438 A CN 104039438A CN 201280065478 A CN201280065478 A CN 201280065478A CN 104039438 A CN104039438 A CN 104039438A
Authority
CN
China
Prior art keywords
nucleic acid
adapter
target nucleic
probe
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201280065478.1A
Other languages
Chinese (zh)
Other versions
CN104039438B (en
Inventor
诺曼·里·伯恩斯
杰·威利斯·雪弗托
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Complete Genomics Inc
Original Assignee
Callida Genomics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Callida Genomics Inc filed Critical Callida Genomics Inc
Publication of CN104039438A publication Critical patent/CN104039438A/en
Application granted granted Critical
Publication of CN104039438B publication Critical patent/CN104039438B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N1/00Sampling; Preparing specimens for investigation
    • G01N1/28Preparing specimens for investigation including physical details of (bio-)chemical methods covered elsewhere, e.g. G01N33/50, C12Q
    • G01N1/30Staining; Impregnating ; Fixation; Dehydration; Multistep processes for preparing samples of tissue, cell or nucleic acid material and the like for analysis
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J19/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J19/0046Sequential or parallel reactions, e.g. for the synthesis of polypeptides or polynucleotides; Apparatus and devices for combinatorial chemistry or for making molecular arrays
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00497Features relating to the solid phase supports
    • B01J2219/00527Sheets
    • B01J2219/00529DNA chips
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00603Making arrays on substantially continuous surfaces
    • B01J2219/00605Making arrays on substantially continuous surfaces the compounds being directly bound or immobilised to solid supports
    • B01J2219/00608DNA chips
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00603Making arrays on substantially continuous surfaces
    • B01J2219/00605Making arrays on substantially continuous surfaces the compounds being directly bound or immobilised to solid supports
    • B01J2219/00632Introduction of reactive groups to the surface
    • B01J2219/00637Introduction of reactive groups to the surface by coating it with another layer
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00603Making arrays on substantially continuous surfaces
    • B01J2219/00659Two-dimensional arrays
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00718Type of compounds synthesised
    • B01J2219/0072Organic compounds
    • B01J2219/00722Nucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips

Abstract

The present invention is directed to treatment of nucleic acid molecules that are attached or associated with solid supports for biochemical analysis, including nucleic acid sequencing. After loading on the solid support, the nucleic acid moleclues are treated with a composition comprising a condensing agent, a volume excluding agent, or both, then treated with a composition comprising a protein.

Description

For the processing method of stabilization of nucleic acids array
Quoting of related application
The application requires the priority of the U.S. Provisional Application 61/554,789 of submitting on November 2nd, 2011.For any object, the full content of priority application is incorporated to herein at this.
Background technology
Large-scale genomic sequence analysis is the committed step of understanding many different biological phenomenons.Make to adopt the new sequence measurement of the multiple nucleic acid targets of parallel analysis simultaneously to be developed to the needs of low cost, high-throughout order-checking and the order of resurveying.
Conventional sequence measurement is generally confined to definite dozens of nucleotides before signal significantly decays, and therefore whole order-checking efficiency is very limited.Conventional sequence measurement is also subject to the restriction of signal to noise ratio, and described signal to noise ratio makes the method be unsuitable for single molecule order-checking.
If method and composition can be designed to improve sequencing reaction efficiency and improve by the efficiency of shorter reading length assembling complete sequence is favourable to this area.Many gene order-checkings and other analytical method have developed into and have adopted the nucleic acid molecules (for example, strand or double-stranded DNA or RNA) that is attached to solid or semi-solid supporter.Especially for analysis, described method and composition need to be stablized accompanying nucleic acid molecules to prevent chemical degradation and mechanical degradation.
Summary of the invention
We develop a kind of method of processing this nucleic acid molecules, and described method is carried out contact and be attached to (that is, being loaded into) solid support thing at nucleic acid molecules after, with stabilization of nucleic acids molecule, prevents chemical degradation and mechanical degradation.Generally speaking, the method comprises concentrated accompanying nucleic acid molecules and with coated (coat) the described nucleic acid molecules of albumen.
As described in detail herein, a kind of above-mentioned analytical method relates to DNA sequencing, this DNA comprises DNA concatermer (the DNA concatemer on high power capacity (high-occupancy) and high density nano-array, also referred to as DNA nanosphere or DNB), described nano-array is by Electrostatic Absorption self assembly in the solid phase substrate of lithographic patterning of DNB.The compositions and methods of the invention also can be used for being associated with multiple biochemical analysis, comprise, for example, nucleic acid hybridization, enzymatic reaction are (for example, use restriction endonuclease [comprising restriction enzyme], excision enzyme, kinases, phosphatase, ligase etc.), synthetic, the nucleic acid amplification of nucleic acid (for example, passes through PCR, rolling-circle replication, whole genome amplification, multiple displacement amplification etc. carry out), and use is attached to any other form of the biochemical analysis known in the art of the nucleic acid molecules of solid or semi-solid supporter.
Therefore, the invention provides method and constituent with the described array stability in raising foranalysis of nucleic acids process for the treatment of nucleic acid array, described foranalysis of nucleic acids includes, but not limited to the auxiliary nucleic acid test of nucleic acid sequencing, nucleic acid hybridization test or enzyme.
A kind of embodiment provides the method comprising the following steps: nucleic acid array (a) is provided, described nucleic acid array comprise (i) have surperficial supporter and (ii) be attached to described surface nucleic acid molecules (for example, strand or double-stranded DNA or RNA, include but not limited to DNB); (b) the concentrated nucleic acid molecules that is attached to described surface, generates concentrated nucleic acid molecules thus; And (c) with the coated described concentrated nucleic acid molecules of albumen.
According to the above-mentioned embodiment of one, the concentrated surperficial nucleic acid molecules that is attached to described substrate comprises to be made described array and comprises nucleic acid concentrating agents, comprises volume eliminant or comprise nucleic acid concentrating agents and the two composition (for example, the aqueous solution) contact of volume eliminant.Useful nucleic acid concentrating agents includes but not limited to alcohol, for example isopropyl alcohol.Useful volume eliminant includes, but not limited to polyethylene glycol.
According to another this embodiment, the coated surperficial nucleic acid molecules that is attached to described substrate comprises with the coated described nucleic acid molecules of the composition that contains albumen, surface and not RNA analysis described in described protein combination.This albumen includes but not limited to seralbumin, for example, and bovine serum albumin(BSA) or human serum albumins.
According to another embodiment, the invention provides the method comprising the following steps: DNB array (a) is provided, and described DNB array comprises (i) and has surperficial supporter and (ii) the non-covalent DNB that is attached to described surface; (b) DNB that makes to be attached to described surface contacts with the aqueous solution, and the described aqueous solution comprises nucleic acid concentrating agents and volume eliminant, generates thus concentrated DNB; And (c) with the coated described concentrated DNB of albumen.
According to another embodiment of the present invention, be provided for improving the kit of the nucleic acid array stability in foranalysis of nucleic acids process, wherein, described array comprises the nucleic acid molecules that (i) has surperficial supporter and (ii) be attached to described surface.This kit comprises (a) nucleic acid concentrate composition, and the concentrated nucleic acid molecules that is attached to described surface of described nucleic acid concentrate composition, generates concentrated nucleic acid molecules thus; (c) coating composition, the albumen that described coating composition comprises coated described concentrated nucleic acid molecules.In one embodiment, described nucleic acid concentrate composition comprises nucleic acid concentrating agents and volume eliminant.
Brief description of the drawings
Fig. 1 has schematically illustrated a kind of embodiment of the method for cutting apart nucleic acid.
Fig. 2 has schematically illustrated the embodiments of the present invention that relate to long segment and read (LFR) technology.Fig. 2 A for example understands the method for cut apart nucleic acid by standard multiple displacement amplification method (MDA).Fig. 2 B for example understands the method that uses 5 ' excision enzyme cutting nucleic acid by multiple displacement amplification method.Fig. 2 C is the schematic diagram of the embodiment of whole LFR process.
Fig. 3 has schematically illustrated the embodiment of bar code adapter (adaptor) design that is designed for method of the present invention.
Fig. 4 has schematically illustrated and has used nick translation method to cut apart the embodiments of the present invention of nucleic acid.
Fig. 5 has schematically illustrated the adapter that can be used for embodiments of the present invention.Fig. 5 A provides 4 different adapter sequences.Fig. 5 B has illustrated the different component that can be included in adapter design of the present invention.
Fig. 6 has schematically illustrated the embodiments of the present invention for the preparation of the circular nucleic acid template that contains multiple adapters.
Fig. 7 has schematically illustrated the embodiments of the present invention of the direction for controlling the adapter that inserts target nucleic acid.
Fig. 8 has schematically illustrated the illustrative embodiments of the different directions that can make adapter and target nucleic acid molecule be connected to each other (ligate).
Fig. 9 has schematically illustrated the one side of assembling nucleic acid-templated method of the present invention.
Figure 10 has schematically illustrated the component of inserting the described adapter of the mode of target nucleic acid for controlling adapter.
Figure 11 has schematically illustrated the embodiment of arm link arm connection (arm-by-arm ligation) method for adapter being inserted to target nucleic acid.Figure 11 A for example understands the illustrative embodiments of arm link arm method of attachment, and Figure 11 B for example understands the exemplary compositions for the adapter arm of the method.
Figure 12 has schematically illustrated the possible direction that adapter inserts.
Figure 13 has schematically illustrated a kind of embodiment of nick translation method of attachment.
Figure 14 has schematically illustrated a kind of embodiment of the method for inserting multiple adapters.
Figure 15 has schematically illustrated a kind of embodiment of nick translation method of attachment.
Figure 16 has schematically illustrated a kind of embodiment of nick translation method of attachment.
Figure 17 has schematically illustrated use nick translation ring reversion (nick translation circle inversion, Figure 17 A) and a kind of embodiment with the nick translation method of attachment of the nick translation ring reversion (Figure 17 B) of uracil degraded combination.
Figure 18 has schematically illustrated a kind of embodiment of nick translation method of attachment.
Figure 19 has schematically illustrated a kind of embodiment of the method for inserting multiple adapters.
Figure 20 has schematically illustrated a kind of embodiment of the method for inserting multiple adapters.
Figure 21 has schematically illustrated a kind of embodiment of the method for inserting multiple adapters.
Figure 22 has schematically illustrated a kind of embodiment of the method for inserting multiple adapters.
Figure 23 has schematically illustrated a kind of embodiment of combined probe grappling method of attachment (combinatorial probe anchor ligation method).
Figure 24 has schematically illustrated a kind of embodiment of combined probe grappling method of attachment.
Figure 25 has schematically illustrated a kind of embodiment of combined probe grappling method of attachment.
Figure 26 has schematically illustrated a kind of embodiment of combined probe grappling method of attachment.
Figure 27 is the fluorescence intensity level view that uses dual combined probe grappling method of attachment that each base is reached in restriction position.
Figure 28 is the data fitting score chart that uses inquiry (interrogation) position that combined probe grappling method of attachment obtains.
Figure 29 is the fluorescence intensity level view that uses single combined probe grappling method of attachment and dual combined probe grappling method of attachment to inquire in single base of different time points acquisition.
Figure 30 uses the data fitting score chart of single combined probe grappling method of attachment in single base inquiry of different time points acquisition.
Figure 31 is compared with single combined probe grappling method of attachment, the fluorescence intensity level view that uses the diverse location of multiple the second different grappling probes to reach in dual combined probe grappling method of attachment.
Figure 32, for illustrating compared with single combined probe grappling method of attachment, uses the figure of the diverse location data fitting mark of multiple the second different grappling probes acquisitions in dual combined probe grappling method of attachment.
Figure 33 is for illustrating compared with single combined probe grappling method of attachment, uses the figure of the fluorescence intensity that the diverse location of multiple different dual combined probe grappling method of attachment reaches.
Figure 34 is the figure that illustrates the diverse location data fitting mark of the first grappling probe acquisition that uses multiple different lengths in dual combined probe grappling method of attachment.
Figure 35 exists under kinase whose condition for illustrating under different temperatures, uses the each base of dual combined probe grappling method of attachment at the figure that limits the fluorescence intensity level that reaches of position.
Figure 36 exists under kinase whose condition for illustrating under different temperatures, uses the figure of the data fitting mark of the restriction position of dual combined probe grappling method of attachment acquisition.
Figure 37 illustrates under the condition existing under kinases and kinases incubation time difference, uses the each base of dual combined probe grappling method of attachment at the figure that limits the fluorescence intensity level that reaches of position.
Figure 38 illustrates under the condition existing under kinases and kinases incubation time difference, uses the figure of the data fitting mark of the restriction position of dual combined probe grappling method of attachment acquisition.
Figure 39 schematically illustrates some embodiments of the present invention.Figure 39 A illustrates the step according to sequence measurement of the present invention.Figure 39 B illustrates the fragment of the genomic DNA that contains four adapters.Figure 39 C illustrates the rolling-circle replication process of producing DNB.Figure 39 D illustrates according to the embodiment of DNB array of the present invention.Figure 39 E illustrates the embodiment of the order-checking probe of usage flag and the sequence measurement of two grappling probes.
The oligonucleotides table that Figure 40 builds and inserts for adapter.
Figure 41 A to 41C is the table for the marker of the quantitative PCR analysis of construct of the present invention.
Figure 42 is the datagram that shows the quantitative PCR analysis of middle construct of the present invention.
Figure 43 illustrates the datagram of analyzing the error in DNB sequencing result.
Figure 44 provides the figure that illustrates genome coverage rate analysis.Figure 44 A shows the cumulative coverage rate of each genomic cumulative coverage rate and analogies.Figure 44 B shows the genome coverage rate of pressing the sequence of GC content.Figure 44 C shows as the detection Infinium SNP of the function of the overburden depth of different loci in NA07022 and or the efficiency of homozygote Infinium Genotyping.
Figure 45 is the figure illustrating in coded sequence so that the size of 3 multiple is inserted and the ratio of disappearance is improved, and illustrates that its damaging influence is lower.
Figure 46 tests the datagram of the Genotyping of subset and Illumina Infinium 1M test for illustrating the genotype Infinium consistent and those genome first waters of the NA07022 generating with HapMap Project (release24).
Figure 47 is for showing that the genotype generating with HapMap Project (release 24) is consistent and HapMap is genotypic or testing subset from the Infinium of the first water of Affy 500k Genotyping.
Figure 48 has shown that 1M Infinium SNP is consistent with the percentage according to the data of different quality mark sequence of so-called variant.
Figure 49 shows the figure of the ratio that illustrates the new mutation changing with different quality score threshold.
Figure 50 A and 50B have shown the table of the impact of summing up the variant in coding NA07022.
Figure 51 has schematically illustrated the illustrative embodiments of nucleic acid-templated construct of the present invention.
Figure 52 has schematically illustrated according to sense data form of the present invention.
Detailed description of the invention
Except as otherwise noted, practice of the present invention can adopt organic chemistry, polymer technology, molecular biology (comprising recombinant technique), cell biology, biochemistry and immunologic routine techniques and the description within the scope of art technology conventionally.These routine techniques comprise that polymer array is synthetic, hybridization is surveyed in hybridization, connection and usage flag physical prospecting.Can be by reference below example suitable technology is carried out to certain illustrated.But, the conventional steps that certainly also can use other to be equal to.These routine techniques and description can be found in the laboratory operation handbook of standard, for example Genome Analysis:A Laboratory Manual Series (Vols.I-IV), Using Antibodies:A Laboratory Manual, Cells:A Laboratory Manual, PCR Primer:A Laboratory Manual, and Molecular Cloning:A Laboratory Manual (all from Cold Spring Harbor Laboratory Press), Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, New York, Gait, " Oligonucleotide Synthesis:A Practical Approach " 1984, IRL Press, London, Nelson and Cox (2000), Lehninger, Principles of Biochemistry3 rded., W.H.Freeman Pub., New York, N.Y.and Berg et al. (2002) Biochemistry, 5 thed., W.H.Freeman Pub., New York, N.Y., the full content of all these documents is incorporated to herein by reference based on all objects.
Notice that unless context is clearly otherwise noted, otherwise herein and the singulative (" a using in accompanying claim, " " an, " with " the ") comprises plural indicant.Therefore, for example, mention that " polymerase " refers to the mixture of a kind of reagent or this reagent, mention that " method " comprises equivalent steps well known by persons skilled in the art and method, etc.
Except as otherwise noted, otherwise all technical terms used herein and scientific terminology are identical with the implication that those skilled in the art understand conventionally.All publications of mentioning are herein incorporated to herein by reference, with describe with described publication is disclosed in describe can use associated with the present invention equipment, composition, preparation and method.
In the situation that number range is provided, should be understood that, unless context is clearly otherwise noted, otherwise each intermediate value between the upper and lower bound of this scope, to 1/10th places of this lower limit unit, and any numerical value that other is mentioned or intermediate value in the scope of claiming, be all included in the present invention.These upper and lower bounds more among a small circle can independent packet be contained in this more among a small circle in, this is also encompassed in the present invention, is limited to the limit value of any special eliminating in described scope.In the time that claimed scope comprises one of bound or both, get rid of a kind of of bound or both scopes are also contained in the present invention.
In the following description, list many details so that more complete understanding of the invention to be provided.But, will it is evident that to those skilled in the art, can in the situation that lacking one or more these details, implement the present invention.In other cases, feature well-known to those skilled in the art and step are not described to avoid causing the present invention unclear.
Although the present invention mainly by reference detailed description of the invention be described, but also expectedly, after reading disclosure of the present invention, other embodiment will be apparent to those skilled in the art, and within being intended to that these embodiments are included in to method scope of the present invention.
I. general introduction
The present invention relates to composition and method for the nucleic acid molecules on stabilization of nucleic acids array.In brief, after above-mentioned nucleic acid molecules is loaded on nucleic acid array, stablize described nucleic acid molecules by arrangement post processing (post-arraying treatment) and prevent chemical degradation and the mechanical degradation in biochemical analysis process, described biochemical analysis includes but not limited to nucleic acid sequencing.According to a kind of embodiment, described nucleic acid molecules is coated in the albumen of one deck partial denaturation to improve the stability of arrayed nucleic acid molecule, and then improving the signal strength signal intensity and the specificity that obtain by biochemical analysis, described biochemical analysis for example relates to the sequencing reaction of fluorescent dye.In addition, if observed after initial loading, the nucleic acid molecules of array is directly coated with to processing, the single core acid molecule of array is by respectively to a certain degree to smear from the teeth outwards.Before coated affect the concentrated rinse step of nucleic acid and washing step subsequently reduces potential Physical interaction between applying amount and adjacent nucleic acid molecules, improve thus the quality of data that biochemical analysis generates.
One embodiment of the present invention relate to this loading post-processing approach in the situation for target nucleic acid order-checking, and described target nucleic acid order-checking process comprises to be extracted target nucleic acid and cuts apart target nucleic acid from sample.The nucleic acid of cutting apart is for generating target nucleic acid template, and described target nucleic acid template generally comprises one or more adapters.Described target nucleic acid template is used for to amplification method to generate nucleic acid nanosphere, conventionally by the setting of described nucleic acid nano ball from the teeth outwards.Nucleic acid nano ball of the present invention is checked order, generally check order by interconnection technique, described interconnection technique comprises that combined probe grappling connects (" cPAL ") method, and described combined probe grappling connection method is below describing in further detail.CPAL and other sequence measurement also can be used for surveying particular sequence, for example, comprise the SNP (" SNP ") (nucleic acid construct of the present invention comprises nucleic acid nano ball and linear nucleic acid template and circular nucleic acid template) in nucleic acid construct of the present invention.
Method and composition of the present invention has remarkable reduction order-checking cost and allows sequencing reaction popularization several features to effective high flux (throughput) level.Due to order-checking substrate under homogenization temperature, there is the rolling-circle replication that passes through in the solution-phase reaction of high template concentrations (> 20,000,000,000/ml) and generate, therefore avoided important selection bottleneck and non-clonal expansion (herein also referred to as " DNA nanosphere " and " DNB ").This has evaded the random low effect phenomenon of the method for the template concentrations of original position clonal expansion in the accurate titration emulsion of needs or bridge-type PCR.These features also allow in 96 orifice plates of standard, automatically to generate hundreds of genomic DNB every day.
Array of the present invention is applicable to more cheap and effective imaging technique.High power capacity and highdensity nano-array are by Electrostatic Absorption self assembly in the solid phase substrate of lithographic patterning of solution phase DNB.Compared with the DNA array of random site, the array of this patterning produces a high proportion of pixel containing much information.In compact (diameter 300nm in some embodiments) DNB, hundreds of reaction site generate the bright signal for fast imaging.This dot density and obtain the efficiency of image and the reagent consumption of reduction and make the extensive human genome order-checking for studies and clinical application to become possibility by very important high sequencing throughput/instrument.
" not (unchained) of chain " of the present invention cPAL order-checking biochemistry makes cheaply accurately base read to be called possibility.Generally speaking,, except the present invention, two kinds of different order-checking chemistry are used for to modern order-checking platform: synthesize order-checking (SBS) and be connected order-checking (SBL).This two kinds of order-checkings are all used " not chain " to read, and wherein, depend on the product of circulation N for the substrate of the N+1 that circulates; Therefore the erroneous effects (especially not exclusively extending) that, mistake produces in circulating before may may being subject to multiple circulative accumulations and the quality of data.Therefore, need to by the expensive high-purity marking substrates molecule of high concentration and enzyme drive these not the sequencing reaction of chain approached.Therefore, cPAL independent, chain voltinism matter is not avoided error accumulation and is allowed low quality base with other high-quality reading, thereby reduce reagent cost.
Generate by the DNA engineering method of the nick-translation based on the directed improvement of inserting of adapter according to order-checking substrate of the present invention, make adapter connect that output exceedes 90% (although also can accept compared with low yield) and chimeric rate is low to moderate approximately 1%.Be inserted with the DNA molecular of adapter by the further enrichment of PCR.This recursive procedure can 96 samples batch (or more according to the rules) implement and extend in each DNB, to read 120 bases or polybase base more by inserting extra adapter.Current reading length is comparable to other large-scale parallel sequencing technologies.
The sequencing data that uses method and composition of the present invention to generate makes the qualification of full genome correlative study, the rare variant that may occur relevant with disease or treatment processing, and the qualification of somatic mutation possesses enough qualities and accuracy.Low and the effective imaging of consumables cost makes hundreds of individual researchs become possibility.The high accuracy that clinical diagnostic applications is required and integrality require this technology and other technology sustainable development.
II. prepare genomic nucleic acids fragment
As further discussed herein, nucleic acid-templated target nucleic acid and the adapter of comprising of the present invention.In order to obtain for building nucleic acid-templated target nucleic acid of the present invention, the invention provides for obtaining the method for genomic nucleic acids from sample and building subsequently with preparation the fragment that nucleic acid-templated method of the present invention is used for the method for cutting apart those genomic nucleic acids.
The unique texture feature that stratiform (tiered) the nucleic acid fragment library structure using in numerous embodiments of the present invention is used for decomposing whole genome (especially human genome).As described in further detail below, in some embodiments, use 500bp fragment to stride across the most repeat element in genome, comprise Alu iteron (repeat), 10% of its constitutivegene group.In other embodiments, make independent order-checking and the analysis of the cover of two in dliploid parental set of chromosome become possibility by longer fragment.These more the analysis of long segment allow heterozygote to cross stage by stage (phasing over) large-spacing district (may all chromosome), even cross the spacer region in the region that recombination fraction is high.
IIA. the preparation of genomic nucleic acids fragment general introduction
In general, the prepared in accordance with the present invention pair of end library comprise be dispersed with at a certain distance known synthetic DNA sequence (being called adapter) target nucleic acid sequence (for example, genomic DNA, but as described herein, also can use other target).Described adapter can be used as the starting point of the base that reads multiple positions that exceed each adapter-genomic DNA abutment, optionally, can read base from adapter two ends.
Target nucleic acid can use methods known in the art to obtain from sample.Should be understood that, described sample can comprise any several desired substance, comprise, but be not limited to, body fluid (includes, but not limited to almost blood, urine, serum, lymph, saliva, anus secretion and vaginal fluid, sweat and the seminal fluid of any organ, preferred mammal sample, especially preferably human sample); Environmental samples (including, but not limited to air, agriculture sample, water and soil sample); Biological warfare agent sample; Research sample (that is, the in the situation that of nucleic acid, this sample can be the product of amplified reaction, comprises target amplification and amplification of signal, for example pcr amplification reaction as describe, in general terms in PCT/US99/01705); The sample of purifying, the genomic DNA of such as purifying, RNA, albumen etc.; Raw material sample (bacterium, virus, genomic DNA etc.); As the skilled person will be understood that, almost any experimental implementation was all carried out on described sample.On the one hand, nucleic acid construct of the present invention is formed by genomic DNA.In some embodiments, genomic DNA by whole blood carry out autoblood or the cell preparation of cell culture obtain.In other embodiments, target nucleic acid comprises extron DNA, that is, for the subset (subset) of the complete genome DNA of institute's enrichment of transcription sequence, described transcription sequence comprises the extron group in genome.In other embodiments, target nucleic acid comprises and transcribes group (all mRNA that, generate in cell or cell mass or the set of " transcripton ") or the group that the methylates colony of site and methylation patterns (, in genome, methylate).
In exemplary embodiment, isolation of genomic DNA from target organisms." target organisms " refers to target organism, as will be appreciated, although in some embodiments, target organisms is the pathogen detection of bacterium or virus infections (for example for), but this term comprises any biology that can therefrom obtain nucleic acid, especially mammal, comprises the mankind.The method that obtains nucleic acid from target organisms is well known in the art.The sample that comprises human genome DNA can be used for many embodiments.Aspect some such as genome sequencing, preferably, obtain approximately 20 to approximately 1,000, the genome equivalent of 0000 or more DNA is enough to cover whole genome to guarantee target dna fragment colony.The number of the genome equivalent obtaining can depend in part on the method for further preparing genomic DNA fragment used according to the invention.For example, in the long segment reading method below further describing, generally use approximately 20 to approximately 50 genome equivalents.For using the method (also below further describing) of multiple displacement amplification, generally use approximately 1000 to approximately 100,000 genome equivalents.For do not use the method for amplification before fragmentation for, use approximately 100,000 to approximately 1,000,000 genome equivalent.
Use routine techniques to separate target gene group DNA, described routine techniques is for example at the Sambrook and Russell front quoting, those disclosed technology in Molecular Cloning:A Laboratory Manual.By routine techniques, target gene group DNA is separated subsequently or fragmentation to ideal dimensions, described routine techniques comprises that enzyme is cut, shearing or ultrasonic, wherein, shears and ultrasonicly has in the present invention a special-purpose.
The fragment size of target nucleic acid can be different and different with library constructing method with used source target nucleic acid, but length is generally 50 to 600 nucleotides.In another embodiment, the length of described fragment is 300 to 600 nucleotides or 200 to 2000 nucleotides.In another embodiment, the length of described fragment is 10 to 100,50 to 100,50 to 300,100 to 200,200 to 300,50 to 400,100 to 400,200 to 400,300 to 400,400 to 500,400 to 600,500 to 600,50 to 1000,100 to 1000,200 to 1000,300 to 1000,400 to 1000,500 to 1000,600 to 1000,700 to 1000,700 to 900,700 to 800,800 to 1000,900 to 1000,1500 to 2000,1750 to 2000, and 50 to 2000 nucleotides.
In other embodiments, the fragment within the scope of fragment or the specific dimensions of separation specific dimensions.These methods are well known in the art.For example, can use gel partition method to generate the segment group of the interior specific dimensions of base-pair scope (for example 500 base-pair+50 base-pairs).
In many situations, do not need that the DNA of extraction is carried out to enzyme and cut, because the shearing force producing in cracking and leaching process can generate the fragment in ideal range.In other embodiments, shorter fragment (1 to 5kb) can be cut and be used restriction enzyme to generate by enzyme.In another embodiment, approximately 10 to approximately 1,000, and 000 DNA genome equivalent guarantees that segment group covers whole genome.Therefore, contain the nucleic acid-templated library being generated by this segment group and can comprise target nucleic acid, once the sequence of described target nucleic acid identify and assemble out, whole genomic great majority or full sequence will be provided.
Under certain situation, for example, no matter be only with a small amount of sample DNA time or (pass through, with chamber wall etc.) non-specific binding damages and goes wrong dangerous time, provide carrier DNA (for example, incoherent synthetic circular double stranded DNA) and sample DNA to mix and together with to use be all favourable.
In one embodiment, after fragmentation, make DNA sex change to generate single-chain fragment.
In one embodiment, after fragmentation, (in fact, herein before or after any step of general introduction) can by amplification step for the nucleic acid group of fragmentation with guarantee concentration enough large all fragments can be used for generating subsequently modification of nucleic acids of the present invention and use the step of these nucleic acid acquisition sequence informations.These amplification methods are well known in the art, comprise, but be not limited to: PCR (PCR), connect the amplification (TMA) of chain reaction (be sometimes referred to as oligonucleotides ligase amplification OLA), circle probe technology (cycling probe technology, CPT), strand displacement analytic approach (SDA), transcriptive intermediate, amplification (NASBA), rolling circle amplification (RCA) (for cyclisation fragment) based on nucleotide sequence and have wound cracking technique (invasive cleavage technology).
In another kind of embodiment, after fragmentation, further modify target nucleic acid to insert multiple adapters according to the described method of invention to them.It is necessary carrying out that this class modifies because the process of fragmentation likely make the target nucleic acid that produces with end cannot carry out the step for inserting adapter, especially cannot use the enzyme such as ligase and polymerase.For the institute of summarizing in literary composition in steps, this step is optional, can combine with any step.
In the exemplary embodiment, after physical segments, target nucleic acid contains the combination of flat end and protruding terminus conventionally, and the combination of end phosphate and hydroxy chemical material.In this embodiment, thus target nucleic acid is processed the flat end forming with particular chemicals with several enzymes.In one embodiment, utilize polymerase and dNTPs that 5 ' strand of arbitrary protruding terminus is filled and led up to form flat end.For example, remove 3 ' jag with the polymerase (conventionally but also always not the same with the polymerase with 5 ' exonuclease activity, T4 polymerase) with 3 ' exonuclease activity.Suitable polymerase comprises, but be not limited to T4 polymerase, Taq polymerase, e. coli dna polymerase 1, Klenow fragment, reverse transcriptase, Φ 29 relevant polymerases including wild type Φ 29 polymerases and derivative, T7DNA polymerase, T5DNA polymerase, the RNA polymerase of these polymerases.These technology can be for generating the flat end serving many purposes.
In other optional embodiment, change the chemical property of end to avoid target nucleic acid to interconnect.For example, except polymerase, can also in the process that produces flat end, use protein kinase, utilize its 3 ' phosphatase activity that 3 ' phosphate group is converted into oh group.This class kinases can include but not limited to the commercially available kinases such as T4 kinases, and also there is no commercialization but have the active kinases of expectation.
Similarly, can utilize phosphatase that the phosphate group of end is converted into oh group.Suitable phosphatase comprises, but be not limited to alkaline phosphatase (comprising calf intestinal alkaline phosphatase (CIP)), Antarctic phosphatase, apyrase (Apyrase), pyrophosphatase, inorganic (yeast) thermally-stabilised inorganic pyrophosphatase etc., these enzymes are known in the art, and can be purchased from for example New England Biolabs.
As shown in figure 16, these modifications prevent that target nucleic acid from interconnecting in step subsequently in the methods of the invention, therefore ensured that adapter (and/or adapter arm) is connected in the step of target nucleic acid end, target nucleic acid can be connected and not be connected with other target nucleic acids with adapter.Preferably target nucleic acid 1601 is connected and adapter 1603 and 1604 (as shown in the drawing, desired direction is that make the to have same shape end direction connected to one another of (circle or square)) with desired direction with 1602.End is modified to avoid undesirable configuration 1607,1608,1609 and 1610, and in these several configurations, target nucleic acid interconnects, and adapter interconnects.In addition,, as below will further discussed in detail, can also control with the chemical property of target nucleic acid end the direction that each adapter-target nucleic acid is connected by controlling adapter.Can realize by method known in the art and that further describe herein the control of end chemical composition.
It will be appreciated by persons skilled in the art that in literary composition, summarize institute in steps for, can use any combination of these steps and enzyme.For example, may to make one or more in these enzymatics " end repairing " step become unnecessary for some enzyme process fragmentation technology (as used restriction enzyme).
Modification described above can prevent from forming and contain the nucleic acid-templated of the different fragments that connects with unknown configuration, therefore reduces and/or has eliminated the mistake in Sequence Identification and the assembling that can be caused by the less desirable template of this class.
In certain situation, classification fragmentation (hierarchical fragmentation) method and any enzymatic fragmentation described herein or mechanical fragmentation Combination of Methods use.These class methods are at U. S. application the 11/451st, in No. 692 and the PCT application WO2006/138284 that announced, describe, and the full content of these two parts of documents is by reference based on all objects, and especially all instructions based on relevant to classification fragmentation are incorporated to this paper.
In some embodiments, adopt controlled random enzymatic (" CoRE ") fragmentation method to prepare the fragment using in the present invention.CoRE fragmentation is enzymatic end-point method, there is enzymatic fragment method (for example can for the DNA of a small amount of and/or small size), and there is no its many defects (comprising sensitiveness that substrate variations or enzyme concentration are changed and the sensitiveness to digestion time).In brief, CoRE fragmentation relates to a series of three enzymatic steps, as Fig. 1 institute schematic illustrations.First, in the situation that having dNTPs, nucleic acid 101 is carried out to enzymatic multiple displacement amplification (MDA), wherein in dNTPs doped with the dUTP or the UTP that become limited ratio with dTTP.This causes the T site on two chains of amplified production to be replaced (103) with controlled ratio by Brdurd (" dU ") or uracil (" U ") with that limit.Then generally combine U Partial Resection (104) by UDG, EndoVIII and T4PNK, produce the single base breach (105) with functional 5 ' phosphoric acid and 3 ' hydroxyl terminal.The equispaced of the single base breach producing is defined by the frequency of occurrences of U in MDA product.Cause nick translation until the otch on opposite strand converges with nucleic acid jaggy (105) with polymerase processing, thereby form double-strand break, obtain the double-stranded segment group (107) of size relatively even.Because the size distribution of double-stranded fragment (107) is the ratio decision of the dTTP and DUTP or the UTP that use in being reacted by MDA, instead of the time of enzyme processing or degree decision, therefore, the fragmentation of this CoRE fragmentation method has highly repeatability.
In some cases, for example, while especially expecting isolating long fragment (being about 150 to about 750kb), the invention provides following method: in the method, cell is cleaved, make intact cell karyomorphism become bead by gentle centrifugation step.By utilizing for example Proteinase K and the RNase digestion enzymic digestion of several hours to discharge genomic nucleic acids (normally genomic DNA).Then by the material dialyzed overnight obtaining or direct concentration of diluting to reduce residual cell refuse.For example, because the method for this class isolating nucleic acid does not relate to many destructive processes (ethanol precipitation, centrifugal and whirlpool mix), it is substantially complete that genomic nucleic acids can keep, and obtains the fragment that majority exceedes 150kb.
In some situation, with above-described any fragmentation Combination of Methods, the present invention also provides the method that genomic nucleic acids segment group is divided into aliquot, and this just makes it possible to reconstruct diploid gene group, for example, identify male parent and maternal chromosome or sequence.This has clear superiority than prior art.
In this embodiment, genomic fragment is divided into aliquot, makes nucleic acid be diluted to every part of concentration that contains about 10% haploid genome.In this dilution level, in each specific aliquot, about 95% base-pair does not have overlapping.The method of this point of aliquot, is called again long segment and reads (LFR) fragmentation method in literary composition, the fragment of the macromolecule that can be separated to for the method further describing more than basis and in literary composition in specific implementations.In Fig. 2 C, illustrate an example of LFR method.LFR does genomic nucleic acids (being generally genomic DNA) in short-term and processes with 5 ' exonuclease, produces 3 ' strand outstanding.The outstanding initiation site (Fig. 2 A) as multiple displacement amplification (MDA) of this strand.Then the DNA 5 ' exonuclease being processed is diluted to sub-gene group concentration, is divided into many aliquots, is generally in multiple holes of assigning on porous plate.By the fragment amplification in each hole, it is generally the MDA method (Fig. 2 B) of utilizing standard MDA method (Fig. 2 A) and/or using exonuclease.In some situation, amplification method has been introduced uracil part to fragment, therefore after amplification, can utilize CoRE method described above by further the fragment in each hole fragmentation.Can also process MDA product fragmentation by ultrasonic wave or enzyme.In general,, after MDA product fragmentation, the end of gained fragment is repaired with T4 polymerase and T4 polynucleotide kinase conventionally.Then use alkaline phosphatase treatment fragment, to being connected subtab on fragment band.Conventionally, label adapter arm has been designed to two sections, section be porose total, utilize the method further describing in literary composition to be directly connected with fragment by flat end connection.Second section is that each hole is distinctive, contains " bar code " sequence, therefore, in the time that the content in each hole combines, can identify the fragment in each hole.Fig. 3 has shown this one side of invention, some exemplary bar code adapter that can add to fragment.
In some situation, utilize the genome of LFR methods analyst individual cells.In this situation, the process of DNA isolation and method described above are similar, but carry out at small size more.Once DNA separator well, before assigning to each aperture, must be carefully by genomic DNA fragment to avoid the loss of material, particularly avoid losing the end sequence of each fragment can cause last genome assembling to have breach because lose this material.In some situation, by avoid the loss of sequence with rare nickase, described nickase produces polymerase (for example phi29 polymerase) initiation site at a distance of about 100kb.Along with polymerase produces new DNA chain, old chain is replaced, and last result is near polymerase initiation site, to exist replacement sequence (Fig. 4), and calling sequence is lacked seldom.
In some situation, for example, when only having a small amount of sample DNA and likely when losing DNA with the non-specific binding of such as chamber wall etc., it is useful providing carrier DNA (irrelevant ring-type synthetic dsdna) to mix with sample DNA and use.In one embodiment, thus after fragmentation by DNA sex change produce single-chain fragment.
In one embodiment, after fragmentation (in fact before or after any step of summarizing herein), can carry out amplification step to the nucleic acid group of fragmentation has enough large concentration to offer step subsequently to ensure whole fragments, to produce the nucleic acid of modification of the present invention and to utilize these nucleic acid to obtain sequence information.This class amplification method is known in the art, includes but not limited to the amplification (TMA) of PCR (PCR), ligase chain reaction (being sometimes called as oligonucleotides ligase amplification 0LA), cycling probe technology (CPT), strand displacement method (SDA), transcriptive intermediate, amplification (NASBA), rolling circle amplification (RCA) (for the fragment of cyclisation) based on nucleotide sequence and has wound cutting technique.
In other embodiments, after fragmentation, target nucleic acid is further modified in order to inserting multiple adapters according to the described method of invention to them.Need to carry out this class modify be because the process of fragmentation likely make the target nucleic acid that produces with end cannot insert the program that adapter will use, especially use the enzyme such as ligase and polymerase.For the institute of summarizing in literary composition in steps, this step is optional, can combine with any step.Fragment is modified in order to the directed method being connected of they and other nucleic acid molecules and comprised use enzyme, and for example polymerase and phosphatase are modified the end of fragment, thereby they can only be connected with other nucleic acid molecules with required direction.These class methods further describe in the text.
IIB.CoRE fragmentation
Mistake just as discussed above, comprise machinery and enzymatic fragmentation method for fragmentation method of the present invention, and the combination of enzymatic and mechanical fragmentation method.Much machinery and enzymatic fragmentation method are known in the art.
On the one hand, the invention provides the fragmentation method that is called controlled random enzymatic (CoRE) fragmentation in literary composition.The CoRE fragmentation method of describing in literary composition can be used separately or use with other machineries known in the art and enzymatic fragmentation Combination of Methods.CoRE fragmentation relates to a series of three enzymatic steps, and property is illustrated as schematically shown in Figure 1.First, nucleic acid 101 will experience the processing by the amplification method carrying out in the situation that having dNTPs, wherein in dNTPs, add a certain proportion of Brdurd (" dU ") or uracil (" U "), caused the T position on two chains of amplified production to be replaced (103) with certain ratio with controlled by dUTP or UTP.There is the multiple amplification method can be for this step of invention, include but not limited to PCR (PCR), connect the amplification (TMA) of chain reaction (be sometimes called as oligonucleotides ligase amplification OLA), cycling probe technology (CPT), strand displacement method (SDA), transcriptive intermediate, amplification (NASBA), rolling circle amplification (RCA) (for the fragment of cyclisation) based on nucleotide sequence and have wound cutting technique.In some embodiments, utilize and in dNTPs, added the dUTP proportional with dTTP or the multiple displacement amplification of UTP (MDA) to produce the amplified production (103) that some site on two chains is replaced by dUTP or UTP.
Increasing and insert after uracil part, is generally, by UDG, EndoVIII and T4PNK coupling, uracil is excised to (104), produces the single base breach (105) with functional 5 ' phosphoric acid and 3 ' hydroxyl terminal.The equispaced of the single base breach producing is determined by the frequency of occurrences of U in MDA product.In other words, the amount of dUTP is higher, and gained fragment is shorter.It will be understood by those skilled in the art that and also can use other can make nucleotides optionally be replaced by certain modified nucleotide, thereby cause the technology of similar cutting, for example nucleotides to chemical substance or other enzyme susceptibles.
Use with the polymerase processing of exonuclease activity and cause otch along the length direction " translation " of nucleic acid or " displacement " until the otch on opposite strand converges to a bit with nucleic acid jaggy (105), thereby formation double-strand break, obtains big or small relatively even one relative double-stranded segment group (107).The exonuclease activity of polymerase (for example Taq polymerase) will adjoin the short dna chain excision of otch, and polymerase activity is filled out nucleotides follow-up in otch and this chain (substantially, Taq moves along chain, utilize exonuclease activity that base is excised and add identical base, its result is exactly that otch is along chain translation until enzyme reaches chain end).
Because the size distribution of double-stranded fragment (107) is the ratio decision of the dTTP and DUTP or the UTP that use in being reacted by MDA, instead of the duration of enzyme processing or degree decision, therefore the fragmentation of this CoRE fragmentation method has highly repeatability.Therefore, CoRE fragmentation produces the double stranded nucleic acid fragment group of whole similar sizes.
IIC. long segment reading technique
Long segment of the present invention is read the physical separation of (LFR) method based on to the long genomic DNA fragment of many different aliquots, to such an extent as in maternal and male parent composition to appear at possibility in identical aliquot very low simultaneously in genomic any given area.By put into unique discriminating thing in each aliquot, many aliquots are analyzed, summed up to get up can assemble diploid gene group by DNA long segment, for example, the sequence of each parental chromosome can be provided, therefore there is obvious superiority than prior art.Although discussion herein concentrates in the application of the LFR method of utilizing DNB array and connection method order-checking, but be understood that these LFR methods can be used with various other arrays together with other sequence measurements, thereby diploid gene group is checked order with two haploid genomes that separate.This will contribute to the aspects such as the qualification of familial hereditary disease.
Be understandable that, by providing the ability that (call) distinguishes of calling from the cover of two in dliploid sample chromosome, variation and unmanifest site that LFR permission high confidence is called low coverage rate.Other application of LFR comprises differentiates a large amount of rearrangements of cancer gene group and the order-checking of the total length of optional splicing transcripton.
For fragment is separated rightly, generally DNA to be diluted to the concentration (Fig. 2 C) of every part of about 10% haploid genome.In such concentration, in aliquot, 95% base-pair does not have overlapping.The separation statistically that such dilution reaches can make female parent and male parent fragment conventionally drop on (Fig. 2 C, second segment) in different aliquots.It should be understood that dilution gfactor may depend on the original size of fragment.That is, utilize gentle technology to carry out isolation of genomic DNA, can obtain the fragment of general 100kb, then these fragments are divided into aliquot.Can produce compared with still less aliquot of the Technology Need of large fragment, may need larger dilution factor and produce compared with the technology of short-movie section.
In some embodiments, the fragment in each aliquot is amplified, and in other embodiments, the further fragmentation of fragment quilt in each aliquot, then adapter on mark, the fragment in same like this aliquot all comprises identical mark adapter; Referring to for example US2007/0072208, the document is all incorporated to herein by reference, especially can be with reference to the wherein discussion about further point aliquot and coverage.
In many embodiments, each aliquot is for example included in, in the independent aperture of porous plate (, 384 orifice plates).Although it should be understood that the situation of the following discussion about LFR for porous plate, can hold the different aliquots that produce in the method by any amount of dissimilar container and system.This class container and system are known in the art, and those skilled in the art readily appreciate that the container of what type and this one side that system is suitable for inventing.
As discussed above, can be by the long segment of multiple distinct methods isolated genes group nucleic acid from cell.In one embodiment, cell is cleaved, with gentle centrifugation step, intact cell core is precipitated to (pellete).Then within several hours, discharge genomic DNA by Proteinase K and RNase digestion.In some embodiments, then can process material and retain the concentration of cellular waste-this class and process and be known in the art to reduce, can include but not limited to dialyse a period of time (, 2-16 hour) and/or dilution.For example, because the method for this class isolating nucleic acid does not comprise many destructive processes (ethanol precipitation, centrifugal and vortex mix), it is complete that genomic nucleic acids keeps substantially, and the most of length of fragment of generation exceedes 150kb.In some embodiment, fragment length is about 100 to 750kb.In other embodiments, fragment length is about 150 to about 600, about 200 to about 500, about 250 to about 400 and about 300 to about 350kb.
In Fig. 2, illustrate an example of LFR method.It is to process in short-term genomic nucleic acids with 5 ' exonuclease that common LFR starts, and is generally genomic DNA, thereby it is outstanding to produce 3 ' strand.The outstanding initial position of MDA (Fig. 2) that can be used as of this strand.Use exonuclease also to exempt the requirement to heating or alkaline denaturation step before amplification, can not introduce skewed popularity again to fragment colony.In some embodiments, alkaline denaturation and 5 ' exonuclease are processed coupling, and the reduction degree of skewed popularity is larger than using separately the reduction degree of wherein any processing like this.
Then DNA through 5 ' exonuclease and optional alkaline denaturation processing can be diluted to sub-gene group concentration, is distributed in a large amount of aliquots, is normally distributed in many apertures of porous plate.In some embodiment, in each aperture of porous plate, assign to the 10% genomic amount that is equal to.If what use is 384 orifice plates, in every hole, assigns to the 10% genomic amount that is equal to and cause each micro plate to comprise altogether 38 genomes.In other embodiments, in each hole, assign to the genomic amount of 5-50% that is equal to.As mentioned above, the quantity of aliquot and genome equivalent depend on initial clip size possibly.
Be separated to behind multiple holes, conventionally utilize MDA method by the fragment amplification in each hole.In specific implementations, described MDA reaction is the amplified reaction based on Phi29 polymerase through improvement.Although discussion is herein mainly with regard to MDA reaction, but it will be appreciated by those skilled in the art that, many different types of amplified reactions can be for the present invention, these amplified reactions are well known in the art, at Maniatis et al., and Molecular Cloning:A Laboratory Manual, 2d Edition, 1989 and Short Protocols in Molecular Biology, ed.Ausubel, has general introduction in et al. (being incorporated to by reference herein).
In some embodiments, design MDA reaction makes to be introduced in amplified production uracil.In some embodiment, the standard MDA of utilization reacts the fragment increasing in every hole, in described MDA reaction, adopts random hexamer.In many embodiments, utilize random 8 aggressiveness primers, instead of random hexamer reduces the amplification skewed popularity in fragment colony.In other embodiments, can also be to adding several different enzymes to reduce amplification skewed popularity in MDA reaction.For example, can utilize non-the carrying out property 5 ' exonuclease of low concentration and/or single strand binding protein to produce the binding site of 8 aggressiveness.Can also utilize the chemical reagent such as betaine, DMSO or trehalose to reduce skewed popularity by similar mechanism.
After the fragment increasing in each hole, in many embodiments, amplified production carries out a fragmentation of taking turns again subsequently.In some embodiment, use afterwards CoRE method described above by further the fragment in each hole fragmentation in amplification.Mistake just as discussed above, in order to use CoRE method, be designed to introduce uracil to MDA product for the MDA reaction of the fragment in each hole of increasing.The fragmentation of MDA product can also be processed or enzyme processing realizes by ultrasonic wave.In this embodiment, operable enzyme processing includes, but are not limited to use DNase1, T7 endonuclease 1, micrococcal nuclease etc.
In some embodiment, in the time utilizing CoRE method by MDA product fragmentation, each hole of containing MDA product by uracil dna glycosylase (UDG), DNA glycosylase-lyases endonuclease V III and Τ 4 polynucleotide kinase mixture process, thereby excision uracil base, produces the single base breach with functional 5 ' phosphoric acid and 3 ' oh group.By utilizing polymerase such as Taq polymerase to produce double-stranded flat end fracture through nick translation, obtain the fragment that can connect, its size depends on the dUTP concentration adding in MDA reaction.In some embodiment, CoRE method used comprises by polymerisation removes uracil, and the strand displacement (referring to Fig. 4) carrying out through phi29.
In general, MDA product by fragmentation after, be by the end reparation of the fragment obtaining.This class reparation may be necessary, because otiose with the end of jag with the end of functional groups in the coupled reaction of many fragmentation technology after can being created in, and for example 3 ' and 5 ' oh group and/or 3 ' and 5 ' phosphate group.Of the present invention many aspect, making fragment is useful through reparation with flat end, in some situation, the chemical attribute that may wish to change end makes not phosphoric acid and the oh group in correct direction, thereby prevents the polymerization of target sequence.Can utilize methods known in the art to control the chemical attribute of end.For example, in some situation, can remove all phosphate groups with phosphatase, therefore all end contains oh group.Then each end can optionally be changed to connect between desirable composition.Then fragment end can be activated, and in some embodiments, this is by processing realization with alkaline phosphatase.In many embodiments, fragment is labeled adapter subsequently.In general, can utilize these label adapters to identify the fragment from same aperture in LFR method.
Fig. 3 provides the schematic diagram as some embodiment of the adapter design of label for LFR method.Conventionally, adapter is designed to two sections, and a section is all apertures total (common), utilizes the method further describing in literary composition that fragment is directly connected with flat end.In the embodiment showing at Fig. 3, " having " adapter of interpolation has the 5 ' end that two adapter arm-mono-arms are flat end junction fragments, and another arm is 3 ' end of flat end junction fragment.Second section of label adapter is unique " bar code " section for each aperture.This bar code is a unique nucleotide sequence normally, and the each fragment in specific aperture is given identical bar code.Then,, when reconsolidated from foraminate label fragment while checking order together, can be tested and appraised bar code adapter and identify the fragment from identical aperture.In the embodiment of Fig. 3 signal, bar code is connected to 5 ' end of total adapter arm.Total adapter and bar code adapter can sequentially or be connected to fragment simultaneously.As described in will be in further detail in literary composition, can modify total adapter and the end of bar code adapter make each adapter section can along correct direction be connected and with suitable point sub-connection.This class is modified by guaranteeing that fragment can not interconnect, and adapter section can only connect with the direction of signal, thereby prevents " polymerization " of adapter section.
In other embodiments, can adopt three section designs for the adapter of the fragment label to each aperture.Seemingly, different is that bar code adapter section is divided into two sections (referring to Fig. 3) for this embodiment and above-described bar code adapter design class.This design produces combination bar code adapter section by different bar code section being joined together to form to total length bar code section, thereby allows how possible bar code.This modular design provides larger possible bar code to be connected word bank, has reduced the quantity that needs the total length bar code adapter producing simultaneously.
After fragment in each aperture is labeled, all fragments are merged and form single colony.Then as described in more detail below, can utilize these fragments to produce of the present invention nucleic acid-templated.Can identify and belong to which specific aperture by the bar coded sticker adapter adhering in each fragment by the nucleic acid-templated of these tape label fragments generations.
In some situation, utilize LFR method to analyze a small amount of cell, comprise the genome of individual cells.In this situation, the process of DNA isolation and method described above are similar, but more carrying out in small size.Once be separated to DNA, and before in sample is assigned to single aperture by part, must be carefully by genomic DNA fragment to avoid the loss of material, particularly avoid losing the end sequence of each fragment can cause occurring breach in final genome assembling because lose this class material.In some situation, by avoid sequence to lose with rare nickase, this nickase produces polymerase (for example phi29 polymerase) initiation site of about 100kb distance of being separated by separately.Along with polymerase produces new DNA chain, it will replace old chain, and final result is to have overlapping sequence (Fig. 4) near the initial position of polymerase, makes sequence deletion considerably less.In other embodiments, method that then can as described above, dilutes DNA and assigns in multiple apertures.In some embodiment, controlledly use 5 ' exonuclease (before M DA reaction or in reaction) can promote to copy from the multiple of single celled original DNA, propagated by copying of copy thereby reduce incipient error.
Should be understood that, utilize any sequence measurement known in the art, LFR method described herein can be for checking order diploid gene group.LFR method described herein in other embodiments can be for any amount of order-checking platform, include but not limited to for example GeneChip (Affymetrix), CodeLink Bioarray (Amersham), Expression Array System (Applied Biosystems), SurePrint microarrays (Agilent), Sentrix LD BeadChip or Sentrix Array Matrix (Illumina) and Verigene (Nanosphere).
In some embodiment, LFR method described herein does not comprise the fragmentation/portioning of multiple level or level, as the U.S. Patent application 11/451 of submitting on June 13rd, 2006, in 692, describe, the document is based on all objects, the particularly instruction based on relevant with nucleic acid portioning method with fragmentation, is incorporated to herein by reference of text.In other words, in some embodiment, only carried out single-wheel portioning, and allowed aliquot again to converge for single array, instead of each aliquot uses independent array.
III. of the present invention nucleic acid-templated
The invention provides the nucleic acid-templated of the adapter that comprises target nucleic acid and multiple dispersions.Insert adapter by the multiple sites in each target nucleic acid and assemble nucleic acid-templated construct.The adapter disperseing allows from multiple sites order of target sequence or side by side obtains sequence information.
Term " target nucleic acid " refers to target nucleic acid (nucleic acid of interest).On the one hand, target nucleic acid of the present invention is genomic nucleic acids, but can use other target nucleic acids that comprise mRNA (and corresponding cDNAs etc.).Target nucleic acid comprises the nucleic acid genome of mammalian diseases model (for example from) of natural or gene alteration or synthetic preparation.Target nucleic acid almost can obtain from any source, also can utilize methods known in the art preparation.For example, target nucleic acid can directly separate without amplification, by utilizing methods known in the art to separate through amplification, described method includes but not limited to PCR (PCR), strand displacement method (SDA), multiple displacement amplification (MDA), rolling circle amplification (RCA), rolling-circle replication (RCR) and other amplification methods.Target nucleic acid can also obtain by clone, includes but not limited to be cloned into carriers such as plasmid, saccharomycete and bacterial artificial chromosome.
In some aspects, target nucleic acid comprises mRNAs or cDNAs.In specific implementations, target DNA is to utilize the transcript separating from biological sample to produce.Same as described in Genome Analysis:ALaboratory Manual Series (Vols.1-1V) or Molecular Cloning:A LaboratoryManual, the mRNA of separation can utilize routine techniques reverse transcription to become cDNAs.
Target nucleic acid can be the strand that specifically indicates or double-stranded, or not only contains two strands but also contain single stranded sequence part.According to concrete application, nucleic acid can be DNA (comprising genome and cDNA), RNA (comprising mRNA and rRNA) or their mixture, wherein, nucleic acid contains deoxyribose-and any combination of ribonucleotide, and any combination of base, comprise uracil, adenine, thymidine, cytimidine, guanine, inosine, xanthine, hypoxanthine, iso-cytosine, isoguanine etc.
Grammer equivalent terms in " nucleic acid " or " oligonucleotides " or " polynucleotides " or literary composition refers at least two covalently bound nucleotides.Nucleic acid of the present invention can contain phosphodiester bond conventionally, but in some cases, for example, as (in primer and the probe member such as label probe) below listed, comprise the nucleic acid analog that may contain alternative skeleton, for example bag is containing Pity acid amide (Beaucage et al., Tetrahedron49 (10): 1925 (1993) and bibliography wherein; Letsinger, J.Org.Chem.35:3800 (1970); Sprinzl et al., Eur.J.Biochem.81:579 (1977); Letsinger et al., Nucl.Acids Res.14:3487 (1986); Sawai et al, Chem.Lett.805 (1984), Letsingeret al., J.Am.Chem.Soc.110:4470 (1988); With Pauwels et al., Chemica Scripta26:14191986)), Liu Dai Pity acid esters (Mag et al., Nucleic Acids Res.19:1437 (1991); With United States Patent (USP) 5,644,048), phosphorodithioate (Briu et al., J.Am.Chem.Soc.1ll:2321 (1989)), O-Jia Ji Pity acid amide ester bond is (referring to Eckstein, Oligonucleotides and Analogues:APractical Approach, Oxford University Press) Yi Ji Makeup nucleic acid (being called again " PNA " in literary composition) skeleton and key (referring to Egholm, J.Am.Chem.Soc.114:1895 (1992); Meier et al., Chem.1nt.Ed.Engl.31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlssonet al., Nature380:207 (1996), all these documents are all incorporated to herein by reference).Other nucleic acid analogs comprise that those have twin nuclei, comprise lock nucleic acid (being called again " LNA " herein), Koshkin et al., J.Am.Chem.Soc.120:132523 (1998); Positive charge skeleton (Denpcy etal., Proc.Natl.Acad.Sc1.USA92:6097 (1995); Nonionic skeleton (U.S. Patent No. 5,386,023, No.5,637,684, No.5,602,240, No.5,216,141 and No.4,469,863; Kiedrowshi et al., Angew.Chem.1ntl.Ed.English30:423 (1991); Letsinger et al., J.Am.Chem.Soc.110:4470 (1988); Letsinger et al., Nucleoside & Nucleotide13:1597 (1994); Chapters2and3, ASC Symposium Series580, " Carbohydrate Modifications inAntisense Research ", Ed.Y.S.Sanghui and P.Dan Cook; Mesmaeker et al., Bioorganic & Medicinal Chem.Lett.4:395 (1994); Jeffs et al., J.Biomolecular NMR34:17 (1994); Tetrahedron Lett.37:743 (1996)) and non-ribose skeleton, comprise U.S. Patent No. 5,235,033 and No.5,034,506 and ASC Symposium Series580, those that describe in the 6th and 7 chapters of " Carbohydrate Modifications in Antisense Research " (Ed.Y.S.Sanghui and P.Dan Cook).The nucleic acid that contains one or more carbocyclic ring sugar is also contained in the range of definition interior (referring to Jenkins et al., Chem.Soc.Rev. (1995), 169-176 page) of nucleic acid.Rawls, C & ENews Jun.2, has described several nucleic acid analogs for 1997,35 pages." lock nucleic acid " (LNA tM) be also contained in the range of definition of nucleic acid analog.LNAs is such class nucleic acid analog, and ribose ring is wherein connected the methylene bridge " locking " of 2 '-O atom and 4 '-C atom.For all objects, especially relevant with nucleic acid whole instructions, these bibliography are all incorporated to herein by reference clearly.Can carry out these modifications of ribose-phosphoric acid skeleton to improve stability and the half-life of this quasi-molecule in physiological environment.For example, PNA:DNA and LNA-DNA mixture can demonstrate higher stability, therefore can be for some embodiment.
Nucleic acid-templated (being called again " nucleic acid construct " and " library construction body ") of the present invention comprises target nucleic acid and adapter.The oligonucleotides of term " adapter " for referring to that sequence is known herein.The adapter using in the present invention can comprise Various Components.The type of the element (being called again " feature " in literary composition) comprising in described adapter and quantity depend on the desired use of adapter.Generally include but be not limited to identification and/or cleavage site (the particularly II type recognition site of restriction endonuclease for adapter of the present invention, as described below, this recognition site allows endonuclease in conjunction with being positioned at the recognition site of adapter inside, and cuts outside adapter); Primer binding site (for amplification of nucleic acid construct) or anchor primer (literary composition is called again " grappling probe " sometimes) are in conjunction with (for giving the target nucleic acid order-checking of nucleic acid construct), nickase site etc.In some embodiments, adapter can comprise the single recognition site of restriction endonuclease, and in other embodiments, adapter can comprise two or more recognition sites of one or more restriction endonuclease.As the general introduction in literary composition, recognition site frequent (but not being inevitable) is present in the end of adapter, so that the cutting of double-stranded construct is being carried out from adapter end position farthest as far as possible.
In some embodiments, adapter of the present invention is according to the quantity of the feature comprising in adapter and size, and length is about 10 to about 250 nucleotides.In specific implementations, about 50 nucleotides of adapter length of the present invention.In other embodiments, the adapter length that the present invention uses is approximately 20 to approximately 225, approximately 30 to approximately 200, approximately 40 to approximately 175, approximately 50 to approximately 150, approximately 60 to approximately 125, approximately 70 to approximately 100, and approximately 80 to approximately 90 nucleotides.
In other embodiments, adapter optionally can comprise element they can be connected on target nucleic acid as two " arms ".In these arms one or both can comprise the complete recognition site of restriction endonuclease, or two arms can comprise the part recognition site of restriction endonuclease.In the later case, the every end of construct that comprises target nucleic acid is in conjunction with adapter arm, and its cyclisation is by the recognition site of complete.
In other embodiment, the adapter that the present invention uses comprises different grappling (anchor) binding sites at its 5 ' and 3 ' end.Just as further described in the text, the sub-binding site of this class grappling can, for order-checking application, comprise this paper and U.S. Patent application the 60/992nd, No. 485, and the 61/026th, No. 337, the 61/035th, No. 914, the 61/061st, No. 134, the 61/116th, No. 193, the 61/102nd, No. 586, the 12/265th, No. 593 and the 12/266th, No. 385, the 11/938th, No. 106, the 11/938th, No. 096, the 11/982nd, No. 467, the 11/981st, No. 804, the 11/981st, No. 797, the 11/981st, No. 793, the 11/981st, No. 767, the 11/981st, No. 761, the 11/981st, No. 730, the 11/981st, No. 685, the 11/981st, No. 661, the 11/981st, No. 607, the 11/981st, No. 605, the 11/927th, No. 388, the 11/927th, No. 356, the 11/679th, No. 124, the 11/541st, No. 225, the 10/547th, No. 214 and the 11/451st, the sequence measurement of the combined probe grappling sub-connection (cPAL) of describing in No. 691, above document is all incorporated to herein by reference, particularly with the relevant disclosure that checks order by connection.
On the one hand, adapter of the present invention is to disperse adapter." dispersion adapter " herein means at the inner oligonucleotides that inserts interval location of target nucleic acid.On the one hand, " inside " with regard to target nucleic acid means that target nucleic acid is with interior site before processing such as cyclisation and cutting etc., and above-mentioned processing may calling sequence reversion or similarly changed, and has therefore upset the order of nucleotides in target nucleic acid.
Nucleic acid-templated construct of the present invention contains the multiple dispersion adapter that inserts target nucleic acid with specific direction.As what further discuss in literary composition, target nucleic acid is to be prepared by the nucleic acid separating from one or more cells (comprising one to millions of cells).Then these nucleic acid utilize machinery or enzymatic method by fragmentation.
The target nucleic acid that becomes a part for nucleic acid-templated construct of the present invention can contain the dispersion adapter that insert at interval in the continuum of target nucleic acid on predetermined position.Interval can be identical or different.In some aspect, disperse the spacing between adapter can only accurately know that one to several nucleotides.In other aspects, the spacing of adapter is known, and in the construct of the relative storehouse of each adapter, the direction of other adapters is known.In other words, in many embodiments, adapter inserts with known distance, and therefore the target sequence of an end and the target sequence of another end are continuous natural gene group sequences.For example, for the II type restriction endonuclease that starts to cut 16 bases from recognition site, 3 bases are positioned at adapter, and endonuclease starts to cut 13 bases from adapter end.Insert after second adapter, the target sequence of the target sequence of adapter " upstream " and adapter " downstream " is in fact continuous sequence in initial target sequence.These " pairing " sequences have expanded the number that can read continuously from construct, are used in particular for reading over the repeat element in genome.
Although invention embodiment described herein is generally with regard to ringed nucleus acid template construct, be understandable that, described nucleic acid-templated construct can be also linear.In addition, nucleic acid-templated construct of the present invention can be strand or two strands, preferably double-stranded in some embodiment.
The invention provides and comprise the nucleic acid-templated of target nucleic acid, the adapter that described target nucleic acid contains one or more dispersion.In another kind of embodiment, formed by multiple genomic fragments nucleic acid-templated can be for generation of nucleic acid-templated storehouse.The target nucleic acid that the nucleic acid-templated storehouse of this class is contained is in some embodiments combined that can to cover whole genome all or part of.In other words, for example, by using the initial gene group (cell) of sufficient amount, in conjunction with random fragmentation, the target nucleic acid of the specific size for generation of annular template of the present invention of gained is " covering " genome effectively, although be understandable that, in a few cases, may introduce skewed popularity carelessly and hinder whole genomic presenting.
Nucleic acid-templated construct of the present invention comprises multiple dispersion adapters, in certain aspects, and one or more recognition site that the adapter of these dispersions comprises restriction endonuclease.On the other hand, the recognition site that described adapter comprises II type endonuclease.II type endonuclease can buy conventionally, is well known in the art.The same with their II type homologue, the specific nucleotide base-pair sequence in II type endonuclease identification double-stranded polynucleotide sequence.In the time recognizing this sequence, endonuclease will cut this polynucleotide sequence, conventionally can leave the outstanding of a chain in sequence, or " sticky end ".And II type endonuclease normally cuts outside its recognition site, according to the situation of concrete endonuclease, this distance can be between about 2 to 30 nucleotides from recognition site.Some II type endonuclease is from " accurately cutting knife " from the known base cutting of recognition site quantity.In some embodiments, II type endonuclease used is not " accurately cutting knife ", but for example, at particular range internal cutting (6 to 8 nucleotides).The cleavage site of the II type restriction endonuclease conventionally using in the present invention and its recognition site be separated by least 6 nucleotides (, recognition site end up and nearest cut point between nucleotides quantity).Exemplary II type restriction endonuclease includes, but are not limited to Eco57M I, Mme I, Acu I, Bpm I, BceA I, Bbv I, BciV I, BpuE I, BseM II, BseR I, Bsg I, BsmF I, BtgZ I, Eci I, EcoP15 I, Eco57M I, Fok I, Hga I, Hph I, Mbo II, Mn I, SfaN I, TspDT I, TspDW I, Taq II etc.In some exemplary embodiment, the II type restriction endonuclease using in the present invention is Acul, and about 16 bases of Cutting Length of this enzyme produce 3 ' of 2 bases and give prominence to and jPEcoP15, approximately 25 bases of this enzyme Cutting Length, produce 5 ' of 2 bases and give prominence to.As below, by what further discuss, in the adapter of the described nucleic acid-templated construct of invention, comprising II type site provides the instrument that inserts multiple adapters in the restriction position of target nucleic acid.
Be understandable that, adapter can also comprise other elements, the binding site of the probe (" grappling probe ") that comprise the recognition site of other (non-II type) restriction endonuclease that further describe in literary composition, uses for the primer binding site that increases and sequencing reaction.
On the one hand, the adapter that the present invention uses has sequence as shown in Figure 5.As indicated in the schematic diagram of an adapter in Fig. 5, adapter can comprise multiple functional characteristics, comprise II type restriction endonuclease recognition site (503 and 506), otch endonuclease site (504) and can affect the sequence of two level characteristics, for example destroy the base (501 and 502) of hairpin structure.The adapter that the present invention uses can also contain palindromic sequence, just as discussed in more detail below, if comprise this class adapter be nucleic acid-templatedly used to produce concatermer, palindromic sequence can promote intramolecular combination.
IV. prepare of the present invention nucleic acid-templated
IVA. generate the general introduction of annular template
The present invention relates to composition and the method for Nucleic Acid Identification and detection.As described herein, the Identification and detection of nucleic acid has a wide range of applications, and comprises various order-checkings and Genotyping application.Method described herein allows to build can be for the ringed nucleus acid template of amplified reaction, this class reaction utilizes described annular template to produce the concatermer of monomer annular template, form " DNA nanosphere " described below, it is widely used in order-checking and Genotyping application.Annular of the present invention or linear construct comprise target nucleic acid sequence, the normally fragment of genomic DNA (although as described herein, also can use other templates such as cDNA), and be dispersed in the exogenous nucleic acid adapter of distribution.The method that the invention provides the nucleic acid-templated construct of preparation, in described template construct, each follow-up adapter is added in the position of restriction, and the optional direction that one or more adapter inserting is before limiting relatively.The normally annular nucleic acid of these nucleic acid-templated constructs (although in specific implementations, described construct can be linear), comprises the target nucleic acid with multiple adapters that are dispersed in distribution.Described adapter, as described below, is for checking order and the exogenous array of Genotyping application, conventionally containing restriction endonuclease site, especially for the site of the endonuclease cutting outside recognition site such as II type.For easy analysis, reaction of the present invention preferably adopts those adapters with specific direction, but not the embodiment inserting randomly.Therefore the invention provides the method that can prepare nucleic acid construct, this nucleic acid construct contains in specific direction, and is separated by and limits multiple adapters of distance.
In the nucleic acid-templated construct that contains multiple adapters, at least one in described adapter will be inserted the continuous nucleotide of target nucleic acid, thereby from the reading that can realize continuous base of reading of each end of (being called again " dispersion " herein) adapters of these insertions.For example, read 10 bases from each end of the adapter of a dispersion reading of the target nucleic acid base continuous to 20 can be provided.
Control the interval of each follow-up adapter and adapter that direction of insertion is dispersed in distribution than radom insertion and there are many advantages.Particularly method described herein has improved the efficiency of adapter insertion process, has therefore reduced the demand to introduce amplification step in the time inserting each follow-up adapter.In addition, interval and the direction of controlling the adapter of each interpolation have ensured that the restriction endonuclease recognition site orientation of living in conventionally containing in each adapter makes cutting and Connection Step subsequently occur in the suitable site of nucleic acid construct, thereby further improve the efficiency of described process by reducing or eliminating the nucleic acid-templated formation that contains the adapter in improper position or direction.In addition, position and the direction of controlling each adapter adding are subsequently useful to some purposes of the nucleic acid construct obtaining, because adapter is exercised several functions in order-checking application, comprise the reference point known as sequence, thereby assist to confirm the relative tertiary location of the base identifying on specific site in target nucleic acid.In literary composition, further describe adapter this class purposes in order-checking application.
Genomic nucleic acids (being generally double-stranded DNA, 601 in Fig. 6), obtains from multiple cells, is generally about 10 to 100 to 1000 or more cell.Use multiple cells to make final DNA nanosphere have sufficiently high redundancy degree, can reach good gene order-checking coverage rate.As described herein, utilize standard technique (for example physics or enzymatic staging are in conjunction with size fractionation) that genomic nucleic acids is separated into suitable size.
As described herein, optionally regulate 5 ' and 3 ' of double-stranded fragment to hold.For example, many is the different fragment end of length and chemical property for what the technology of nucleic acid classification was produced.For example, end may contain overlapping, and based on many objects, preferred flat terminal double link fragment.Utilize known technology, for example polymerase and dNTPs can accomplish this point.Similarly, classification technique also likely obtains various ends, and for example 3 ' and 5 ' oh group and/or 3 ' and 5 ' phosphate group.In some embodiments, as described below, may wish to change these ends by enzymatic method.For example, in order to prevent that the multiple fragments containing adapter do not connect, desirable may be that the chemical property of change end, makes not exist correct phosphoric acid and oh group direction, thereby prevent " polymerization " of target sequence.Utilize methods known in the art can control the chemical property of end.For example, in some cases, utilize phosphatase to remove all phosphate groups, thereby make whole ends contain oh group.Then each end can optionally be changed so that the composition of expecting links together.
In addition, also optionally utilize as required a large amount of known technologies to increase to increase the quantity of genomic fragment to further operate, but in many embodiments, in this step, do not need amplification.
After classification and optional end regulate, add a set of adapter " arm " to genomic fragment end.Two adapter arms, in the time linking together, form the first adapter.For example, as described in Fig. 6, each end connects together two arms to form complete adapter (606) and annular construct (607) with the cyclisation (605) of the linear construct of an adapter arm.Therefore, an end of genomic fragment has added the first adapter arm (603) of the first adapter, and another end of this genomic fragment adds the second adapter arm (604) of the first adapter.In general,, as what below more fully describe, according to the situation of needed system, one of adapter arm or both can comprise the recognition site of II type endonuclease.Alternatively, each adapter arm can contain part recognition site, and described part recognition site reconstitutes after described arm connects.
For follow-up adapter is connected with direction to check order according to the position of hope, the invention provides a kind of method, recognition site in the first adapter of II type restriction endonuclease and annular nucleic acid construct is combined in the method, and then in genomic fragment (being called again " target nucleic acid " in literary composition), the point outside this first adapter cuts.Then connect the second adapter (equally normally by adding two adapter arms to the second adapter) at this point that cutting occurs.In order to cut target nucleic acid in known site, perhaps wish any other recognition site sealing of the same enzyme that in target nucleic acid, possibility random packet contains, thereby make the combinable unique site of restriction endonuclease in the first adapter, therefore avoid construct that unwanted cutting occurs.In general, first protect the recognition site non-inactivation in the first adapter, then conventionally make any other not protected recognition site inactivation in construct by methylating., methylated recognition site can be combined with enzyme, does not therefore cut.Only have not methylated recognition site in adapter to be combined with enzyme, and cut subsequently.
A kind of method of protecting the recognition site non-inactivation in the first adapter is to make this site become strand, because methylase can be combined with strand.Therefore, a kind of method of the recognition site in protection the first adapter is that the glm gene group fragment that the primer amplification by utilizing uracil modified is connected with two the first adapter arms realizes.Described primer and the complementation of adapter arm, and through uracil modified, therefore in the time of amplification (normally utilizing PCR), the linear construct obtaining contains in the recognition site that uracil is embedded in a first adapter arm.Utilize known technology cutting uracil to make described the first adapter arm (or any fragment that contains uracil) become strand.Use sequence-specific methylase then to linear construct, this enzyme methylates all double-stranded recognition site of the endonuclease identical with the endonuclease containing in the first adapter.This sequence-specific methylase can not methylate the strand recognition site in the first adapter arm, therefore protected the avoiding because of the inactivation that methylates of recognition site in the first adapter arm.As described below, if restriction site is methylated, it will can not cut by being limited property endonuclease.
As below more fully described, in some situation, single adapter may contain two identical recognition sites, so that can be from " upstream " of same adapter and " downstream " cutting.In this embodiment, as schematically shown in Figure 7, suitable selection has been passed through in primer and uracil position, thereby " upstream " or " downstream " recognition site can optionally be protected, and avoids inactivation or avoid being caused inactivation.For example, in Fig. 7, the recognition site of the each self-contained restriction endonuclease of two different adapter arms (representing with rectangle) (represent with circle in an adapter arm, represent with triangle in another).If need to utilize uracil edman degradation Edman described above to protect the adapter arm with the recognition site representing with circle, design the amplimer of uracil modification and introduce uracil to this recognition site.Then once uracil degraded, this adapter arm becomes strand (representing with half rectangle), thereby protection recognition site avoids inactivation.
After protecting recognition site in the first adapter arm not methylated, by utilizing for example bridge oligonucleotides and T4 ligase by linear construct cyclisation.Cyclisation forms the double-stranded restriction endonuclease recognition site in the first adapter arm again.In some embodiments, bridge oligonucleotides has the end being closed, and this makes to carry out cyclisation by bridge joint oligonucleotides, and closed end does not connect, and near recognition site, leaves otch.This otch can further be utilized as discussion below.Application restric-tion endonuclease produces the second linear construct, and it comprises the first adapter and the end that are positioned at target nucleic acid inside, and this end comprises (specifically depending on enzyme) double alkali yl ledge.
The second cover adapter arm of the second adapter is connected on the second linear construct.In some situation, in the time utilizing otch, connect with appropriate direction in order to ensure adapter, there is the polymerase of exonuclease activity by utilization by the otch in the first adapter " translation " (or " displacement ").The exonuclease activity of polymerase (for example Taq polymerase) is by the short dna chain of the contiguous otch of excision, and polymerase activity will " be filled out " otch and nucleotides subsequently (substantially at this chain, Taq moves along chain, utilize exonuclease activity excision base and add identical base, result is exactly that otch is shifted along chain, until enzyme arrives chain end).
In addition, in order to form the dissymmetrical structure of template, an end coverlet base modification of construct.For example, some polymerases (for example Taq) can not have the nucleotides of template to add, and therefore cause at 3 ' end of flat end DNA duplex and add single nucleotides, produce 3 ' ledge.It will be appreciated by those skilled in the art that and can add any base, specifically depend on the dNTP concentration in solution.In specific implementations, polymerase used can only add single nucleotides.For example Taq polymerase can add single G or A.Thereby also can add other nucleotides with other polymerases and produce ledge.In one embodiment, use excessive dGTP, caused at 3 ' of a chain and hold under the condition that there is no template and added guanine.This " G tail " of the second linear construct 3 ' end causes the dissymmetrical structure of end, therefore can be connected with the second adapter arm with C-tail, makes 3 ' end renaturation (anneal) of the second adapter arm and the second linear construct.Intention is connected to the adapter of 5 ' end with C-tail, and its position can be connected with 5 ' G-tail it.The second adapter arm produces construct cyclisation the second annular construct that comprises two adapters after connecting.The second adapter contains the recognition site of II type endonuclease conventionally, and this recognition site may be identical or different from the recognition site containing in the first adapter, and latter event has many application.
By with restriction endonuclease cutting, can insert the 3rd adapter at the opposite side of the first adapter, described restriction endonuclease is in conjunction with the recognition site in second arm of the first adapter (initial by the recognition site of methylation Inactivation).For this recognition site can be used, the uracil Mdification primer of the second recognition site complementation in utilization and the first adapter increases annular construct to produce the 3rd linear construct, and the first adapter in this linearity construct comprises the uracil being embedded in the second restricted recognition site.Make the first adapter become strand uracil degraded, so just protect the recognition site in adapter to avoid methylating.Then adopt sequence-specific methylase to make all not protected recognition site inactivations.Once cyclisation, the recognition site in the first adapter forms again, uses restriction endonuclease by cut ring, produces the site that can insert the 3rd adapter in the 3rd linear construct.The 3rd adapter arm will be with A-or G-tail by following the linear construct of the base program-identical with above description tri-with the connection of the 3rd linear construct, the 3rd adapter arm will be with T-or C-tail, make adapter arm can with the 3rd linear construct renaturation, and connect.Then by the linear construct cyclisation that comprises the 3rd adapter arm to form the 3rd annular construct.The same with the second adapter, the 3rd adapter can comprise the different restriction endonuclease recognition site of the recognition site contained from the first adapter conventionally.
There is the II type restriction endonuclease of recognition site can add the 4th adapter by utilizing second and the 3rd in adapter.The cutting of being undertaken by these restriction endonuclease will produce the 4th linear construct, and then this construct is connected with the 4th adapter arm.The cyclisation that has connected the 4th linear construct of the 4th adapter arm will produce nucleic acid-templated construct of the present invention.As is understood by persons skilled in the art, can also add other adapters.Therefore, method described herein allow two or more adapters in directional dependence mode, be that Range-dependent mode adds sometimes.
The present invention also provides the method for the direction of insertion for controlling each adapter adding subsequently.This class " nick translation " method provides a kind of method of controlling target nucleic acid and adapter connected mode.By preventing that adapter is connected with other adapters and prevents that target nucleic acid molecule is connected (being mainly to avoid " polymerization " separately of adapter and target nucleic acid molecule) with other target nucleic acid molecules, these methods can also prevent from forming false nucleic acid construct.Fig. 8 indicative icon adapter be connected the example of the different directions that can take with target nucleic acid molecule.Target nucleic acid 801 with 802 preferably with adapter 803 and 804 along the directions of hope be connected (as what show in this figure, the direction of hope be have same shape-circle or square-interconnective that direction of end).The end of decorating molecule can be avoided undesirable conformation 807,808,809 and 810, and in these conformations, target nucleic acid interconnects, and adapter interconnects.In addition, by discussed in detail, can control the direction that each adapter-target nucleic acid is connected as below by the chemical property of the end of control adapter and target nucleic acid.Can utilize the chemical property of means known in the art control end.For example, in some cases, use phosphatase to remove all phosphate groups, make whole ends contain oh group.Then each end can optionally be changed so that connect between desired composition.These and other methods end modified in nick-translation method of the present invention and that control adapter inserts have below been described in more detail.
In other embodiments, thereby adapter direction can be used optional method control to select those templates with adapter in the right direction, and described optional method comprises the combination of selective cross, selective amplification and adapter otch and amplification.For example, these class methods are at the WO2008/070375 of submission on November 2nd, 2007 and the U. S. application No.11/934 submitting on November 2nd, 2007,695, the No.11/934 that on November 2nd, 2007 submits to, the No.11/934 that on November 2nd, 697 and 2007 submits to, in 703, there is description, the full content of each in above-mentioned document is incorporated to herein by reference, all instructions of the nucleic acid-templated construct of the adapter that the direction that particularly, has to expect about selection is inserted are incorporated to herein especially by reference.
Then these nucleic acid-templated constructs (" monomer " that comprise the target sequence that has scattered these adapters) can be for generation of concatermer, and these concatermers can be formed for as order-checking and detect the nucleic acid nano ball of the downstream application such as specific target sequence.
The invention provides the method that forms nucleic acid-templated construct, wherein said nucleic acid-templated construct comprises the multiple adapters that are dispersed in distribution that insert target nucleic acid.As what further discuss in literary composition, method of the present invention allows to insert each follow-up adapter by the recognition site that utilizes the II type restriction endonuclease comprising in adapter.In order to insert multiple adapters with order and/or the direction expected, may be necessary the restriction endonuclease recognition site sealing containing in target nucleic acid, thereby make to only have the recognition site in adapter can supply the combination of enzyme and cutting subsequently.One of advantage of these class methods is in each adapter, to use identical restriction endonuclease site, this has just simplified the production process of the final annular template for the preparation of concatermer, the insertion of adapter can utilize the adapter of previous insertion as " stepping-stone " of next adapter, and each new adapter " is walked " to realize interpolation by the length direction along fragment.Control can also have been avoided some sequence of excision for the recognition site of Restriction Enzyme, therefore can only obtain limited sequence and present (if can approach the site of target nucleic acid inside, this thing happens with regard to possibility).
IVB. add the first adapter
As producing the nucleic acid-templated first step of the present invention, the first adapter is connected with target nucleic acid.Can add whole the first adapter to an end, or will in the literary composition of the first adapter, be called the two parts of " adapter arm " and two ends of target nucleic acid are connected respectively.The first adapter arm is designed to can reconstitute the first complete adapter by connecting.As specifically described above, the first adapter comprises the recognition site of one or more II type restriction endonuclease conventionally.In some embodiments, II type restriction endonuclease recognition site can split off between two adapter arms, and therefore this site only could be used in conjunction with restriction endonuclease after above-mentioned two adapter arms connect.
Fig. 6 is the schematic diagram of the one side of the method for assembling adapter/target nucleic acid template (being called again " target library construction body ", " library construction body " and all phraseological equivalents in literary composition).Utilize standard technique DNA isolation described above, for example genomic DNA 601, and its fragment is turned to target nucleic acid 602.Then the target nucleic acid 602 of fragmentation is repaired, and is neat or flat end with the 5' and the 3' end that make each chain.After this reaction, utilize 3 ' end of the each chain that does not have the polymerase of calibration function to make the target nucleic acid of fragmentation to add single A, thereby make " A-tail " on each fragment band.Add A tail conventionally for example, by using polymerase (Taq polymerase) and only providing adenylic acid to realize, polymerase is forced to add one or more A in the mode of template sequence dependent/non-dependent to the end of target nucleic acid like this.
In the illustrative methods showing at Fig. 6, first arm (603) of the first adapter is connected with each target nucleic acid with the second arm (604), and generation is with the target nucleic acid of adapter arm that is connected to each end.In one embodiment, adapter arm is and the A tail complementation of target nucleic acid " adding T tail ", like this by provide a kind of mode make adapter arm first with target nucleic acid renaturation, then adopt ligase that adapter arm is linked on target nucleic acid, to be conducive to being connected of adapter arm and target nucleic acid.
In other embodiments, the invention provides mode that adapter is connected with each fragment makes in molecule or the generation of intermolecular connection illusion (artifacts) minimizes.This is that useful because target nucleic acid random fragment mutually forms and connects illusion and can produce the genome proximity relations between false target nucleic acid fragment, makes sequence alignment process complicated.Utilization adds that A is attached to adapter with T tail in the random molecule that has prevented adapter and fragment on DNA fragmentation or is intermolecular associated, and this has just reduced and certainly connects the illusion that (adapter-adapter or fragment-fragment connects) can produce.
As adding the A/T tail alternative of (or adding G/C tail), can take various additive methods to prevent that target nucleic acid from forming and being connected illusion with adapter, and relative target nucleic acid is to adapter arm orientation, comprise that the complementary NN that utilizes in target nucleic acid and adapter arm is outstanding, thereby or adopt suitable target nucleic acid and the ratio of adapter to carry out flat end to be connected and to optimize single slice nucleic acid/adapter arm connection ratio.
Generation comprises after target nucleic acid and the linear construct of each end with adapter arm, by linear target nucleic acid cyclisation (605) (this process will discuss in more detail in the text), produce the annular construct 607 that comprises target nucleic acid and adapter.Notice that cyclization process causes the first and second arms of the first adapter to be brought to together to form continuous the first adapter (606) in annular construct.In some embodiments, annular construct 607 utilizes for example random hexamer and Φ 29 or unwindase, is amplified by for example encircling dependent amplification.Alternatively, it is linear that target nucleic acid/adapter structure can keep, and can increase by the PCR of the site guiding in adapter arm.The amplification procedure that amplification is is preferably regulated and controled, use high fidelity, have the polymerase of proofreading activity, produce sequence amplifying target nucleic acid/adapter construct library accurately, the genome being wherein queried or genomic one or more part have enough showing.
IVC. add multiple adapters
Fig. 6 is the schematic diagram of an aspect of the method for assembling adapter/target nucleic acid template (being called again " target library construction body ", " library construction body " and all phraseological equivalents in literary composition).Utilize standard technique DNA isolation, for example genomic DNA 601, and its fragment is turned to target nucleic acid 602.Then in some embodiments the target nucleic acid 602 of (as described herein) fragmentation is repaired, and making the 5' of each chain and 3' end is neat or flat end.
In the exemplary method showing at Fig. 6, first arm (603) of the first adapter is connected with each target nucleic acid with the second arm (604), and generation is with the target nucleic acid of adapter arm that is connected to each end.
Generation comprises after target nucleic acid and the linear construct of each end with adapter arm, by linear target nucleic acid cyclisation (605) (this process will discuss in more detail in the text), produce the annular construct 607 that comprises target nucleic acid and adapter.Notice that cyclization process causes the first arm of the first adapter in annular construct, to form continuous the first adapter (606) together with being brought to the second arm.In some embodiments, annular construct 607 utilizes for example random hexamer and Φ 29 or unwindase, is amplified by for example encircling dependent amplification.Alternatively, it is linear that target nucleic acid/adapter structure can keep, and can increase by the PCR of the site guiding in adapter arm.The amplification procedure that amplification is is preferably regulated and controled, use high fidelity, have the polymerase of proofreading activity, produce sequence amplifying target nucleic acid/adapter construct library accurately, the genome being wherein queried or genomic one or more part are fully showed.
Similar with the first adapter process of interpolation, can give linear molecule each end add the second cover adapter arm (610) and (611), then connection (612) is to form complete adapter (614) and ring molecule (613).Equally, the II type endonuclease that cuts adapter (609) opposite side by utilization can add the 3rd adapter to the opposite side of adapter (609), then the 3rd cover adapter arm (617) and (618) is connected to each end of linearisation molecule.Finally, add the 4th adapter by again cutting annular construct and adding quadruplet adapter arm to linearisation molecule.The drawn embodiment of Fig. 6 is to have adopted in adapter (620) and (614) to have the II type endonuclease of its recognition site to cut a kind of method of annular construct.Recognition site in adapter (620) and (614) can be identical or different.Similarly, the recognition site in all adapters that Fig. 6 shows can be identical or different.
As Fig. 9 general description, the annular construct that comprises the first adapter may contain two II type restriction endonuclease recognition sites in this adapter, and its position makes the target nucleic acid cut (910) of (and outside adapter) outside recognition sequence.Structure 510 arrow instruction recognition site and restriction site around.In process 911, use a kind of II type restriction endonuclease EcoP15 to cut annular construct.Note Fig. 9 shown aspect in, in each library construction body, be mapped to part target nucleic acid part will from construct, be cut away (the target nucleic acid part in structure 910 between arrow).In process 911, produced with EcoP15 restriction cutting library construction body the linear construct library of containing the first adapter, wherein the first adapter is positioned in linear construct 912 ends.The size of the linear library construction body obtaining adds that by the distance between endonuclease recognition site and endonuclease restriction site the size of adapter determines.In process 913, linear construct 912, the same with the target nucleic acid 904 of fragmentation, process to become flat or neat end by conventional method, utilize the polymerase without proofreading activity to add the A tail that comprises single A to 3 ' end of linear library construction body, by A-T tailing be connected the end 913 that the first arm of the second adapter and the second arm is connected to linearizing library construction body.The library construction body obtaining comprises the structure that 914 places can see, wherein the first adapter is positioned in linear construct end, and target nucleic acid is at an end by the first adapter side joint, and the other end is by the first arm or the second arm side joint of the second adapter.
In process 915, double-stranded linear library construction body is processed to be strand 916, and then strand library construction body 916 is connected (917) and forms target nucleic acid 918 single-stranded loops that are dispersed in two adapters that distributing.Connection/cyclization process in 917 connects under optimized condition and carries out in molecule making.Under certain concentration and reaction condition, be more prone to connect in the local molecule of each nucleic acid construct end, instead of connection between molecule.
In some embodiments, 2,3,4,5,6,7,8,9 or 10 adapters be included in of the present invention nucleic acid-templated in, wherein, independent select each adapter so that its all identical, whole differences or the identical adapter that has in groups (for example, have two adapters of identical sequence, have identical sequence but also have not homotactic two adapters, as described herein, all combinations all likely).As above more discussed in detail, Fig. 6 is the schematic diagram that produces the method for the template with four adapters.Figure 51 is the schematic diagram of six-adapter reading structure, and this structure is increased to 104 base/DNB by reading length from 70 base/DNB.In Figure 51, each arm of DNB has the adapter (Ad2+Ad3 and Ad4+Ad5) of two insertions, and this adapter is supported the analysis to 13+13+26 base/arm.The adapter (Ad2-Ad5, according to the direction of inserting) of all insertions passes through identical IIS enzyme (for example AcuI on self-reacting device.Use optional MmeI can analyze base and be increased to 18+18+26/arm or 124/DNB) introduces with the following step recurrence: IIS cutting DNA ring, by direction connection adapter, PCR, USER enzyme cut, selective methylation and DNA circle.As described herein, can use any amount of restriction endonuclease, and described restriction endonuclease can be identical or different, specifically depends on system format.Every batch of 96 library that the reaction time of each adapter can be low to moderate in automated system needs 10 hours, thereby produces enough outputs to support multiple advanced order-checking instrument.Each adapter inserting by direction, except having extended cPAL reading length, has also obviously extended the reading length of SBS or SBL.
IVD. control the closure between target nucleic acid and adapter
On the one hand, as described above, the invention provides a kind of method that adapter is connected with desired orientation with target nucleic acid.This control to direction is useful, can produce the genome proximity relations between false target nucleic acid fragment because the random fragment of target nucleic acid forms interconnective illusion, makes sequence alignment process complicated.
There is several method can be used for controlling the direction of insertion of adapter.As described above, can change the chemical property of target nucleic acid and adapter end, occur over just in situation in the right direction thereby make to connect.Alternatively, can carry out " nick-translation method ", as summarized below, the method depends on the chemical property of end equally.Finally, can be according to description below, take to relate to the method increasing with the special primer of selecting.
Figure 12 has schematically illustrated the second adapter can add the different directions of nucleic acid construct to.Equally, process 1200 starts with the annular library construction body 1202 that contains the first adapter 1210 having inserted.The first adapter 1210 has specific direction, triangle wherein represents the outer chain (" outer strand ") of the first adapter, and rhombus represents the interior chain (" inner strand ") (Adl direction 1210) of the first adapter.The afterbody of arrow 1201 has been indicated the II type restriction endonuclease site in the first adapter 1210, the head instruction cleavage site of arrow.Process 1203 comprises with II type restriction endonuclease cuts, and connects the first arm and second arm of the second adapter, and recirculation.Can find out from the library construction body 1204 and 1206 obtaining, the second adapter can insert in two kinds of different modes by relative the first adapter.In the direction 1204 of hope, oval insertion is with leg-of-mutton outer chain, bowknot to insert the interior chain (Ad2 direction 1220) with rhombus.In undesirable direction, ellipse has inserted the interior chain with rhombus, and bow tie has been inserted the outer chain (Ad2 direction 1230) with rectangle.
Although for the sake of clarity, the insertion that relative the first adapter of the second adapter has been discussed with the schematic diagram of mentioning is below discussed, but be understandable that, the process of discussing is herein applicable to the second adapter after the adapter adding, and generation is with three, four, five, six, seven, eight, nine, ten or the library construction body of more insertion adapters.
In one embodiment, used and added A tail and add T tail adapter is attached to nucleic acid fragment.For example, fragment end is repaired in the modification of as described above, utilizes the polymerase without proofreading activity to add single A to 3 ' end of every chain of target nucleic acid of fragmentation, thereby each fragment " is added to A tail ".Add A tail and normally utilize polymerase (for example Taq polymerase) and adenylic acid (or excessive adenylic acid) is only provided, polymerase is forced to add one or more A to target nucleic acid end in the mode of template-sequence-dependent/non-dependent like this.In the embodiment of employing " adding A tail ", to add " T tail " by the 5 ' end to adapter/adapter arm with being connected of adapter (or adapter arm), thereby the A tail complementation with target nucleic acid, like this by provide a kind of mode make adapter arm first with target nucleic acid renaturation, then adopt ligase that adapter arm is linked on target nucleic acid, to be conducive to being connected of adapter arm and target nucleic acid.
Because for desirable amount and while comprising the target nucleic acid that derives from individual chip, each aspect of the present invention effect optimum is useful so ensure to produce the cyclization of nucleic acid-templated whole process in molecule when nucleic acid-templated.In other words, guarantee that target nucleic acid is in the process being connected with first, second, third, etc. adapter, self can not interconnect is useful.Figure 10 has shown a kind of embodiment of controlling cyclization process.As shown in figure 10, sealing oligonucleotides 1017 and 1027 is respectively applied for sealing calmodulin binding domain CaM 1012 and 1022.Sealing oligonucleotides 1017 and binding sequence 1016 complementations, sealing oligonucleotides 1027 and binding sequence 1026 complementations.In the schematic diagram of 5 ' adapter arm and 3 ' adapter arm, be two deoxidation cytimidines (ddC) with the base of underscore, runic base is phosphorylation.Sealing oligonucleotides 1017 and 1027 with adapter arm be not covalent bond, can be connected with library construction body at adapter arm afterwards and cyclisation before " melting "; And, dideoxy nucleotide (be ddC here, or the alternative another kind nucleotides that cannot connect) prevent that sealer (blocker) is connected with adapter.Substitute in addition or as a kind of, in certain aspects, the breach that the crossbred of sealing oligonucleotides-adapter arm contains one or more base between adapter arm and sealing molecule is to reduce the possibility that is connected of sealing molecule and adapter.In certain aspects, the T of sealer/land crossbred mfor approximately 37 DEG C so as sealer sequence adapter arm connect (cyclisation) front easy thawing.
IVD (I). the direction of control connection: arm link arm connects
In one aspect, the directionality of utilizing " arm link arm " method of attachment can control adapter is inserted, and without modifying target nucleic acid end.In general, this is a connection procedure that has two steps, and wherein adapter arm is added to target nucleic acid, follows the primer of strand displacement to extend two duplex molecules of generation, each at one end have an adapter arm, then can not had the end of adapter arm to add the second adapter arm.This process can prevent the nucleic acid molecules that contains identical adapter arm in two ends-for example, resembles that Fig. 1 lA shows, arm link arm connection procedure can prevent from forming the nucleic acid molecules that two ends are all occupied by adapter A or adapter B.In many embodiments, preferably every end of target nucleic acid connects from different adapter arms, and in the time that two arms connect together, they can form complete adapter like this.This is particularly useful in the quantity of adding the amplification step that after each adapter arm, minimizing needs to greatest extent, because arm link arm connects the quantity that has reduced unwanted molecule in each coupled reaction.
Figure 11 has shown a kind of embodiment of arm link arm method of attachment.In this embodiment, two of dephosphorylized target nucleic acid chains have all been coupled with a chain of the first adapter arm A.Conventionally utilize alkaline phosphatase that one end of this adapter arm (being shown as closed hoop) sealed.Primer displacement can be used for changing the chain with blind end.Follow the primer of strand displacement to extend (in an exemplary embodiment, this can be by utilizing phi29 or Pfu polymerase to realize) since two ends, the whole Insert Fragment of extend through, produce two double chain acid molecules, each one end is with adapter arm A, and one end is flat end.In alternate embodiments, thus adapter arm A can be first and primer extend at the upstream hybridization starting primer that is being closed chain, and without primer displacement reaction.After strand displacement polymeric enzyme reaction, connect upper the second adapter arm B can to the normally flat end of target nucleic acid, instead of connect upper the second adapter arm B with the end of adapter arm.This arm link arm connection procedure can prevent from forming the target nucleic acid that two ends comprise identical adapter arm.
IVD (ii). control connection direction: nick-translation method
In one embodiment, the invention provides " nick-translation method " for building nucleic acid molecules.In one embodiment, described nick-translation method is used to connect nucleic acid molecules with the direction of hope.In another embodiment, nick-translation method is used to insert adapter with the direction of hope.These methods are usually directed to one of nucleic acid molecules by be connected together or both one or two ends are modified.For example, when adapter is connected to target nucleic acid, in target nucleic acid to be connected together and adapter one or the two one or two end are modified.After this class is modified, " displacement " or " translation " that be inserted into the otch in chain of construct provides the ability of the final direction of adapter-target nucleic acid construct that control connection is good.As described in more detail below, " nick-translation method " described herein can also comprise that primer extends or breach is filled and led up method.Be just to control with regard to being connected of adapter and target nucleic acid although below discuss, be appreciated that these methods are not limited to being connected of adapter and target nucleic acid, these methods can also be used for controlling the connection of any two nucleic acid molecules.For example; the method of nick-translation method and any other control connection described herein can be used as a part for gene and/or DNA engineering method; for example build new plasmid or other DNA vectors, gene or genome synthesize or modify, and for building the assembly of nanometer technology construct.
Figure 13 has schematically illustrated the process of this " nick translation " type.Construct 1306 in Figure 13 is to utilize the method for discussing to form herein, and it contains the adapter 1304 that is dispersed in distribution, restricted property endonuclease recognition site (the arrow afterbody in Figure 13) and cleavage site.In Figure 14, library construction body is not by cyclisation, but branch's concatermer that target nucleic acid fragment 1406 (containing restriction endonuclease recognition site 1404) and adapter 1412 replace; But the process of the nick translation type showing in Figure 13 also can be carried out on library construction body configuration.Term " library construction body " is with referring to and the nucleic acid construct that comprises one or more adapter can exchange with term " nucleic acid-templated " in the text.
Library construction body with the first adapter inserting digests (process 1301) through restriction endonuclease, in certain aspects, is the II type restriction endonuclease that cutting target nucleic acid produces 3 ' nucleotides outstanding 1308.In Figure 11, shown 1308 of two nucleotides (NN-3 '), but the number of jag nucleotides changes according to the difference of restriction endonuclease used at least partly in different aspect.Library construction body 1310 is linearized, and the adapter of first insertion is wherein shown as 1304.The adapter 1304 of first insertion is configured to comprise the otch 1312 that is positioned at adapter segment boundaries; Or the recognition site that comprises nucleic acid otch restriction endonuclease, making can be at the inner otch 1314 of introducing of adapter.In two kinds of situations, all process (1303) library construction body 1310 with polymerase 1316, it is outstanding with 3 ' to form one end that this polymerase 1316 can extend to chain end from otch 1312 or 1314 by the cochain of library construction body 1310, the chain that the other end is flat end.Connect upper the second adapter 1318 in process 1305 this library construction body 1310, these the second adapter 1318 one end have degenerate core thuja acid outstanding, and the other end is that single 3 ' nucleotides (for example dT) is outstanding to form library construction body 1320.Then in process 1307, (for example use Taq polymerase) and process library construction body 1320 to add 3'dA at its flat end.Then can utilize the primer that for example contains uracil through pcr amplification library construction body 1322.Alternatively, can, by 1322 cyclisation of library construction body in process 1309, in this situation, can carry out CDA (for example step 1421 in Figure 14).Process discussed here is combined relative position and the relative direction that can select the adapter of follow-up interpolation and the adapter of any previous insertion library construction body with the nick translation type procedure showing in Figure 13.
In order to utilize the method for nick translation type, as above discussed by end modified one or two of target nucleic acid and/or adapter may be useful.In an exemplary embodiment, be intended to hold the first arm of the adapter being connected can be designed to its 3 ' end with target nucleic acid 3 ' and be closed, therefore only have 5 ' end of adapter arm can supply to be connected with 3 ' end of target nucleic acid.Similarly, be intended to hold the second arm being connected can be designed to its 5 ' end with target nucleic acid 5 ' and be closed, therefore only have 3 ' end of the second arm to be connected with 5 ' end of target nucleic acid.The method of one end of sealing adapter arm and/or target nucleic acid is known in the art.For example, hold the enzyme of removing phosphoric acid to process target nucleic acid (literary composition be again called " nucleic acid insertion " or " DNA insertion " or " insertion ") with the end that can produce specific function of discussing and from 3 ' and 5 ' above.Removing phosphate group can not interconnect target nucleic acid molecule.Chain that adapter in this embodiment is also designed to connect (for example connecting by producing or retaining 5 ' phosphate group) with there is complementary strand protected and 3 ' end that can not be connected.Conventionally, this protection of 3 ' end is utilized dideoxy nucleotide to make 3 ' end inactivation and realized.Therefore, when adorned target nucleic acid two ends all do not have phosphate group, the adapter of modifying comprises phosphate group at one 5 ' end, and 3 ' (be for example closed on complementary strand, two deoxidations) time, unique connection product that may form is target nucleic acid, and this target nucleic acid is connected to the adapter 5 ' end with phosphate group.After this Connection Step, the protected 3 ' end of adapter can be replaced into the chain that contains functional 3 ' end.Realize this displacement and normally utilized 3 ' protected chain generally shorter, easily this fact of sex change.Displacement chain with functional 3 ' end is longer; therefore can be more effectively in conjunction with complementary strand-in other embodiments; chain with functional end adds with higher concentration simultaneously, thereby further impact reaction is undertaken by the strand displacement with functional end towards protected chain.Then the archaeal dna polymerase that has nick translation activity by adding will cause with the chain of functional 3 ' end, make polymerase remove base from 5 ' end of target nucleic acid by the circumscribed mode of nucleic acid, thereby expose functional 5 ' phosphoric acid.5 ' phosphoric acid of this new generation can be connected on extension products by linked enzyme.(if in extension process, there is no ligase, two polymerase molecules, by the each end translation otch from target nucleic acid until meet, produce the molecule of fracture).For example, as shown in Figure 2, target nucleic acid (insertion) is first formed the end of specific function, preferably flat end by end reparation.Then, form concatermer for fear of insert, remove 5 ' end phosphoric acid.Then insert is mixed with DNA ligase and DNA adapter.Described DNA adapter contains two oligonucleotides, and in the time that two oligonucleotides are hybridized together, has a flat end and a sticky end.Flat end one side contains " upstream chain (top-strand) " with 3 ' end of protected/inactivation, and one " downstream chain (bottom-strand) " with functional 5 ' end phosphoric acid, therefore can not be from connecting.Unique so possible connection combination is the each end of an insert and " downstream chain " flat end-be connected to.Then hold shielded " upstream chain " and the oligonucleotides displacement that contains functional 3 ' end with 3 ', this oligonucleotides can be used as the primer in polymerase elongation reaction.Once add polymerase and ligase, can embed second oligonucleotides by nick translation and coupled reaction.In polymerase extends to insert time, it can introduce an otch with functional 5 ' phosphoric acid, and this otch can be identified and seal up by DNA ligase.At this moment the insert with adapter or adapter arm obtaining on every end of every chain can utilize adapter special primer to carry out PCR.
Conventionally in all nick translation described above reactions, before adding polymerase, in mixture, have active ligase, or active ligase and polymerase add in mixture simultaneously.In some embodiments, it may be useful using low activity polymerase (low nick translation) condition.Before polymerase, add ligase or polymerase and ligase to add and otch that low activity condition both can contribute to ensure translation is sealed up arriving before DNA fragmentation opposite end simultaneously.In some embodiments, this can be by realizing at 37 DEG C of (this temperature causes oligomerization enzymatic activity and high ligase activity conventionally) incubation Taq polymerases and T4 ligase.Then reaction can for example, be proceeded incubation to further ensure that the majority/whole constructs in reaction all complete otch-translation-connection at higher temperature (50-60 DEG C).
In other embodiments, the invention provides the method that forms nucleic acid-templated construct, described nucleic acid-templated construct comprises multiple adapters that are dispersed in distribution.Method of the present invention comprises inserts multiple adapters so that the method that each follow-up adapter inserts on the ad-hoc location of the one or more adapters that relatively previously added.Some method of inserting multiple adapters that are dispersed in distribution is known in the art, for example resemble U.S. Patent application the 60/992nd, No. 485, the 61/026th, No. 337, the 61/035th, No. 914, the 61/061st, No. 134, the 61/116th, No. 193, the 61/102nd, No. 586, the 12/265th, No. 593, the 12/266th, No. 385, the 11/679th, No. 124, the 11/981st, No. 761, the 11/981st, No. 661, the 11/981st, No. 605, the 11/981st, No. 793 and the 11/981st, in No. 804, discuss, for all objects, particularly produce the nucleic acid-templated method and composition that comprises multiple adapters that are dispersed in distribution and whole instructions of the nucleic acid-templated all usings method of this class in order to relate to, these documents are all incorporated to herein by reference of text.Known adapter sequence is inserted to target sequence, thereby continuous target sequence is interrupted by multiple adapters that are dispersed in distribution, ability to each adapter " upstream " and " downstream " order-checking is provided, has therefore increased by each nucleic acid-templated sequence information amount that can produce.The adapter that the invention provides one or more previously interpolation relatively inserts the additive method of each follow-up adapter at ad-hoc location.
Nick translation connects normally having connected the first chain after by least adding to reacting that polymerase carries out.In some embodiments, nick translation reaction can be by carried out disposable all the components adding with single step reaction, and in other embodiments, reactions steps order is carried out." step " method of nick translation reaction has multiple possible embodiment.For example, can use the single mixture that contains primer, wherein Taq starts to add in reaction.Provide by simple raising temperature and carried out primer exchange is connected (and PCR, if needs) ability with nick translation with heat-staple ligase.In another exemplary embodiment, the weak 3 ' exonuclease of the non-progressive nick translation polymerase that reactant mixture contains least concentration and activation 3 ' sealed joint.
In other embodiments, utilize T4 polynucleotide kinase (PNK) or alkaline phosphatase to change 3 ' end of adapter and/or target nucleic acid, prepare nick translation process.For example, a part that can be used as cyclization is inserted adapter.The target nucleic acid that end reparation and alkaline phosphatase treatment are crossed is connected with adapter, is designed to form the hair clip shape unit (Figure 16) of self complementation in this exemplary embodiment.Described hair clip is designed to contain at given position can be by the modification of enzyme or chemicals identification and cutting.For example, if hair clip contains BrdU, BrdU can be identified and cut by UDG/EndoVIII.After cutting, two hair clips become the strand of its 3 ' end separately with phosphoric acid.Then these 3 ' phosphoric acid can be removed to carry out as further described in the text nick-translation method through T4 polynucleotide kinase (PNK) or alkaline phosphatase (SAP).In exemplary embodiment, the embodiment of for example illustrating in Fig. 4 A, two hair clips are designed to part complementation mutually, therefore can form ring molecule by hybridization in molecule.Finally, the molecule of cyclisation is carried out to nick translation processing, wherein, polymerase extends in insert, the otch with functional 5 ' end phosphoric acid that introducing can be identified and seal up by DNA ligase.
Except utilizing as mentioned above hairpin structure, can also use the double-stranded adapter of a pair of mutual part complementation to carry out cyclisation.On bar chain, contain one to one the BrdU that can be identified and cut by UDG/EndoVIII.Can also use other to do the method for otch at a chain, include but not limited to: the DNA that nickase, introducing can be modified by the inosine of the enzyme of inscribe ucleotides identification, and introduce the RNA that can be identified by RNA-endonuclease to DNA and modify.Can as described above, target nucleic acid and adapter be ready for to controlled connection, for example, by the flat end that can not be connected with other target nucleic acids with generation with alkaline phosphatase treatment target nucleic acid.The activation of cyclisation is to realize by the each end inserting from short 3 ' shielded chain sex change in the chain being connected with target nucleic acid, at target nucleic acid in adapter is left to two complementary strand ends of part.Then these ends are by hybridizing in molecule and carrying out together with nick translation links with connection, forming covalently closed circle.Then process these rings with UDG/EndoVIII, for the preparation of the directed ring inserting of next adapter.
In the another embodiment that Figure 15 shows, process linear target nucleic acid to remove 5 ' phosphoric acid with shrimp alkaline phosphotase (SAP).Then, target nucleic acid is connected with an arm (arm A) of adapter, described arm comprises a chain with 5 ' phosphoric acid, and with the shorter complementary strand of protected 3 ' end.Then carry out nick translation to connecting product.The otch producing in cyclization is positioned at the upstream chain of the first adapter, and the primer that in reacting as nick translation, polymerase uses.Polymerase is the otch to adapter-insert intersection by upstream chain extension, discharges one of adapter A arm, produces flat end or A or G outstanding.Then the insert end that the polymerase, obtaining produces is connected with the second adapter arm (arm B).In cyclization, produce otch by designing the first adapter, follow-up adapter can add with predetermined direction.This strategy can be applied in all II type Restriction Enzymes or other enzymatic or non-enzymatic fragmentation method, no matter whether they produce with flat end, 3 ' outstanding or 5 ' outstanding digestion product.That in primer displacement, extension, connection and PCR afterwards and Fig. 2, describes is similar.Can also utilize non-amplification mode to carry out closed-loop, comprise the oligonucleotides being closed is melted, then realize DNA circle through nick translation coupled reaction.
There is the polymerase (thering is 3 '-5 ' exonuclease activity) of proofreading activity, for example Pfu polymerase, do not there is the polymerase (lacking 3 '-5 ' exonuclease activity) of proofreading activity, for example Taq, both may be used to nick translation described herein and comprise that the chain of strand displacement process is synthetic.The polymerase with proofreading activity can effectively produce flat end in nick translation process, but its shortcoming is can same degraded not protected 3 ' outstanding.Therefore the nick translation product obtaining has two flat ends, therefore can not be connected with adapter subsequently with specific direction.A solution is for example for example, not to be degraded with the 3 ' end of protecting the adapter (the arm A in Figure 15) having connected at the upper bi-deoxyribose ribonucleoside triphosphote (ddNTP) that uses of 3 ' end.But ddNTP protection has also protected 3 ' end to make it can not carry out extension afterwards, therefore limit adapter and pushed ahead in direct cyclization process.Another kind of potential solution is to utilize modification (for example 3 ' phosphoric acid) protection 3 ' end on 3 ' end not to be aggregated enzyme degraded, and wherein said modification can be removed (for example utilizing alkaline phosphatase) before nick translation cyclisation.Another kind method is to utilize hairpin-shaped adapter in conjunction with the polymerase with proofreading activity in nick translation reaction.These adapters can avoid being degraded, but shortcoming is to need extra UDG/EndoVIII step.In addition, inventor finds that there is a kind of polymerase with proofreading activity, i.e. Pfu polymerase can protected 3 ' not produce flat end outstanding in the situation that effectively not degrading, shows that it has 3 '-5 ' lower exonuclease activity.
Do not have the polymerase of proofreading activity, for example Taq polymerase both can produce flat end in nick translation process, also can produce single base outstanding (Taq, except flat end, can also produce the A-and the G-tail that do not rely on template).In nick translation process, using the advantage of polymerase without 3 '-5 ' exonuclease activity is protected 3 ' outstandingly can not keep complete.This 3 ' outstanding avoiding that made not need protection degrades, and specifically direction connects adapter subsequently.The latent defect of many polymerases with proofreading activity is that they have not relying in the process of template, to the function of 3 ' end interpolation single core thuja acid.This process is very difficult to control, and often can produce 3 ' end colony of mixing, causes the productive rate that is connected of low adapter and insert.In general the method that, adopts flat end to connect is higher than the efficiency of the outstanding connection of single base.
In one embodiment, connect after the first adapter, not that then with the II type endonuclease cutting that contains its recognition site in the first adapter, (this is that the present invention produces a step in some nucleic acid-templated embodiment to formation annular, for example, in Fig. 6 and Fig. 9 embodiment of indicative icon), but utilize the one of nick-translation method to change to add the second adapter.In Figure 17, illustrate the exemplary embodiment of this variation.Conventionally,, as shown in above detailed description and Fig. 6 and Fig. 9, it is to add the first adapter, then cyclisation to target nucleic acid that these embodiments start.In the embodiment showing at Figure 17 A, utilize the polymerase (for example Taq polymerase) with 5 '-3 ' exonuclease activity to carry out nick translation, generation be the reversion annulus that the first adapter is positioned at target nucleic acid inside.Then this product end can be repaired, carry out be connected (the utilizing the above method of describing in detail) with adapter 2.A shortcoming of this embodiment is that target nucleic acid may be needed longer than order-checking, and in any nucleic acid concatermer being produced by template (below having more detailed discussion by nucleic acid-templated generation concatermer of the present invention), this longer template may be easy to form secondary structure.In the time that these concatermers are used to order-checking application (example cPAL method as discussed below), this class secondary structure may cause signal to decline.A kind of mode that overcomes this shortcoming is an exemplary embodiment that has drawn this method by target nucleic acid is shortened-Figure 17 B.In this embodiment, utilize method described herein uracil to modify the first adapter.After the ring reversion of nick translation-comprise the first adapter, adapter C arm is added to two ends of the molecule of end reparation.The treated uracil of removing of adapter 1 that uracil is modified, produces breach, and the 3 ' end that also treated generation is activated.Conventionally,, by using UDG/EndoVIII enzymatic mixture to remove uracil, remove 3 ' phosphoric acid and produce 3 ' of activation with PNK and/or alkaline phosphatase and hold.Activation 3 ' the end of adapter 1 and the 3 ' end of adapter arm C are identified by nick translation polymerase (having the polymerase of 5 '-3 ' exonuclease activity), produce product in, adapter 1 be trimmed to the only about half of target nucleic acid of its original length around.If adapter 1 is modified (including but not limited to introduce inosine, RNA modification etc.) institute's modified by other otch, can repeatedly carry out this polymerase cutting process to further reduce the size of target nucleic acid.
In other embodiments, the nick-translation method showing in Figure 17 A and B can expand to and insert multiple adapters.By modifying adapter, can form otch, breach and functional 3 ' end to prepare nick translation reaction from multiple adapters simultaneously.The nucleic acid construct that comprises target nucleic acid and two adapters (each uracil that all contains is modified) on a chain is by cyclisation.Then, use the enzymatic mixture such as UDG/EndoVIII to process ring to remove uracil and introduce breach.These breach can carry out nick translation simultaneously annulus is reversed, and construct can be connected with other adapter.By add multiple modification on identical adapter, otch/breach and the nick translation that can carry out subsequently reverse to introduce multiple adapters.In some embodiments, uracil can be added back to the same position in adapter, make adapter be applicable to carrying out further nick translation reaction.Can be by for example nick translation reaction being hatched with " reconstruction " adapter and is modified with independent uracil, the nucleotides that then adds higher concentration unmodified is filled the remainder of construct, thereby uracil is added back.
In other embodiment, can repair (trim) target nucleic acid by the speed of controlling nick translation enzyme.For example, can make nick translation enzyme slack-off by changing temperature or limiting reagent, thereby may cause being introduced into two otch in cyclisation insert, it is mobile that it utilizes the original site of nick translation process from adapter to start.Similarly, utilize strand displacement polymerase (for example phi29) can cause otch to be moved, due to the replaced branch point that produces of section of nucleic acid.These otch or branch point can (be included but not limited to by plurality of enzymes, the combination of SI endonuclease, Bal31, T7 endonuclease, mung bean (Mung Bean) endonuclease and enzyme, for example 5 '-3 ' exonuclease (for example T7 exonuclease) and S1 or mung bean nuclease restriction endonuclease) identification, the opposite strand of these enzymes meeting cut, produces linear product.This product then can be by end reparation (if need), and be connected with next adapter.The large young pathbreaker of remaining target nucleic acid is by the control of nick translation reaction speed, also for example, by for example reducing reagent (dNTPs) concentration, or by reacting to control at the temperature less than optimum temperature.The incubative time that the size of target nucleic acid can also be reacted by nick translation is controlled.
In other embodiments, can utilize nick-translation method to form nucleic acid-templated without the conversion of any cyclisation step.In Figure 18, show the exemplary embodiment of these class methods, the figure illustrates and utilize method of attachment described above, thereby for example control the end that can supply the target nucleic acid being connected with the first adapter by process target nucleic acid with shrimp alkaline phosphotase to remove phosphate group, so that hairpin-shaped the first adapter 1801 is connected with target nucleic acid 1802.Connect after the first adapter, carry out controlled double-stranded specific 5 '-3 ' exonuclease and react to produce strand 3 ' end.In some embodiments, use T7 exonuclease to carry out exonuclease reaction, but be appreciated that in these embodiments of the present invention and can use other double-stranded specific exonucleases.In other embodiments, exonuclease reaction has produced the strand 3 ' end of length about 100 to about 3000 bases.In other embodiment, exonuclease reaction produces the strand 3 ' end that is about 150 to approximately 2500, approximately 200 to approximately 2000, approximately 250 to approximately 1500, approximately 300 to approximately 1000, approximately 350 to approximately 900, approximately 400 to approximately 800, approximately 450 to approximately 700 and approximately 500 to approximately 600 bases.
Be understandable that, nick translation process described herein can with literary composition in describe any other add the method coupling of adapter.For example, described above and the arm link arm connection procedure illustrated in Fig. 1 lA can carry out the construct for the preparation of pcr amplification with the coupling of nick translation process.
In other embodiments, the adapter arm A using in the coupled reaction of arm link arm can be designed to not need PCR and directly cyclisation, then connects and seals up ring through nick translation.In exemplary embodiment, for direct cyclisation, adapter arm A can be designed to the such of Fig. 1 lB picture.Section 1101 is designed to the complementation with adapter arm B.Construct in Fig. 1 lB can for example, allow primer directly extend by strand displacement polymerase (phi29), and the end (polymerase can not extend across 3 ' phosphoric acid on section 1102) that does not need primer exchange reaction to remove to be closed.This construct also provides for 3 ' of cyclisation and has given prominence to.Section 1102 prevents that adapter arm A and adapter arm B from hybridizing before cyclisation.In some embodiments, may not need section 1102 to prevent from for example, can be used as with arm B hybridization (when adapter arm B in very high concentration time) or section 1102 part for the design of adapter arm B instead of adapter arm A.
Produce after strand 3 ' end, the second adapter 1803 is hybridized with the strand 3 ' end of target nucleic acid, and be connected with the first adapter by nick translation coupled reaction (in one embodiment, described nick translation connection is " primer extension " or " breach is filled and led up " reaction).The second adapter is with 5 ' phosphoric acid and 3 ' sealing (being designated vertical line 1804).In some embodiment, 3 ' sealing can be removable sealing, for example 3 ' phosphoric acid, and this can utilize polynucleotide kinase (PNK) and/or shrimp alkaline phosphotase to remove in some exemplary embodiment.The second adapter is held with degeneracy base 3 ' and/or 5 ' in some embodiments.In some exemplary embodiment, the second adapter has an about 2-6 degeneracy base at 5 ' end, holds and has 4-9 degeneracy base, but be appreciated that the present invention contains the degeneracy base of the second adapter one or both ends with any number combinations 3 '.In the illustrated embodiment of Figure 18, the second adapter comprises 5 ' 3 degeneracy bases of end (" N3 "), 3 ' 7 degeneracy bases of end (" N7 ").Can under the reaction condition that is conducive to the hybridization of adapter and target nucleic acid, realize in some embodiments engaging of the first adapter and the second adapter.In some exemplary embodiment, this reaction condition can comprise from the temperature of about 20 to about 40 DEG C.The polymerase that can use under this reaction condition includes but not limited to phi29, Klenow, T4 polymerase and Pol I.
Then will connect product 1805 sex change and/or further use 5 '-3 ' exonuclease processing, and pass through again afterwards annealing steps and form two single stranded nucleic acid molecules (with " x2 " instruction in Figure 18).In annealing process again, the N7 part of the second adapter can with the section hybridization from the first hybridization sequences primitive random distance, thereby form strand circle 1806.In some embodiments, the N7 end of the second adapter may be until sex change produces long single-chain nucleic acid region 1807 just hybridizes.Average distance between two captive genome sections (their normal length be about 20 to about 200 bases) in many embodiments between about 0.5 to about 20kb.This average distance depends in part on the quantity of degeneracy base in adapter (" Ns ") and the stringency of hybridization conditions.Then again can carry out another after annealing steps takes turns adapter hybridization and is connected with nick translation.Last adapter is (in Figure 18, this last adapter is shown as the 3rd adapter 1808, but be appreciated that, last adapter can be the the the 4th, the 5th, the 6th, the 7th or the adapter of high-order more inserting according to any method described herein) similar with the second adapter, but in many embodiments, can lack the degeneracy base of 3 ' end.In other embodiments, last adapter may comprise the binding site of amplification primer (for example PCR primer).
In further embodiment, amplified reaction, for example PCR reaction (referring to 1809 in Figure 18), can be for example by utilize first and last adapter in the primer binding site that comprises carry out.In further embodiment, first and last adapter can be two arms of same adapter, can insert more than one adapter adding before last adapter.In further embodiment, amplified production can be used to form annular double chain acid molecule, to utilize any process described herein or known in the art further to insert adapter.
IVD (iii). the controlled insertion of follow-up adapter: the protection of restriction endonuclease recognition site
Insert the direction of the adapter of target nucleic acid except control described above, the adapter that can also relatively previously insert inserts multiple adapters with specific position in target nucleic acid.In the embodiment that this method comprises, some restriction endonuclease recognition site, the recognition site containing in the adapter particularly previously having inserted, protected and non-inactivation.For adapter is subsequently connected with direction with the position of hope, in method provided by the invention, II type restriction endonuclease recognition site in the first adapter in annular nucleic acid construct is combined, then outside the first adapter, certain point in genomic fragment (being called again " target nucleic acid " in literary composition) cuts.Then can on the point that cutting occurs, connect the second adapter (being equally generally two adapter arms by adding the second adapter).In order to cut target nucleic acid at known point, be necessary to seal any other recognition site of the same enzyme that in target nucleic acid, possibility random packet contains, the combinable unique site of restriction endonuclease is in the first adapter like this, thereby avoids construct to carry out unwanted cutting.Conventionally, first protecting the recognition site non-inactivation in the first adapter, is generally then to make any other not protected recognition site inactivation in construct by methylating.In literary composition, " inactivation " of restriction endonuclease recognition site means and makes in some way the described recognition site can not the combination of being limited property endonuclease, thereby stoped the downstream cutting step of this enzyme.For example, methylated recognition site can be combined with restriction endonuclease, does not therefore cut.Once all not protected recognition site in nucleic acid construct is methylated, only have the recognition site that do not methylate in adapter allow enzyme combination and carry out cutting subsequently.Make the additive method of recognition site inactivation include but not limited to recognition site methylase sealer, utilize sealing oligonucleotides to seal recognition site, utilize other sealing molecule such as zinc finger protein to seal recognition site, and do otch to recognition site and prevent from methylating.The U.S. Patent application the 12/265th that this class protects the method for required recognition site to submit on November 5th, 2008; submit in No. 593 and on November 6th, 2008 the 12/266th; in No. 385, there is description; for all objects; especially for the whole instructions relevant with insert multiple adapters that are dispersed in distribution in target nucleic acid, the full content of these two parts of documents is incorporated to herein by reference.
Be understandable that, described abovely can also combine use for controlling adapter with the method at the interval of the each adapter adding subsequently of control described below with the method for the interconnective direction of target nucleic acid.
On the one hand; the invention provides the method for the recognition site non-inactivation in protection the first adapter; described method makes the recognition site in the first adapter become strand, and protected recognition site like this can only can not methylate methylated duplex molecule methylase.A kind of method that makes the recognition site single stranded in the first adapter is the glm gene group fragment of utilizing the primer amplification of modifying through uracil to be connected with two the first adapter arms.Primer and the complementation of adapter arm, and modify with uracil, in the time of amplification (conventionally utilizing PCR), the linear construct obtaining contains the uracil in the recognition site that is embedded in an adapter arm like this.In the PCR product that primer produces, uracil is near the II type restriction endonuclease recognition site in the first and/or second arm of the first adapter.Digest the region single stranded that adapter arm is comprised treat protected II type recognition site for uracil.Use sequence-specific methylase then to linear construct, this enzyme methylates all double-stranded recognition site of the endonuclease identical with the endonuclease containing in the first adapter.This sequence-specific methylase can not methylate the strand recognition site in the first adapter arm, and therefore the recognition site in the first adapter arm is protected avoids passing through methylation Inactivation.
In some situation, as below more fully described, single adapter can have two identical recognition sites, to allow " upstream " and " downstream " cutting from same adapter.In this embodiment, as Fig. 7 sets forth, select rightly primer and uracil position, thereby make " upstream " or " downstream " recognition site optionally protected and avoid inactivation or avoid being caused inactivation.
Can by use with the second arm of the first adapter in the restriction endonuclease of recognition site (starting by methylating by the recognition site of inactivation) combination cut the opposite side that the 3rd adapter is inserted into the first adapter.In order to make this recognition site available, the primer (the second recognition site complementation in this primer and the first adapter) that utilizes uracil to modify increases annular construct to produce the 3rd linear construct, and wherein the first adapter comprises the uracil that is embedded in the second restricted recognition site.Degradation of urine pyrimidine makes the first adapter single stranded, thereby the recognition site in protection adapter is not methylated.Then use sequence-specific methylase to make all not protected recognition site inactivations.Once cyclisation, the recognition site in the first adapter reconstitutes, and uses restriction endonuclease cut ring, produces at the 3rd linear construct the position that can insert the 3rd adapter.The 3rd adapter arm with will follow the linear construct of same general procedure-tri-described above being connected of the 3rd linear construct and will be added A or G tail, the 3rd adapter arm will be added T or C tail, make adapter arm and the 3rd linear construct annealing, and connect.Then by the linear construct cyclisation that comprises the 3rd adapter arm to form the 3rd annular construct.Identical with the second adapter, the recognition site of the restriction endonuclease that the 3rd adapter comprises is conventionally different from the recognition site containing in the first adapter.
Utilize the II type restriction endonuclease that there is recognition site in adapter second and the 3rd, can add the 4th adapter.Cut and produce the 4th linear construct by these restriction endonuclease, be then connected with the 4th adapter arm.The cyclisation that has connected the 4th linear construct of the 4th adapter arm will produce nucleic acid-templated construct of the present invention.
Generally speaking; method of the present invention provides the mode of special protection II type endonuclease recognition site non-inactivation; once like this in construct after every other not protected recognition site inactivation; add II type endonuclease will cause being only combined with protected site, therefore can control the position that cutting occurs in construct subsequently.Method described above provides a kind of embodiment of how protecting required recognition site non-inactivation.Be appreciated that and utilize technology known in the art can improve said method, and these improved methods are also encompassed in the present invention.
In an illustrative embodiments, coupling certain methods protection recognition site non-inactivation in the insertion method of each adapter inserting subsequently.In the illustrated embodiment of Figure 19, relative the first adapter of the second adapter inserts in the position of hope, and its process adopting is the methylating and protect not methylated combination of combination that uses uracil degraded and nickase.Figure 19 shows that genes of interest group DNA1902 is with the II type restriction endonuclease recognition site that is positioned at 1904.Described genomic DNA classification or fragmentation and produce the fragment 1906 with II type restriction endonuclease recognition site 1904 in process 1905.In process 1907, adapter arm 1908 is connected with fragment 1906 with 1910.In process 1911, utilize the primer 1912 of modifying with the uracil of adapter arm 1908 and 1910 complementations, there is the fragment 1906 (library construction body) of the first and second adapter arms 1908 and 1910 through pcr amplification.In the PCR product that primer produces with near the uracil of II type restriction endonuclease recognition site.In process 1913, utilize the such as uracil-DNA glycosylase (people such as Krokan, (1997) Biochem.J.325:1-16) degrade specifically uracil, the PCR product staying is strand in II type restriction endonuclease recognition site region.As what shown, can utilize the introducing of uracil and degraded to make II type restriction endonuclease recognition site single stranded; But, just as further described herein, can adopt additive method, comprise that using 3' or 5' exonuclease limitedly to digest makes these region single stranded.
In process 1915, utilize sequence-specific nickase that the II type restriction endonuclease recognition site of each two strands is done to otch to protect these sites not identified by II type restriction endonuclease.But, in the first and second adapter arms 1908 and 1910, the II type restriction endonuclease recognition site part of strand can not be cut open, once cyclisation also connects (1917), II type restriction endonuclease recognition site in the first and second adapter arms forms again, makes this II type restriction endonuclease recognition site can be used to restriction.In the time selecting for the nickase of this process and II type restriction endonuclease, preferably these two enzymes are identified identical sequence or enzyme and identify the subsequence (sequence in certain sequence) of another enzyme.Alternatively, described nickase can be identified different sequences, but this nickase is positioned at adapter, therefore nickase otch in II type restriction endonuclease recognition site.Utilize uracil or 3' or 5' degraded to make to use a kind of nickase in whole process; Alternatively, can adopt more than one sequence-specific nickases.Then in process 1919, cut the construct of cyclisation by II type restriction endonuclease, wherein II type restriction endonuclease recognition site is designated as 1922, at 1920 cutting constructs, otch is as shown in 1918, and the linear construct obtaining can be for adding the connection of the cover of second in construct adapter arm in process 1921.
The first (1924) and second (1926) adapter arm of the second adapter is added to linearisation construct by connection procedure 1921, in process 1923, increasing for the second time through PCR, is the primer 1928 of the uracil modified of use and 1924 and 1926 complementations of adapter arm equally.In the same manner as above, in the PCR product that primer produces with near the uracil of II type restriction endonuclease recognition site.In process 1925, uracil is by degrade specifically, and the II type restriction endonuclease recognition site of the PCR product staying in the first and second adapter arms 1924 and 1926 of the second adapter is strand.Connection procedure 1921 can be repaired the otch 1918 in the II type restriction site 1904 in target nucleic acid fragment 1906 equally.In process 1927; again utilize sequence-specific nickase that the base of II type restriction endonuclease recognition site with in the first adapter 1930 of the double-stranded II type restriction endonuclease recognition site in target nucleic acid fragment (incision 1914 of II type restriction endonuclease recognition site 1904 occurs) is cut, to protect these sites not identified by II type restriction endonuclease.
With the construct of otch then in process 1929 by cyclisation and connection, wherein the II type restriction endonuclease recognition site in the first and second arms 1924 and 1926 of the second adapter forms (1932) again, repeat this process, the construct of cyclisation is again cut by II type restriction endonuclease in process 1931, produces another linearizing construct (having added the first and second adapters in this) for the 3rd pair of adapter arm 1936 and 1938 is connected in construct.Described II type restriction endonuclease recognition site is as shown in 1922, and restriction site is as shown in 1920, and the otch of the II type restriction endonuclease recognition site in target nucleic acid fragment is as shown in 1918, and the otch in the first adapter is as shown in 1934.Can repeat this process to add the adapter of requirement.As what show here, first adapter adding contains an II type restriction endonuclease recognition site; But in other respects, first adapter adding can contain two II type restriction endonuclease recognition sites to accurately select the required target nucleic acid size of construct.
On the one hand, adapter can be designed to contain around II type restriction endonuclease recognition site or partly overlapping sequence-specific nickase site with it.By utilizing nickase, can optionally protect the II type restriction endonuclease recognition site in each adapter not methylated.In other embodiments, nickase can be identified another sequence or site, but cuts at II type restriction endonuclease recognition site.Nickase is the endonuclease of specific recognition sequence in identification double-stranded DNA, and can cut a chain in the special position of relative recognition sequence, thereby in duplex DNA, cause single-strand break, nickase includes but not limited to Nb.BsrD1, Nb.Bsm1, Nt.BbvC1, Nb.Bbv.Nb.BtsI and Nt.BstNBI.By being used in combination sequence-specific nickase and II type restriction endonuclease; II type restriction endonuclease recognition site in all II type restriction endonuclease recognition sites in target nucleic acid and any adapter previously having inserted can be protected not digested (certain; suppose that II type restriction endonuclease is notch sensitiveness;, can not be attached on the recognition site being cut open).
Figure 20 indicative icon invent the embodiment of described method, wherein utilize to methylate and sequence-specific nickase has been selected the ideal position with respect to the second adapter of the first adapter.Figure 20 has shown genes of interest group DNA (target nucleic acid) 2002, and it is with the II type restriction endonuclease recognition site that is positioned at 2004.This genomic DNA is graded in process 2005 or thereby fragmentation produces the fragment 2006 with II type restriction endonuclease recognition site 2004.Adapter arm 2008 and 2010 is connected to fragment 2006 in process 2007.With fragment 2006 (library construction body) cyclisation in process 2009 of adapter arm 2008 and 2010, in process 2011, be amplified by ring dependent amplification, obtain the hyperbranched concatermer that target nucleic acid fragment 2006 (II type restriction endonuclease recognition site is wherein positioned at 2004) and the first adapter 2012 replace.
In process 2013, sequence-specific nickase 2030 is used near nucleic acid in the special II type restriction endonuclease recognition site in the adapter in library construction body or it and does otch, thereby stops methylating of these sites.Here, the II type restriction endonuclease recognition site in adapter arm 2012 and 2014 is cut by sequence-specific nickase 2030.In process 2015, the II type restriction endonuclease recognition site not being cut open in construct is methylated (be II type restriction endonuclease recognition site 2004 methylate 2016) here to protect these sites not identified by II type restriction endonuclease.But the II type restriction endonuclease recognition site in adapter 2012 and 2014 is not methylated because there is otch to exist.
In process 2017, otch in library construction body is repaired, in the library construction body producing, the II type restriction endonuclease recognition site of 2012 li of adapters can be for identification and restriction 2018, and II type restriction endonuclease recognition site in genomic fragment 2004 can not.Then methylated construct is connected with second pair of adapter arm, cyclisation, and increase in process 2021 by ring dependence amplification, obtain the concatermer that target nucleic acid fragment 2006 (II type restriction endonuclease recognition site is 2004), the first adapter 2012 and the second adapter 2020 replace.Then, in process 2023, again carry out the incision of sequence specific type, use is the sequence-specific nickase in the site in identification the second adapter 2020 specifically, thereby methylating of the II type restriction endonuclease recognition site in prevention the second adapter 2020, instead of other II type restriction endonuclease recognition sites (, the II type restriction endonuclease recognition site in II type restriction endonuclease recognition site 2004 and the first adapter 2012 in fragment) in construct.Then process proceeds to methylate 2015, can further add adapter arm if needed.In each different adapter, use different sequence-specific nickase sites, make can carry out sequence-specific incision in whole process.
In the illustrated process of Figure 21, utilizing methylates has selected the second required adapter and the relative position of the first adapter with sequence-specific methylase sealer.Figure 21 has shown genes of interest group DNA (target nucleic acid) 2212, and it is with the II type restriction endonuclease recognition site that is positioned at 2214.This genomic DNA is graded in process 2105 or thereby fragmentation produces the fragment 2106 with II type restriction endonuclease recognition site 2104.Adapter arm 2108 and 2110 is connected to fragment 2106 in process 2107.With fragment 2106 (library construction body) cyclisation in process 2109 of adapter arm 2108 and 2110, in process 2111, be amplified by ring dependent amplification, obtain the hyperbranched concatermer that target nucleic acid fragment 2106 (II type restriction endonuclease recognition site is wherein positioned at 2104) and the first adapter 2112 replace.
In process 2113, utilize sequence-specific methylase sealer 2130 (for example zinc refers to) to stop special II type restriction endonuclease recognition site in library construction body to methylate.Here, the II type restriction endonuclease recognition site in adapter arm 2112 and 2114 is sealed by methylase sealer 2130.In the time selecting for the methylase sealer of this process and II type restriction endonuclease, do not need this two site sequences that Entity recognition is identical, do not need the subsequence of another entity of Entity recognition yet.Described sealer sequence can, in upstream or the downstream of II type restriction endonuclease recognition site, be sealed the such configuration in described site (for example zinc refers to or other nucleic acid binding proteins or other entities) but belong to methylase sealer.In process 2115; in construct, not protected II type restriction endonuclease recognition site is methylated-(here, be II type restriction endonuclease recognition site 2104 methylate 2116)-protect these sites not identified by II type restriction endonuclease.But the II type restriction endonuclease recognition site in adapter 2112 and 2114 is not methylated because there is methylase sealer.
In process 2117, methylase sealer discharges from library construction body, in the library construction body obtaining, the II type restriction endonuclease recognition site of 2112 li of adapters can be for identification and restriction 2118, and II type restriction endonuclease recognition site in genomic fragment 2104 can not.Then methylated construct is connected with second pair of adapter arm, cyclisation, and be amplified through ring dependent amplification in process 2121, obtain the concatermer that target nucleic acid fragment 2106 (with the II type restriction endonuclease recognition site that is positioned at 2104), the first adapter 2112 and the second adapter 2120 replace.Then, in process 2123, again carry out methylase sealing, that the methylase sealer in the site in the second adapter 2120 with identification seals methylating of II type restriction endonuclease recognition site in the second adapter 2120 specifically, but be helpless in construct other II type restriction endonuclease recognition site (, the II type restriction endonuclease recognition site in II type restriction endonuclease recognition site 2104 and the first adapter 2112 in fragment).Then process proceeds to methylate 2115, if needed, can further add adapter arm.In each different adapter, use different methylase sealer sites, to can carry out the sealing of sequence-specific methylase in whole process.Although Fig. 9 has shown the insertion of relative the first adapter of the second adapter with 21, but be understood that this process can be applied to the adapter adding after the second adapter, produce with the library construction body of four, six, eight, ten or more insertion adapters nearly.
In the illustrated process of Figure 22, utilize the degraded with uracil that methylates to select the second required adapter and the relative position of the first adapter.Figure 22 has shown genes of interest group DNA2202, and it is with the II type restriction endonuclease recognition site that is positioned at 2204.This genomic DNA is graded in process 2205 or thereby fragmentation produces the fragment 2206 with II type restriction endonuclease recognition site 2204.Adapter arm 2208 and 2210 is connected to fragment 2206 in process 2207.In process 2211, utilize the primer 2 212 of modifying with the uracil of adapter arm 2208 and 2210 complementations through pcr amplification with the fragment 2206 (library construction body) of the first and second adapter arms 2208 and 2210.In the PCR product that primer produces with being positioned at or near the uracil of II type restriction endonuclease recognition site.In process 2213, utilize the such as uracil-DNA glycosylase (people such as Krokan, (1997) Biochem.J.325:1-16) degrade specifically uracil, the PCR product staying is single stranded in II type restriction endonuclease recognition site region.As what shown, utilize the introducing of uracil and degraded can make II type restriction endonuclease recognition site single stranded; But, just as further described herein, also can adopt additive method, comprise that using 3' or 5' exonuclease limitedly to digest makes these region single stranded.
In process 2215; utilize sequence-specific methylase that the base in each double-stranded II type restriction endonuclease recognition site is methylated (what have II type restriction endonuclease recognition site 2204 here methylates 2214), to protect these sites not identified by II type restriction endonuclease.But, in the first and second adapter arms 2208 and 2210, the II type restriction endonuclease recognition site of strand is not methylated, once cyclisation and connection 2217, II type restriction endonuclease recognition site forms 2216 again, and therefore this II type restriction endonuclease recognition site can be limited digestion.But while selecting for the methylase of this process and II type restriction endonuclease, what these two kinds of enzyme requires were identified identical sequence or the identification of a kind of enzyme is the subsequence (sequence in sequence) of another kind of enzyme.Then in process 2219, cyclisation construct is cut by II type restriction endonuclease, wherein the demonstration of II type restriction endonuclease recognition site is positioned at 2218, construct is in 2220 cuttings, obtain linearizing construct, this construct can connect to add this construct for the second cover adapter arm in process 2221.
Connection procedure 2221 adds the one 2222 and the 2 2224 adapter arm of the second adapter in linearizing construct, reuses the primer 2 226 of modifying with the uracil of adapter arm 2222 and 2224 complementations in process 2223, increases for the second time through PCR.In the same manner as above, in the PCR product that primer produces with near the uracil of II type restriction endonuclease recognition site.In process 2225, uracil is by degrade specifically, and the II type restriction endonuclease recognition site region of the PCR product staying in the first and second adapter arms 2222 and 2224 of the second adapter is strand.In process 2227; again utilize sequence-specific methylase that the base of the II type restriction endonuclease recognition site of target nucleic acid fragment double center chain is (same; what have II type restriction endonuclease recognition site 2204 methylates 2214) and the first adapter 2228 in the base of II type restriction endonuclease recognition site methylate, to protect these sites not identified by II type restriction endonuclease.Then methylated construct in process 2229 by cyclisation, wherein the II type restriction endonuclease recognition site in the first and second arms 2222 and 2224 of the second adapter forms 2230 again, repeat this process, in process 2219, again cut the construct of cyclisation by II type restriction endonuclease, thereby produce another linear construct (this has added the first and second adapters), be connected with construct for the 3rd pair of adapter arm.Can repeat this process to add the adapter of requirement.As what show here, first adapter adding can contain an II type restriction endonuclease recognition site; But, in other respects in, first adapter adding can contain two II type restriction endonuclease recognition sites to accurately select the required target nucleic acid size of construct.
Except the method for the insertion of the multiple adapters that are dispersed in distribution of above-mentioned control, can also further select the construct that comprises the adapter in specific direction to obtain those constructs with the adapter in required direction by enrichment construct group.This class enrichment method is at the U.S. Patent application the 60/864th of 11/09/06 submission, No. 992,11/02/07 U.S. Patent application the 11/943rd of submitting to, No. 703,11/02/07 U.S. Patent application the 11/943rd of submitting to, No. 697,11/02/07 U.S. Patent application the 11/943rd of submitting to, No. 695 and 11/02/07 submit to PCT/US07/835540 in have description, for all objects, particularly, for the whole instructions relevant with the method and composition of selecting specific direction adapter, the full content of these documents is all incorporated to herein by reference.
V. prepare DNB
On the one hand, the nucleic acid-templated nucleic acid nano ball that is used to make of the present invention, it is called again " DNA nanosphere ", " DNB " and " amplicon " in the text.Although nucleic acid nano ball of the present invention can utilize method described herein to be made up of any nucleic acid molecules, these nucleic acid nano balls normally comprise the of the present invention nucleic acid-templated concatermer of multicopy.In general, in the solution of this amplification procedure in single reaction chamber, carry out, can obtain more high density and use still less reagent.In addition, because the preparation of DNB generates clonal expansion, therefore this amplification method is not generally by limiting dilution generation random variation, and described random variation is intrinsic in other method.Can in one milliliter of reaction volume, generate and exceed 10,000,000,000 DNB according to DNB preparation method of the present invention, this be enough for the full genome of the mankind checks order.
On the one hand, utilize rolling-circle replication (RCR) to produce concatermer of the present invention.RCR process was once used to prepare the M13 genome (Blanco, et al., (1989) J Biol Chem264:8935-8940) of multiple continuous copies.In this method, nucleic acid copies through linear concatermer.Those skilled in the art can find about selecting the condition of RCR reaction and the guide of reagent in many bibliography, comprise United States Patent (USP) the 5th, 426, No. 180, the 5th, 854, No. 033, the 6th, 143, No. 495 and the 5th, 871, No. 921, for all objects, particularly, in order to prepare the relevant whole instructions of concatermer with utilizing RCR or additive method, these documents are all incorporated to herein by reference of text.
Conventionally, PCR reacted constituent comprises single stranded DNA ring, one or more primer that can anneal with DNA circle, has archaeal dna polymerase, ribonucleoside triphosphote and the conventional polymeric enzyme reaction buffer solution of strand displacement activity with 3 ' end of the primer of extension and DNA circle annealing.Allowing primer annealing under the condition on DNA circle, these compositions to be merged.Extend these primers by archaeal dna polymerase, to form the concatermer of DNA circle complementary strand.In some embodiments, of the present invention nucleic acid-templated be double-stranded ring, these double-stranded ring sex change can be for the single-stranded loop of RCR reaction to form.
In some embodiments, the amplification of annular nucleic acid can for example, by likely connecting upper short oligonucleotides (6 aggressiveness) the mixture of sequence from containing continuously, if or ring synthesizes, the finite mixtures thing of these short oligonucleotides is containing the selected sequence that is useful on circle replication, and circle replication is to be called " ring dependent amplification " process (CDA)." ring dependent amplification " or " CDA " refer to and utilize the primer that all can anneal with two chains of annular template, repeatedly replace amplifying doulbe-chain annular template to produce the product of two chains that can represent template, cause a series of multiple-hybridization, primer extends and strand displacement event.This causes the quantitative indicator of primer binding site to increase, and the passing in time of the amount of resultant product is also exponent increase.The primer may be random sequence (for example, random hexamer) or have distinguished sequence to select the amplification for required product.CDA causes the formation of the double-stranded fragment of concatermer in groups.
In the case of the beginning and the equal complementary bridging template DNA of end of existence and target molecule, can also pass through by target DNA connection generation concatermer.The target DNA that a group is different can be by the mixture of corresponding bridging template and in concatermer internal conversion.
In some embodiment, can be according to special characteristic, for example the adapter of required number or type carrys out isolating nucleic acid template group's subgroup.This colony can utilize routine techniques (for example conventional centrifugal column etc.) separate or carry out other processing (for example selecting by size) to form colony, can utilize the technology such as RCR to produce concatermer group from this colony.
Form the method for DNB of the present invention at disclosed patent application WO2007120208, WO2006073504, WO2007133831 and US2007099208, and U.S. Patent application the 60/992nd, No. 485, 61/026, No. 337, the 61/035th, No. 914, 61/061, No. 134, the 61/116th, No. 193, the 61/102nd, No. 586, the 12/265th, No. 593, the 12/266th, No. 385, the 11/938th, No. 096, the 11/981st, No. 804, the 11/981st, No. 797, the 11/981st, No. 793, the 11/981st, No. 767, the 11/981st, No. 761, the 11/981st, No. 730 (submission on October 31st, 2007), the 11/981st, No. 685, the 11/981st, No. 661, the 11/981st, No. 607, the 11/981st, No. 605, the 11/927th, No. 388, the 11/927th, No. 356, the 11/679th, No. 124, the 11/541st, No. 225, the 10/547th, No. 214, the 11/451st, No. 692 and the 11/451st, there is description in No. 691, for all objects, particularly, for the whole instructions relevant with forming DNB, the full content of these documents is all incorporated to herein by reference of text.
V1. prepare DNB array
On the one hand, DNB of the present invention arranges the random array that forms from the teeth outwards individual molecule.DNB can be fixed from the teeth outwards by multiple technologies (comprising covalent attachment and non-covalent adhering to).In one embodiment, described surface may comprise the capture probe that for example, forms complex (for example duplex) with the composition (adapter oligonucleotides) of polynucleotide molecule.In other embodiments, capture probe may comprise in the United States Patent (USP) 5,473,060 (being incorporated in full herein) as Gryaznov etc., describe and adapter form the oligonucleotides pincers of triple helix thing, or similar structure.
Form the method for DNB array of the present invention at disclosed patent application WO2007120208, WO2006073504, WO2007133831 and US2007099208, and U.S. Patent application the 60/992nd, No. 485, the 61/026th, No. 337, the 61/035th, No. 914, the 61/061st, No. 134, the 61/116th, No. 193, the 61/102nd, No. 586, the 12/265th, No. 593, the 12/266th, No. 385, the 11/938th, No. 096, the 11/981st, No. 804, the 11/981st, No. 797, the 11/981st, No. 793, the 11/981st, No. 767, the 11/981st, No. 761, the 11/981st, No. 730, the 11/981st, No. 685, the 11/981st, No. 661, the 11/981st, No. 607, the 11/981st, No. 605, the 11/927th, No. 388, the 11/927th, No. 356, the 11/679th, No. 124, the 11/541st, No. 225, the 10/547th, No. 214, the 11/451st, No. 692 and the 11/451st, there is description in No. 691, for all objects, particularly, for the whole instructions relevant with forming DNB array, these documents are all incorporated to herein by reference.
In some embodiments, use the patterned substrate with bidimensional array of spots to prepare DNB array.Described spot is activated to catch and hold DNB, and DNB does not stay the region between spot.In general, the DNB on spot can suppress other DNB, causes each spot to only have a DNB.Because DNB is three-dimensional (that is, not being the DNA of linear short-movie), so with respect to traditional DNA array, the every square nanometers mating surface of array of the present invention produces more DNA copy.This three-dimensional character further reduces the amount of required sequencing reagent, thereby obtains brighter spot and more effective imaging.The occupation rate of DNB array generally exceedes 90%, but can between 50% to 100% occupation rate, change.
In other embodiments, the surface of patterning is used standard silicon process technology to make.Compare non-patterned array, this patterned array can produce more highdensity DNB, has the still less reagent service efficiency of every base reading, faster course of reaction and the raising of pixel thereby produce.In another embodiment, patterned substrate is the slide of 25mm × 75mm (1 ' ' × 3 ' ') standard, and each slide glass can hold approximately 1,000,000,000 can be in conjunction with the independent spot of DNB.Should be understood that, the present invention includes more highdensity slide glass.In these embodiments, because DNB arranges from the teeth outwards, be then attached to activation spot, therefore high density DNB array, substantially by the DNB in solution " self assembly ", has been avoided one of the most expensive aspect of the preparation oligomeric array of tradition designization or DNA array.
In some embodiment, surface may be with reactive functional group, described reactive functional group reacts and forms covalent bond with the complementary function group on polynucleotide molecule, for example adopt and adhere to cDNA and carry out to the identical mode of technology used on microarray, for example Smirnov et al (2004), Genes, Chromosomes & Cancer, 40:72-77 and Beaucage (2001), Current Medicinal Chemistry, 8:1213_1244, these two parts of documents are incorporated to herein by reference.DNB can also be attached to hydrophobic surface effectively, for example, for example, with the clean glass surface of the various response function groups (-OH group) of low concentration.Be called again " chemical attachment " via being attached to herein of the covalent bond forming between polynucleotide molecule and lip-deep reactive functional group.
In other embodiments, polynucleotide molecule can be adsorbed onto on surface.In this embodiment, polynucleotides by with surperficial non-specific interaction, or be fixed by the noncovalent interaction such as hydrogen bond, Van der Waals force etc.
Adhere to and may also comprise the cleaning step of different stringency to remove the individual molecule not adhering to completely or other reagent from preparation process above, the existence of these individual molecules or reagent is unwanted or they are non-specifically combined in surface.
On the one hand, lip-deep DNB is limited in separate areas area.Separate areas can be to utilize method known in the art and that further describe to be incorporated into lip-deep herein.In exemplary embodiment, separate areas contains reactive functional group or can be used for fixing the capture probe of polynucleotide molecule.
Described separate areas may be positioned at the restriction position on regular array, regular array may be corresponding straight line style, hexagon style etc.The regular array in these regions is useful for detection and the data analysis of the signal of collecting from array in analytic process.Meanwhile, the amplicon that is confined to first and/or second stage on the restriction area in separate areas can provide more concentrated or strong signal, particularly in the time using fluorescence probe in analysis operation, thereby provides higher signal to noise ratio.In some embodiments, DNB is randomly dispersed in separate areas, and the possibility that therefore given area receives arbitrary different individual molecules is identical.In other words, the array obtaining so immediately can space addressing after manufacture, but can space addressing by carrying out that qualification, order-checking and/or decode operation become.Like this, the characteristic of polynucleotide molecule of the present invention of arranging on surface can distinguish, but is not that they are aligned to surface and started just to know when upper.In some embodiment, select discrete area, together with the macromolecular structure of chemicals, employing etc., with corresponding with unimolecule size of the present invention, thereby make in the time that unimolecule is applied to surperficial going up, substantially each region is no more than a unimolecule and is occupied.In some embodiment, DNB is arranged in pattern mode on the surface that comprises separate areas, therefore special DNB (in exemplary embodiment, by label adapter or other Marker Identifications out) is arranged on specific separate areas or separate areas group.
In some embodiments, the area of separate areas is less than 1 μ m 2; In some embodiments, the area of separate areas is at 0.04 μ m 2to 1 μ m 2scope in; In some embodiments, the area of separate areas is at 0.2 μ m 2to 1 μ m 2scope in.Be roughly circular or square making in embodiment that their size can represent by single linear dimension in separate areas, the size in this class region at 125nm in the scope of 250nm, or at 200nm in the scope of 500nm.In some embodiments, the center to center of nearest separate areas at 0.25 μ m in the scope of 20 μ m; In some embodiment, this distance at 1 μ m in the scope of 10 μ m, or 50 in the scope of 1000nm.Conventionally the major part that, described separate areas is designed in them can optical resolution.In some embodiments, described region can be with any pattern arrangement almost from the teeth outwards, as long as there is the position of restriction in region in pattern.
In other embodiment, molecule is directed to lip-deep separate areas (discrete regions), reason is that the area (being called " interregional area " in literary composition) between separate areas is inertia because concatermer or other macromolecular structures not with they combinations.In some embodiment, can process this interregional area with sealer (for example, with irrelevant DNA, other polymer etc. of concatermer DNA).
There are many kinds of holders can be used to form random array with the compositions and methods of the invention.On the one hand, holder is to have surperficial rigid solid, is preferably plane domain substantially, and unimolecule to be inquired is like this in same plane.A rear specific character allows to carry out effective signal collection by for example detecting optics.On the other hand, described holder comprises pearl, and in this situation, bead surface contains reactive functional group or the capture probe that can be used for fixing polynucleotide molecule.
On the one hand, solid support of the present invention is atresia, particularly when unimolecule random array is while needing small size by hybridization reaction analysis again.Suitable solid support material comprises such as the glass of glass, polyacrylamide coating, pottery, silica, silicon, quartz, various plastic or other material.On the one hand, the area of plane domain can be 0.5 to 4cm 2scope in.On the one hand, described solid support is glass or quartz, for example, have the slide of even silanized surface.This can reach by routine test scheme, for example after acid treatment, be immersed in 3-glycidyl ether oxygen propyl trimethoxy silicane, the N of 80 DEG C, in N-diisopropylethylamine and anhydrous dimethyl benzene (8:1:24v/v) solution, form surface (for example Beattie et a (1995) of epoxy silane, Molecular Biotechnology, 4:213).For example, by being applied to before surface, provide 3 ' or 5 ' triethylene glycol phosphinylidyne spacerarm (referring to Beattie et al cited above) to capture oligo, such surface is easy to adhere to through processing the oligonucleotides end that is hunted down.By surface-functionalized and further preparation for other embodiments of the present invention at for example U.S. Patent application the 60/992nd, No. 485, 61/026, No. 337, 61/035, No. 914, 61/061, No. 134, 61/116, No. 193, 61/102, No. 586, 12/265, No. 593, 12/266, No. 385, 11/938, No. 096, 11/981, No. 804, 11/981, No. 797, 11/981, No. 793, 11/981, No. 767, 11/981, No. 761, 11/981, No. 730, 11/981, No. 685, 11/981, No. 661, 11/981, No. 607, 11/981, No. 605, 11/927, No. 388, 11/927, No. 356, 11/679, No. 124, 11/541, No. 225, 10/547, No. 214, 11/451, No. 692 and 11/451, in No. 691, there is description, for all objects, particularly for preparation form the relevant whole instructions in the surface of array and with form array, especially the relevant whole instructions of DNB array, above document is all incorporated to herein by reference of text.
Require in the present invention in the embodiment of separate areas pattern, can utilize photoetching process, beamwriter lithography, nano-imprint lithography and nano print in kinds of surface, to produce this class pattern, for example Pirrung et al, United States Patent (USP) 5,143,854, Fodor et al, United States Patent (USP) 5,774,305, Guo, (2004) Journal of Physics D:Applied Physics, 37:R123-141, these documents are incorporated to herein by reference.
On the one hand, manufacture by photoetching process on the surface of containing multiple separate areas.By the thick photoresist layer of 100-500nm in the quartz substrate spin coating of commodity optical flat.Then photoresist layer is burnt in quartz substrate.Utilize stepper, light net (reticle) image of the zone map with to be activated is projected to photoresist layer surface.After exposure, develop to photoresist layer, remove and in projective patterns, be exposed to the region under UV source.This is to realize by plasma etching (a kind of dry process development technology that can produce very trickle details).Then substrate is toasted strengthen remaining photoresist layer.After baking, quartz wafer can carry out functionalization.Then by the vapour deposition of wafer process 3-aminopropyl dimethylethoxysilane.By changing the concentration of monomer and the time for exposure of substrate, can strictly control the density of amino functional monomer.Only have the quartz areas that is exposed to plasma etching process processes and to catch monomer with described monomer reaction.And then baking substrate is to bake the amino functional monomer of individual layer on the quartz exposing.After baking, can remove remaining photoresist with acetone.Because the difference of adhering to chemical characteristic of photoresist and silane, so in substrate, the area of amino silane functionalization keeps complete in acetone cleaning process.These regions can by be dissolved in P-phenylene diisothiocyanic acid reactant salt in the solution of pyridine and N-N-dimethyl formamide and by further these regions functionalization.Then substrate can be reacted with amine-modified oligonucleotides.Alternatively, can connect molecule (Glen Research) with 5 '-carboxyl-modifier-C10 and prepare oligonucleotides.This technology allows oligonucleotides to be directly attached on the holder of amine-modified mistake, thereby avoids other functionalization step.
On the other hand, manufacture by nano-imprint lithography method (NIL) on the surface of containing multiple separate areas.In order to prepare DNA array, give quartz substrate spin coating one deck photoresist layer, be commonly called transfer layer.Then on transfer layer, apply Equations of The Second Kind photoresist, be commonly referred to embossed layer.Then main coining tool leaves impression on embossed layer.Then reduce the gross thickness of embossed layer by plasma etching, until transfer layer is encountered in the lower region of embossed layer.Because transfer layer is removed than embossed layer is more difficult, therefore it is substantially unaffected.Then make embossed layer and transfer layer sclerosis by heating.Then substrate is put into plasma etching instrument, until quartz is encountered in the lower region of embossed layer.Then pass through vapour deposition as described above by substrate derivatization.
On the other hand, manufacture by Nanoprinting on the surface of containing multiple separate areas.This process utilizes light, impression or electron beam lithography art to produce main mould, and it is the negative picture of the Eigen Structure that needs on printhead.Printhead normally for example, is made up of soft flexible polymer (dimethyl silicone polymer (PDMS)).Different this material or the material layers of attribute is spun in quartz substrate.Then under controlled temperature and pressure condition, with mould by Eigen Structure embossment the top layer to photoresist material.Then printhead is carried out to the etching process based on plasma to improve the length-width ratio of printhead, and eliminate due to the material distortion of the lax printhead causing in time that is added embossment.Random array substrate is to utilize Nanoprinting to manufacture by leave amine-modified oligonucleotides pattern on homogeneous derivatization surface.These oligonucleotides are using the capture probe as RCR product.A possible advantage of Nanoprinting is the pattern that interweaves of different capture probes can be printed onto on random array holder.This can realize by printing in succession with multiple printheads, and wherein each printhead is with different patterns, and all patterns are combined together the holder pattern that forms final band structure.These class methods allow, in random array, DNA element is carried out to some location codings.For example, the contrast concatermer that contains distinguished sequence can be combined on random array at regular intervals.
Aspect another, utilize printhead or impression main frame (imprint-master) to prepare the high density arrays of the capture oligo spot of sub-micron, wherein said printhead or impression main frame are to be prepared by a branch of or about 10,000 to 100,000,000 optical fiber that comprise axle core and lining material of multi beam.Wire drawing and welding by optical fiber produce unique material, and the axle core that this material contains about 50-1000nm is separated by the lining material of similar size or little or large 2-5 times of size.Obtain the nanometer printhead that contains very a large amount of nano level little bars (posts) by the difference etching (dissolving) of lining material.This printhead can be for placing oligonucleotides or other biological (protein, oligopeptides, DNA, aptamer) or chemical compound, for example, with the silane of various active groups.In one embodiment, glass fibre instrument is used as depositing oligonucleotides or other biological or chemical compound with the holder of pattern.In this situation, only have the little bar producing by etching to contact with material to be deposited.Can utilize the welding fibre bundle of truncation to guide light to pass axle core, only allow photo-induced chemical process to occur in axle core print surface, therefore not need to carry out etching.In two kinds of situations, then same holder can be used as to the label photoconduction/gathering-device of the fluorescence labeling imaging using of oligonucleotides or other reactants.This device provides the have large-numerical aperture large visual field of (possible >1).Can utilize the marking of depositing or the printing equipment of enforcement active material or oligonucleotides is the style interweaving by 2 to 100 different oligonucleotides printings.Printhead is accurately positioned at about 50-500nm by this process need.Such oligonucleotide arrays can for example, for adhering to 2 to 100 different DNA colonies, different source DNAs.They can also be by utilizing DNA specificity grappling or label, for reading in parallel sub-light resolution ratio luminous point.Can pass through DNA specificity label (for example for 16 kinds of DNA 16 kinds of special grappling) obtaining information, by the combination of 5-6 kind color, utilize 16 to connect that circulation or one connect circulation and 16 decode cycle read 2 bases.For example, if each fragment only needs limited information (, circulation on a small quantity), this mode of preparing array is that effectively therefore each circulation can provide more information or each surface can do more circulations.
On the one hand, poly array of the present invention can be placed on single surface.For example, the array substrate that can produce patterning with standard 96 or 384 orifice plate format match.Producing form can be the 6mm × 6mm array on monolithic glass or plastics and other optics compatible materials, 8 × 12 styles of spacing 9mm, or 3.33mm × 3.33mm array, 16 × 24 styles of spacing 4.5mm.In an example, each 6mm × 6mm array is made up of the 250-500nm square region at 1 micron, 3,000 6 hundred ten thousand intervals.Can utilize obstacle hydrophobicity or other surfaces or physics to prevent from occurring between cell array the mixing of differential responses.
The additive method that forms molecular array is known in the art, can be used for forming DNB array.
Be appreciated that DNB of the present invention and/or nucleic acid-templated can being placed on the surface that comprises separate areas to form array of various density.In some embodiment, each separate areas may comprise about 1 to about 1000 molecules.In other embodiments, each separate areas may comprise about 10 to about 900, about 20 to about 800, about 30 to about 700, about 40 to about 600, about 50 to about 500, about 60 to about 400, about 70 to about 300, about 80 to about 200 and about 90 to about 100 molecules.
In some embodiments, density nucleic acid-templated and/or DNB array is every square millimeter and has 500,000,1,2,3,4,5,6,7,8,900 ten thousand or 1,000 ten thousand molecules at least.
VII. DNB is loaded on mobile slide glass (flowslide) and laden processing
As mentioned above, according to one embodiment of the present invention, DNB can arrange or " loading " on patterned surface to form high density DNB array.
According to a kind of embodiment, DNB preparation is loaded in streaming slide glass, as Drmanac et al., Science327:78-81, described in 2010.In brief, by being inhaled to move to, DNB on slide glass, loads slide glass.For example, can be moved on slide glass by suction than the DNB of many 2 to 3 times of binding site., in sealing chamber laden slide glass is hatched 2 hours at 23 DEG C, rinse with in and pH remove unconjugated DNB.
In order to make the DNB of array stable, avoid occurring chemistry and mechanical degradation in order-checking process, can be contacted or be attached to array on array, DNB (, be loaded on array) and process afterwards DNB.According to a kind of embodiment, DNB is coated in the albumen of one deck partial denaturation to improve the stability of DNB array, and then improves intensity and the specificity of the signal that cPAL sequencing reaction (described below) produces.Various albumen; include but not limited to seralbumin (for example bovine serum albumin(BSA) (BSA) and human serum albumins); there is the character that contributes to protective effect and non-Interference Detection, because but they strong interaction do not occur with nucleic acid are combined substrate Reversible binding with array.These character depend on the multiple physical-chemical property of stable coated molecule, comprise charged character, for example isoelectric point, molecular weight, with the not reactive of nucleic acid and can not insert the character of nucleic acid.Do not having thisly coated in the situation that, in cPAL order-checking process, the DNB signal strength signal intensity of detection and specific quality may reduce in surveying circulation completely being less than 30.Have in this coated situation, we have been used for DNB array to exceed 100 circulations, and conventionally seldom see or do not see degradation (degradation) exceeding 70 circulation times.
In addition, if also observe start load after directly the independent DNB in array is coated with to processing, described DNB scatters from the teeth outwards to a certain extent.Finding to add before coated affect the amount that the concentrated rinsing step of DNB and washing step subsequently can reduce the DNB scattering and mix and also can improve the quality of data that detection DNB generates.
Although be described according to the genomic DNA order-checking of DNB form, but loading post processing according to the present invention also can be used for improving some biomolecule (includes but not limited to nucleic acid (strand and double-stranded DNA, RNA etc.)) stability and reduce its distribution, these biomolecule are attached to or are associated with any type solid holder for the biochemical analysis of broad range, described biochemical analysis comprises, for example nucleic acid hybridization, enzymatic reaction (is for example used restriction endonuclease [comprising restriction enzyme], excision enzyme, kinases, phosphatase, ligase etc.), nucleic acid is synthetic, nucleic acid amplification (for example, pass through polymerase chain reaction, rolling-circle replication, whole genome amplification, multiple displacement amplification etc.) and the biochemical analysis of any other form known in the art.
For example, many biochemical analysises relate to for example, by various different enzymes (kinases) nucleic acid molecules for being combined with substrate, and can have benefited from above-mentioned loading post processing.In the time that enzyme is diffused as the limiting factor in these analyses, can optimize the coated layer of albumen.The best is coated can provide enough protections to continue whole analytic process to retain nucleic acid molecules, and substantially can not diffuse through albumin layer by inhibitory enzyme.The performance characteristic of the albumin layer of absorption can be controlled by the protein concentration changing in solution.Also can be for example, by regulating pH, add the eliminating molecule of knowing and control total open-assembly time and control thickness and the performance of albumin layer, for example PEG of described eliminating molecule and/or adsorption inhibitor are (for example, such as Tween tM-80 (tweens tM-80) surfactant and so on).In the concrete condition of DNA sequencing that uses combined probe-grappling connection (cPAL below describing in detail), while particularly using 2,3,4 or more grappling probes, phosphorylation or unphosphorylated oligonucleotides and to remove remaining kinases before coupled reaction be completely important for realize high-quality extension by connections for completely.In one embodiment, best BSA concentration is: be 5.4 at pH, and under the condition that continues 5 to 15 minutes, the concentration of 0.05mg/ml.
The using method of VIII.DNB
The DNB preparing according to method described above is having brought advantage aspect the sequence of qualification target nucleic acid, because the adapter containing in DNB provides known array point, when when using the Combination of Methods of grappling son and order-checking probe, can determine dimensional orientation and sequence.In addition, DNB has avoided depending on cost and the problem that single fluorogen that single molecule sequencing system uses detects, because multicopy target sequence is present in single DNB.
Comprise to target nucleic acid order-checking and survey the distinguished sequence (for example, survey specific target sequence (for example, specific gene) and/or qualification and/or survey SNPs) in target nucleic acid according to the using method of DNB of the present invention.The method of describing in literary composition can also be reset and copy number variation for detection of nucleic acid.Nucleic acid quantification, for example digitlization gene expression (, the whole subgroup-all mRNA that transcribe that exist in analytic sample), and detect the quantity of distinguished sequence in sample or sequence set, also can utilize method described herein to realize.Although the major part discussion in literary composition is for the sequence of qualification DNB, be appreciated that other non-concatermer nucleic acid constructs that comprise adapter also can be in embodiment described herein.
The general introduction of VIIIA.cPAL order-checking
According to the present invention, conventionally utilize as mentioned below and in literary composition, be called the method for combined probe-grappling sub-connection (" cPAL ") and the sequence of improved form qualification DNB thereof.In simple terms, cPAL comprises that connecting product by detector probe identifies the locational nucleotides of particular detection in target nucleic acid, it is being connected to form by least one grappling probe and order-checking probe that described probe connects product, grappling probe and adapter are wherein hybridized wholly or in part, order-checking probe for example, contains specific nucleotide on corresponding (, can hybridize to) detection position " inquiry site ".Order-checking probe contains unique identification marking.If the nucleotides complementation on nucleotides and detection position on inquiry site, can connect, the connection product of formation contains described unique tag, then can be detected.The description of the different exemplary embodiment to cPAL method is hereinafter provided.Be appreciated that following description is not for restriction object, the distortion of following embodiment is contained in the present invention.
CPAL method of the present invention has the order-checking advantage of many hybridizing methods known in the art, comprises that the DNA array depth of parallelism, independence and non-iterative base read and each reaction can read multiple bases.In addition, cPAL has solved two restricted problems of hybridizing method order-checking: can not read simple repetition, and need intensive calculating.
" complementation " or " substantially complementary " refers to hybridization or base pairing or the duplex formation of (for example, between Oligonucleolide primers and primer binding site between two chains of double chain DNA molecule or on single-chain nucleic acid) between nucleotides or nucleic acid.Complementary nucleotide is generally A and T (or A and U) or C and G.When the nucleotides of a chain, in the situation that comparing best and relatively and suitably carried out nucleotides insertion or disappearance, with at least about 80%, conventionally at least about 90% to approximately 95% of another chain, even about 98% during to 100% pairing, and these two single stranded RNAs or DNA molecular are called as substantially complementary.
" hybridization " forms the process of stable double-stranded polynucleotide for referring to two non-covalent combinations of strand polynucleotides herein.(conventionally) double-stranded polynucleotide obtaining is " crossbred " or " duplex "." hybridization conditions " generally comprises lower than about 1M, more commonly lower than about 500mM, may be the salinity lower than about 200mM." hybridization buffer " is buffer salt solution, for example 5%SSPE or other this class buffer solutions known in the art.Hybridization temperature can be low to moderate 5 DEG C, but generally higher than 22 DEG C, more typically higher than about 30 DEG C, generally exceedes 37 DEG C.Hybridization is generally carried out under stringent condition, stringent condition be that probe can be with the hybridization of its target subsequence but not can with the condition of other not complementary sequence hybridizations.Stringent condition is sequence dependent, in different situations, is different.For example, longer fragment may need than the high hybridization temperature of short-movie section to carry out specific hybridization.Although other factors including existence and the base mispairing degree of base composition and complementary strand length, organic solvent may affect the stringency of hybridization, the combination of parameter is more important than the absolute measure of any independent parameter.Conventionally, stringent condition is under the ionic strength limiting and PH, selects the heterotactic T of bit mthe temperature of low about 5 DEG C.Exemplary stringent condition comprises that at least 0.01M is to the salinity that is no more than 1M Na ion concentration (or other salt), and pH about 7.0 is to about 8.3, at least 25 DEG C of temperature.For example, the condition of the temperature of 5xSSPE (750mM NaCl, 50mM sodium phosphate, 5mM EDTA, pH7.4) and 30 DEG C is applicable to the hybridization of equipotential specific probe.Other examples of stringent condition are known in the art, referring to for example Sambrook J et al. (2001), Molecular Cloning, A Laboratory Manual, (3rd Ed., Cold Spring HarborLaboratory Press.
Term " T m" for typically referring to the temperature that allows the double chain acid molecule of half be dissociated into strand herein.Calculate the T of nucleic acid mformula be well known in the art.Point out as canonical reference document, when nucleic acid is in 0.5M or the lower aqueous solution time in cation concn, can pass through formula T m=81.5+16.6 (log10[Na+]) 0.41 (%[G+C])-675/n-1.0m simple method of estimation T mvalue, (G+C) content is between 30% and 70%, n is base number, m is base mismatch to percentage (referring to for example, Sambrook J et al. (2001), Molecular Cloning, A Laboratory Manual, (3rd Ed., Cold Spring Harbor Laboratory Press).Other bibliography comprise more complicated computational methods, and these methods are being calculated T mtime consider that structure and sequence characteristic (can also be referring to, Anderson and Young (1985), Quantitative Filter Hybridization, Nucleic Acid Hybridization, and Allawi and SantaLucia (1997), Biochemistry36:10581-94).
In an example of cPAL method, as being called " single cPAL " in the literary composition of Figure 23 demonstration, the complementary district hybridization in the adapter 2308 of grappling probe 2302 and DNB2301.Grappling probe 2302 and the adapter area hybridization that is directly close to target nucleic acid 2309, but in some cases, as Figure 24 illustrates and is described further below, can, by introduce the degeneracy base of desired number at grappling probe end, grappling probe design can be become to " putting in " target nucleic acid.The order-checking probe set 2305 of distinguishing mark can be hybridized with the complementary district of target nucleic acid, and conventionally by using ligase, the order-checking probe of contiguous grappling Probe Hybridization connects to form probe connection product.Order-checking probe normally comprises oligonucleotides group or the oligonucleotides set of two parts, inquires the different oligonucleotides on site, and other locational likely bases (or universal base); Therefore each probe represents the each base type on ad-hoc location.Order-checking probe mark has detectable mark, and each order-checking probe and the order-checking probe that contains other nucleotides in this position are differentiated.Therefore,, in the example showing at Figure 23, the order-checking probe 2310 that contiguous grappling probe 2302 is hybridized and is connected with this grappling probe is " G " by identifying in target nucleic acid from the locational base of 5 bases of adapter.In the situation that Figure 23 describes, inquiry base is from connecting 5, site base, but more fully describes as following, and inquiry base can be from connecting site " more ", in some situation just at tie point.Once connect, wash away the grappling connecting and the probe that checks order do not occur, utilize the connection product existing in mark detection arrays.The multiple crossing of grappling probe and order-checking probe can be used for identifying the base of the desired number of the target nucleic acid of the every side of each adapter in DNB with being connected to circulate.The hybridization of grappling probe and order-checking probe can order or generation simultaneously.The informativeness of base mensuration (base call) partly depends on the informativeness of ligase, if having mispairing near connecting site, conventionally can not connect.
The present invention also provides the method that uses two or more grappling probes in each hybridization-connection circulation.Figure 25 has shown the another one example of " with outstanding dual cPAL " method, and wherein the first grappling probe 2502 and the second grappling probe 2505 are hybridized with the complementary district of adapter separately.In the example showing at Figure 25, the first grappling probe 2502 is hybridized completely with the first area of adapter 2511, and the second grappling probe 2505 is connected subarea complementation with adjacent with the first grappling Probe Hybridization position second.The same degeneracy base that comprises of end that the second grappling probe is not adjacent with the first grappling probe.Like this, the second grappling probe can be hybridized with the region (" giving prominence to " part) of close adapter 2511 in target nucleic acid 2512.The second grappling probe is conventionally too short, can not remain on alone duplex hybridization state, but after being connected with the first grappling probe, form longer grappling probe, in method subsequently, can stablize hybridization.As the discussion to " single cPAL " method above, the set 2508 of order-checking probe and adapter-grappling probe duplex hybridization 2509, and be connected in end 5 ' or the 3 ' base of the grappling probe connecting together, the set of described order-checking probe represents the each base type on target nucleic acid detection position and is marked with the detectable label that each order-checking probe and the order-checking probe region of containing other nucleotides in this position are separated.In the example showing at Figure 25, order-checking probe is designed to the base in tie point 5 ' 5 sites of end (positions) between interrogating range order-checking probe 2514 and the grappling probe 2513 of connection.Because the second adapter probe 2505 has 5 degeneracy bases at its 5 ' end, it reaches 5 base places within target nucleic acid 2512, allows order-checking probe to inquire whole 10 bases that start from the intersection between target nucleic acid 2512 and adapter 2511.
In some of the example of dual cPAL method described above changes, if the end of the more close adapter of the first grappling probe terminal, the second adapter probe degeneracy more pro rata, therefore there is larger possibility to be not only connected with the first adapter probe end, can also be connected with other the second adapter probes on the upper multiple sites of DNB.In order to prevent this connection illusion, can optionally activate the second grappling probe so that the connection of its participation and the first grappling probe or order-checking probe.This activation method is below having more detailed description, thereby comprises that the end of for example optionally modifying grappling probe makes them be merely able to be connected such method with the specific direction of relative adapter with particular anchor probe or order-checking probe.
Similar with dual cPAL method described above, be appreciated that the cPAL method of three kinds of uses or more kinds of grappling probes is also contained by the present invention.
In addition, sequencing reaction can carry out in the one or both ends of each adapter, and for example sequencing reaction can be " unidirectional ", detects at 3 ' or 5 ' or the other end of adapter; Or reaction can be " two-way ", wherein in 3 ' and 5 ' detection position detection base of adapter.Two-way sequencing reaction can carry out simultaneously, and the base of adapter both sides is simultaneously detected; Or carry out successively with any order.
Many circulations cPAL (no matter be single, dual, triple etc.) is by the multiple bases that identify in the target nucleic acid region adjacent with adapter.In simple terms, by looping grappling Probe Hybridization and enzymatic coupled reaction, and the order-checking probe set that is designed for the nucleotides that detects diverse location is removed from the intersection of adapter and target nucleic acid, repeat multiple adjacent bases in cPAL method inquiry target nucleic acid.In any given circulation, order-checking probe used is designed such that the characteristic and the characteristic concord that is attached to the mark on this order-checking probe of one or more locational one or more bases.The order-checking probe (and base of inquiry site) connecting once be detected, peeled off from DNB by junctional complex, and carry out hybridization and the connection of new round adapter and order-checking probe.
Be appreciated that, except cPAL method described above, DNB of the present invention can be for other sequence measurements, comprise method and other sequence measurements of other connection method order-checkings, include but not limited to hybrid method order-checking, synthetic method order-checking (comprising primer extension order-checking), can cut probe connection method chain type order-checking (chained sequencing by ligation of cleavable probes) etc.
Can also be for detection of the distinguished sequence in target nucleic acid with the similar sequence measurement of sequence measurement described above, comprise and detect SNP (SNPs).In these class methods, will adopt the order-checking probe that can for example, hybridize with particular sequence (containing the sequence of SNP).Described order-checking probe can distinguishing mark there is which SNP in qualification target nucleic acid.Grappling probe and this class order-checking probe combinations can also be used to the stability and the specificity that provide higher.
VIIIB. order-checking
On the one hand, the invention provides the method that connects to check order by utilization and identify the method for the sequence of DNB.One aspect of the present invention provides the method for qualification DNB sequence, and described method has been utilized combined probe-grappling sub-connection (cPAL) method.Conventionally, cPAL relates to and connects product by detector probe and identify the nucleotides on detection position in target nucleic acid, and described probe connects product and is connected to form by grappling probe and order-checking probe.Method of the present invention can be for measuring DNB and the some or all of sequence that represents the target nucleic acid comprising in part or all genomic many DNB.
In some respects, only driven approximately 20% according to the coupled reaction in cPAL method of the present invention." be urged to " as used herein the percentage that the specified level completing refers to the monomer in single DNB or the DNB that should show connection event.Be independent event owing to reading each base in cPAL method, so in order to read the base of back along sequence in hybridization subsequently connects circulation, each base in each monomer in each DNB needn't Supporting connectivity reaction.Therefore, cPAL method of the present invention need to reduce amount of reagent and time greatly, makes cost obviously reduce and raise the efficiency.In some embodiments, be driven to 20%, 25% according to the coupled reaction in cPAL method of the present invention, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 90% or 100%.In other embodiments, be driven to approximately 10% to approximately 100% according to the coupled reaction in cPAL method of the present invention.In other embodiment, coupled reaction according to the present invention has been driven to about 20%-95%, 30%-90%, 40%-85%, 50%-80% and 60%-75%.In some embodiments, the percentage that completes of reaction is subject to the reagent concentration that allows reaction to carry out, the variable effect of temperature and time length.In other embodiments, cPAL coupled reaction complete percentage can by relatively available from the signal of the each DNB in cPAL coupled reaction and these signals of comparison with estimate from the signal of direct cross label probe in the grappling Probe Hybridization site of adapter in DNB.The estimation of the DNB number of the possible hybridization site of tool is provided from the signal direct and label probe that adapter is hybridized, sort signal can serve as and the baseline of the signal comparison of the probe from being connected in cPAL reaction subsequently, thus definite coupled reaction complete percentage.In some embodiments, the completion rate of coupled reaction can change with the change of information final use, the level that completes that some of them application need to be higher than other application.
As what further discuss herein, the monomeric unit that each DNB comprises repetition, each monomeric unit comprises one or more adapter and target nucleic acid.Target nucleic acid comprises multiple detection positions.Term " detection position " refers to the position of wishing to obtain its sequence information in target nucleic acid.As is understood by persons skilled in the art, the detection position that target sequence contains multiple its sequence informations of needs conventionally, for example, resemble the order-checking of the full gene group of describing in literary composition.In some situation, for example, in snp analysis, may wish only to read the single SNP in specific region.
The invention provides the sequence measurement that is used in combination grappling probe and order-checking probe." the order-checking probe " that in literary composition, use refers to and is designed for the oligonucleotides that the locational nucleotide identity of target nucleic acid particular detection is provided.Domain hybridization in order-checking probe and target sequence, for example the first order-checking probe can be hybridized with the first target structure territory, the second order-checking probe and the hybridization of the second target structure territory.Grammer equivalents in term " the first target structure territory " and " the second target structure territory " or literary composition means two parts of target sequence in the nucleic acid accepting inspection.The first target structure territory may and the second target structure territory between direct neighbor, or the sequence (for example adapter) that the first and second target structure territories are inserted into separates.Term " first " and " second " the non-hope direction of sequence with regard to 5 ' of target sequence-3 ' direction of giving.For example, suppose that complementary target sequence is 5'-3' direction, the first target structure territory may be positioned at 5 ' direction of the second domain, or 3 ' direction of the second domain.Order-checking probe can be overlapping, 6 base hybridization that start most that for example the first order-checking probe can be adjacent with end of adapter, the second order-checking probe can be hybridized (for example, in the time that grappling probe has three degeneracy bases) with the 4th to the 9th base starting from adapter end.Alternatively, 6 base hybridization that the first order-checking probe can be adjacent with adapter " upstream " end, 6 base hybridization that the second order-checking probe can be adjacent with adapter " downstream " end.
Order-checking probe generally comprises many degeneracy bases and is arranged in the specific nucleotide on probe ad-hoc location to inquire about detection position (literary composition is called again " inquiry site ").
In general,, in the time utilizing degeneracy base, use the set of order-checking probe.In other words the probe that, has a sequence " NNNANN " is actually one group of probe that likely combines and be adenine on the 6th site (, 1024 kinds of sequences) that contains 4 kinds of nucleotide bases on 5 sites.(as the article pointed out, this technology can also be used for adapter probe: for example, in the time that adapter probe contains " three degeneracy bases ", for example, it is actually such one group of adapter probe, it comprises likely combining on the sequence corresponding with anchored site and 3 sites, so be the set of 64 kinds of probes).
In some embodiment, for each inquiry site, can by four not the set of isolabeling merge in single set, for the step that checks order.Therefore, in any specific order-checking step, use 4 set, each in inquiry site with different special bases, and have isolabeling not and the base-pair in inquiry site should.That is, order-checking probe is labeled equally, make the associated mark of specific nucleotide on ad hoc querying site from identical challenges site the mark with the order-checking probe of different IPs thuja acid different.For example, in single step, can use four set: NNNANN-dyestuff 1, NNNTNN-dyestuff 2, NNNCNN-dyestuff 3 and NNNGNN-dyestuff 4, as long as these dyestuffs are that optics is distinguishable.In some embodiment, for example, detect for SNP, may only need to comprise two set, because described SNP can only be C or A etc.Similarly, some SNP comprises three kinds of possibilities.Alternatively, in some embodiments, if reaction order but not carry out simultaneously, can just in different steps, use same dyestuff: for example, can in reaction, use separately NNNANN-dyestuff 1 probe, detect or do not detect signal, wash away probe; Second set NNNTNN-dyestuff 1 of any introducing.
In any sequence measurement described herein, order-checking probe can have different lengths, comprises that about 3 to about 25 bases.In other embodiments, the length of order-checking probe can be about 5 to about 20, about 6 to about 18, about 7 to about 16, about 8 to about 14, about 9 to about 12 and about 10 in the scope of about 11 bases.
Order-checking probe of the present invention be designed to target sequence in sequence complementation, and be generally complete complementary, make it possible to occur the hybridization of part target sequence and the described probe of invention.Specifically, inquiry site base and detection site base complete complementary are very important, unless their complete complementaries really, method of the present invention can not produce signal.
In many embodiments, order-checking probe and with their hybridization target sequence be complete complementary; , test is carried out under the condition that is conducive to as known in the art form complete base pairing.It will be understood by those skilled in the art that can only be substantially complementary with the second domain of identical target sequence with the order-checking probe of target sequence the first domain complete complementary; Be in the many situations of the present invention, to depend on use probe in groups, for example in groups with some target sequence complete complementary not with six aggressiveness of other complementations.
In some embodiments, depend on the situation of concrete application, the complementarity between order-checking probe and target sequence needs not be perfect; What can have any amount may disturb the base-pair mismatch of hybridizing between target sequence and single-chain nucleic acid of the present invention.But if mispairing quantity is too high, even all can not hybridize under least tight hybridization conditions, this sequence is not complementary target sequence.Therefore, " substantially complementary " in literary composition means that order-checking probe and the complementary degree of target sequence enough hybridizing under reaction condition normally.But for majority application, only have the complete complementary of existence, be just set as being conducive to the condition of Probe Hybridization.Alternatively, have enough complementarity so that ligase reaction occurs, some part of sequence may have mispairing, but the base in inquiry site should only have this site to have complete complementary, just allows to connect to occur.
In some situation, probe of the present invention, using outside degeneracy base or replacing use degeneracy base, can also use the universal base with the hybridization of more than one base.For example, can use inosine.Can adopt any combination of these systems and probe composition.
Be used for the order-checking probe of method of the present invention conventionally with detectable mark." mark " in literary composition, " with mark " thus mean that compound has at least adhered to a kind of element, isotope or chemical substance and can detect compound.In general, include but not limited to tagging (can be radioactivity or heavy metal isotope), magnetic mark, electronic marker, temperature-sensitive mark, colour developing and luminescent dye, enzyme and magnetic ball etc. for mark of the present invention.Can be chromophore, phosphorus or fluorescent dye for dyestuff of the present invention, they be because the signal producing is strong, thereby can provide good signal to noise ratio for decoding.Order-checking probe can also be by quantum dot, fluorescence nano pearl or other structures that comprises an above identical fluorogen molecule.The mark of the multiple molecules that comprise identical fluorogen can provide stronger signal conventionally, to quench sensitiveness lower than the mark that comprises single fluorogen molecule.Should be understood to be applicable to about any discussion of the mark that comprises fluorogen the mark that comprises single or multiple fluorogen molecules herein.
In many embodiments of the present invention, relate to use fluorescence labeling.Being applicable to dyestuff of the present invention comprises, but be not limited to fluorescent rare earth (comprising europium and terbium) complex, fluorescein, rhodamine, tetramethyl rhodamine, Yihong, erythrosine, cumarin, methylcoumarin, pyrene (pyrene), malachite green (Malacite green), seedling class (stilbene), lucifer yellow (Lucifer Yellow), Cascade Blue tM, other dyestuffs of describing in Texas Red and the 6th edition Molecular Probes Handbook by Richard P.Haugland, for all objects, the particularly whole instructions in order to use about mark according to the invention, the document is clear and definite to be by reference incorporated to herein in full.For using the commodity fluorescent dye of introducing nucleic acid to include, but are not limited to any nucleotides: Cy3, Cy5 (Amersham Biosciences, Piscataway, New Jersey, USA), fluorescein, tetramethyl rhodamine, Texas cascade fL-14, tR-14, Rhodamine Green tM, Oregon 488, 630/650, 650/665-, Alexa 488, 532, Alexa 568, Alexa 594, Alexa 546 (Molecular Probes, Inc.Eugene, OR, USA), Quasar570, Quasar670, Cal Red610 (BioSearch Technologies, Novato, Ca).Other can comprise for the fluorophorre adhering to after synthetic, Alexa 350, Alexa 532, Alexa 546, Alexa 568, Alexa 594, Alexa 647, BODIPY493/503, BODIPY FL, BODIPY R6G, BODIPY530/550, BODIPY TMR, BODIPY558/568, BODIPY558/568, BODIPY564/570, BODIPY 576/589, BODIPY581/591, B0DIPY630/650, BODIPY650/665, Cascade Blue, Cascade Yellow, Dansyl, beautiful this amine rhodamine B (lissamine rhodamine B), Marina Blue, Oregon Green488, OregonGreen514, Pacific Blue, rhodamine 6G, rhodamine is green, rhodamine is red, tetramethyl rhodamine, Texas Red (can be from Molecular Probes, Inc., Eugene, OR, USA buys) and Cy2, Cy3.5, Cy5.5 and Cy7 (Amersham Biosciences, Piscataway, NJ USA and other) etc.In some embodiment, comprise that the mark of luciferin, Cy3, Texas Red, Cy5, Quasar570, Quasar670 and Cal Red610 is used in method of the present invention.
Can utilize methods known in the art that mark is attached on nucleic acid to form mark order-checking probe of the present invention, and be attached to the various positions on nucleosides.For example, adhering to can be at one or two end of nucleic acid, or at interior location or either way have.For example, in one embodiment, mark can be attached to by acid amides or amine key 2 ' or 3 ' position (latter event is for end mark) of ribose in ribose-phosphoric acid skeleton.Can also adhere to via the phosphoric acid in ribose-phosphoric acid skeleton, or be attached in the base of nucleotides.Mark can be attached to one or two end of probe, or on any one nucleotides on probe.
According to desirable inquiry site, the structure of order-checking probe is different.For example, for mark the order-checking probe of fluorogen, the characteristic of the fluorogen using with label probe is echoed in a site in each order-checking probe.In general, fluorogen molecule can be attached in order-checking probe one end contrary end connecting with grappling probe.
" the grappling probe " using in literary composition means and is designed to the oligonucleotides complementary with at least a portion (being called " anchored site " in literary composition) of adapter.As described herein, adapter can contain the multiple anchored sites with multiple grappling Probe Hybridizations.As what further discuss in literary composition, can be designed to and adapter hybridization for grappling probe of the present invention, thereby make end at least one end and adapter of grappling probe flush (" upstream " or " downstream " or the two).In other embodiments, grappling probe can be designed to the target nucleic acid of at least a portion (the first adapter site) of adapter and contiguous adapter at least one nucleotides (" give prominence to ") hybridize.As shown in figure 24, grappling probe 2402 comprises and the sequence of the part complementation of adapter.Grappling probe 2402 also comprises 4 degeneracy bases at an end.This degeneracy allows a part for grappling probe populations to mate wholly or in part with the target nucleic acid sequence of contiguous adapter, and allow grappling probe and adapter to hybridize and put in the target nucleic acid contiguous with adapter, and with the nucleotide identity of the contiguous target nucleic acid of adapter why.Grappling probe end base moves on to and in target nucleic acid, makes more close connection site, base site to be determined, thereby has kept the fidelity of ligase.In general, if probe and the target nucleic acid region complete complementary being hybrid with it, ligase linking probe more efficiently, but the informativeness of ligase connects the distance increase in site and declines along with leaving.Therefore,, in order to reduce and/or to prevent the mistake that between probe and target nucleic acid, incorrect pairing causes that checks order, keeping nucleotides to be detected may be useful with the distance being connected between site of order-checking and grappling probe.Make grappling probe put in target nucleic acid by design, can keep the informativeness of ligase, but still can identify the nucleotides being connected with each adapter of greater number.Although the embodiment that Figure 24 shows is the target nucleic acid area hybridization of order-checking probe and adapter one side, be appreciated that the Probe Hybridization that checks order also contains in the present invention to the embodiment of adapter opposite side.In Figure 24, " N " represents degeneracy base, and " B " represents the nucleotides of undetermined sequence.As will be appreciated, in some embodiments, can use universal base and nondegenerate base.
Grappling probe of the present invention can comprise any sequence that grappling probe can be hybridized with DNB (the normally adapter on DNB).This class grappling probe can comprise such sequence, makes in the time of grappling probe and adapter hybridization, and whole length of grappling probe are included in adapter.In some embodiment, grappling probe can comprise the sequence with at least a portion complementation of adapter, also comprise can with the degeneracy base of the target nucleic acid area hybridization of contiguous adapter.In some exemplary embodiment, grappling probe be comprise 3 with six aggressiveness of base and 3 degeneracy bases of adapter complementation.In some exemplary embodiment, grappling probe be comprise 3 with 8 aggressiveness of base and 5 degeneracy bases of adapter complementation.In other embodiments, while particularly having used multiple grappling probe, the first grappling probe at one end comprises multiple and base adapter complementation, the other end comprises degeneracy base, and the second grappling probe comprises whole are degeneracy bases, be designed to be connected with one end that comprises degeneracy base of the first grappling probe.Be appreciated that these are exemplary embodiments, multiple combination known and degeneracy base can be for generation of the grappling probe that is applicable to the present invention's use.
The invention provides the connection method sequence measurement of qualification DNB sequence.In some aspect, connection method sequence measurement of the present invention comprises provides the grappling of various combination probe and order-checking probe, and these two kinds of probes, in the time hybridizing to the upper adjacent area of DNB, can be connected to form probe and connect product.Then detector probe connects product, and the characteristic of one or more nucleotides in target nucleic acid can be provided." connection " using in literary composition refers to any method that two or more nucleotides is connected with each other.Connection can comprise that chemistry is connected with enzyme process.In general the connection method sequence measurement of, discussing in literary composition utilizes ligase to carry out enzyme process connection.This class ligase using in the present invention can be with discussed above to be used to form nucleic acid-templated ligase identical or different.This class ligase includes but not limited to DNA ligase I, DNA ligase II, DNA ligase III, DNA ligase IV, e. coli dna ligase, T4DNA ligase, T4RNA ligase 1, T4RNA ligase 2, T7 ligase, T3DNA ligase and thermally-stabilised ligase (including but not limited to Taq ligase) etc.As discussed above, connection method sequence measurement often relies on the informativeness of ligase, only by with couple together with the probe of the nucleic acid complete complementary of their hybridization.This informativeness declines along with the increase of the distance between the tie point between the base on specific site in probe and two probes.Therefore, conventional connection method sequence measurement can only identify the base of limited quantity.Just as further described in the text, the present invention adopts the base quantity that the incompatible increase of multiple probe sets can be identified.
Multiple hybridization conditions can be for other sequence measurements of discussing in connection method sequence measurement and literary composition.These conditions comprise height, medium and low stringency condition, referring to being incorporated to by reference for example Maniatis et al. herein, Molecular Cloning:A Laboratory Manual, 2d Edition, 1989, with Short Protocols in Molecular Biology, ed.Ausubel, et al.Stringent condition is sequence dependent, and is different in different situations.Longer sequence is at higher temperature specific hybridization.More comprehensive guide about nucleic acid hybridization is found in Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid Probes, " Overview of principles of hybridization and the strategy of nucleic acid assays, " (1993).Conventionally, stringent condition is selected in limiting at the temperature of the low about 5-10 of fusing point (Tm) DEG C of ionic strength and pH than distinguished sequence.Tm be under equilibrium condition with the probe of target complementation have 50% with the target sequence temperature in when hybridization (limiting under ionic strength, PH and nucleic acid concentration) (because at Tm, exist excessive target sequence, so 50% probe is occupied in the time of balance).Stringent condition can be more such conditions, wherein salinity is lower than about 1.0M sodium ion, generally about 0.01 to 1.0M Na ion concentration (or other salt), pH7.0 to 8.3, temperature for short probe (for example, 10 to 50 nucleotides) be at least about 30 DEG C, for example, be at least about 60 DEG C for long probe (, exceeding 50 nucleotides).Stringent condition can also be by adding the spiral destabilizing agent such as formamide to reach.As known in the art, when using nonionic skeleton, when PNA, hybridization conditions also may change.In addition, can, after target combination, add crosslinking agent that two chains of hybridization complex are crosslinked, i.e. covalent attachment.
Although about many descriptions of sequence measurement with regard to of the present invention nucleic acid-templated with regard to, be appreciated that as described herein, these sequence measurements have also been contained the sequence of qualification by the DNB of these nucleic acid-templated generations.
For the nucleic acid-templated any sequence measurement carrying out of the present invention that utilizes of describing in known in the art and literary composition, the invention provides at least about 10 methods to about 200 bases in target nucleic acid of determining.In other embodiments, the invention provides the method at least about 20 to approximately 180, approximately 30 to approximately 160, approximately 40 to approximately 140, approximately 50 to approximately 120, approximately 60 to approximately 100 and approximately 70 to approximately 80 bases in target nucleic acid of determining.Also have in some embodiments, sequence measurement is used at least 5,10,15,20,25,30 or polybase base more of the one or both ends of the contiguous each adapter in nucleic acid-templated of the present invention of qualification.
Any sequence measurement described herein and known in the art can be applied to the of the present invention nucleic acid-templated and/or DNB in solution, or is arranged on the nucleic acid-templated and/or DNB in surface and/or array.
VIIIB (i). single cPAL
On the one hand, the invention provides by the check order combination of probe and grappling probe of utilization and identify the method for the sequence of DNB, wherein said order-checking and grappling Probe Hybridization be to the adjacent area of DNB, and conventionally by using ligase to link together.In this method literary composition, be commonly referred to as cPAL (combined probe-grappling sub-connection) method.On the one hand, cPAL method generation of the present invention comprises single grappling probe and is connected product with the probe of single order-checking probe.This uses the cPAL method of single grappling probe to be called in the text " single cPAL ".
Figure 23 has shown a kind of embodiment of single cPAL.The monomeric unit 2301 of DNB comprises target nucleic acid 2309 and adapter 2308.Grappling probe 2302 is hybridized with the complementary region on adapter 2308.In the example showing at Figure 23, grappling probe 2302 and the adapter area hybridization that is directly close to target nucleic acid 2309, but as further discussed in literary composition, also can, by introduce the degeneracy base that needs quantity at grappling probe end, grappling probe design be become can put in the target nucleic acid of contiguous adapter.The order-checking probe set 2306 of distinguishing mark by with target nucleic acid in complementary region hybridization.Contiguous grappling probe 2302, will be connected to form probe with grappling probe with the order-checking probe 2310 of the area hybridization of target nucleic acid 2309 and be connected product.When in probe, inquire in the base in site and the detection site of target nucleic acid unknown base complementrity time, hybridization and the efficiency increase being connected.The efficiency of this increase is conducive to complete complementary (instead of contain mispairing) order-checking probe and is connected with grappling probe.As above discussed, connect and normally utilize ligase to realize through enzymatic, be applicable to other methods of attachment of the present invention but also can use.In Figure 23, " N " represents degeneracy base, and the nucleotides of sequence is not determined in " B " representative.Be appreciated that in some embodiment, can use universal base to replace degeneracy base.
As what above also discussed, order-checking probe can be the oligonucleotides that represents various base types on specific site and be marked with detectable label, and wherein said detectable label can separate every kind of order-checking probe and the order-checking probe region with other nucleotides on this position.Therefore,, in the example showing at Figure 23, contiguous grappling probe 2302 is hybridized and is connected to order-checking probe 2310 on this grappling probe can to identify in target nucleic acid from the base on the site of 5 bases of adapter be " G ".The grappling probe that can utilize multiple circulations and order-checking Probe Hybridization be connected to identify the target nucleic acid of the every side of each adapter in DNB in the base of desired number.
The hybridization that is appreciated that the grappling probe in any cPAL method of describing in literary composition and the probe that checks order can be carried out in sequence or simultaneously.
In the embodiment showing at Figure 23, order-checking probe 2310 hybridizes to adapter " upstream " region, but is appreciated that described order-checking probe also can hybridize with adapter " downstream ".Term " upstream " and " downstream " refer in the region of adapter 5 ' and 3 ' direction, specifically depend on the direction of system.In general, " upstream " and " downstream " is relative terms, is not restrictive; Use them just for the ease of understanding.As shown in Figure 6, order-checking probe 607 can hybridize to adapter 604 downstreams, thereby identifies the nucleotides apart from 4 bases of intersection between adapter and target nucleic acid 603.In other embodiments, order-checking probe can all hybridize to identify the nucleotides on the nucleic acid site of adapter both sides with adapter upstream and downstream.These embodiments allow in single cPAL method, for each hybridization-connection-detection circulation, produce multiple information points by each adapter.
In some embodiment, can contain about 3 of corresponding adapter for the probe of single cPAL method and arrive about 20 bases, and about 1 arrives about 20 degeneracy bases (, in the set of grappling probe).These grappling probes can also comprise universal base, and the combination of degeneracy base and universal base.
In some embodiment, the grappling probe that contains degeneracy base, mates the stability of hybridization completely in order to increase degeneracy base, can have about 1-5 mispairing with adapter sequence.The stability that such design provides another kind of mode to come control connection grappling together and order-checking probe, to be conducive to the probe that those mate completely with target (the unknown) sequence.In other embodiments, in grappling probe, multiple bases of degeneracy base portion can replace affecting the stability of hybridization probe with abasic site (being the site that there is no base on sugar) or other nucleotide analogs, thereby the far-end (as described herein will participation and the coupled reaction of order-checking probe) that is conducive to grappling probe degeneracy part forms and mates crossbred completely.Can, in inner base, particularly introduce this class in the inside base of the grappling probe that comprises a large amount of (more than 5) degeneracy bases and modify.In addition, as described further below, some degeneracy of grappling probe distal or universal base can be designed to can cut after hybridization (for example, by introducing uracil) thereby connect site for check order probe or the second grappling probe produce.
In other embodiments, can be by control response condition, for example the stringency of hybridization is controlled the hybridization of grappling probe.In exemplary embodiment, grappling crossover process can be from high stringency (higher temperature, compared with low salt concn, higher pH, higher formamide concentration etc.) condition, and these conditions can be loosened gradually or progressively.This may need continuous hybridization circulation, and wherein the set of different grappling probe is removed, and then adds in circulation subsequently.Such method provides the target nucleic acid being occupied by the grappling probe of complete complementary (particularly by with the grappling probe of the far-end site complete complementary that is connected of order-checking probe) of higher percentage.Can also control the crossbred of coupling completely that hybridization time under each stringency condition obtains greater number.
VIIB (ii). dual cPAL (and subsequent treatment)
In other embodiment, the invention provides the cPAL method that uses two kinds of grappling probes that link together in each hybridization-connection circulation.Referring to for example U.S. Patent application the 60/992nd, No. 485,61/026, No. 337,61/035, No. 914 and 61/061, No. 134, by reference of text, particularly embodiment and claim are incorporated to herein these documents.Figure 25 has shown an example of " dual cPAL " method, and wherein the first grappling probe 2502 and the second grappling probe 2505 are hybridized the complementary region to adapter; I.e. the first grappling probe and the hybridization of the first anchored site, the second grappling probe and the hybridization of the second anchored site.In the example showing at Figure 25, a region (the first anchored site) complete complementary of the first grappling probe 2502 and adapter 2511, adapter region (the second anchored site) complementation in the second grappling probe 2505 and contiguous the first grappling Probe Hybridization site.In general, the first and second anchored sites are adjacent.
The second grappling probe can optionally also comprise degeneracy base at that not adjacent with the first grappling probe end, therefore it by with target nucleic acid 2512 in the area hybridization of contiguous adapter 2511.This makes it possible to obtain the sequence information from the farther target nucleic acid base of adapter/target intersection.Equally, as summarized in literary composition, when mentioning probe and containing " degeneracy base ", mean that in fact this probe comprises probe in groups, have in degeneracy site the likely combination of sequence.For example, if grappling probe length is 9 bases, have 6 known bases and 3 degeneracy bases, grappling probe is actually the set of 64 probes.
The second grappling probe is conventionally too short, can not maintain separately duplex hybridization state, but after being connected with the first grappling probe, just forms the stable grappling probe in method subsequently of length.In some embodiment, the second grappling probe contain with adapter complementation about 1 to about 5 bases and about 5 to the bases of about 10 degenerate sequences.As what discussed in above " single cPAL " method, represent each base type of target nucleic acid detection site mark detectable label (its can by each order-checking probe with in this site, the order-checking probe region with other nucleotides separates) order-checking probe set 2508 and adapter-grappling probe duplex hybridization 2509, and be connected in end 5 ' or the 3 ' base of connected grappling probe.In the example shown in Figure 25, order-checking probe is designed to the base in tie point 5 ' 5 sites of direction between interrogating range order-checking probe 2514 and connected grappling probe 2513.Because the second grappling probe 2505 has 5 degeneracy bases at its 5 ' end, it puts in 5 bases to target nucleic acid 2512, allows order-checking probe to inquire in the place of whole 10 bases of intersection apart between target nucleic acid 2512 and adapter 2511.In Figure 25, " N " represents degeneracy base, and the nucleotides of sequence is not determined in " B " representative.Be appreciated that in some embodiment, can use universal base to replace degeneracy base.
In some embodiments, the second grappling probe may contain corresponding with adapter about 5-10 base and with target nucleic acid corresponding be generally about 5-15 base of degeneracy base.First this second grappling probe may hybridize under optimum condition, thereby is conducive to that the target of high percentage occupied completely matchingly in the minority base around the tie point between two kinds of grappling probes.The second grappling probe can be hybridized and be connected to the first adapter probe and/or order-checking probe in one step or sequentially.In some embodiment, the first and second grappling probes can their tie point have about 5 to about 50 not with the complementary base of adapter complementation, therefore form " branch " crossbred.This design allows the adapter specificity of the second grappling probe of hybridization to stablize.In some embodiment, the second grappling probe with the first grappling Probe Hybridization before be first connected to order-checking probe on; In some embodiment, the second grappling probe with order-checking Probe Hybridization before, be first connected on the first grappling probe; In some embodiment, the first and second grappling probes and order-checking probe are hybridized simultaneously, between the first and second grappling probes and between the second grappling probe and order-checking probe simultaneously or connect substantially simultaneously, and in other embodiments, between the first and second grappling probes and the second grappling probe and the connection of order-checking between probe occur successively with any order.Can utilize tight cleaning condition to remove not occur the probe that connects (for example, temperature, pH, salt, the buffer solution of formamide that contains optium concentration can use, and optimum condition wherein and/or concentration utilize means known in the art to determine).This method is particularly useful in the method with the second grappling probe of a large amount of degeneracy bases in use, hybridization beyond the respective quadrature contact of wherein said degeneracy base between grappling probe and target nucleic acid.
In some embodiments, dual cPAL method is utilized the connection of two grappling probes, one of them grappling probe and adapter complete complementary, and second grappling probe is all degeneracy base (same, to be actually probe set).Figure 26 has shown an example of the dual cPAL method of this class, and wherein the first grappling probe 2602 is hybridized with the adapter 2611 of DNB2601.The second grappling probe 2605 is all degeneracy, therefore can with the target nucleic acid region of contiguous adapter 2611 in unknown nucleotide hybridization.The second grappling probe is designed to short cannot maintain separately duplex hybridization state, but the connected grappling probe construct of length forming after being connected with the first grappling probe cPAL process is just provided in the stability of subsequent step needs.The second grappling probe of degeneracy can grow up approximately 5 in some embodiments to about 20 bases completely.For longer length (that is, 10 more than base), can change to reduce effective Tm of degeneracy grappling probe to hybridization and condition of contact.The second shorter grappling probe can be non-specifically combined with target nucleic acid and adapter conventionally; but its shorter effect length hybridization kinetics characteristic; therefore in general only have those and the second grappling probe of the region complete complementary of contiguous adapter and the first grappling probe to have the stability that ligase is connected together the first and second grappling probes, thereby produce long connected grappling probe construct.The second grappling probe of non-specific hybridization does not keep hybridizing the sufficiently long time with DNB so that the stability being connected with the order-checking probe of any adjacent hybridization subsequently.In some embodiment, second be connected with the first grappling probe after, conventionally by cleaning step remove any do not have connect grappling probe.In Figure 26, " N " represents degeneracy base, and the nucleotides of sequence is not determined in " B " representative.Be appreciated that in some embodiments, can use universal base to replace degeneracy base.
In other exemplary embodiment, the first grappling probe be comprise 3 with six aggressiveness of base and 3 degeneracy bases of adapter complementation, and the second grappling probe only comprises degeneracy base, and the first and second grappling probes are designed such that to only have the first grappling probe can be connected with the second grappling probe with the end of degeneracy base.In other exemplary embodiment, the first grappling probe be comprise 3 with 8 aggressiveness of base and 5 degeneracy bases of adapter complementation, same the first and second grappling probes are designed such that to only have the end with degeneracy base of the first grappling probe to be connected with the second grappling probe.Be appreciated that these are exemplary embodiment, the multiple combination perhaps of known base and degeneracy base can be used in the design of first and second (in some embodiment, the 3rd and/or the 4th) grappling probe.
In the modification of the example of above-mentioned dual cPAL method, if the first grappling sound end is connected on the end of more close adapter, the second grappling probe is by degeneracy more pro rata, therefore be more likely not only connected with the first grappling probe end, also can be connected with other the second grappling probes in the upper multiple sites of DNB.In order to prevent that this class from connecting illusion, can optionally activate the second grappling probe it is limited to and being connected of the first grappling probe or order-checking probe.This class activation comprises the end of optionally modifying grappling probe, and they can only be connected with particular anchor probe or order-checking probe with the specific direction of relative adapter.For example, can give the second grappling probe introducing 5 ' and 3 ' phosphate group, adorned like this second grappling probe can be connected with the 3 ' end that hybridizes to the first grappling probe on adapter, but two the second grappling probes can not interconnect (because 3 ' end is phosphorylated, will stop enzymatic to connect).Once the first and second grappling probes connect together, can for example, activate 3 ' end of the second grappling probe by removing 3 ' phosphate group (using T4 polynucleotide kinase or the phosphatase such as shrimp alkaline phosphotase and calf intestinal phosphatase enzyme).
Occur between 3 ' end of the second grappling probe and 5 ' end of the first grappling probe if wish to connect, can design and/or modify the first grappling probe its 5 ' end is phosphorylated, and can design and/or modify the second grappling probe and make it without 5 ' or 3 ' phosphoric acid.Equally, the second grappling probe can be connected with the first grappling probe, but can not be connected with other the second grappling probes.After the first and second grappling probes connect, can on the free terminal of the second grappling probe, produce 5 ' phosphate group (for example, by use T4 polynucleotide kinase) and it is used in the later step of cPAL process be connected with order-checking probe.
In some embodiment, two kinds of grappling probes are added to DNB simultaneously.In some embodiment, two kinds of grappling probes are that order adds to DNB, allow a kind of grappling probe to hybridize with DNB before another kind.In some embodiment, before the second adapter is connected with order-checking probe, two kinds of grappling probes first interconnect.In some embodiment, grappling probe is connected in a step with order-checking probe.Two kinds of grappling probes are in the embodiment being connected in a step with order-checking probe, the second adapter can be designed to its position of enough stability maintenances, until three kinds of probe (two kinds of grappling probes and the probe that checks order) in place connections.For example, can use comprise 5 with the base of adapter complementation and 5 for the second grappling probe of the degeneracy base of the target nucleic acid area hybridization of contiguous adapter.The second grappling probe like this may have enough stability to maintain in the time of low tight washing, therefore between the second grappling Probe Hybridization and order-checking Probe Hybridization step, does not need Connection Step.In the Connection Step of order-checking probe subsequently and the second grappling probe, the second grappling probe also will be connected on the first grappling probe, and the duplex stability of generation is higher than independent any grappling probe or order-checking probe.
Similar with dual cPAL method described above, be appreciated that the cPAL of three kinds or more kinds of grappling probes is also contained in the present invention.These grappling probes can be designed to and adapter area hybridization according to described herein and methods known in the art, and an a kind of end of grappling probe can be connected with the order-checking probe of adjacent end grappling Probe Hybridization.In exemplary embodiment, three kinds of grappling probe-two kind and different sequence complementations in adapter are provided, the third comprise degeneracy base with target nucleic acid in sequence hybridization.In other embodiments, can also comprise one or more degeneracy base with one of two kinds of grappling probes of the sequence complementation in adapter its end, be connected with the 3rd grappling probe so that this grappling probe puts in target nucleic acid.In other embodiments, the one in grappling probe may be complementary wholly or in part with adapter, second and the 3rd grappling probe be completely degeneracy to hybridize with target nucleic acid.In other embodiments, the grappling probe that the grappling probe of four or more complete degeneracys can be connected with three is linked in sequence, and further extends in target nucleic acid sequence thereby make to read.In exemplary embodiment, comprise with the first grappling probe of 12 bases of adapter complementation and can be connected with the two or six aggressiveness grappling probe, 6 bases in described six aggressiveness grappling probes are all degeneracys.The 3rd grappling that is equally six aggressiveness of complete degeneracy also can be connected with the second grappling probe, and further puts in target nucleic acid unknown nucleotide sequence.Can also add the grappling probe such as the 4th, the 5th, the 6th to further extend in unknown nucleotide sequence.In other embodiment, according to any cPAL method described herein, one or more grappling probes can comprise one or more marks, and described mark is for making the particular anchor probe of hybridizing on " label " and/or the adapter for the identification of DNB to grappling probe.
VIIIB (iii). detect fluorescently-labeled order-checking probe
As discussed above, can order-checking probe of the present invention will be used for various marks and ground mark can be detected.Although following description is mainly the embodiment with fluorogen mark for order-checking probe, is appreciated that and has used the similar embodiment of the order-checking probe that comprises other types mark also to contain in the present invention.
The cPAL of multiple circulations (no matter be single, dual, triple etc.) is by the multiple bases that identify in the target nucleic acid region of contiguous adapter.In simple terms, by looping grappling Probe Hybridization and enzymatic coupled reaction, and remove order-checking probe set (it is designed for the nucleotides that detects different loci) from the intersection of adapter and target nucleic acid and repeat cPAL method so that the multiple bases in inquiry target nucleic acid.In any given circulation, the order-checking probe of use is designed to the characteristic of one or more base on one or more site and echoes with the characteristic that is attached to the mark on this order-checking probe.Once the order-checking probe the having connected base of site (and therefore inquire) is detected, and junctional complex is peeled off from DNB, carry out adapter and order-checking Probe Hybridization and the connection of a new round.
In general, conventionally identify the base on inquiry site in order-checking probe, the single base of each hybridization-connection-detection cyclic query with four fluorogens.But, be appreciated that the embodiment of use 8,16,20 and 24 or more kinds of fluorogens is also encompassed in the present invention.The quantity that increases fluorogen will increase the quantity of the base that can identify in any one circulation.
In an exemplary embodiment, adopt one group of 7 aggressiveness order-checking probe set with following structure:
3’-Fl-NNNNNNAp
3’-F2-NNNNNNGp
3’-F3-NNNNNNCp
3’-F4-NNNNNNTp
Wherein, " p " representative can be used for the phosphoric acid connecting, and " N " represents degeneracy base.F1-F4 represents that four kinds of different fluorogens-therefore every kind of fluorogen is associated with specific base.This group exemplary probe, after order-checking probe is connected with the grappling probe that hybridizes to adapter, can detect the base of next-door neighbour's adapter.With regard to the complementarity between the ligase difference probe inquiry base in site and the base of target nucleic acid detection site for connecting order-checking probe and grappling probe, order-checking Probe Hybridization provides the fluorescence signal detecting when being connected the base characteristic of target nucleic acid detection site.
In some embodiment, one group of order-checking probe, by the order-checking probe that comprises three kinds of distinguishing marks, leaves the 4th kind of optional order-checking probe unmarked.
Hybridize-connect-detect after circulation, grappling probe-order-checking probe is connected to product and peel off, start new circulation.In some embodiment, can obtain 6 of tie points or the more base between probe apart from grappling probe and order-checking, and apart from the accurate sequence information of 12 of the intersections between target nucleic acid and adapter or more bases.Utilize method described herein, comprise use with degeneracy end can further put in the grappling probe in target nucleic acid, can increase can certified base quantity.
Can utilize methods known in the art, comprise utilizing and carry out image acquisition such as the commodity imaging software bag of Metamorph (Molecular Devices, Sunnyvale, CA) and so on.Data are extracted and can be undertaken by a series of binary files that write with for example C/C++, base-measure and read-shine upon and can be undertaken by a series of Matlab and Perl script.
In exemplary embodiment, be arranged in lip-deep DNB experience one as described herein and take turns cPAL, the order-checking probe mark wherein using four kinds of different fluorogens (particular bases in every kind of corresponding probe on inquiry site).In order to determine the above characteristic of the base of each DNB of arrangement of surface, each visual field (" photo frame ") uses four different wavelength imagings of corresponding four kinds of fluorescently-labeled order-checking probes.All images that each circulation obtains are stored in circulation catalogue, and wherein the quantity of image is four times (while using four kinds of fluorogens) of photo frame.Then chain image data can be stored in the bibliographic structure into Downstream processing tissue.
In some embodiment, data are extracted the view data that depends on two types: distinguish the bright field-of-view image (bright-field images) of the position of lip-deep all DNB, and the many groups fluoroscopic image obtaining in each order-checking circulation.Can utilize data to extract software and identify all objects with bright field-of-view image, then for each this object, can utilize software to calculate the mean fluorecence value of each order-checking circulation.For any given circulation, have four data points, four images that absorb under their corresponding different wave lengths are inquired about whether A, G, C or T of this base.These raw data points (being called again " base mensuration " (" base call ") in literary composition) are arranged, produce discontinuous sequencing result to each DNB.
Then the base group identifying can be filled provides the sequence information of target nucleic acid and/or identifies in target nucleic acid whether have certain particular sequence.In some embodiment, by the comparison of the overlap by multiple order-checking circulation acquisitions of carrying out on multiple DNB, the base identifying is assembled into complete sequence.Noun " complete sequence " is for referring to the sequence of part or whole genomic sequence and part or whole target nucleic acid herein.In other embodiments, utilized can be by overlap " splicing " to provide the algorithm of complete sequence for assemble method.In other embodiment, utilize reference table to assist the sequence set identifying to dress up complete sequence.Can utilize the existing sequencing data of selected organism to be aggregated into reference table.For example, human genome data can be from National Center for Biotechnology Information (ftp, ncb1.nih.gov/refseq/release), or J.Craig Venter Institute (http://www.1cv1.org/researchhuref/) obtain.Can utilize whole human genome information or its subset to make the reference table for specific order-checking inquiry.In addition, the empirical data that can origin comes from special group builds specific reference table, comprise the gene order of the people colony limiting from particular race, geographical succession, religion or culture, because the difference in human genome may be distorted these data, specifically depend on the source of the information containing in comparable data.
In any working of an invention mode of discussing, thereby can comprising many target nucleic acids, nucleic acid-templated and/or DNB group substantially covers whole genome or whole target polynucleotide herein." substantially cover " for meaning that the quantity of analyzed nucleotides (being target sequence) is at least equal to the target polynucleotide of two copies herein; Or at least ten copies in another aspect; Or at least two ten copies in another aspect; Or at least 100 copies in another aspect.Target polynucleotide can comprise DNA fragmentation (comprising genomic DNA fragment and cDNA fragment) and RNA fragment.Can in being incorporated to by reference document herein below, find the guide about the step of reconstruct target polynucleotide sequence: Landeret al, Genomics, 2:231-239 (1988); Vingron et al, J.Mol.Biol., 235:1-12 (1994) and similarly bibliography.
VIIB (iv). probe groups
As understandable, can, according to various cPAL methods described above, use the various combination of order-checking probe and grappling probe.Be exemplary embodiment about the description of the probe groups using in the present invention (being also called " probe set " in literary composition) below, be appreciated that the present invention is not limited to these combinations.
On the one hand, probe groups is designed to qualification from the nucleotides on the site of adapter specific range.For example, some probe groups can be for the identification of the base from adapter maximum 3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 and more sites.Mistake just as discussed above, one end can be designed to put in the target nucleic acid adjacent with adapter with the grappling probe of degeneracy base, allows order-checking probe to be connected to the position farther from adapter, thereby the characteristic of the base farther from adapter is provided.
In exemplary embodiment, one group of probe comprises at least two kinds and is designed to the grappling probe of hybridizing with the adjacent domain of adapter.In a kind of embodiment, the first grappling probe and adapter region complete complementary, and the adjacent domain complementation of the second grappling probe and adapter.In some embodiment, the second grappling probe can comprise one or more degenerate core thuja acids, and these bases put in the target nucleic acid adjacent with adapter, and hybridizes with its nucleotides.In exemplary embodiment, the second grappling probe comprises at least 1-10 degeneracy base.In other exemplary embodiment, the second grappling probe comprises 2-9,3-8,4-7 and 5-6 degeneracy base.In other exemplary embodiment, the one or both ends of the second grappling probe and/or its sequence interior zone comprise one or more degeneracy bases.
In other embodiments, one group of probe can also comprise one or more order-checking probe group, for determining the base in one or more detection site in target nucleic acid.In one embodiment.Described probe groups comprises enough difference order-checking probe groups, can identify and offer sacriffices to the gods or the spirits of the dead about 1 in nucleic acid to about 20 sites.In other exemplary embodiment, described probe groups comprises enough order-checking probe groups, can identify that about 2 in target nucleic acid is to about 18, about 3 to about 16, about 4 to about 14, about 5 to about 12, about 6 to about 10 and about 7 to about 8 sites.
In other exemplary embodiment, according to the present invention used 10 marks or the probe set of tape label.In other embodiment, probe groups can comprise the grappling probe that two or more sequences are different.Also have in some embodiments, probe groups comprises 3,4,5,6,7,8,9,10,11,12,13,14,15 or the different grappling probe of more kinds of sequence.
In other exemplary embodiment, the one group of probe providing comprises one or more order-checking probe group and three kinds of grappling probes.The first area complementation of the first grappling probe and adapter, the second area complementation of the second grappling probe and adapter, and the first and second regions are adjacent.The 3rd grappling probe comprises three or more degenerate core thuja acids, can with the target nucleic acid of contiguous adapter in nucleotides hybridization.The 3rd grappling probe in some embodiments can also with the 3rd adapter regional complementarity, described the 3rd region can be adjacent with second area, thus the second grappling probe side then first and the 3rd grappling probe.
In some embodiment, every kind of probe that grappling and/or order-checking probe groups can comprise variable concentrations, and this variable concentrations depends in part on the degeneracy base that may contain in grappling probe.For example, there is the probe of lower hybridization stability, for example, have the probe of more A and/or T, thereby its concentration can relatively highly make up their lower stability.In other embodiments, the difference of relative concentration is that the probe set of then these independently being prepared mixes to realize with suitable amount by independently preparing little probe set.
VIIIB (v). improve specificity and the informativeness of coupled reaction
Some aspects, are modified to the coupled reaction using in cPAL method of the present invention to comprise for improving adjacent hybridization in the informativeness of two nucleic acid of target nucleic acid of connection.In some embodiments, this method comprises that interpolation is general by preferentially preferably increase the material of double-strandednucleic acid stability in conjunction with double-strandednucleic acid (" double-stranded conjugated group ").In some embodiments, intercalator be used for and make an addition to coupled reaction mixture." embedding agent " used herein or " intercalator " (" Intercalating agent " or " intercalator ") refer to the material inserting between can the adjacent base-pair in nucleic acid dliploid, for example, those with respect to single-chain nucleic acid preferentially in conjunction with the material of double-strandednucleic acid.Similarly, it will be appreciated by those skilled in the art that and also can use ditch and major groove conjugated group.
Aspect concrete, intercalator includes but not limited to ethidium bromide, dihydro second ingot, equal dimer-1 of second ingot, equal dimer-2 of second ingot, acridine, propidium iodide, YOYO-1 or TOTO-1, proflavin, daunomycins, adriamycin, POPO-1, POPO-3, BOBO-1, BOBO-3, Psoralen, actinomycin D, SYBR Green or Sa Li polyamines, and can be fluorescence or non-fluorescence.Aspect very concrete, intercalator is ethidium bromide.The scope of ethidium bromide preferred for the present invention comprises that 0.1ng/ μ l is to about 20.0ng/ μ l, and more preferably, about 2.5ng/ μ l is to about 15.0ng/ μ l, and more preferably, about 5.0ng/ μ l is to about 10.0ng/ μ l.
In another embodiment, the invention provides the method for the characteristic of the base for determining the site that is positioned at target nucleic acid, comprising: the library construction body that comprises target nucleic acid and at least one adapter is provided, and wherein, described target nucleic acid has site to be inquired; By grappling Probe Hybridization to the adapter in library construction body; The set of order-checking probe is hybridized to target nucleic acid; Existing under the condition of the double-stranded conjugated group such as intercalator, connect order-checking probe and grappling probe, wherein, can effectively be connected to grappling probe with the order-checking probe of target nucleic acid complementation; And determine which order-checking probe is connected to grappling probe, thereby determine the sequence of target nucleic acid.Aspect concrete, before determining sequence, abandon the order-checking probe not connecting.Aspect preferred, repeat these steps until the base of desired amt is determined.
In another embodiment, the invention provides the method for the synthesis of nucleic acid library construct, comprising: obtain target nucleic acid; Connect the first adapter and target nucleic acid to generate the first library construction body, wherein, described the first adapter comprise make enzyme in adapter in conjunction with but in target nucleic acid, carry out the restriction enzyme enzyme recognition site that enzyme is cut; The first library construction body increases; Cyclisation the first library construction body; With the digestion with restriction enzyme library construction body of the restriction enzyme enzyme recognition site in identification the first adapter; And connect the second adapter and library construction body to generate the second library construction body, wherein, one or more in these steps comprise intercalator in reactant mixture.Aspect concrete, can repeat these steps until the adapter of the distribution of desired number is connected to target nucleic acid.
In another embodiment, the invention provides the selective of polymeric enzyme reaction for improving combination and coupled reaction, comprising: by nucleic acid hybridization to primer; Carry out extension to generate primer extension product by extend the nucleic acid of the above-mentioned hybridization of primer pair with polymerase, and connect one end and the double-strandednucleic acid of the primer product extending, wherein, extension and coupled reaction are carried out under the condition that has intercalator.Aspect concrete, the double-strandednucleic acid connecting with primer extension product is the relative end of derivative primer product.Other side, is connected to primer extension product the nucleic acid of separation.One concrete aspect, the nucleic acid of described separation is adapter.This method can be used for preparing nucleic acid library as above.
As further discussed in detail herein, in some embodiments, make target and the grappling Probe Hybridization of array, then wash and abandon unnecessary grappling (anchor).Then use T4DNA ligase and hold the mixture of 9 aggressiveness fluorescence order-checking probes of mark to hybridize array 3 ' or 5 '.Exist under the condition of T4 ligase, 9 aggressiveness order-checking probes participate in the connection of grappling oligonucleotides, form thus stable crossbred and fluorogen and are combined in sequence-specific mode with grappling probe and target nucleic acid.Optionally be included in the double-stranded conjugated group that has in this coupled reaction, for example ethidium bromide, it can exist by variable concentrations, comprises about 1ng/ul to 10ng/ul.Interchangeable intercalator includes but not limited to dihydro second ingot, equal dimer-1 of second ingot, equal dimer-2 of second ingot, acridine, propidium iodide, YOYO-1 or TOTO-1, proflavin, daunomycins, adriamycin and Sa Li polyamines.
The concentration affects of the intercalator that signal strength signal intensity exists in being reacted.For example, in coupled reaction, the concentration of ethidium bromide is increased to 10ng/ul from 1ng/ul, can makes the overall signal strength decreased of whole 4 kinds of fluorescence probes.The reduction of signal strength signal intensity can reflect the destruction of ethidium bromide to dliploid DNA stability, and the mechanism of action of the colour purity of increase can be described.Compared with mispairing being added into stable dliploid, in the time that stability destruction is put on to described dliploid, the interpolation of mispairing can cause larger stability destruction.The signal strength signal intensity itself reducing is not harmful to, and can be by suitable detecting instrument sensitivity compensation.
VIIIB (vi). other sequence measurements
On the one hand, method and composition of the present invention and for example WO2007120208, WO2006073504, WO2007133831 and US2007099208 and U.S. Patent application the 60/992nd, No. 485, 61/026, No. 337, 61/035, No. 914, 61/061, No. 134, 61/116, No. 193, 61/102, No. 586, 12/265, No. 593, 12/266, No. 385, 11/938, No. 096, 11/981, No. 804, 11/981, No. 797, 11/981, No. 793, 11/981, No. 767, 11/981, No. 761, 11/981, No. 730, 11/981, No. 685, 11/981, No. 661, 11/981, No. 607, 11/981, No. 605, 11/927, No. 388, 11/927, No. 356, 11/679, No. 124, 11/541, No. 225, 10/547, No. 214, 11/451, No. 692 and 11/451, the technical combinations of describing in No. 691 is used, for all objects, particularly for order-checking, especially the whole instructions relevant with the order-checking of concatermer, above-mentioned document is incorporated to herein by reference of text.
On the other hand, the sequence of qualification DNB has been used methods known in the art, includes but not limited to the method based on hybridization, for example Drmanac, United States Patent (USP) the 6th, 864, No. 052,6,309, No. 824 and 6,401, No. 267; And the United States Patent (USP) of Drmanac etc. discloses 2005/0191656; With synthetic method sequence measurement, the United States Patent (USP) 6,210,891 of such as Nyren etc.; The United States Patent (USP) 6,828,100 of Ronaghi; Ronaghi etc. (1998), Science, 281:363-365; Balasubramanian, United States Patent (USP) 6,833,246; Quake, United States Patent (USP) 6,911,345; Li et al, Proc.Natl.Acad.Sci., 100:414-419 (2003); Smith et al, the open WO2006/074351 of PCT; And based on connect method, for example Shendure et al (2005), Science, 309:1728-1739, Macevicz, United States Patent (USP) 6,306,597, wherein for all objects, particularly, in order to relate to the instruction of figure, icon and subsidiary word of description (especially relevant with order-checking) composition, composition using method and preparation method of composition, these documents are incorporated to respectively herein by reference of text.
In some embodiment, DNB nucleic acid-templated and that produced by these templates of the present invention is used in synthetic method sequence measurement.Compared with conventional synthetic method sequence measurement, utilize the efficiency of the nucleic acid-templated synthetic method sequence measurement carrying out of the present invention to improve, the nucleic acid that conventional synthetic method sequence measurement uses does not comprise multiple adapters that are dispersed in distribution.Nucleic acid-templated permission of the present invention each adapter from template starts to carry out multiple shorter mensuration, instead of single long mensuration.This short mensuration is used less mark dNTPs, therefore can save reagent expense.In addition, can on DNB array, carry out synthetic method sequencing reaction, described array provides the monomeric unit of highdensity order-checking target and multicopy.This array provides the detectable signal of single molecules level, and the sequence information amount simultaneously providing increases, because most or whole DNB monomeric unit is extended in the situation that not affecting order-checking process.The high density of array can also reduce that reagent expense-than conventional synthetic method sequence measurement, reagent expense reduces about 30 to about 40% in some embodiments.In some embodiment, insert to about 100 bases if the present invention is dispersed in the adapter space about 30 of distribution in nucleic acid-templated, can provide a kind of mode to merge about two to about ten standard test values.In such embodiment, follow-up order-checking circulation does not need to peel off new synthetic chain, thereby allows to use single DNB array to carry out about 100 to about 400 synthetic order-checking circulations.
In some embodiments of the present invention, open chain cPAL sequence measurement is extended to the connection event that comprises that two or more are undertaken by order-checking probe.For example, after the first connection product that contains the first order-checking probe that is connected to the construct that contains one or more grappling probes being detected, can be by the second order-checking Probe Hybridization to being positioned at the nucleic acid target in the site that is close to the first connection product and being connected to the first order-checking probe.Then can detect the second order-checking probe.Should be understood that, multiple order-checking probes can carry out this hybridization-connection circulation.Subsequently, can from target, remove the connection product obtaining, and can carry out as described herein another and take turns cPAL order-checking.In this embodiment, open chain cPAL sequence measurement is combined with the chain type method part of use one or more other order-checking probe.Should be understood that, can use methods known in the art to detect each new order-checking probe.For example, if order-checking probe fluorogen mark, after the order-checking probe of each connection being detected, can cut the fluorogen of combination, make the second order-checking probe that adds to " chain " can be detected and not disturbed by the mark on the first order-checking probe.
VIIIC. two stages order-checking
One aspect of the present invention provides " two stages " sequence measurement, is called again " shotgun sequencing " in literary composition.The U.S. Patent application No.12/325 that this method was submitted on December 1st, 2008, has description in 922, for all objects, particularly for the relevant whole instructions of checking order of two stages or air gun, the document is incorporated to herein in full by reference.
Conventionally, comprise the following steps for two stages-PCR sequencing PCR of the present invention: (a) measure target nucleic acid sequence, produce the primary target nucleotide sequence that comprises one or more target sequence; (b) the synthetic special oligonucleotides of multiple targets, at least one target sequence of each correspondence in the special oligonucleotides of wherein said multiple targets; (c) provide the library (or comprise this fragment and can further comprise the example construct of adapter and other sequences as described in the text) of the target nucleic acid fragment of the oligonucleotide hybridization special with multiple targets; And (d) measure the sequence in library (or the construct that comprises this fragment) of fragment, to produce secondary target nucleotide sequence.For example, measure in order to fill up the base of losing the breach that causes of sequence or solving low confidence level in the elementary sequence of genomic DNA (human gene group DNA), synthesizing can be about 10,000 to about 100 ten thousand for the special oligonucleotides quantity of the target of these methods, therefore the present invention has considered to use at least about 10,000 oligonucleotides that target is special, or about 25,000 or about 50,000 or about 100,000 or about 20,000 or about 50,000 or about 100,000 or about 200,000 or the special oligonucleotides of more target.
Mention at least one target sequence of the special oligonucleotides of multiple targets " correspondence ", mean that the special oligonucleotides of this target is designed to and approaches, include but not limited to the target nucleic acid hybridization in abutting connection with target sequence, so very large possibility is will comprise target sequence with the target nucleic acid fragment of this oligonucleotide hybridization.Therefore the special oligonucleotides of described target can be for the crossbred catching method sheet phase library of target sequence that produced enrichment, as the sequencing primer to target sequence order-checking, as the amplimer of amplification target sequence or for other objects.
According in shotgun sequencing of the present invention and other sequence measurements, assembling one of ordinary skill in the art will readily recognize that in the sequence assembling and has breach after measuring, or one or more base or a string base confidence level on sequence specific site is lower.By primary target nucleotide sequence and canonical sequence are compared, also can identify and may comprise this class breach, low confidence level sequence, or the target sequence of different sequences (being the variation of one or more nucleotides in target sequence) on ad-hoc location just.
According to a kind of embodiment of these methods, measure target nucleic acid sequence and comprise that with generation primary target nucleotide sequence the sequencing of computer processing is inputted and the sequencing of computer processing assembles to produce primary target nucleotide sequence.In addition, design target special oligonucleotides also can computerization, and the special oligonucleotides of this computerized target synthetic can be processed with the computer of input and the Computer Equipment of sequencing and the design of the special oligonucleotides of target combine.This is useful especially, because the quantity of the special oligonucleotides of target to be synthesized is for example, for higher organism body (people's) genome, and may be several ten thousand or hundreds of thousands.Therefore the present invention can be automatically integrated by determining sequence and having identified that region produces the process of oligonucleotides set to further process.In some embodiment, the region that computer drivers utilization has been identified and definite sequence are carried out design oligonucleotides to separate and/or to produce the new segment that covers these regions, wherein said definite sequence near or in abutting connection with the described region of having identified.Then can be as described herein with oligonucleotides arbitrarily from the first order-checking storehouse, from the precursor in the first order-checking storehouse, from the difference order-checking storehouse that produced by identical target nucleic acid, directly from isolated fragments such as target nucleic acids.In other embodiments, qualification needs the region of further analyzing and the automatic integrated oligonucleotide sequence defining in oligonucleotides set that separates/produce the second library to and guide the synthetic of these oligonucleotides.
In some embodiment of two stage sequence measurements of the present invention, after crossbred acquisition procedure, carry out release steps, in other aspects of this technology, before second order-checking process, carry out amplification step.
In other embodiment, in authentication step by more determining that sequence and canonical sequence identify some or Zone Full.In some aspect, the second air gun order-checking storehouse is according to canonical sequence, and the oligonucleotides set that utilization comprises oligonucleotides separates.Equally, in some aspect, the different oligonucleotides of at least 1000 sequences of described oligonucleotides set-inclusion, in other aspects, described oligonucleotides set-inclusion at least 10,000,25,000,50,000,75,000 or 100,000 or the different oligonucleotides of more sequence.
In some aspect of the present invention, one or more order-checking process adopting in described two stage sequence measurements is checked order and is carried out by connection method; In other respects, one or more order-checking process checks order by hybrid method or synthetic method order-checking is carried out.
In some aspect of invention, about 1 to about 30% complicated target nucleic acid is accredited as and need in the Phase of method, again checks order; In other aspects, about 1 to about 10% complicated target nucleic acid is accredited as need to order-checking again in the Phase of method.In some aspect, for the coverage of the qualification percentage of complicated target nucleic acid at about 25x to 100x.
In other aspects, determine to each target nucleic acid region of again checking order in the Phase of described method and synthesize 1 to the special selection oligonucleotides of about 10 targets; In other aspects, determine that about 3 offer sacriffices to the gods or the spirits of the dead special selection oligonucleotides to about 6 to each target nucleic acid region of again checking order in the Phase of described method.
In some aspects again of this technology, by the selection oligonucleotides that auto-programming is determined and synthetic target is special, the process of wherein identifying the sequence of the special selection oligonucleotides of the process in the region of losing nucleotide sequence in complex nucleic acid or having the low confidence level of nucleotide sequence and definite target exchanges with hardware with oligonucleotides composite software, thus the synthetic special selection oligonucleotides of target.In other aspects of this technology, the special selection oligonucleotides length of target arrives about 30 bases about 20, and is unmodified in certain aspects.
In fact the region of the not all identified complicated target nucleic acid that will be further analyzed exists.The reason that certain region expection lacks coverage may be that in fact the predicted region being present in complicated target nucleic acid (does not for example exist, described region is may be in target nucleic acid deleted or reset), the oligonucleotides that therefore not all set produces can be separated to fragment and be included in second air gun order-checking storehouse.In some embodiment, at least to design and prepare an oligonucleotides for each identified region that will be further analyzed.In other embodiments, provide three or more oligonucleotides for the identified zone leveling that will be further analyzed.A feature of the present invention is that oligonucleotides set can be directly used in the template by being used to come from target nucleic acid, extends described oligonucleotides produce second air gun order-checking storehouse through polymerase.Invention another feature be oligonucleotides set can be directly used in utilize described oligonucleotides set through ring dependence copy generation replicon.Another feature of invention is that described method can provide sequence information to identify the target area lacking, for example identified be further analyzed but due to for example deletion or rearrangement and actual non-existent estimation range.
The embodiment of two stage sequence measurements described above can be used in combination with any nucleic acid construct described herein and known in the art and sequence measurement.
VIIID.SNP detects
Method and composition discussed above in other embodiments can be for detection of the distinguished sequence in the nucleic acid construct such as DNB.Particularly use order-checking can, for detection of polymorphism or the sequence relevant to gene mutation, comprise SNP (SNPs) with the cPAL method of grappling probe.For example, whether to there is SNP in order detecting, can to use the order-checking probe of two groups of distinguishing marks, one detected like this but not another kind of probe shows in sample, whether there is polymorphism.Sort sequencer probe can be combined use with the grappling probe being similar in the method for above-mentioned cPAL method, the specificity and efficiency detecting further to improve SNP.
VIIIE. long segment is read (LFR) method
According to any in above-mentioned sequence measurement, the present invention also provides the long segment reading method that the longer reading length of providing of haplotype orientation is provided.
In the exemplary embodiment of LFR process, the genomic DNA of about 100kbp, as input sample, can be affected to the interval that can carry out phasing (phasing) because input the length of DNA.This HMW genomic DNA halved to 384 hole flat boards so that every hole receives approximately 0.1 haploid genome (haploid genome 10%).The DNA fragmentation increasing in each hole, by extremely about 500bp of the DNA fragmentationization of this amplification.DNA in each hole is connected to the adapter arm that contains unique qualification thing, subsequently the DNA being connected in all 384 holes is pooled in single pipe.This DNA collecting is as the library construction of describing in detail in sections before and the input thing of sequence measurement.Each site in genome, containing having an appointment 40 kinds of fragments, is crossed over generally in 384 holes, and wherein approximately 20 fragments are from maternal chromosome, and 20 fragments are from paternal chromosome.Under the ratio in 0.1 genome equivalent/hole, the probability that the fragment in hole can be overlapping is 10%, and any this overlapping fragments is 50% from the probability of parental set of chromosome independently.Therefore, approximately 95% data are from single parental set of chromosome.Subsequently the data from each hole are shone upon, the measurement result of mapping located adjacent one another is divided into groups by the discriminating thing of its uniqueness, thereby make the reconstruction of the monoploid fragment of each Kong Zhongyue 100kbp become possibility.Unique discriminating thing can any number method differentiate, include but not limited to use and the Probe Hybridization of probe and the order-checking of this discriminating thing sequence of unique tag of differentiating the complementation of thing sequence.
Above method can be independently solved the problem of parental set of chromosome.SNP in sample is for distinguishing from maternal and paternal chromosomal 100kbp fragment.40 genome equivalents of beginning discussed above produce the maternal fragment of the average 100kbp starting every 5kbp and every the paternal fragment of the 100kbp of 5kbp.Therefore, two continuous maternal fragments average about 95kbp that overlaps each other.In human genome, in 95kbp, generally there are 50 to 150 kinds of SNPs (SNP), wherein, many in arbitrary given sample is heterozygosis.Use these SNP to distinguish maternal fragment and paternal fragment.By overlapping fragments is linked together, can independently build larger maternal fragment and paternal fragment (large to whole chromosome).This method makes effective reading length be increased to and exceed 100kbp from about 35bp.
VIIIF. base is measured (calling), mapping and assembling
Can use means known in the art analysis assembling to use the data of any generation in sequence measurement as herein described.
In some embodiments, generate four images (every kind color one) in each genome inquiry site.The degree of depth of each in image in the position of each spot and four kinds of colors by adjust between dyestuff crosstalk and background intensity is determined.Can be by quantitative model matching to the four-dimensional data set obtaining.Measure the base of given spot, its mass fraction represents the degree of four kinds of degree of depth model of fit.
In other embodiments, the data that read, with the binary format coding of compression, comprise base and the mass fraction of mensuration.Mass fraction is associated with base accuracy.Analysis software (comprising sequence assembling software) can use described mark to determine the evidence contribution from determined single base.
The structure of DNB generally makes measurement result " generation breach " (Figure 51).Breach size is cut changeability intrinsic in digestion according to enzyme and is changed (1 base of +/-conventionally).CPAL random-base (" undetermined ") that access character has measurement result not read once in a while, otherwise the quality of DNB is higher.The base-pair of measuring matches as described in further detail herein.
The mapping software that can compare read data and reference sequence can be used for the data that mapping is generated by sequence measurement as herein described.Little variation in the general tolerable reference sequence of this mapping software (for example by individual gene group make a variation those little variations that cause), reading mistake or the base not reading.This character generally makes SNP be able to direct reconstruction.In order to support the larger variation assembling of (comprising that large-scale structure changes or intensive variation region), can independently shine upon each arm of DNB, after comparison, limit pairing.
In some embodiments, the assembling of sequencing result can utilize supports the software of DNB reading structure (pairing, with the tool measurement result jaggy of undetermined base) to realize haploid genome assembling, and described haploid genome assembling can come phasing heterozygosis site by the sequence information that generates LFR method of the present invention in some embodiments.
Method of the present invention can be with rebuilding non-existent new segment in reference sequence.The algorithm of the combination of the reasoning of use based on evidence (Bayesian) and the algorithm based on de Bruijin figure can be used for some embodiments.In some embodiments, can use and rule of thumb proofread and correct the statistical models for each data set, make all determination datas be used and need not filter in advance or disposal data.Large-scale structure variation (including but not limited to disappearance, transposition etc.) and copy number object change the measurement result that also can match by expansion and detect.
IX. exemplary embodiment
One aspect of the present invention provides the method for the sequence in definite target nucleic acid.Said method comprising the steps of: the sequencing template that comprises target nucleic acid fragment and adapter (a) is provided, and wherein said adapter comprises at least the first anchored site; (b) by grappling probe and anchored site hybridization, described grappling probe comprises with the region of adapter site complementation and for 3 or more degeneracy base in conjunction with target nucleic acid sequence; (c) hybridize with the set of order-checking probe, to determine that relative adapter limits the sequence of one or more nucleotides of position, wherein said order-checking probe can be detected ground mark existing with qualification particular bases; (d) connect grappling probe and order-checking probe; (e) detect order-checking probe, thereby determine the sequence in target nucleic acid.
With above consistent, the present invention provides the method for the characteristic of the first nucleotides in definite target sequence detection site on the other hand, and described target sequence comprises multiple detection site.The method comprising the steps of: the surface with multiple concatermers (a) is provided, wherein each concatermer comprises multiple monomers, each monomer comprises: (i) the first target structure territory of target sequence, at least the first adapter that it comprises first group of target detection site and (ii) comprises the second anchored site that (1) first anchored site is adjacent with (2); (b) make the first grappling probe and the hybridization of the first anchored site; (c) make the hybridization of the second grappling probe and the second anchored site, wherein said the second grappling probe also with the second anchored site outside sequence hybridization; (d) make at least the first order-checking probe and the first target structure territory hybridization, wherein said the first order-checking probe comprises: (i) and the first probe structure territory of target structure territory complementation; (ii) be positioned at the distinct oligonucleotide in the first inquiry site; (iii) mark, hybridization conditions is if described distinct oligonucleotide and the first nucleotides complementation, check order probe and the hybridization of described concatermer; (e) connect grappling probe and order-checking probe; And (f) qualification the first nucleotides.
With above consistent, the method for the characteristic of the first nucleotides of certain detection site in definite target sequence is provided in one embodiment of the present invention, wherein order-checking probe in groups contacts with the surface that comprises multiple concatermers.In this embodiment, each order-checking probe comprises: (a) with the first probe structure territory of target structure territory complementation; (b) be positioned at the distinct oligonucleotide in the first inquiry site; (c) mark, the wherein corresponding distinct oligonucleotide of each mark of every group.
In other embodiments, with above consistent, the each monomer in concatermer comprises multiple adapters.
In other embodiment, with above consistent, at least one adapter in concatermer comprises at least one II type endonuclease recognition site.
Also have in some embodiments, with above consistent, repeat to make the first grappling probe to hybridize, make the second grappling probe hybridize, make at least the first order-checking probe with the second anchored site with the first target structure territory hybridization and be connected grappling probe and the step of order-checking probe with the first anchored site, thus the second nucleotides in qualification the second detection site.
In other embodiments, with above consistent, the second grappling probe comprises the second grappling probe that contains at least 3 degeneracy bases in groups, the sequence hybridization outside wherein said degeneracy base and the second anchored site.
In other embodiment, with above consistent, the second grappling probe comprises at least one can optionally activate the end for connecting.
Also have in some embodiments, with above consistent, the described surface with multiple concatermers is the surface of functionalization.In other embodiment, functionalization has been carried out by the funtion part that is selected from amine, silane and hydroxyl in described surface.
In other embodiment, with above consistent, described surface comprises discrete region, multiple spaces, the concatermer that this district inclusion is fixing.
In other embodiment, with above consistent, described concatermer utilizes capture probe fixing from the teeth outwards.
In other embodiment, with above consistent, genomic nucleic acids by fragmentation with formation target sequence.
In other embodiment, with above consistent, described target sequence is genomic nucleic acid sequence.
In other embodiment, with above consistent, described genomic nucleic acid sequence is people.
With above consistent, one aspect of the present invention provides the kit that can comprise probe groups described herein using with sequencing template.In general, kit of the present invention can comprise grappling probe to, grappling probe to other and template in the adjacent grappling probe of target nucleic acid and for the order-checking probe of the base on definite kernel acid template specific site.This kit can further comprise for generation of the nucleic acid-templated adapter using in the present invention.
With above consistent, one aspect of the present invention provides such nucleic acid sequencing system, described system has comprised 10 marks or the probe set of tape label, the grappling probe groups that comprises sequence different 4 kinds or more kinds of probes, the grappling probe groups that contains 3 or more degeneracy bases, and ligase.In other embodiments, described nucleic acid sequencing system also comprises the reagent of grappling probe, the sex change from nucleic acid-templated with the order-checking being connected and grappling probe of order-checking probe.
Embodiment
Embodiment 1: preparation DNB
Below the exemplary testing program by nucleic acid-templated preparation DNB of the present invention (being called again " replicon " in literary composition), the wherein said nucleic acid-templated target nucleic acid that is dispersed in one or more adapter that distributing that comprises.First use 5 ' primer of phosphorylation and biotinylated 3 ' primer by linear strand nucleic acid-templated amplification, obtain having the double-stranded linear nucleic acid template of biotin label.
First, prepare Streptavidin MagneSphere by the 1x magnetic bead binding buffer liquid (150mM NaCl and 20mMTris, pH7.5 is dissolved in the water of nuclease free) MagPrep-Streptavidin MagneSphere (Novagen Part.No.70716-3) being resuspended in the microcentrifugal tube of nuclease free.Centrifuge tube is placed on magnetic centrifugal pipe support, allows magnetic-particle clarification, shift out supernatant and lose.Then magnetic bead washes twice in 800 μ l1x magnetic bead binding buffer liquid, is resuspended in 80 μ l1x magnetic bead binding buffer liquid.Nucleic acid-templated (herein also referred to as " library construction body ") through amplification from PCR reaction adds to 60 μ l volumes, and Xiang Guanzhong adds 20 μ l4x magnetic bead binding buffer liquid.Then nucleic acid-templated to adding in the pipe that contains MagPrep magnetic bead, gentleness mixes, and incubation 10 minutes under room temperature allows the clarification of MagPrep pearl.Shift out supernatant and lose.Then MagPrep pearl (with mixing through the library construction body of amplification) is washed twice in 800 μ l1x magnetic bead binding buffer liquid.After washing, MagPrep pearl is resuspended in 80 μ l0.1N NaOH, gentleness mixes, incubation allow its clarification under room temperature.Shift out supernatant and add in new nuclease free centrifuge tube.Every part of supernatant adds 4 μ 13 Μ sodium acetates (pH5.2) gentleness to mix.
Next, 420 μ l PBI buffer solutions (providing in QIAprep PCR Purification Kits) are provided in each centrifuge tube, by sample blending, then be loaded in the QIAprepMiniprep post (Qiagen Part No.28106) being placed in 2ml collecting pipe, at 14,000rpm centrifugal 1 minute.Flow through liquid and abandon, in each post, add 0.75ml PE buffer solution (providing in QIAprep PCR Purification Kits), post centrifugal 1 minute again.Again will flow through liquid abandons.Posts transfer, to new centrifuge tube, is added to 50 μ l EB buffer solutions (providing in QIAprep PCR Purification Kits).By pillar in centrifugal 1 minute wash-out single-chain nucleic acid template of 14,000rpm.Then measure the amount of each sample.
Utilize CircLigase by single-stranded template cyclisation: first, to get in the linear nucleic acid-templated PCR pipe of transferring to nuclease free of 10pmol strand.Add the water of nuclease free to make reaction volume reach 30 μ 1, sample is remained on ice.Then, give in each pipe and add 4 μ l 10x CircLigase Reaction Buffer (Epicentre Part.No.CL4155K), 2 μ l 1mM ATP, 2 μ l 50mMMnCl 2, and 2 μ l CircLigase (100U/ μ is (stack up is 4x CircLigase Mix) l), and sample was 60 DEG C of incubations 5 minutes.Each Guan Zhongzai adds 10 μ l 4x CircLigase Mix, and sample is 60 ° of incubations 2 hours, and 80 DEG C of incubations 20 minutes, are then placed in 4 DEG C.Then measure the amount of each sample.
Remove residual linear DNA in CircLigase reaction by exonuclease digestion.First, each CircLigase sample is got in the PCR pipe that 30 μ l add nuclease free, then in each sample, add 3 μ l water, 4 μ l10x Exonuclease Reaction Buffer (New England Biolabs Part No.B0293S) and 1.5 μ l Exonuclease I (20U/ μ l, New England Biolabs Part No.M0293L) and 1.5 μ l Exonuclease III (100U/ μ l, New England Biolabs Part No.M0206L).Sample was 37 DEG C of incubations 45 minutes.Then, in each sample, add 75mM EDTA (pH8.0) and in 85 DEG C of incubations 5 minutes, be cooled to subsequently 4 DEG C.Then sample is transferred in clean nuclease free centrifuge tube.Next, 500 μ l PN buffer solutions (providing in QIAprep PCR Purification Kits) are provided in every pipe and mix, sample is loaded in the QIAprep Miniprep post (Qiagen Part No.28106) being placed in 2ml collecting pipe, centrifugal 1 minute of 14,000rpm.Abandon and flow through liquid, 0.75ml PE buffer solution (providing in QIAprep PCR Purification Kits), pillar centrifugal 1 minute are again provided each post.Again abandon and flow through liquid.Posts transfer, to new collecting pipe, is added to 40 μ l EB buffer solutions (providing in QIAprep PCR Purification Kits).Pillar centrifugal 1 minute at 14,000rpm, wash-out strand library construction body.Then measure the amount of each sample.
Ring dependence for the preparation of DNB copies: by the nucleic acid-templated DNB that encircles dependence and copy to prepare the concatermer that comprises target nucleic acid and adapter sequence.PCR pipe bar to nuclease free adds the single-stranded loop of 40fmol through exonuclease processing, adds water to final volume 10.0 μ l.Then, in every pipe, add 10 μ 1 2x Primer Μ ix (7 μ 1 water, 2 μ 110x phi29 Reaction Buffer (New England Biolabs PartNo.B0269S) and 1 μ l primers (2 μ Μ)), in room temperature incubation 30 minutes.Afterwards, in every pipe, add 20 μ 1p,hi2 9 Μ ix (14 μ l water, 2 μ l 10x phi29 Reaction Buffer (New England Biolabs Part No.B0269S), 3.2dNTP mixed liquor (the each 2.5mM of dATP, dCTP, dGTP and dTTP) and 0.8 μ l phi29DNA polymerase (10U/ μ l, New England Biolabs Part No.M0269S)).Pipe was 30 DEG C of incubations 120 minutes.Then take out pipe, every duplicate samples adds 75mM EDTA (pH8.0).Then measure the amount that ring dependence copies product.
The amount of determining the quality of DNB: DNB is evaluated the quality of DNB by observing colour purity after determining.DNB is suspended in to replicon dilution buffer liquid (0.8x phi29Reaction Buffer (New England Biolabs Part No.B0269S) and 10mM EDTA, pH8.0) in, various dilutions added the swimming lane of mobile slide glass (flowslide), 30 DEG C of incubations 30 minutes.Then use buffer solution wash dynamic load sheet, add the probe solution containing four kinds of different random 12 aggressiveness probes of useful Cy5, Texas Red, FITC or Cy3 mark to each swimming lane.Mobile slide glass is transferred in the heat block that is preheating to 30 DEG C, 30 DEG C of incubations 30 minutes.Then give with Imager3.2.1.0 software the slide glass imaging of flowing.Then measure the amount that ring dependence copies product.
Embodiment 2: single and dual c-PAL
The second grappling probe of the complete degeneracy of different length is tested in two grappling probe detection systems.Used being combined as: 1) use a kind of grappling sub-connection of the standard of grappling (anchor) and 9 aggressiveness order-checking probe, wherein said grappling is in conjunction with the adapter adjacent with target nucleic acid, from start mensuration apart from 4 site of adapter; 2) use the first identical grappling to be connected with two anchor molecules of the second grappling that comprises degeneracy 5 aggressiveness and 9 aggressiveness order-checking probe, from start mensuration apart from 9 site of adapter; 3) use two grappling sub-connections of identical the first grappling and the second grappling that comprises degeneracy 6 aggressiveness and 9 aggressiveness order-checking probe, from start mensuration apart from 10 site of adapter; With 4) use two grappling sub-connections of identical the first grappling and the second grappling that comprises degeneracy 8 aggressiveness and 9 aggressiveness order-checking probe, from start mensuration apart from 12 site of adapter.The second grappling probe of 1 μ M the first grappling probe and 6 μ M degeneracys is incorporated in to the T4DNA ligase in ligase reaction buffer, and point sample, to reacting sheet primary surface 30 minutes, washes away unreacted probe and reagent from reaction sheet base afterwards.The second reactant mixture that introducing contains ligase and 5 ' F1-NNNNNBNNN or 5 ' F1-NNBNNNNNN5 ' F1-NNNBNNNNN5 ' F1-NNNNBNNNN type fluorescence probe.Fl represents four kinds of one in fluorogen, the one in random four kinds of base A, G, C or the T that introduce of N representative, B representative and one in fluorophorre special associated four kinds of base A, G, C or T.Connect after 1 hour, wash away unreacted probe and reagent from sheet base, detect the fluorescence intensity of each DNA target association.
Figure 27 has shown the signal strength signal intensity associated with the degeneracy second grappling probe of different length in system, and wherein signal strength signal intensity is along with the second grappling probe length increases and declines.As can be seen from Figure 28, the degree of fitting score of these intensity also reduces along with the length of degeneracy the second grappling, but until the mensuration of base 10 still can produce rational degree of fitting score.
Figure 29 and 30 has shown the impact of a kind of grappling sonde method and two kinds of grappling sonde method service times.Standard grappling and degeneracy 5 aggressiveness all use and start from the site 4 and 9 of adapter respectively to measure with 9 aggressiveness order-checking probes.Although strength level difference is larger in two grappling sonde methods, sub-method of grappling of standard and two grappling sonde methods all show suitable degree of fitting marking for twice, have eachly exceeded 0.8.
The length of degeneracy the second grappling probe is the impact with degree of fitting marking on signal strength signal intensity: the various combination (wherein the length of the second grappling probe is different with composition) of the first and second grappling probes is used to the impact of comparison degeneracy probe on signal strength signal intensity and degree of fitting marking in the time of the base for the identification of adapter 5 ' direction.Use a kind of grappling sonde method comparison signal intensity and the degree of fitting marking of two kinds of grappling sonde methods and standard, that described two kinds of grappling sonde methods are used or contain and the part degeneracy probe in some region of adapter complementation, or the second grappling probe of degeneracy completely.5 aggressiveness use same concentration to the degeneracy second grappling probe of 9 aggressiveness, also to wherein two kinds---and 6 aggressiveness and 7 aggressiveness probes detect under 4x concentration.Also with the first concentration determination comprise with two nucleotides of adapter complementation and be positioned at the second grappling probe of the different length degenerate core thuja acid of 3 ' end.Each reaction has used four kinds of identical order-checking probe groups to identify that being positioned at target nucleic acid measures the nucleotides on (read) site.
The combination using in test is as follows:
The first grappling probe of reaction 1:1 μ M12 base
There is no the second grappling probe
Measure site: from adapter end 2nt
The first grappling probe of reaction 2:1 μ M12 base
The second grappling probe of a 20 μ M5 degeneracy base
Measure site: from adapter end 7nt
The first grappling probe of reaction 3:1 μ Μ 12 bases
The second grappling probe of a 20 μ M6 degeneracy base
Measure site: from adapter end 8nt
The first grappling probe of reaction 4:1 μ M12 base
The second grappling probe of a 20 μ M7 degeneracy base
Measure site: from adapter end 9nt
The first grappling probe of reaction 5:1 μ M12 base
The second grappling probe of a 20 μ M8 degeneracy base
Measure site: from adapter end 10nt
The first grappling probe of reaction 6:1 μ M12 base
The second grappling probe of a 20 μ M9 degeneracy base
Measure site: from adapter end 11nt
The first grappling probe of reaction 7:1 μ M12 base
The second grappling probe of a 80 μ M6 degeneracy base
Measure site: from adapter end 8nt
The first grappling probe of reaction 8:1 μ M12 base
The second grappling probe of a 80 μ M7 degeneracy base
Measure site: from adapter end 9nt
The first grappling probe of reaction 9:1 μ M12 base
The second grappling probe (4 degeneracy base-2 known bases) of 20 μ M6nt
Measure site: from adapter end 6nt
The first grappling probe of reaction 10:1 μ M12 base
The second grappling probe (5 degeneracy base-2 known bases) of 20 μ M7nt
Measure site: from adapter end 7nt
The first grappling probe of reaction 11:1 μ M12 base
The second grappling probe (6 degeneracy base-2 known bases) of 20 μ M8nt
Measure site: from adapter end 8nt
Figure 31 and 32 has shown the combination of different grappling probes and order-checking probe combinations.The length that shows degeneracy the second grappling probe in figure is preferably used 6 aggressiveness, no matter be complete degeneracy or part degeneracy.The signal strength signal intensity that complete degeneracy 6 aggressiveness of use higher concentration show and the signal strength signal intensity similar (Figure 31) that uses 6 aggressiveness of part degeneracy.All data have good degree of fitting marking (referring to Figure 32), and except using the reaction 6 of the second the longest grappling, this reaction also shows minimum signal strength signal intensity scoring (Figure 31) in all reactions of carrying out.
The impact of the first grappling probe length on the marking of signal strength signal intensity and degree of fitting: the impact of the length that the various combination (wherein the first grappling probe has different length) of the first and second grappling probes is used to comparison the first grappling probe in the time of the base for the identification of adapter 3 ' direction on signal strength signal intensity and degree of fitting marking.A kind of grappling sonde method of standard is come comparison signal intensity and degree of fitting marking by two kinds of grappling sonde methods, that described two kinds of grappling sonde methods are used or contain and the part degeneracy probe in some region of adapter complementation, or the second grappling probe of degeneracy completely.Each reaction has used four kinds of identical order-checking probe groups to identify that being positioned at target nucleic acid measures the nucleotides on site.The combination using in test is as follows:
The first grappling probe of M12 base of reaction 1:1 μ
There is no the second grappling probe
Measure site: from adapter end 5nt
The first grappling probe of M12 base of reaction 2:1 μ
The second grappling probe of a 20 μ M5 degeneracy base
Measure site: from adapter end 10nt
The first grappling probe of M10 base of reaction 3:1 μ
The second grappling probe (5 degeneracy base-2 known bases) of 20 μ M7nt
Measure site: from adapter end 10nt
The first grappling probe of M13 base of reaction 4:1 μ
The second grappling probe of a 20 μ M7 degeneracy base
Measure site: from adapter end 12nt
The first grappling probe of M12 base of reaction 5:1 μ
The second grappling probe of a 20 μ M7 degeneracy base
Measure site: from adapter end 12nt
The first grappling probe of M11 base of reaction 6:1 μ
The second grappling probe of a 20 μ M7 degeneracy base
Measure site: from adapter end 12nt
The first grappling probe of M10 base of reaction 7:1 μ
The second grappling probe of a 20 μ M7 degeneracy base
Measure site: from adapter end 12nt
The first grappling probe of M9 base of reaction 8:1 μ
The second grappling probe of a 80 μ M7 degeneracy base
Measure site: from adapter end 12nt
The signal strength signal intensity (Figure 33) of observing and degree of fitting marking (Figure 34) produces best intensity while being presented at long the first grappling probe of use, this may part because higher melting temperature is provided to the grappling probe merging compared with long probe.
While using two kinds of anchor primer methods, the impact of kinases incubation on signal strength signal intensity and degree of fitting marking: carry out reaction as above 3 days under different temperatures, reaction is having in the kinase whose situation of lUnit/ml, measures the site 10 outside adapter with the order-checking probe that the first grappling probe, 20 μ M7 aggressiveness the second grappling probes and the structure of a 1 μ M10 base are Fluor-NNNNBNNNN.Use the reaction of 15 aggressiveness the first grappling probes and order-checking probe as positive control.Result is as shown in Figure 35 and 36.Although compared with the control, kinases has impact to signal strength signal intensity really, and from 4 DEG C, to 37 DEG C, scope does not change, and degree of fitting is given a mark and contrasted maintenance quite.The temperature that kinases incubation impacts is really 42 DEG C, and at this temperature, the marking of data fitting degree is low.
Then the probe identical with condition with probe described above and conditional test have been used the needed minimum time of kinases.As shown in Figure 37 and 38, the signal strength signal intensity that kinases incubation 5 minutes or above generation are effectively equal to and degree of fitting marking.
Embodiment 3: use Self-assembled DNA open chain base to measure human genome is checked order
Three human genomes are checked order, generate the average 45-to 87-of every genome and doubly cover and identified each genomic 320 to 4,500,000 sequence variants.A genomic data group is verified and shown that sequence accuracy rate has 1 false variant for approximately every 100kb.
The generation of template order-checking substrate
Order-checking substrate is by genomic DNA fragment and (recursive) cutting and the directed insertion generation of adapter (Fig. 6 and Figure 39 B) repeatedly of II type restriction enzyme.The building process in 4-adapter library is summed up in Fig. 6.This process causes: (i) adapter of high yield connects and forms minimum chimeric DNA circle, (ii) minimum directed insertion of adapter of structure of containing less desirable adapter topological structure, (iii) iteration of the construct of the adapter topological structure with expectation producing by PCR is selected, (iv) effective formation of chain specificity ssDNA ring, and (v) the single tube solution phase amplification of the ssDNA ring of discrete (non-confusion) DNA nanosphere (DNB) of generation high concentration.Although described process relates to many independently enzymatic steps, it is that self repeats to a great extent and is suitable for the automation mechanized operation of 96 sample batch processing.
It is 500 base-pairs (" bp ") that genomic DNA (" gDNA ") changes into average length by ultrasonic fragment, and (be for example separated in 100bp by polyacrylamide gel, for NA19240, approximately 400 to about 500bp) the interior fragment of moving of scope use QiaQuick purification column (Qiagen, Valencia, CA) reclaim.At 37 DEG C, with the 10 FastAP (Fermentas of unit, Burlington, ON, CA) process approximately 1 μ g (about 3pmol) fragmentation gDNA continue 60 minutes, with AMPure pearl (Agencourt Bioscience, Beverly, MA) purifying, at 12 DEG C with 40 T4DNA of unit polymerases (New England Biolabs (NEB), Ipswich, MA) incubation 1 hour, purify (institute all carries out according to working specification in steps) with AMPure again, thereby generate unphosphorylated flat end.Then, gDNA fragment after end being repaired according to nick translation Connection Step described herein is connected to synthetic adapter 1 (Ad1) arm (Figure 40), is connected the connection of minimum effective adapter-fragment thereby produce fragment-fragment with adapter-adapter.Figure 40 is provided for the oligonucleotides table that adapter of the present invention builds and inserts.All oligomer are purchased from IDT.In Figure 40, " site in Ad " represent with respect to insert adapter cochain oligonucleotides site (3=3 ', 5=5 ') and chain (on T=, under B=), thereby make the ssDNA obtaining encircle the cochain that contains adapter, the lower chain that makes the DNB obtaining contain adapter.Oligonucleotides moved to (offset) and be expressed as 3 '->5 ' or 5 '->3 ', to emphasize its function and relative position in adapter.With 5 or 3 labeled oligonucleotide ends, to represent direction, with P, dd or B mark to be to represent respectively 5 ' PO4,3 ' two deoxidations, or 5 ' biotin modification.Promote the contained palindrome that generates consolidation DNB by hybridization in 14-base molecule to illustrate with underscore.
At 14 DEG C, containing 50mM Tris-HCl (pH7.8), 5%PEG8000,10mM MgCl2,1mM rATP, the (" 5 ' PO of 5 '-phosphorylation of 10 times of molar excess 4") and there is the Ad1 arm (Figure 40) and 4 of 3 ' two deoxidation ends (" 3 ' dd "); in the reaction of 000 T4DNA of unit ligase (Enzymatics; Beverly, MA), the gDNA fragment that about 1.5pmol end is repaired is hatched 120 minutes.By 5 ' PO 4ad1 arm end carries out T4DNA with 3 ' OH gDNA end and is connected the intermediate structure generating with otch, wherein, described otch is made up of two deoxidations (not attachable thus) 3 ' Ad1 arm end and unphosphorylated (not attachable thus) 5 ' gDNA end.After removing the Ad1 arm not being incorporated to, at 60 DEG C, containing 200 μ M Ad1PCR1 primers (Figure 40), 10mM Tris-HCl (pH78.3), 50mM KCl, 1.5mM MgCl through AMPure purifying 2, 1mM rATP, hatches DNA 15 minutes in the reaction of 100 μ M dNTPs, thereby has the Ad1 oligomer of 3 ' two deoxidation ends with the Ad1PCR1 primer replacement with 3 ' OH end.Then reaction is cooled to 37 DEG C, after adding the Taq archaeal dna polymerase (NEB) of 50 units and the T4DNA ligase of 2000 units, at 37 DEG C, reaction is hatched 30 minutes again, thus the nick translation systematic function 5 ' PO of Ad1PCR1 primer 3 ' the OH end by Taq-catalysis 4gDNA end, and the otch through repairing obtaining by T4DNA connecting sealed.
By the 40 PfuTurbo Cx (Stratagene of unit, La Jolla, CA) 1X Pfu Turbo Cx buffer solution, 3mM MgSO4,300 μ M dNTPs, 5%DMSO, 1M betaine, the material that the Ad1-of the AMPure purifying to about 700pmol in the 800 μ l reactions that form with the Ad1PCR1 primer of each 500nM is connected carry out PCR (6 to 8 circulations: 95 DEG C 30 seconds, 56 DEG C 30 seconds, 72 DEG C 4 minutes) (Figure 40).This process provides the selective amplification of the template that about 350fmol is contained to Ad1 left arm and right arm, thereby generates the PCR product that the ad-hoc location of about 30pmol in Ad1 arm merged dU group.At 37 DEG C, with 10 UDG/EndoVIII of unit mixture (USER; NEB) product of processing about 24pmol AMPure purifying continues to have the Ad1 arm of 3 ' complementary ledge and make the Acul site of Ad1 right arm coding partly become strand to generate for 60 minutes.At 37 DEG C, containing 10mMTris-HCl (pH7.5), 50mM NaCl, 1mM EDTA, 50 μ M s-adenyl residue-L-Methionines, with the 50 Eco57I (Fermentas of unit, Glen Burnie, MD) reaction in hatch this DNA12 hour, thereby Ad1 left arm Acul site and genome Acul site are methylated.By 16.5mM Tris-OAc (pH7.8), 33mM KOAc, 5mM MgOAc, in the reaction that and1mM ATP forms, the concentration that is 3nM by DNA dilution about 18pmol AMPure purifying, methylated, be heated to 55 DEG C and continue 10 minutes, and be cooled to 14 DEG C and continue 10 minutes, thereby be conducive to hybridization (cyclisation) in molecule.
Then at 14 DEG C, there is under the unphosphorylated bridge joint oligomer of 180nM (bridge oligo) condition (Figure 40) the dsDNA cyclosiloxane monomer that contains the nicked Ad1 of cochain and double-stranded unmethylated Ad1 right arm Acul site with generation in 2 hours by 3600 unit T4DNA ligase (Enzymatics) incubation reaction.According to manufacturer's explanation, concentrate Ad1 ring and hatch 60 minutes to eliminate remaining linear DNA with 100U PlasmidSafe excision enzyme (Epicentre, Madison, WI) at 37 DEG C by AMPure purifying.
According to manufacturer explanation, at 37 DEG C, digest about 12pmol Ad1 ring with 30 Acul of unit (NEB) and within 1 hour, generate the linear dsDNA structure containing by the Ad1 of two fragment side joints of insertion DNA.After AMPure purifying, at 60 DEG C, containing 10mM Tris-HCl (pH8.3), 50mM KCl, 1.5mM MgCl 20.163mM dNTP, 0.66mM dGTP, with in the reaction of 40 Taq of unit archaeal dna polymerases (NEB), hatch about 5pmol linear DNA 1 hour, thereby be converted into 3 ' G ledge by the 3 ' ledge that the translation of Ad1 cochain otch will be close to active (right arm) Ad1Acul site.At 14 DEG C, containing 50mM Tris-HCl (pH7.8), 5%PEG8000,10mM MgCl 21mM rATP, 4000 T4DNA of unit ligases, (one of them arm is designed to be connected with 3 ' G ledge with the asymmetric Ad2 arm of 25 times of molar excess, another arm is designed to be connected with 3 ' NN ledge) reaction in hatch DNA2 hour (Figure 40) obtaining, generate thus directed (with respect to Ad1) Ad2 arm and connect.Purify the material that about 2pmol Ad2 connects with AMPure pearl, carry out pcr amplification (Figure 40) with PfuTurbo Cx and the Ad2 Auele Specific Primer that contains dU, AMPure purifying, process with USER, with the cyclisation of T4DNA ligase, concentrated and the PlasmidSafe processing with AMPure, through the above-mentioned dsDNA ring that contains Ad1+2 that generates in steps.
Encircle and carry out pcr amplification (Figure 40) with the about 1pmol Ad1+2 of the primer pair that contains Ad1PCR2dU, AMPure purifying, and USER digestion, generate in steps by the fragment of the Ad1 arm side joint with complementary 3 ' ledge and make Ad1 left arm Acul site partly become strand through above-mentioned.The fragment obtaining is methylated with Ad1 right arm Acul site and genome Acul site inactivation, and AMPure purifying and cyclisation, contain lower chain with the Ad1 of otch and the dsDNA in the left Acul of double-stranded unmethylated Ad1 site ring through above-mentioned generation in steps.Concentrate described ring by AMPure purifying, Acul digestion, makes its band G tail by AMPure purifying, and is connected to asymmetric Ad3 arm (Figure 40), passes through the above-mentioned directed Ad3 arm connection that generates in steps.The material that Ad3 connects is by AMPure purifying, carry out pcr amplification (Figure 40) with the Ad3 Auele Specific Primer containing dU, AMPure purifying, USER digestion, cyclisation is also concentrated, generate the ring containing Ad1+2+3 through all above-mentioned steps, wherein, Ad2 and Ad3 side joint Ad1 also contain EcoP15 recognition site at its far-end.
At 37 DEG C, according to manufacturer's explanation, digest about 10pmol Ad1+2+3 ring with 100 EcoP15 of unit (NEB) and contain to discharge the fragment that is dispersed in three adapters between four gDNA fragments in 4 hours.After AMPure purifying, with T4DNA polymerase, the DNA of digestion is carried out to end repairing as mentioned above, carry out as mentioned above AMPure purifying, at 37 DEG C, containing 50mM NaCl, 10mM Tris-HCl (pH7.9), 10mM MgCl 2, 0.5mM dATP, and in the reaction of 16 unit K lenow exo-(NEB), hatch 1 hour to add 3 ' ledge, and be connected to as mentioned above the Ad4 arm with T tail.On polyacrylamide gel, coupled reaction is carried out to electrophoresis detection, the fragment that wash-out contains Ad1+2+3+Ad4-arm from gel also reclaims by QiaQuick purifying.The DNA that adds the 5 '-biotinylated primer special to Ad4 arm with Pfu Turbo Cx (Stratagene) as mentioned above and the special about 2pmol of 5 ' PO4 primer pair of another arm of Ad4 is reclaimed increase (Figure 40).
According to manufacturer's explanation, biotinylated about 25pmol PCR product is captured in to the coated Dynal paramagnetic beads (Invitrogen of biotin, Carlsbad, CA), and by reclaim abiotic elementization chain (containing a 5 ' Ad4 arm and a 3 ' Ad4 arm) with 0.1N NaOH sex change.After neutralization, by carrying out the chain of purifying containing the Ad1+2+3 of the desired orientation along with respect to Ad4 arm with the oligomer hybridisation of catching of three times of excessive Ad1 cochain specific biological elementizations, then be trapped on Streptavidin pearl and use 0.1N NaOH wash-out, above-mentioned steps is all carried out according to manufacturer's explanation.At 60 DEG C, according to manufacturer's explanation, hatch the DNA1 hour of about 3pmol recovery with 200 CircLigase of unit (Epicentre), thereby generate the ring that contains strand (ss) DNA Ad1+2+3+4, then at 37 DEG C, according to manufacturer explanation, hatch 30 minutes to remove the DNA of not cyclisation with 100 Exol of unit and 300 ExoIII of unit (two kinds of enzymes are all from Epicenter).
In order to evaluate the representative biases in ring building process, by StepOne platform (Applied Biosystems, Foster City, CA) quantitative PCR (QPCR) under and the QPCR based on SYBR Green analyze (Quanta Biosciences, Gaithersburg, MD) intermediate steps during genomic DNA and library construction are formed is analyzed, to judge the existing and concentration (Figure 41) of one group of 96dbSTS mark that represents series of genes seat GC content.The mark length showing in the Figure 41 selecting from dbSTS is less than 100bp, and long 20 bases of use and GC content are 45 to 55% primer, and represent series of genes seat GC content.From NCBI Build36 starting and ending coordinate.Amplicon GC content is the GC content of the PCR product of amplification, and 1kb GC content is positioned at the interval calculation at amplicon center by 1kb.In each sample, each mark is collected to former cycle threshold (Ct) value.Then, the average Ct that deducts each sample from former Ct value is separately to generate one group of standardization Ct value, and the average Ct value of each sample is zero thus.Finally, from standardization Ct value separately, deduct average (four repeat) standardization Ct of each mark in gDNA, generate the δ Ct value of one group of each each mark of sample.This analysis has shown under the condition that the AT content of the mark in Ad1, Ad2 and Ad3 ring with respect to genomic DNA is higher, has increased (Figure 42) compared with the concentration of high GC content mark.Conventionally, 1kb GC content is that 30 to 35% locus and 1kb GC content are the concentration difference that 50 to 55% locus has 1.4Ct (2.5 times).This deviation is similar to fragment and the base horizontal departure in cPAL data-mapping, observed.
In order to assess library construction body structure, carry out pcr amplification with the strand library DNA that Taq archaeal dna polymerase (NEB) and Ad4 specific PCR primer pair 4Ad mixture are caught.Clone kit (Invitrogen) with TopoTA and clone these PCR products, use bacterium colony PCR to generate the PCR replicon of 192 independent bacterium colonies.Purify these PCR products with AMPure pearl, collect the sequence information from two chains with Sanger dideoxy sequencing method (MCLAB, South San Francisco, CA).The trace materials that filtration obtains is to obtain quality data, and analysis comprises the clone that insert in library that contains with at least one good determination result.Table 1 shows the Sanger sequencing data from library intermediate product for assessment of adapter structure.147 to 192 library clones contain at least one high-quality Sanger measurement result.143 (>97%) in above-mentioned 147 clones contain along whole 4 adapters of anticipated orientation and order.In addition, 3 (*) having in 4 clones of abnormal adapter structure are expected to remove from library in the PCR course of reaction for generating DNB, this means that approximately 99% DNB is expected to have correct adapter structure.Data are from NA07022
Table 1
Table 2 shows the Sanger sequencing result to library intermediate product of qualification adapter sudden change.The analysis of the library construction body to 89 clones has disclosed every 1000bp in adapter sequence approximately 1 sudden change, for the analysis of described library construction body, can obtain high-quality forward and reverse Sanger sequencing data.Meanwhile, 5 (5.6%) in 89 clones' library construction body have sudden change in the 10bp of one of its 8 adapter ends; Expect that this sudden change can affect the cPAL quality of data.The mistake that most of adapters suddenly change in probably synthesizing by oligonucleotides is introduced.Expect that much lower mutation rate can produce (32*1.3E-6<1 is in 10,000bp) by 32 high fidelity PCR circulations.Data are from NA07022
Table 2
The generation of DNB
Copy the ring generating according to said method with Phi29 polymerase.Same step synthetic method under use control obtains the order-checking substrate of hundreds of tandem copies with the form of the improved single stranded DNA ring of palindrome, be called DNA nanosphere (DNB) (Figure 39 C) herein.At 90 DEG C, containing the Tris-HCl of 50mM (pH7.5), 10mM (NH 4) 2sO 4, 10mM MgCl 2, 4mM DTT, and hatch 100fmol Ad1+2+3+4ssDNA ring (Figure 40) in 400 μ L reactions of 100nM Ad4PCR5B primer.This reaction is adjusted into and contains mentioned component and add 800 μ L reactions of the each dNTP of 800 μ M and 320 Phi29DNA of unit polymerases (Enzymatics), and at 30 DEG C, hatch and within 30 minutes, generate DNB.Short palindrome (Figure 40) in adapter promotes the consolidation DNB of ssDNA concatermer spiral into about 300nm by hybridization in reversible molecule, has avoided thus entwine (herein also referred to as " replicon ") with adjacent DNB.The DNB assembling that synchronous rolling-circle replication (RCR) conditional combination and palindrome drive generates and exceedes 20,000,000,000 discrete DNB/mlRCR reactions.The sustainable several months of structure of these consolidations keeps stable and not evidence show degraded and the generation of entwining.
The generation of DNB random array
DNB is adsorbed in the 25 × 75mm silicon base by the etched finishing of photoetching process, and described silicon base has the grid type patterned array (Figure 39 D) of about 300nm DNB in conjunction with spot.Compared with the array forming on there is no the surface of this pattern, use the surface of grid type patterning to make DNA content and the image information strength increase of every array.These arrays are random array, because do not know that before carrying out sequencing reaction which sequence is positioned at array each point.
In order to prepare patterned substrate, layer of silicon dioxide is planted in the surface of standard silicon chip (Silicon Quest International, Santa Clara, CA).One deck titanium is deposited on silica, by conventional photolithography and dry etch technique by reference mark by this layer pattern.By vapour deposition by one deck HMDS (HMDS) (Gelest Inc., Morrisville, PA) add to substrate surface, and by centripetal force, dark-UV, eurymeric (positive-tone) photoresist material are coated on surface.Then, utilize 248nm lithography tool that photoresist surface is exposed to array pattern, make photoresist developing generate the array of the separate areas of the HMDS with exposure.Remove the HMDS layer in hole with plasma etching method, and by amino silane vapour deposition in hole so that DNB attachment site to be provided.With coated array substrate the substrate of being cut into 75mm x25mm again of one deck photoresist material, peel off all photoresist materials in separate substrate with ultrasonic wave.Then, the mixture of 50 μ m polystyrene beads and polyurethane adhesive is coated on to the substrate of each incision in the mode of series of parallel line, cover glass is pressed on to the mobile slide glass that forms six road gravitation/capillary drives in tree lace.The feature (feature) that is patterned into suprabasil amino silane is served as the binding site of independent DNB, and DNB combination between HMDS inhibitory character.Suction moves than the DNB of many 2 to 3 times of the binding site on slide glass, thereby DNB is loaded in each flow channel of fluid slide glass.At 23 DEG C, hatch laden slide glass 2 hours in sealing in chamber, in flushing and pH remove unconjugated DNB.
Sequencing reaction
The clone from two individualities (European descendant's the white man male sex (NA07022) and Yoruban women (NA19240)) of previously going on a punitive expedition by HapMap repertory is checked order.In addition, to checking order from the lymphoblast DNA of Personal Genome Project white man male sex sample (PGP1 (NA20431)).Four-dimensional intensity data is carried out to auto-clustering analysis and generate former base measurement result and relevant former base mark.
Use the high cPAL of degree of accuracy order-checking chemical process independently measure and 8 anchored sites in 10 bases of each adjacent as many as (Figure 39 E), obtain the measurement result (each DNB62 to 70 base) of 31 to 35 base pairings altogether.CPAL is open chain hybridizing method and the interconnection technique that the coupled reaction by using degeneracy grappling extends tradition order-checking, with have at all reading positions the mode of the similar degree of accuracy generate with the adapter site (Figure 39 E, right side) of 8 insertions in each adjacent prolongation sequence reading length (for example 8 to 15 bases) (Figure 43).In Figure 43, DNB site represents 70 sites through order-checking in a DNB.Adapter nearly 10 read site by detecting described in part 4.Site 1 to 5 in adapter is expressed as Blue Streak, and the site 6 to 10 in adapter is expressed as red.From left to right, adapter and grappling reading structure are: ad13 ' (1-5), ad25 ' (10-6), ad25 ' (5-1), ad23 ' (1-5), ad23 ' (6-10), (10-6), (5-1), ad43 ' (1-5) for ad45 ' for ad45 ', ad43 ' (6-10), (10-6), (5-1), ad33 ' (1-5) for ad35 ' for ad35 ', (6-10), ad15 ' (5-1) for ad33 '.By mapping measurement result to reference sequence (obtaining optimum Match multiple rational matched record in the situation that in discovery) and calculate the different definite difference between measurement result and the reference sequence in each site.Open chain base reads and allow frequently to occur failed base detection in other good determination result.Most of mistakes occur in fraction low quality base.Data are from NA07022.Generally speaking, approximately 10 bases adjacent with each adapter can be used cPAL technology to read.
Connecting (cPAL) by combined probe grappling calculates and carries out open chain order-checking and relate to and detect the connection product forming to the grappling oligonucleotides of part adapter sequence and at the fluorescence degeneracy order-checking probe that " inquiry site " located to contain specific nucleotide by hybridizing target nucleus.If the nucleotides complementation at the detection site place in nucleotides and the target of inquiry site, is conducive to connect, stable probe-grappling that generation can be detected by fluorescence imaging connects product.
Identify the base of inquiring site in order-checking probe with four kinds of fluorogens, use the set of four kinds of order-checking probes to inquire the single base site of each hybridization-connect-detection circulation.For example, in order to measure site 4 (anchor series 3 '), merge following 9-aggressiveness order-checking probe, wherein " p " represents attachable phosphoric acid, and " N " represents degeneracy base:
5’-pNNNANNNNN-Quasar670
5’-pNNNGNNNNN-Quasar570
5’-pNNNCNNNNN-Cal?fluor?red610
5 '-pNNNTNNNNN-fluorescein
Synthetic 40 probes (Biosearch Technologies, Novato, CA) altogether, and under peak value, carry out HPLC-purifying wider subduing.These probes are designed to inquire that by 5 groups of four probes that are designed to inquire steady point 1 to 5 anchor series 5 ' and five groups four probes of site anchor series 3 ' form.These probes are merged into 10 set, by 16 anchor series altogether [4 adapter × 2 adapter end × 2 anchor series (standard with extend)] by described set for composite joint analysis, be called thus combined probe-grappling and connect (cPAL).
In order to measure the site 1 to 5 in the target sequence adjacent with adapter, 1 μ M grappling oligomer is inhaled and moved on array and at 28 DEG C direct cross in the adapter regional sustained adjacent with target sequence 30 minutes.Then the mixture that, 1000U/ml T4DNA ligase is added to four kinds of fluorescence probes (at 1.2 μ M T, 0.4 μ M A, 0.2 μ M C, and under the typical concentration of 0.1 μ M G) is inhaled to move on array and at 28 DEG C and is hatched 60 minutes.Remove unconjugated probe by the Tris buffer solution with 150mM NaCl (pH is 8) washing.
Generally speaking, if the region complete complementary of probe and the target nucleic acid that is hybrid with it, T4DNA ligase can be with greater efficiency linking probe, but the informativeness of ligase is along with increasing and reduce from the distance that connects site.For the mistake that between probe and target nucleic acid, incorrect pairing causes that makes to check order minimizes, the distance limiting between nucleotides to be detected and grappling probe and the tie point of order-checking is useful.Can, to the anchor series that extends the prolongation of 5 bases in unknown target sequence, can use T4DNA ligase to read the site 6 to 10 in target sequence by adopting.
The generation of the anchor series extending relates to the connection of two grappling oligomer, and described grappling oligomer is designed to annealing adjacent one another are on target DNB.First-grappling oligomer is designed near termination adapter end, and second-grappling oligomer (being made up of five degeneracy sites that extend in target sequence to a certain extent) is designed to be connected to the first anchor series.In addition, second-grappling degeneracy oligomer is optionally modified into and suppresses inappropriate (for example, self) and connect.For the assembling (it is used for 3 ' end to be connected with order-checking probe) of the 3 ' anchor series extending, second-grappling oligomer is with 5 ' and 3 ' phosphate group preparation, thereby 5 ' end of second-anchor series can be connected to 3 ' end of first-anchor series, but 3 ' end of second-anchor series can not participate in connecting, seal thus second-grappling and connect product.Once the anchor series extending is assembled, its 3 ' end is by the phosphorylation activation of T4 polynucleotide kinase (Epicentre).Similarly, for the assembling of (it is used for 5 ' end to be connected with order-checking probe) of the 5 ' anchor series extending, first-anchor series is by 5 ' phosphoric acid preparation, need not 5 ' or 3 ' phosphoric acid in the preparation of second-anchor series, thus, 3 ' end of second-anchor series can be connected to 5 ' end of first-anchor series, but 5 ' end of second-anchor series can not participate in connecting, thereby stops the connection product of second-anchor series.Once the anchor series extending is assembled, its 5 ' end is by the phosphorylation activation of T4 polynucleotide kinase (Epicentre).
First-anchor series (4 μ M) length is generally 10 to 12 bases, and the length of second-anchor series (24 μ M) is 6 to 7 bases, comprises 5 degeneracy bases.Compared with using the alternative way of high concentration label probe, the noise that uses second-anchor series of high concentration to introduce can be ignored, and cost is minimum.At 28 DEG C, anchor series is connected to 30 minutes with 200U/ml T4DNA ligase, then washs three times, then add 1U/ml T4 polynucleotide kinase (Epicentre) lasting 10 minutes.Then as above the mensuration of rheme point 1 to 5 is such, and loci 6-10 checks order.
After imaging, remove grappling-probe bond of hybridization with 65% formamide ,-grappling hybridization mixture single by adding or two-grappling connect mixture and start next circulation of this process.Removing probe-grappling product is the key character that open chain base is measured.On clean DNA, starting new connection circulation, to obtain connecting productive rate be 20 to 30% accurate detection, and it can be realized with low cost and high-accuracy by low concentration probe and ligase.
Imaging
Tecan (Durham NC) MSP9500 liquid processor, for automatic cPAL Biochemical processes, is used for to the slide glass between exchanging liquid processor and imaging station by mechanical arm.Imaging station is made up of the four look lighting fluorescent microscope devices made from conventional assembly, comprises by the Olympus (Center Valley, PA) NA=0.95 water immersion objective and the tube lens that amplify 25 times of operations; Semrock (Rochester, NY) two-way fluorescence filter, FAM/Texas Red and CY3/CY5; Wegu (Markham, Ontario, Canada) autofocus system; Sutter (Novato CA) 300W xenon arc lamp with Lumatec (Deisenhofen, Germany) 380 liquid core photoconductive tube couplings; Aerotech (Pittsburgh, PA) ALS130X-Y stage storehouse; And two Hamamatsu (Bridgewater, NJ) 91001-mega pixel EM-CCD camera.Each slide glass is divided into 6,396 320 μ m x320 μ m visuals field.These visuals field are organized into six 1066-visual field groups, the passage of producing corresponding to suprabasil tree lace.Generate the four-color image (need to change a filter) of each group, before then moving to next group.In Step-and-repeat pattern with the effective speed photographic images of 7 frames per second.In order to use to greatest extent microscope and the biochemical cycles time to be mated with imaging circulation timei, 6 slide glasses of parallel processing under staggered condition of Biochemical processes time started, thus the imaging of slide glass N just in the time that just completing its biochemical cycles, slide glass N+1 is completed.
Other embodiment can comprise continuous imaging, can make the productive rate of described continuous imaging improve 30 times by further improvement camera, reaches 250Gb/ instrument sky and exceedes 1Tb/ instrument sky.
Base is measured
Each visual field contains 225x225=50625 spot or possible DNB feature.Carry out independence processing to extract DNB strength information by following steps pair 4 images relevant to the visual field: 1) background removal, 2) image registration, 3) intensity extraction.First, by modified opening operator (expanding after corrosion) operation estimation background.From original image, deduct subsequently the background image obtaining.Then, by flexible mesh and image registration.Except proofreading and correct rotation and translation, this grid makes convergent-divergent/adjusting to have (R-1)+(C-1) free degree of degree (: R=C=225) herein, wherein R and C are respectively the capable number with row of DNB, can make thus each row or column of grid slightly floating with DNB array best fit.Optical aberration in this procedure regulation image and the partial pixel of every DNB.Finally, for each mesh point, considered the radius of a pixel; In this radius, calculate the mean value of 3 pixels in top and return as the intensity level of this DNB extracting.
Then the data from each visual field are carried out to base mensuration, this comprises four key steps: 1) crosstalk correction, 2) standardization, 3) measure base, and 4) former base score calculation.First optics (fixing) and biochemistry (variable) that, application crosstalk correction reduces between four channels are crosstalked.From all parameters of data estimation in each visual field (fixing or variable).To there is the system matching of four interception lines (on one point) by constrained optimization method to four-dimensional intensity data.Sequential quadratic programming and genetic algorithm are used for to optimizing process.Then by model of fit for normal space is reversed-be converted to data.After crosstalk correction, by making each point be distributed in the each channel of separate standardsization on corresponding channel.Then, select from the nearest axle of each point as its base measurement result.No matter quality how, is measured base on institute's spottiness.Then, each spot obtains former base scoring, has reflected the confidence level of this particular bases measurement result.Carry out former base score calculation by the geometry mode of several subitem marks, described geometry mode is caught intensity and the relative position thereof of each cluster and the position of data point is interspersed among in its cluster.
DNB mapping and sequence assembling
Use means known in the art and as on April 29th, 2009 submit to 61/173, described in 967, sequencing result is mapped to the assembling of human genome reference, for all objects, particularly, in order to arrive the relevant all instructions of reference sequence with sequence assembling with by sequence mapping, the full content of above-mentioned document is incorporated to herein by reference.The assembling of sequencing result and mapping obtain approximately 124 to about 241Gb mapping and approximately 45 to 87 times/genomic full genome coverage rate.
Reading structure jaggy of the present invention need to carry out necessarily adjusting to carry out standard information analysis.For example, if the length (, being modal value) of the breach between fixing measurement result replaces positive breach and the base site of repeat reading is used to consistent mensuration with N, the continuous word string that each arm is expressed as to base is possible.The dynamic programming including the scoring of standard Smith-Waterman Local Alignment be can use or this word string and reference sequence compared by the marking scheme that makes insertion or disappearance only occur in the gap position between measurement result of improvement.Also can use the method for the high speed mapping of the short sequencing result of some forms for relating to the genomic indexing of reference, limit and can and/or need to limit the notch size allowing with a part for the arm of index comparison although depend on the index of the unnotched seed of being longer than 10 bases.In simulation process, although we find the correct gap structure of disappearance fraction (<1%) arm, also can significantly increase variation and measure mistake, because we have missed the correct comparison of these arms, may trust too much thus the vacation mapping of carrying out with wrong gap structure.Therefore, the invention provides the method for effective mapping DNB that can find out nearly all correct mapping.
Measurement result and the reference genome of the arm of comparison pairing in two-stage process.First, use the genomic index of reference independently to compare left arm and right arm.Search first can be found all sites in the genome mating with the arm of maximum two single-bases replacements, has nearly some sites of five mispairing but may find.Further limit the number of the mispairing in the coupling of reporting, thereby find the probability <4 of the coupling of the random sequence identical with reference length -3.Exceed 1000 comparison results if particular arm has, can not compare forward again, and this arm is marked as " signal strong (overflow) ".The second, for each site of left arm that the first stage is differentiated, right arm is carried out to Local Alignment process, this process is restrained in genome interval, and described genome interval can be learnt from the distribution of counter pair distance (0 to 700 base distance herein).In this process, allow the mispairing up to four single bases; Further the number of restriction mispairing is so that the probability <4 of the random fit of all pairings -7.Identical Local Search is carried out to left arm near in right arm coupling.
Two stages, by multiple combinations of attempting breach value, the measurement result with arm jaggy is compared.Arm measurement result sample in the library of breach value being carried out loosely limiting by comparison is estimated the frequency of the breach value in each library.In large capacity comparison process, consider performance factor, only use a part of breach value; The cumulative frequency of unheeded breach value is approximately 10 -3.Two stages all can be compared the arm in the site of containing unsuccessful order-checking (without measurement result).The number without measurement result in arm has been considered in above-mentioned probability calculation.Finally, if pairing has any consistent arm site (that is, left arm and right arm are positioned in same chain with suitable order and within expection pairing range distribution scope), so only retain these sites.Otherwise, retain all sites matching.Or, consider performance factor, each arm is 50 sites of report at most; The arm with more reservations site is labeled as " signal is strong ", and does not report any site.The total data productive rate of reading the spot of imaging by mapping is 40 to 50%, reflect the end-end loss in all process poor efficiencys, described process poor efficiency comprises the DNB of empty array spot, low quality region, abnormal DNB and non-human (for example EBV source) DNA.
Use means known in the art and method as herein described to assemble genome sequence by measurement result.Subsequently, the sequence of assembling and reference sequence are compared to confirm.
The genomic data collection of assembling is differentiated to QC analytical plan processes to confirm its sample source routinely.Find to assemble the SNP genotype obtaining with independent consistent available from those SNP genotype height of original DNA sample, illustrate that this data set is from target sample.Meanwhile, the mitochondrial genomes in each passage covers the chondriogen somatotype (average 31 times/passage) that is enough to support passage level.39-SNP chondriogen type to each passage is edited, and itself and total data collection are compared, and confirms that each passage source is identical.
Poisson expection that this and mapping coverage rate have shown substantial deviation but only have sub-fraction base to there is invalid covering (Figure 44).For each sample, 10% coverage rate of minimum covering gene group changes between doubly at <13-and <22-.Many this coverage rate deviations are caused by the local GC content in NA07022, a kind of deviation (Figure 44) being significantly reduced by the PCR condition improveing in NA19240.Each genomic accumulation coverage rate shows in Figure 44 A.Make to distribute standardization easily to compare.Provide Poisson sampling measurement result distribution and compare with the distribution of the 400bp pairing DNB measurement result of simulation.In NA19240, only have the shone upon genome of a few percent to exceed 3 times and be not expressed or exceed 2 times and represented too much.Figure 44 B shows genomic coverage rate percentage, and the GC content classification of the 501-base window of drawing according to the coverage rate to average, by representing the genomic running summary of the points scored report of NA07022 and NA19240.NA20431 is similar to NA07022.Main difference between these two libraries is the condition for PCR.NA19240 is by using the condition of describing in above-mentioned SOM to increase.By contrast, NA07022, by using twice DMSO to increase with the amount of betaine (as for NA19240), makes genomic high GC content district be represented too much thus.Figure 44 C shows by the efficiency of heterozygote (triangle) or homozygote (circle) Infinium genotype detection Infinium SNP as the function of the actual overburden depth of variant site in NA07022.If single-allele is measured (alternate allele, a undeterminate allele) by measuring threshold value, consider to detect this single-allele.
With respect to the reference in the exclusive mapping measurement result from NA07022 genomic inconsistent be 2.1% (approximately 1.4% to 3.3%/slide glass).But, consider that the base that only reaches highest score 85% is measured just former mensuration inconsistent to be reduced to 0.47% (comprising real variant site).
Identify the scope with respect to genomic 2,910,000 to 4,040,000 SNP of reference, wherein 81 to 90% replace to report (the left side bar shaped at Figure 45-each disappearance/insertion point place is as genome, and right side bar shaped is decoding) taking dbSNP and short disappearance or insertion and bulk.By using part assemble method again, nearly disappearance or the insertion of 50bp size detected.As was expected, and the disappearance in code area or insertion are easy to occur with the multiple of length 3, and explanation may select minimally to affect the variant (Figure 45) of code area.
As the first test of sequence accuracy, the HapMap phase I/II SNP genotype comparison by the SNP of the mensuration generating according to said method with the NA07022 of report.Method of the present invention is measured these sites of 94% completely, and global consistency is 99.15% (Figure 46-residue 6% site is half mensuration or undeterminate).
In addition, the 96%Infinium of HapMap SNP (Illumina, San Diego, CA) subset is measured completely, and global consistency is 99.88%, has illustrated that these genotype are reported more exactly.In NA19240, in (mensuration rate exceedes 98%) and NA20431, can be observed (Figure 47) with the similar uniformity of available SNP genotype.Form in Figure 47 shows the genotype generating with HapMap Project (release24) and HapMap is genotypic or detect the uniformity of subset from the Infinium of the first water of Affy500k Genotyping (detect in duplicate genotype, only consider to have the SNP of same measured result).
Because the false positive rate of full genome can not accurately estimate by known SNP site, thus the random subset of non-synonym variant new in NA07022 is tested, because this classification contains more mistake.Target location survey order by 291 this sites is inferred error rate, estimating false positive rate is approximately 1 variant/100kb, comprise the variant that <6.1 replaces, the variant of the short sequence deletion of <3.0, the variant/Mb of the variant that <3.9 section sequence is inserted and <3.1 sealing.(table 3)
Table 3
Figure 48 shows 1M Infinium SNP consistent with the variant of mensuration (being expressed as according to the percentage of the data of variant mass fraction sequence).The percentage in inconsistent site can filter by use the variant mass fraction threshold value reduction of the percentage of described data.Note the y-axle of different proportion.Data are from NA07022.
Abnormal pairing breach may illustrate and have the structural variant and the rearrangement that change with respect to the genomic length of reference.In NA07022, identify this abnormal pairing of 2,126 clusters altogether.PCR-based has confirmed a kind of 1500-base deletion of such heterozygosis.By adding or lack single Alu repeat element and make to exceed cluster in the same size of half.
Some application of genome sequencing can have benefited from maximum discovery speed, although taking extra false positive as cost, and for other application, preferably lower discovery speed and lower vacation sun rate.Use the adjustment of variant mass fraction to measure speed and accuracy (Figure 48).In addition, novel rate (property is for dbSNP) is also the function of variant mass fraction.
Figure 49 shows the variation of covariant weight score threshold and the ratio of the novelty variation measured value (not confirmed release129 by dbSNP) that changes.Variant mass fraction can be used for selecting the novel rate (novelty rate) of expecting and measures the balance between speed.Each point in figure is the number of the known and novel sudden change that detects under single variant mass fraction threshold value.Band line a little infers according to the highest scoring 20% of measuring from known mutations the novel rate obtaining.Note novel not direct representation of rate error rate, and variant mass fraction have different implications for different variant type.Data are from NA07022.
Automatically explain software with Trait-o-Matic and process NA07022 data, produce 1,159 variant through explaining, wherein 14 diseases (Figure 50) that hint is possible.
Once be identified for the site that confirms order-checking, just with the PCR primer sequence of JCVI Primer Designer (http://sourceforge.net/projects/primerdesigner/, S1) (taking Primer3 as basic management and flow process external member) design side joint target variant.Use synthetic oligomer [Integrated DNA Technologies, Inc. (IDT), Coralville, IA] the described site of increasing by Taq polymerase, by SPRI (Agencourt) purified pcr product.The PCR product of purifying is carried out to double-stranded Sanger order-checking (MCLAB).The trace product that filtration obtains is to produce quality data, produce the base measurement result of mixing by TraceTuner (http://sourceforge.net/projects/tracetuner/) operation, compare by the application program in EMBOSS Software Suite (http://emboss.sourceforge.net/) and the mensuration sequence of its expection.For each site, generate the mensuration sequence of the expection of each chain by the sudden change modification reference sequence based on prediction, thereby express the combination of two allele sequences.Determine that locus is to confirm whether corresponding trace product is accurately compared at this mutational site place of at least one chain with the mensuration sequence of expection.By trace product being estimated to the inconsistent or difference of the distinguishable arbitrary chain being caused by ambient noise.
The analysis of coding SNP
With all SNP variants that identify in Trait-o-Matic software analysis NA07022.All non-synonym SNP (nsSNP) variant of finding in HGMD, OMIM and SNPedia (SNP quoting) as this software loopback of webpage operation, and specifically do not list in database before, but all nsSNP that occur in the gene of listing in OMIM (unreferenced nsSNP).With Trait-o-Matic 1,141 variant of having analyzed the loopback of NA07022 genome, comprise 605 nsSNP that quote and 536 unreferenced nsSNP.By 20 variant scores of BLOSUM100 filter 23, lower than 3,725 variants (weighted average of HapMap and 1000 Genome ratio data) in white people/European (CEU) group with little gene frequency (MAF) >0.06 leave 55 nsSNP that quote and 41 unreferenced nsSNP.Remove 41 nsSNP that quote, be because their phenotype evidence only based on binding, for example be, because their uncorrelated with disease (olfactory receptor, blood group, eye colors), and remove 38 unreferenced nsSNP, because their functional impact is not remarkable.Figure 50 has listed remaining 14 nsSNP that quote (12 heterozygosis sites and a compound heterozygosis site) in APOE with potential Phenotype, three unreferenced nsSNP (two nonsense mutations and a homozygous mutation) and two conventional variations.
The loading post processing of embodiment 4DNB array
DNB loads
Suction moves than the DNB of many 2 to 3 times of the DNB binding site on slide glass DNB preparation is filled in the passage of the slide glass that flows.At 23 DEG C, hatch laden slide glass 2 hours in sealing in chamber, and rinse with in and pH and remove unconjugated DNB.
Load post processing
After loading, rinse first DNB with cushioning liquid, then for example, with (containing nucleic acid concentrating agents, alcohol such as isopropyl alcohol) and volume eliminant is (for example, polyethylene glycol, PEG) composition in situ processing, rinse to remove alcohol and polyethylene glycol (PEG) with Tris-citrate (pH7.5) again, be then coated with step.Isopropyl alcohol makes nucleic acid dehydration precipitate nucleic acids to affect the concentrated and surface-active of DNB.Also other alcohol be can use, ethanol, butanols and phenol included but not limited to.PEG further affects the surface-active of DNB by the excluded volume effect of effectively concentrated DNB.By separately using, described alcohol and volume eliminant have respectively beneficial effect.But we observe when be used in combination Zhe Lianggecheng branch in single rinsing step and produce synergy.Compare the one in the described alcohol of independent use and volume eliminant or use in order this two kinds of compositions, described alcohol uses the stability that provides higher together with volume eliminant.For the order-checking program of 70 circulations, we combine to reach the needed quality of circulation and stability afterwards by two kinds of compositions.Rinse scheme and buffer composition according to the obtainable signal quality of the DNB array in 70 cycle sequencing programs (intensity and signal to noise ratio) and optimizing stability.Particularly, after being loaded on slide glass, rinse substrate to remove unnecessary DNB with 210mM potassium hydroxide, 100mM citric acid (DNB dcq buffer liquid).Then use 60% isopropyl alcohol, 5%PEG4000 (w/v) (DNB Crash buffer solution) to rinse substrate to concentrate the DNB of substrate surface.Then use Tris-citrate+150mM NaCl, 5% glycerine, 0.1% Tween-80 (detecting buffer solution (Read buffer)) rinses substrate to remove DNA Crash buffer solution.
Then being that DNB is coated processes.With 0.5mg/ml bovine serum albumin(BSA) (BSA; New England Biolabs), 225mM potassium, 100mM citric acid, 4mM dithiothreitol (DTT), 10mM EDTA (albumen lavation buffer solution) rinses substrate, and hatches 15 minutes so that BSA fully adsorbs.The albumen lavation buffer solution that contains human serum albumins provides similar beneficial effect.Then use 40% isopropyl alcohol, 15%PEG4000 (Crash buffer solution) rinses substrate.Finally, rinse substrate with detecting buffer solution, the slide glass that flows is afterwards that order-checking is ready.
Common cPAL order-checking process comprises approximately 70 circulations.Automatically to process the BSA that DNB is coated in to one deck partial denaturation, DNB and substrate surface the two and the stability that greatly improved the DNB in array are covered like this to prevent chemical degradation and mechanical degradation.In this coated absence, be less than 30 survey circulation in DNB signal strength signal intensity and the specific quality surveyed degenerate completely.Have in this coated situation, DNB array successfully passes through and exceeds the cPAL order-checking of 100 circulations and showing seldom degraded or not degraded through 70 circulation times.
Starting to be directly coated with processing after loading if observed, the single DNB in array is to a certain degree to scatter from the teeth outwards.Add before coated rinsing step and subsequently affect the concentrated washing step of DNB, thereby reduce dispersion volume, prevent that adjacent DNB from contacting with each other, and improve by the quality of data of this detection DNB generation.
Methodology, system and/or the structure of this description to technology described herein and carried out sufficient description in the purposes of instance aspect.Although above, the description of described technology various aspects is had to particularity to a certain degree, or for one or more indivedual aspects, those skilled in the art can, in the case of not departing from the spirit or scope of this technology, do various improvement to disclosed aspect.Because in the situation that not departing from technology described herein, can carry out many changes, suitable invention scope is present in appending claims below.Therefore other aspects have also been considered.In addition, it should be understood that any operation can carry out in any order, unless be otherwise noted clearly or the linguistic competence of claim on require certain particular order.In above description, contain with accompanying drawing in all things of showing be appreciated that the just explanation to particular aspects, be not limited to above-mentioned embodiment.Unless based on context very clear or declared clearly, otherwise any concentration value providing in literary composition is conventionally all with regard to mixed liquor value or percentage, there is no to consider any conversion in the time adding the special component of mixture or afterwards.For not clearly being incorporated in literary composition, for all objects, all published bibliography of mentioning in disclosure text and patent document are all incorporated to herein by reference of text.In the basic element situation that does not depart from the technology of the present invention limiting in following claim, can change details or structure.

Claims (16)

1. process nucleic acid array to improve a method for the stability of described array in foranalysis of nucleic acids process, described method comprises:
(a) provide nucleic acid array, described nucleic acid array comprises the nucleic acid molecules that (i) has surperficial supporter and (ii) be attached to described surface;
(b) the concentrated nucleic acid molecules that is attached to described surface, thus concentrated nucleic acid molecules generated; And
(c) with the coated described concentrated nucleic acid molecules of protein.
2. the method for claim 1, wherein described foranalysis of nucleic acids comprises the foranalysis of nucleic acids that nucleic acid sequencing, nucleic acid hybridization analysis or the mode with enzyme are assisted.
3. the method for claim 1, wherein described concentrated step comprises and makes described array and the composition that contains nucleic acid concentrating agents, the composition that contains volume eliminant or contain the two composition of nucleic acid concentrating agents and volume eliminant to contact.
4. method as claimed in claim 3, wherein, described nucleic acid concentrating agents is alcohol.
5. method as claimed in claim 4, wherein, described alcohol is isopropyl alcohol.
6. method as claimed in claim 3, wherein, described volume eliminant is polyethylene glycol.
7. the method for claim 1, described method comprises with the coated described nucleic acid molecules of the composition that contains protein, described protein and described surface conjunction and do not disturb described foranalysis of nucleic acids.
8. method as claimed in claim 7, wherein, described protein is partial denaturation.
9. method as claimed in claim 7, wherein, described protein is seralbumin.
10. method as claimed in claim 9, wherein, described protein is bovine serum albumin(BSA) or human serum albumins.
11. the method for claim 1, wherein described nucleic acid be selected from DNA and RNA.
12. methods as claimed in claim 11, wherein, described DNA is single stranded DNA.
13. methods as claimed in claim 12, wherein, described single stranded DNA is DNA nanosphere.
14. the method for claim 1, wherein described analysis for order-checking.
15. 1 kinds of methods, described method comprises: DNB array (a) is provided, and described DNB array comprises the DNB that (i) has surperficial supporter and (ii) be noncovalently attached to described surface; (b) DNB that makes to be attached to described surface and the aqueous solution that contains nucleic acid concentrating agents, the aqueous solution that contains volume eliminant or contain the two the aqueous solution of nucleic acid concentrating agents and volume eliminant and contact, generate concentrated DNB thus; And (c) with the coated described concentrated DNB of protein, thereby stablize described concentrated DNB.
16. 1 kinds for the treatment of nucleic acid array to improve the kit of the stability of described array in foranalysis of nucleic acids process, described kit comprises the nucleic acid concentrate composition that (a) contains nucleic acid concentrating agents and volume eliminant, (b) the nucleic acid coating composition that contains protein, the surface conjunction of described protein and described array and do not disturb described foranalysis of nucleic acids.
CN201280065478.1A 2011-11-02 2012-10-31 For the processing method of stabilization of nucleic acids array Active CN104039438B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201161554789P 2011-11-02 2011-11-02
US61/554,789 2011-11-02
PCT/US2012/062744 WO2013066975A1 (en) 2011-11-02 2012-10-31 Treatment for stabilizing nucleic acid arrays

Publications (2)

Publication Number Publication Date
CN104039438A true CN104039438A (en) 2014-09-10
CN104039438B CN104039438B (en) 2016-03-09

Family

ID=47178353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280065478.1A Active CN104039438B (en) 2011-11-02 2012-10-31 For the processing method of stabilization of nucleic acids array

Country Status (6)

Country Link
US (2) US10837879B2 (en)
EP (1) EP2773452B1 (en)
CN (1) CN104039438B (en)
DK (1) DK2773452T3 (en)
HK (1) HK1201778A1 (en)
WO (1) WO2013066975A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106148482A (en) * 2015-03-24 2016-11-23 深圳华大基因研究院 A kind of sequence measurement being applicable to small-sized sequenator
CN107034267A (en) * 2016-02-03 2017-08-11 深圳华大基因研究院 Prepare probe collection is sequenced in candidate method, device and its application
CN108070642A (en) * 2016-11-17 2018-05-25 深圳华大基因研究院 The method and the double end sequencing methods of DNB and kit of the double end sequencing quality of raising DNB
CN108350491A (en) * 2015-11-18 2018-07-31 加利福尼亚太平洋生物科学股份有限公司 Nucleic acid is loaded on base material
CN109610007A (en) * 2018-10-12 2019-04-12 深圳市瀚海基因生物科技有限公司 A kind of DNA chip and preparation method thereof that albumen is co-modified
CN109689216A (en) * 2017-05-11 2019-04-26 伊鲁米那股份有限公司 The protection surface covering of flow cell
CN110021349A (en) * 2017-07-31 2019-07-16 北京哲源科技有限责任公司 The coding method of gene data
US10837879B2 (en) 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
LT3305918T (en) 2012-03-05 2020-09-25 President And Fellows Of Harvard College Methods for epigenetic sequencing
US20130296173A1 (en) * 2012-04-23 2013-11-07 Complete Genomics, Inc. Pre-anchor wash
EP3044329B1 (en) 2013-09-13 2020-04-22 Life Technologies Corporation Device preparation using condensed nucleic acid particles
US20150298091A1 (en) 2014-04-21 2015-10-22 President And Fellows Of Harvard College Systems and methods for barcoding nucleic acids
EP3207169B1 (en) * 2014-10-14 2020-07-01 MGI Tech Co., Ltd. Mate pair library construction
US20180023138A1 (en) * 2014-11-01 2018-01-25 Singular Bio, Inc. Assays for Single Molecule Detection and Use Thereof
CN107735497B (en) 2015-02-18 2021-08-20 卓异生物公司 Assays for single molecule detection and uses thereof
US11746367B2 (en) 2015-04-17 2023-09-05 President And Fellows Of Harvard College Barcoding systems and methods for gene sequencing and other applications
WO2017096322A1 (en) * 2015-12-03 2017-06-08 Accuragen Holdings Limited Methods and compositions for forming ligation products
CN108120754A (en) * 2016-11-28 2018-06-05 瀚生医电股份有限公司 Biosensor and the method that probe is formed on the surface of solids of biosensor
PL3663407T3 (en) 2017-08-01 2023-05-08 Mgi Tech Co., Ltd. Nucleic acid sequencing method
EP3696275A4 (en) * 2017-10-11 2021-05-26 MGI Tech Co., Ltd. Method for improving loading and stability of nucleic acid on solid support
CA3095292A1 (en) * 2018-04-02 2019-10-10 Progenity, Inc. Methods, systems, and compositions for counting nucleic acid molecules
WO2020034122A1 (en) * 2018-08-15 2020-02-20 深圳华大生命科学研究院 Gene chip and preparation method therefor
CN111411148B (en) * 2019-01-07 2023-05-23 天筛(上海)科技有限公司 One-tube ALDH2 genotyping kit and detection method thereof
US20210189483A1 (en) 2019-12-23 2021-06-24 Mgi Tech Co. Ltd. Controlled strand-displacement for paired end sequencing
EP4097230A1 (en) * 2020-01-31 2022-12-07 Edge Biosystems, Inc. Method for quantitating nucleic acid library
CA3173798A1 (en) 2020-03-03 2021-09-10 Pacific Biosciences Of California, Inc. Methods and compositions for sequencing double stranded nucleic acids
US20220195476A1 (en) * 2020-12-21 2022-06-23 Chen cheng yao Method and kit for regenerating reusable initiators for nucleic acid synthesis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1366067A (en) * 2001-01-15 2002-08-28 上海博华基因芯片技术有限公司 Gene chip reagent kit for detecting hepatitis B and C and its preparing process and application
CN1514246A (en) * 2003-08-18 2004-07-21 成都夸常科技有限公司 Plane substrate packed polymer composition and its application
US20100028873A1 (en) * 2006-03-14 2010-02-04 Abdelmajid Belouchi Methods and means for nucleic acid sequencing

Family Cites Families (153)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2633708A (en) 1948-07-07 1953-04-07 American Steel Foundries Control for hydraulic presses
US3591408A (en) 1967-12-06 1971-07-06 Owens Corning Fiberglass Corp Process for coloring glass fibers and fabrics
JPS53135660A (en) 1977-04-30 1978-11-27 Olympus Optical Co Ltd Fluorescent photometric microscope using laser light
US4469863A (en) 1980-11-12 1984-09-04 Ts O Paul O P Nonionic nucleic acid alkyl and aryl phosphonates and processes for manufacture and use thereof
US4663656A (en) 1984-03-16 1987-05-05 Rca Corporation High-resolution CCD imagers using area-array CCD's for sensing spectral components of an optical line image
US4883750A (en) 1984-12-13 1989-11-28 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
US5034506A (en) 1985-03-15 1991-07-23 Anti-Gene Development Group Uncharged morpholino-based polymers having achiral intersubunit linkages
US5235033A (en) 1985-03-15 1993-08-10 Anti-Gene Development Group Alpha-morpholino ribonucleoside derivatives and polymers thereof
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
US4965188A (en) 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
US4800159A (en) 1986-02-07 1989-01-24 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences
US5140221A (en) 1988-05-16 1992-08-18 Seiko Epson Corporation Rare gas cold cathode discharge tube and image input device
US5216141A (en) 1988-06-06 1993-06-01 Benner Steven A Oligonucleotide analogs containing sulfur linkages
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
CA2020958C (en) 1989-07-11 2005-01-11 Daniel L. Kacian Nucleic acid sequence amplification methods
US5091652A (en) 1990-01-12 1992-02-25 The Regents Of The University Of California Laser excited confocal microscope fluorescence scanner and method
US5386023A (en) 1990-07-27 1995-01-31 Isis Pharmaceuticals Backbone modified oligonucleotide analogs and preparation thereof through reductive coupling
US5602240A (en) 1990-07-27 1997-02-11 Ciba Geigy Ag. Backbone modified oligonucleotide analogs
US5210015A (en) 1990-08-06 1993-05-11 Hoffman-La Roche Inc. Homogeneous assay system using the nuclease activity of a nucleic acid polymerase
US5426180A (en) 1991-03-27 1995-06-20 Research Corporation Technologies, Inc. Methods of making single-stranded circular oligonucleotides
JP3082346B2 (en) 1991-09-12 2000-08-28 株式会社ニコン Fluorescence confocal microscope
WO1993008464A1 (en) 1991-10-21 1993-04-29 Holm Kennedy James W Method and device for biochemical sensing
US6569382B1 (en) 1991-11-07 2003-05-27 Nanogen, Inc. Methods apparatus for the electronic, homogeneous assembly and fabrication of devices
US5644048A (en) 1992-01-10 1997-07-01 Isis Pharmaceuticals, Inc. Process for preparing phosphorothioate oligonucleotides
US5424413A (en) 1992-01-22 1995-06-13 Gen-Probe Incorporated Branched nucleic acid probes
US5593826A (en) 1993-03-22 1997-01-14 Perkin-Elmer Corporation, Applied Biosystems, Inc. Enzymatic ligation of 3'amino-substituted oligonucleotides
ATE246702T1 (en) 1993-04-12 2003-08-15 Univ Northwestern METHOD FOR PREPARING OLIGONUCLEOTIDES
JP3292935B2 (en) 1993-05-18 2002-06-17 富士写真フイルム株式会社 Fluorescence spectroscopic image measurement device
US5552272A (en) 1993-06-10 1996-09-03 Biostar, Inc. Detection of an analyte by fluorescence using a thin film optical device
US5473060A (en) 1993-07-02 1995-12-05 Lynx Therapeutics, Inc. Oligonucleotide clamps having diagnostic applications
US5381224A (en) 1993-08-30 1995-01-10 A. E. Dixon Scanning laser imaging system
US6401267B1 (en) 1993-09-27 2002-06-11 Radoje Drmanac Methods and compositions for efficient nucleic acid sequencing
US7172864B1 (en) 1993-11-01 2007-02-06 Nanogen Methods for electronically-controlled enzymatic reactions
US5965452A (en) 1996-07-09 1999-10-12 Nanogen, Inc. Multiplexed active biologic array
US5578832A (en) 1994-09-02 1996-11-26 Affymetrix, Inc. Method and apparatus for imaging a sample on a device
SE9400522D0 (en) 1994-02-16 1994-02-16 Ulf Landegren Method and reagent for detecting specific nucleotide sequences
US5637684A (en) 1994-02-23 1997-06-10 Isis Pharmaceuticals, Inc. Phosphoramidate and phosphorothioamidate oligomeric compounds
EP0746865B1 (en) 1994-12-08 2003-03-26 Molecular Dynamics, Inc. Fluorescence imaging system employing a macro scanning objective
GB9506312D0 (en) 1995-03-28 1995-05-17 Medical Res Council Improvements in or relating to sample processing
US5750341A (en) 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
US5774305A (en) 1995-06-07 1998-06-30 Seagate Technology, Inc. Head gimbal assembly to reduce slider distortion due to thermal stress
ATE199572T1 (en) 1995-11-21 2001-03-15 Univ Yale UNIMOLECULAR SEGMENT AMPLIFICATION AND DETERMINATION
US5854033A (en) 1995-11-21 1998-12-29 Yale University Rolling circle replication reporter systems
US5847400A (en) 1996-02-01 1998-12-08 Molecular Dynamics, Inc. Fluorescence imaging system having reduced background fluorescence
US5646411A (en) 1996-02-01 1997-07-08 Molecular Dynamics, Inc. Fluorescence imaging system compatible with macro and micro scanning objectives
US5880777A (en) 1996-04-15 1999-03-09 Massachusetts Institute Of Technology Low-light-level imaging and image processing
DE69735313T2 (en) 1996-06-04 2006-11-02 University Of Utah Research Foundation, Salt Lake City Fluorescence donor-acceptor pair
JP3255034B2 (en) 1996-08-09 2002-02-12 日本電気株式会社 Audio signal processing circuit
GB9620209D0 (en) 1996-09-27 1996-11-13 Cemu Bioteknik Ab Method of sequencing DNA
US7381525B1 (en) 1997-03-07 2008-06-03 Clinical Micro Sensors, Inc. AC/DC voltage apparatus for detection of nucleic acids
US6699719B2 (en) 1996-11-29 2004-03-02 Proteomic Systems, Inc. Biosensor arrays and methods
GB9626074D0 (en) 1996-12-16 1997-02-05 Cytocell Ltd Nucleic acids amplification assay
US6309824B1 (en) 1997-01-16 2001-10-30 Hyseq, Inc. Methods for analyzing a target nucleic acid using immobilized heterogeneous mixtures of oligonucleotide probes
AU724141B2 (en) 1997-03-31 2000-09-14 Battelle Memorial Institute Apparatus and method for ammonia removal from waste streams
US6008892A (en) 1997-05-23 1999-12-28 Molecular Dynamics, Inc. Optical substrate for enhanced detectability of fluorescence
US6326489B1 (en) * 1997-08-05 2001-12-04 Howard Hughes Medical Institute Surface-bound, bimolecular, double-stranded DNA arrays
US6322901B1 (en) 1997-11-13 2001-11-27 Massachusetts Institute Of Technology Highly luminescent color-selective nano-crystalline materials
US5990479A (en) 1997-11-25 1999-11-23 Regents Of The University Of California Organo Luminescent semiconductor nanocrystal probes for biological applications and process for making and using such probes
US6207392B1 (en) 1997-11-25 2001-03-27 The Regents Of The University Of California Semiconductor nanocrystal probes for biological applications and process for making and using such probes
GB9809918D0 (en) 1998-05-08 1998-07-08 Isis Innovation Microelectrode biosensor and method therefor
US6306589B1 (en) 1998-05-27 2001-10-23 Vysis, Inc. Biological assays for analyte detection
AU770993B2 (en) 1998-09-15 2004-03-11 Yale University Molecular cloning using rolling circle amplification
US6251303B1 (en) 1998-09-18 2001-06-26 Massachusetts Institute Of Technology Water-soluble fluorescent nanocrystals
US6426513B1 (en) 1998-09-18 2002-07-30 Massachusetts Institute Of Technology Water-soluble thiol-capped nanocrystals
US6113408A (en) 1998-10-21 2000-09-05 Lyall Assemblies, Inc. Non-arcing fluorescent lamp holder
US6403376B1 (en) 1998-11-16 2002-06-11 General Hospital Corporation Ultra rapid freezing for cell cryopreservation
EP1144684B1 (en) 1999-01-06 2009-08-19 Callida Genomics, Inc. Enhanced sequencing by hybridization using pools of probes
GB9901475D0 (en) 1999-01-22 1999-03-17 Pyrosequencing Ab A method of DNA sequencing
US6544732B1 (en) 1999-05-20 2003-04-08 Illumina, Inc. Encoding and decoding of array sensors utilizing nanocrystals
US6396995B1 (en) 1999-05-20 2002-05-28 Illumina, Inc. Method and apparatus for retaining and presenting at least one microsphere array to solutions and/or to optical imaging systems
US6818395B1 (en) 1999-06-28 2004-11-16 California Institute Of Technology Methods and apparatus for analyzing polynucleotide sequences
US20020045182A1 (en) 1999-07-16 2002-04-18 Lynx Therapeutics, Inc. Multiplexed differential displacement for nucleic acid determinations
US7244559B2 (en) 1999-09-16 2007-07-17 454 Life Sciences Corporation Method of sequencing a nucleic acid
WO2001023610A2 (en) 1999-09-29 2001-04-05 Solexa Ltd. Polynucleotide sequencing
US6878255B1 (en) 1999-11-05 2005-04-12 Arrowhead Center, Inc. Microfluidic devices with thick-film electrochemical detection
CA2424824A1 (en) 2000-10-04 2002-04-11 The Board Of Trustees Of The Leland Stanford Junior University Renaturation, reassociation, association and hybridization of nucleic acid molecules
US6649138B2 (en) 2000-10-13 2003-11-18 Quantum Dot Corporation Surface-modified semiconductive and metallic nanoparticles having enhanced dispersibility in aqueous media
US7348183B2 (en) 2000-10-16 2008-03-25 Board Of Trustees Of The University Of Arkansas Self-contained microelectrochemical bioassay platforms and methods
US6670131B2 (en) 2000-11-30 2003-12-30 Kabushiki Kaisha Toshiba Nucleic acid detection method and apparatus, and vessel for detecting nucleic acid
US6576291B2 (en) 2000-12-08 2003-06-10 Massachusetts Institute Of Technology Preparation of nanocrystallites
FR2818378B1 (en) 2000-12-14 2003-06-13 Commissariat Energie Atomique LOW BANDWIDTH BROADBAND FLUORESCENCE REINFORCING DEVICE AND BIOLOGICAL OR CHEMICAL OPTICAL SENSOR USING THE SAME
US6656685B2 (en) * 2001-01-29 2003-12-02 Ventana Medical Systems, Inc. Hybridization buffers using low molecular weight dextran sulfate and methods for their use
US20020102596A1 (en) 2001-01-31 2002-08-01 Davis Lloyd Mervyn Methods for detecting interaction of molecules with surface-bound reagents
JP2002340802A (en) 2001-03-15 2002-11-27 Yokogawa Electric Corp Fluorescence intensity intensifying chip
EP1384022A4 (en) 2001-04-06 2004-08-04 California Inst Of Techn Nucleic acid amplification utilizing microfluidic devices
CA2451789C (en) 2001-06-29 2012-03-27 Meso Scale Technologies, Llc. Assay plates, reader systems and methods for luminescence test measurements
ATE385337T1 (en) 2001-07-03 2008-02-15 Barco Nv METHOD AND DEVICE FOR REAL-TIME CORRECTION OF AN IMAGE
WO2003092043A2 (en) 2001-07-20 2003-11-06 Quantum Dot Corporation Luminescent nanoparticles and methods for their preparation
AU2002351187A1 (en) 2001-11-30 2003-06-17 Fluidigm Corporation Microfluidic device and methods of using same
US20030156121A1 (en) 2002-02-19 2003-08-21 Willis Donald Henry Compensation for adjacent pixel interdependence
US7487462B2 (en) 2002-02-21 2009-02-03 Xerox Corporation Methods and systems for indicating invisible contents of workspace
US6731831B2 (en) 2002-02-27 2004-05-04 Xiang Zheng Tu Optical switch array assembly for DNA probe synthesis and detection
US6815167B2 (en) 2002-04-25 2004-11-09 Geneohm Sciences Amplification of DNA to produce single-stranded product of defined sequence and length
CA2491023A1 (en) 2002-07-19 2004-01-29 Althea Technologies, Inc. Strategies for gene expression analysis
US7198901B1 (en) 2002-09-19 2007-04-03 Biocept, Inc. Reflective substrate and algorithms for 3D biochip
EP1551986B1 (en) 2002-09-30 2014-08-27 Affymetrix, Inc. Polynucleotide synthesis and labeling by kinetic sampling ligation
US20040086892A1 (en) 2002-11-06 2004-05-06 Crothers Donald M. Universal tag assay
WO2004059006A1 (en) 2002-12-25 2004-07-15 Casio Computer Co., Ltd. Optical dna sensor, dna reading apparatus, identification method of dna and manufacturing method of optical dna sensor
EP2365095A1 (en) 2003-02-26 2011-09-14 Callida Genomics, Inc. Random array DNA analysis by hybridization
US7025935B2 (en) 2003-04-11 2006-04-11 Illumina, Inc. Apparatus and methods for reformatting liquid samples
TWI329208B (en) 2003-06-03 2010-08-21 Oerlikon Trading Ag Optical substrate for enhanced detectability of fluorescence
US8298780B2 (en) 2003-09-22 2012-10-30 X-Body, Inc. Methods of detection of changes in cells
JP4328168B2 (en) 2003-10-02 2009-09-09 ソニー株式会社 Detection unit for interaction between substances using capillary phenomenon, method using the detection unit, and substrate for bioassay
US7216291B2 (en) 2003-10-21 2007-05-08 International Business Machines Corporation System and method to display table data residing in columns outside the viewable area of a window
EP2163653B1 (en) 2003-11-10 2013-03-27 Geneohm Sciences, Inc. Nucleic acid detection method having increased sensitivity
JP4065855B2 (en) 2004-01-21 2008-03-26 株式会社日立製作所 Biological and chemical sample inspection equipment
WO2005082098A2 (en) 2004-02-27 2005-09-09 President And Fellows Of Harvard College Polony fluorescent in situ sequencing beads
US7327349B2 (en) 2004-03-02 2008-02-05 Microsoft Corporation Advanced navigation techniques for portable devices
US8048437B2 (en) 2004-04-21 2011-11-01 Richard Nagler Medical device with surface coating comprising bioactive compound
EP1749108A2 (en) 2004-05-28 2007-02-07 Nanogen, Inc. Nanoscale electronic detection system and methods for their manufacture
US20060024711A1 (en) 2004-07-02 2006-02-02 Helicos Biosciences Corporation Methods for nucleic acid amplification and sequence determination
WO2006073504A2 (en) 2004-08-04 2006-07-13 President And Fellows Of Harvard College Wobble sequencing
TWI294460B (en) * 2004-12-23 2008-03-11 Ind Tech Res Inst Method for stabilizing nucleic acids
US7220549B2 (en) 2004-12-30 2007-05-22 Helicos Biosciences Corporation Stabilizing a nucleic acid for nucleic acid sequencing
WO2006074351A2 (en) 2005-01-05 2006-07-13 Agencourt Personal Genomics Reversible nucleotide terminators and uses thereof
US9040237B2 (en) 2005-03-04 2015-05-26 Intel Corporation Sensor arrays and nucleic acid sequencing applications
US20060270229A1 (en) 2005-05-27 2006-11-30 General Electric Company Anodized aluminum oxide nanoporous template and associated method of fabrication
CA2611671C (en) 2005-06-15 2013-10-08 Callida Genomics, Inc. Single molecule arrays for genetic and chemical analysis
US20090005256A1 (en) * 2005-07-29 2009-01-01 Bittker Joshua A Analysis of Encoded Chemical Libraries
JP4496376B2 (en) 2005-09-05 2010-07-07 国立大学法人東京工業大学 Disposable magnetic levitation blood pump
US7960104B2 (en) 2005-10-07 2011-06-14 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
WO2007120208A2 (en) 2005-11-14 2007-10-25 President And Fellows Of Harvard College Nanogrid rolling circle dna sequencing
US20070128610A1 (en) 2005-12-02 2007-06-07 Buzby Philip R Sample preparation method and apparatus for nucleic acid sequencing
JP5180845B2 (en) 2006-02-24 2013-04-10 カリダ・ジェノミックス・インコーポレイテッド High-throughput genomic sequencing on DNA arrays
SG10201405158QA (en) 2006-02-24 2014-10-30 Callida Genomics Inc High throughput genome sequencing on dna arrays
US20080194414A1 (en) 2006-04-24 2008-08-14 Albert Thomas J Enrichment and sequence analysis of genomic regions
WO2008070352A2 (en) 2006-10-27 2008-06-12 Complete Genomics, Inc. Efficient arrays of amplified polynucleotides
WO2008070375A2 (en) 2006-11-09 2008-06-12 Complete Genomics, Inc. Selection of dna adaptor orientation
US20090105961A1 (en) 2006-11-09 2009-04-23 Complete Genomics, Inc. Methods of nucleic acid identification in large-scale sequencing
US20080242560A1 (en) * 2006-11-21 2008-10-02 Gunderson Kevin L Methods for generating amplified nucleic acid arrays
US8951731B2 (en) 2007-10-15 2015-02-10 Complete Genomics, Inc. Sequence analysis using decorated nucleic acids
US8415099B2 (en) 2007-11-05 2013-04-09 Complete Genomics, Inc. Efficient base determination in sequencing reactions
US7897344B2 (en) 2007-11-06 2011-03-01 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors into library constructs
US8298768B2 (en) 2007-11-29 2012-10-30 Complete Genomics, Inc. Efficient shotgun sequencing methods
US8518640B2 (en) 2007-10-29 2013-08-27 Complete Genomics, Inc. Nucleic acid sequencing and process
EP2215209B1 (en) 2007-10-30 2018-05-23 Complete Genomics, Inc. Apparatus for high throughput sequencing of nucleic acids
US7988918B2 (en) 2007-11-01 2011-08-02 Complete Genomics, Inc. Structures for enhanced detection of fluorescence
WO2009061840A1 (en) 2007-11-05 2009-05-14 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors employing selective methylation
US9551026B2 (en) 2007-12-03 2017-01-24 Complete Genomincs, Inc. Method for nucleic acid detection using voltage enhancement
US8592150B2 (en) 2007-12-05 2013-11-26 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
WO2009097368A2 (en) 2008-01-28 2009-08-06 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
WO2009132028A1 (en) 2008-04-21 2009-10-29 Complete Genomics, Inc. Array structures for nucleic acid detection
EP2401395A1 (en) 2009-02-26 2012-01-04 Dako Denmark A/S Compositions and methods for rna hybridization applications
DK2511843T3 (en) 2009-04-29 2017-03-27 Complete Genomics Inc METHOD AND SYSTEM FOR DETERMINING VARIATIONS IN A SAMPLE POLYNUCLEOTIDE SEQUENCE IN TERMS OF A REFERENCE POLYNUCLEOTIDE SEQUENCE
US8536322B2 (en) * 2009-10-19 2013-09-17 Zhiqiang Han Method for nucleic acid isolation by solid phase reversible binding of nucleic acids
US9023769B2 (en) 2009-11-30 2015-05-05 Complete Genomics, Inc. cDNA library for nucleic acid sequencing
US10837879B2 (en) 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
US20130256228A1 (en) 2012-03-30 2013-10-03 Hydration Systems, Llc Use of novel draw solutes and combinations thereof to improve performance of a forward osmosis system and process
US20130296173A1 (en) 2012-04-23 2013-11-07 Complete Genomics, Inc. Pre-anchor wash
US9901705B2 (en) 2013-03-15 2018-02-27 Armour Technologies, Inc. Medical device curving apparatus, system, and method of use
MA43271A (en) 2015-06-17 2018-09-26 Dispersol Technologies Llc IMPROVED DEFERASIROX FORMULATIONS AND THEIR MANUFACTURING PROCESSES
WO2019151396A1 (en) 2018-02-01 2019-08-08 イーグル工業株式会社 Sliding parts

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1366067A (en) * 2001-01-15 2002-08-28 上海博华基因芯片技术有限公司 Gene chip reagent kit for detecting hepatitis B and C and its preparing process and application
CN1514246A (en) * 2003-08-18 2004-07-21 成都夸常科技有限公司 Plane substrate packed polymer composition and its application
US20100028873A1 (en) * 2006-03-14 2010-02-04 Abdelmajid Belouchi Methods and means for nucleic acid sequencing

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10837879B2 (en) 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
CN106148482A (en) * 2015-03-24 2016-11-23 深圳华大基因研究院 A kind of sequence measurement being applicable to small-sized sequenator
CN106148482B (en) * 2015-03-24 2019-12-03 深圳华大智造科技有限公司 A kind of sequencing approach suitable for small-sized sequenator
CN108350491A (en) * 2015-11-18 2018-07-31 加利福尼亚太平洋生物科学股份有限公司 Nucleic acid is loaded on base material
US11642643B2 (en) 2015-11-18 2023-05-09 Pacific Biosciences Of California, Inc. Loading nucleic acids onto substrates
CN107034267A (en) * 2016-02-03 2017-08-11 深圳华大基因研究院 Prepare probe collection is sequenced in candidate method, device and its application
CN107034267B (en) * 2016-02-03 2021-06-08 深圳华大智造科技股份有限公司 Method and device for preparing candidate sequencing probe set and application of candidate sequencing probe set
CN108070642A (en) * 2016-11-17 2018-05-25 深圳华大基因研究院 The method and the double end sequencing methods of DNB and kit of the double end sequencing quality of raising DNB
CN108070642B (en) * 2016-11-17 2020-10-27 深圳华大智造科技股份有限公司 Method for improving DNB double-end sequencing quality and DNB double-end sequencing method and kit
CN109689216A (en) * 2017-05-11 2019-04-26 伊鲁米那股份有限公司 The protection surface covering of flow cell
US11667969B2 (en) 2017-05-11 2023-06-06 Illumina, Inc. Protective surface coatings for flow cells
CN110021349A (en) * 2017-07-31 2019-07-16 北京哲源科技有限责任公司 The coding method of gene data
CN110021349B (en) * 2017-07-31 2021-02-02 北京哲源科技有限责任公司 Method for encoding gene data
CN109610007A (en) * 2018-10-12 2019-04-12 深圳市瀚海基因生物科技有限公司 A kind of DNA chip and preparation method thereof that albumen is co-modified
CN113913944A (en) * 2018-10-12 2022-01-11 深圳市真迈生物科技有限公司 Protein co-modified DNA chip and preparation method thereof

Also Published As

Publication number Publication date
EP2773452B1 (en) 2016-03-30
DK2773452T3 (en) 2016-07-18
US11835437B2 (en) 2023-12-05
WO2013066975A1 (en) 2013-05-10
US20210131924A1 (en) 2021-05-06
EP2773452A1 (en) 2014-09-10
CN104039438B (en) 2016-03-09
US20130178369A1 (en) 2013-07-11
US10837879B2 (en) 2020-11-17
HK1201778A1 (en) 2015-09-11

Similar Documents

Publication Publication Date Title
CN104039438B (en) For the processing method of stabilization of nucleic acids array
CN101932729B (en) Efficient base determination in sequencing reactions
CN102459592B (en) For the method and composition of long fragment read sequencing
US9023769B2 (en) cDNA library for nucleic acid sequencing
AU2020202992B2 (en) Methods for genome assembly and haplotype phasing
US20220290224A1 (en) Method for in situ determination of nucleic acid proximity
US8518640B2 (en) Nucleic acid sequencing and process
US20220315918A1 (en) Massively parallel contiguity mapping
US20130296173A1 (en) Pre-anchor wash
CN103917654B (en) For the method and system that longer nucleic acid is sequenced
WO2020056381A9 (en) PROGRAMMABLE RNA-TEMPLATED SEQUENCING BY LIGATION (rSBL)
CN105531375A (en) Methods for targeted genomic analysis
CN104334739A (en) Genotyping by next-generation sequencing
JP2020501554A (en) Method for increasing the throughput of single molecule sequencing by linking short DNA fragments
US20220267826A1 (en) Methods and compositions for proximity ligation
JP2021523723A (en) Chemical composition and how to use it
CN103290106B (en) In sequencing reaction, base effectively determines
US20240084291A1 (en) Methods and compositions for sequencing library preparation
CN117222737A (en) Methods and compositions for sequencing library preparation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: California, USA

Patentee after: COMPLETE GENOMICS Inc.

Address before: California, USA

Patentee before: Complete Genomics, Inc.