CN108026568A - Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling - Google Patents
Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling Download PDFInfo
- Publication number
- CN108026568A CN108026568A CN201680052075.1A CN201680052075A CN108026568A CN 108026568 A CN108026568 A CN 108026568A CN 201680052075 A CN201680052075 A CN 201680052075A CN 108026568 A CN108026568 A CN 108026568A
- Authority
- CN
- China
- Prior art keywords
- sequence
- complementary
- target
- probe
- complementary probe
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/6853—Nucleic acid amplification reactions using modified primers or templates
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses the presence for measuring one or more polynucleotide sequences in two or more samples, it is not present, composition, method and the kit of content, copy number or other characteristics, and the application of the composition, method and kit in Genotyping, the assessment for copying number variation, expression analysis, splicing variants and the measure of fusion and other genetic analyses.
Description
Related application
This application claims the U.S. Provisional Patent Application No. submitted for 8th in September in 2015 in January, 62/215,679,2016
The U.S. Provisional Patent Application No. submitted for 31st U.S. Provisional Patent Application No. submitted on April 4th, 62/289,303,2016
The priority for the U.S. Provisional Patent Application No. 62/353,088 that on June 22nd, 62/317,879 and 2016 submits, these U.S.
Temporary patent application is incorporated by herein for all purposes.
Technical field
The present invention relates to for carrying out foranalysis of nucleic acids to two or more samples and keeping the same of each sample at the same time
Composition, method and the kit of property.
Introduction
It is required to keep the homogeneity of each sample and promotes such as analysis of Genotyping, copy number, expression analysis, table
See genetic analysis and measure specific gene, SNP, indel, transcript or gene loci presence, be not present or content etc. is more
The genetic analysis method of the application of weight processing method.The present invention can meet this demand.
The content of the invention
Present disclose provides composition, method and kit for carrying out foranalysis of nucleic acids to target polynucleotide etc..This point
Analysis may include to measure the existence or non-existence of a variety of target polynucleotides in two or more samples.In other respects, this point
Analysis can be used in two or more samples to one or more allele carry out Genotyping, analysis copy number variation,
Analyze the expression of epigenetics event (such as methylating) or the one or more RNA transcripts of analysis.
This method may include following steps:Two or more samples are provided, it is more that each sample includes one or more targets
Nucleotide, every kind of target polynucleotide include the first target sequence and the second target sequence;Multiple first complementary probes and second mutual are provided
Probe is mended, (i) each first complementary probe has with the Sequence of the first target sequence complementation and non-mutually with the first target sequence
The Sequence of benefit, wherein non-complementary portion include inquiry site bar code sequence and adjacent universal sequence, and (ii) is each
Second complementary probe has with the Sequence of the second target sequence complementation and with the second target sequence incomplementarity close to sequence portion
Point;With each individually multiple first complementary probes of sample incubation and the second complementary probe under hybridization conditions so that first is complementary
Probe and the second complementary probe hybridize in sample their complementary target polynucleotide to form hybridization complex;With reference in sample
The first complementary probe and the second complementary probe of the first target sequence and the second target sequence are hybridized to form product polynucleotides;It is rich
Collect the product polynucleotides formed by single sample;It is and more to measure target by analyzing product polynucleotides or its complementary strand
Existence or non-existence of the nucleotide in one or more samples.
First complementary probe and the second complementary probe can be complementary with the first target sequence and the second target sequence, and can be with that
This close to or it is adjacent to each other and be separated by 1 to 500 nucleotide.
First complementary probe can have comprising both 3' and 5' complementary with target sequence and in inquiry site bar code
The sequence of side, and the adjacent universal sequence of the first complementary probe can be 5' to complementary series part, the complementary series portion
Point can be the first complementary probe 5' to incomplementarity inquire site bar code.
The non-complementary portion of first complementary probe and the second complementary probe can include universal sequence, and can also include additional
Sequence, the appended sequence can effectively make the length normalization method of product polynucleotides in given measure.First complementary spy
Pin and the universal sequence of the second complementary probe can be identical or different.
Universal sequence may include the primer binding sequence with primer sequence complementation, and the primer sequence can be used for increasing (i)
Sample index, (ii) are used for the appended sequence of formation sequence data or another form of detection (such as new-generation sequencing
Adapter, for capturing capture probe on a solid surface or sequence) and (iii) other parts (for example, can be used for next
One or more of the part of generation sequencing (" NNGG ")).
Primer sequence may include PCR primer sequence.
Incomplementarity inquire site bar code and sample index length can be 10,11,12,13,14,15
A or 16 nucleotide, such as length are 12 or 15 nucleotide.Inquiry site bar code may be selected from SEQ ID NO:1-SEQ
ID NO:384.Sample index bar code may be selected from SEQ ID NO:1-SEQ ID NO:73536.
When performing this method, the first complementary probe and the second complementary probe composition can be added before hybridization step
The temperature of heat to 70 DEG C to 100 DEG C.Can before compilation steps enriched product polynucleotides, such as pass through product polynucleotides
PCR amplification realize.
Composition and method can be based on solution, and each in the first complementary probe and the second complementary probe can
Including inosine, the 3' ends and 5' ends of the inosine and probe are separated by 2,3,4,5,6,7,8,9,10 or more
Multiple bases.
Present disclose provides depositing available for Genotyping, measure copy number variation and/or the specific target polynucleotide of measure
Or be not present or the composition of content, method and kit.
The supplementary features and advantage of the disclosure are as described in following specification.These features of the disclosure and other features will
Become more fully apparent from following specification, or can be by putting into practice acquistion to the principles described herein.
Attached drawing schematic illustration
Figure 1A-E provide the composition for carrying out foranalysis of nucleic acids by combining the polynucleotide probes of bar shaped code labeling
With the schematic diagram of method.The attached drawing will be described in detail in example 1.
Fig. 2 shows the result of study of the influence in relation to inquiring site bar code arrangement in the first complementary probe.In figure
The cluster figure in multiple sites and two kinds of strategies for the inquiry bar code arrangement in the first complementary probe are shown.Left figure
(6mer) between the first target sequence and universal sequence with shorter inquiry site bar code (6 nucleotide) (
6-mer is included between one target sequence and universal sequence).Right figure (12mer) has longer inquiry site in the first target sequence
Bar code (12 nucleotide) so that inquiry site bar code there are complementary series for both sides (12- is included in the first target sequence
mer).The deciphering quantity of allele A (x-axis) and allele B (y-axis) are shown, wherein each corresponding uniqueness of point in figure
Sample (96 identical samples are included in per treatment).AA animals are along x-axis, and BB animals are along y-axis, and AB animals occupy axis
Center.Show in figure, under a number of cases, genotype resolution ratio is similar, and in other cases, genotype resolution ratio exists
It is outstanding under a kind of or other arrangements.As shown in Figure 2 A, wherein the first complementary probe have comprising related allele and
The information in both sites and the 12-mer inquiry site bar codes in the both sides of inquiry site bar code with complementary series
As a result have than wherein the first complementary probe comprising in relation to the allele in different sequences and the information in site and inquiring
The result that 6-mer inquiry site bar code of the both sides of site bar code without complementary series obtains provides clearer base
Because of type cluster (Fig. 2A).In this case, inquire site bar code and target sequence and universal sequence close to.As shown in Figure 2 B, two
Group produces similar genotype cluster.As shown in Figure 2 C, inquire that the genotype cluster that site bar code produces provides slightly by 6-mer
Micro- clearer genotype cluster.
Fig. 3 shows to mitigate the G of probe triplet using deoxyinosine in Genotyping measures embodiment:The shadow of T mispairing
Loud result of study.The 2nd to the 10th 3' position by the first complementary probe that deoxyinosine is placed in impacted form,
To being subject to serious G:The probe that T mispairing influences is modified (nothing, iT2 to iT10).Series model is (wherein complementary for first
The LHS-T (the first complementary probe) of the impacted variation of probe is shown on 5' to 3' directions, and target gDNA or genome
DNA is shown on 3' to 5' directions) show in genomic dna sequence comprising mispairing to the first complementary probe of G nucleotide
10 big 3' positions.The 2nd shown 3' positions (i) correspond to " iT2 ".The underscore part of gDNA sequences is the second complementary spy
Pin is by the part of hybridization.Closed grey bar is homozygosity GG samples, and striated bar represents the sample of homozygosity AA.Y-axis is and T-shaped
The logarithmic scale for the deciphering quantity that first complementary probe body of formula is associated.Grey bar (homozygosity GG samples) is represented by G:T is wrong
Non-specific connection caused by the stability matched somebody with somebody.Striated bar (homozygosity AA samples) represents specificity connection.The result shows that put
Deoxyinosine arrangement in the 2nd or the 3rd 3' position of the first complementary probe of modified forms significantly reduces non-spy
The deciphering quantity of opposite sex connection.Similarly, deoxyinosine can be used in the first complementary probe, which has 3'G
And G:The possibility of T mispairing.
Fig. 4 shows wherein to detect the result of study of a small amount of target DNA in background (noise) genomic DNA.Fig. 4 A show to use
Signal and noise base in the quantity (above) in per treatment two optimal sites, signal and noise genome and each reaction
Because of the average relative uniformity of the ng (figure below) of group.The result shows that reduced with the quantity of signal gene group, the phase in two sites
It is still very high to uniformity.Even for the 122ng input signal genes under the background equivalent to 250000 noise genomes
Group, average relative uniformity is also 100%.This is under the background of the noise genome of equivalent size, in signal gene group
Testing result under 0.05% pollutional condition.Fig. 4 B show the average deciphering quantity associated with Single locus per treatment,
There is shown with the ng of signal and noise genome in the quantity (above) and each reaction of signal and noise genome (figure below).With
The quantity for signal gene group is reduced, and the deciphering quantity associated with Single locus also reduces, and largely with instead
The content of noise DNA present in answering is unrelated.
Fig. 5 shows the result of study being heated or not heated before Genotyping measure embodiment is performed to sample of nucleic acid.
Cluster illustrates the presence (heating of the Single locus and reversible denaturation in workflow;Fig. 5 A) or there is no (do not heat;Figure
5B).There is shown with the deciphering quantity of allele A (x-axis) and allele B (y-axis), wherein corresponding one of each point is unique
Sample (includes 96 identical samples) in per treatment.AA animals are along x-axis, and BB animals are along y-axis, and AB animals occupy axis
Center.Response diagram (heating with reversible denaturation;Fig. 5 A) three easily distinguishable genotype clusters are shown.Lack reversible denaturation
Response diagram (does not heat;Fig. 5 B) three easily distinguishable genotype clusters are not shown.
Fig. 6 illustrates various storage methods to Genotyping using the cluster of Single locus and four kinds of probe member storage processing
Measure the influence of the reaction result of embodiment.It is from left to right freshly prepared (Fig. 6 A), (Fig. 6 B) of freezing, dry (figure
The figure of (Fig. 6 D) probe member 6C) and through trehalose dried.Allele A (x-axis) and allele B (y-axis) are shown in figure
Deciphering quantity, wherein the corresponding unique sample (96 identical samples are included in per treatment) of each point.AA is moved
Thing is along x-axis, and BB animals are along y-axis, and AB animals occupy the center of axis.Although fresh, freezing and obtained by being dried with trehalose
Figure it is similar, but do not use in the figure that trehalose is dried to obtain and show that the resolution ratio of three kinds of genotype is relatively low.
Fig. 7 shows the use of the copy number analysis embodiment when performing copy number analysis to measure copy number variation (CNV)
On the way.Fig. 7 A show the solution associated with the inquiry site bar code of the A allele of the target site for independent sample in X-axis
Reading amount.Circle=BB samples, triangle=AB samples, square=AA samples.Fig. 7 B show being averaged for BB, AB and AA sample
Solve reading (bar) and standard deviation (palpus).Fig. 7 C show that the copy number of A gene locis is 0,1 or 2.
Fig. 8 shows the use of the tetraploid Genotyping embodiment in the detection and Genotyping of tetraploid genomic DNA
On the way.The cluster figure of Single locus in simulation tetraploid genomic DNA sample is shown in figure.There is shown with allele A (x-axis) and wait
The deciphering quantity of position gene B (y-axis) and allele C (z-axis), wherein each corresponding unique sample of point.This
In the case of, allele A is C bases, and allele B is T bases.With TTTT (solid circles) or CCCC (solid just
It is square) the homozygosity animal of genotype draws along Y-axis or X-axis respectively.Heterozygosity animal with hollow square, black triangle and
Open diamonds are shown.
Fig. 9 shows the purposes of the Genotyping embodiment for the inquiry in multiple alleles site.Shown in figure single more
The cluster figure in allele site.Three allele are substitute.There is shown with allele A (x-axis) and allele B (y
Axis) and allele C (z-axis) deciphering quantity, wherein the corresponding unique sample of each point.In this case, etc.
Position Gene A is G bases, and allele B is T bases, and allele-C is C bases.AA animals are along x-axis, and BB animals are along y
Axis, CC animals are along z-axis.Heterozygosity animal (TC, TG, CG) is between any two axis.
The embodiment that Figure 10 is shown with for inquiring missing is checked in sample of nucleic acid presence or absence of particular sequence
Result of study.Show three bases (Figure 10 A) of missing and lack the cluster figure in the site of 45kb (Figure 10 B).There is shown with equipotential
The deciphering quantity of Gene A (x-axis) and allele B (y-axis), wherein each corresponding unique sample of point is (in per treatment
Include 96 identical samples).AA animals are along x-axis, and BB animals are along y-axis, and AB animals occupy the center of axis.With missing
The resolution ratio of the cluster figure in site is similar to the resolution ratio of the cluster figure in the site replaced there are single base.
Figure 11 A are the schematic diagram for showing the exemplary sequence with the self-complementarity by lines instruction.
Figure 11 B are the mutually homotactic schematic diagram shown in Figure 11 A with the variable bar code area by square frame instruction.
Figure 12 A are to show 7 base-pairs (bp) in the internal 3' ends complementation of index (7+0+1) and some other matchings
To stablize the schematic diagram of the variation of dimer.
Figure 12 B are the schematic diagram for showing 7 base-pairs (bp) in the internal 3' ends partial complementarity of index (1+0+7),
In this example, 0 matches for GT, it can be matched equivalent to 9 base-pairs (bp).
Figure 13 is to show unstability site (nearside SNP) and label site (target SNP) and its in polyploid target gene group
Relative position schematic diagram.Unstability site can be at the either side of label/target SNP.Hollow arrow is directed toward them in target base
Because of the corresponding site in group.
Figure 14 is to show unstability site (nearside SNP) and label site (target SNP) and its in polyploid target gene group
Relative position schematic diagram.Unstability site can be at the either side of label/target SNP.Figure 14 A show wherein unstability position
Point and label site are the situation of SNP.Figure 14 B show that wherein unstability site is insertion point and label site is
The situation of SNP.Figure 14 C show the situation that wherein unstability site is deletion segment and label site is SNP.
Figure 15 is the present or absent Genotyping shown for detecting the target polynucleotide in polyploid sample
The schematic diagram of probe used in method.Figure 15 A show that the situation of nearside SNP is not present wherein in target DNA.LHS and RHS are miscellaneous
Target DNA is sent to, and LHS is connected (cloud represents connection) with RHS.Figure 15 B show that there are the feelings of nearside SNP wherein in target DNA
Shape (arrow direction cross).Nearside SNP makes the hybridization between RHS and target DNA unstable, and does not connect.
Figure 16 is the present or absent Genotyping shown for detecting the target polynucleotide in polyploid sample
The schematic diagram of probe used in method.Figure 16 A show that the situation of nearside SNP is not present wherein in target DNA.LHS and RHS are miscellaneous
Target DNA is sent to, and LHS is connected (cloud represents connection) with RHS.Figure 16 B show that wherein (arrow refers to there are the situation of nearside SNP
To cross).It further avoid with the blocking oligonucleotide of the target DNA complementation with nearside SNP miscellaneous between RHS and target DNA
Hand over, and do not connect.
Figure 17 is the present or absent Genotyping shown for detecting the target polynucleotide in polyploid sample
The schematic diagram of probe used in method.Figure 17 A show that the situation of nearside SNP is not present wherein in target DNA.Increase early period
PCR amplification step, is only expanded interested using knowledge of the PCR primer based on nearside SNP and the relative position of target/label SNP
Unique genome or subgenome.Then, PCR amplification that LHS and RHS hybridizes to target DNA, and LHS and RHS occurs
Connection (cloud represents connection).Figure 17 B show that there are the situation (arrow direction cross) of nearside SNP wherein in target DNA.Target DNA
In nearside SNP avoid PCR amplification early period, its disturb PCR primer be attached to target DNA.
Figure 18 shows the influence that early period, PCR amplification step understood sequence.There is shown with allele A (x-axis) and equipotential
The deciphering quantity of gene B (y-axis), wherein each corresponding unique sample of point.Figure 18 A are shown without PCR amplification step early period
The result of cluster figure on rapid genomic DNA.Figure 18 B show the cluster figure on PCR amplification comprising PCR amplification step early period.
Enrichment PCR amplification step improves the resolution ratio of site cluster figure.
Figure 19 is to show that SplintR ligases can connect to hybridize to grinding for the mRNA transcripts from mankind's hela cell line
Study carefully result.Reading there is shown with each sample is total (in all sites).When SplintR ligases are saved in reaction, inspection
The deciphering gone out is close to zero (16 independent reactions altogether).This group of data eliminate any first complementary probe, the first complementary spy
Pin does not have the second complementary probe of gametophyte for being connected to the first complementary probe, and it is abnormal to substantially eliminate pseudo- first complementary probe
The noise of connection product.
Figure 20 shows glyceraldehyde-3-phosphate dehydrogenase (GADPH) gene obtained by the titration of SplintR ligases
The result of total solution reading of mRNA transcripts (being arbitrarily assigned as the site 745 in 778 site groups) is (per 500mL connection mixtures
Deposit SplintR enzymes milliliter number [25 units/μ L]).With the decline of SplintR ligase unit concentrations, repeatedly weighing
Branch mailbox (binned) the solution reading obtained during repetition measurement is fixed is also reduced.In the case where RNA ligase is zero unit, which does not deposit
Understood in branch mailbox.SplintR connections enzyme reaction depends on the concentration of SplintR ligases.
Figure 21 shows the glyceraldehyde-3-phosphate dehydrogenase that the titration by inputting RNA and human genome DNA obtains
(GADPH) result of total solution reading of the mRNA transcripts (being arbitrarily assigned as the site 745 in 778 site groups) of gene.With
The reduction of RNA concentration in reaction, branch mailbox site 745 solve reading and also reduce.Still deposited in the case of dna in zero input RNA, inspection
Measure a small amount of branch mailbox signal.Neither inputting RNA nor in the case of inputting DNA, branch mailbox signal is close to zero.SplintR connections
Enzyme reaction depends on RNA, but has trace reactivity with DNA.
Specific embodiment
Present disclose provides composition, method and the kit for including multiple first complementary probes and the second complementary probe.
Each first complementary probe may include the sequence with the first target sequence complementation interested.Each second complementary probe may include with
The sequence of second target sequence complementation interested.When the first complementary probe and the second complementary probe hybridize to the first target sequence of complementation
When row and the second target sequence, the first probe and the second probe can be combined to form product polynucleotides.
The disclosure additionally provides multiple samples, and each sample may include one or more target sequences.Some samples include
A variety of target sequences, and some samples do not include any target sequence.
Present disclose provides the presence available at least one of definite one or more samples target polynucleotide, do not deposit
, composition, method and the kit of genotype, content or copy number.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains
The method of amount or copy number, this method comprise the following steps:(a) sample for including one or more target polynucleotides is provided, often
Kind target polynucleotide includes the first target sequence and the second target sequence;(b) multiple first complementary probes and the second complementary probe are provided,
Including the first complementary probe and the second complementary probe for every kind of target polynucleotide, (i) each first complementary probe tool
There is the Sequence with the Sequence of the first target sequence complementation of target polynucleotide and with the first target sequence incomplementarity, wherein
Non-complementary portion includes inquiry site bar code sequence and adjacent universal sequence, and (ii) each second complementary probe have with
The Sequence of the second target sequence complementation of target polynucleotide and with the second target sequence incomplementarity close to Sequence;(c)
The multiple first complementary probe of sample incubation and the second complementary probe are used under hybridization conditions so that the first complementary probe and
Two complementary probes hybridize in sample their complementary target polynucleotide to form hybridization complex;(d) combine and hybridize in sample
It is more to form product to the first target sequence and the first complementary probe of the second target sequence of target polynucleotide and the second complementary probe
Nucleotide;And (e) measures every kind of target polynucleotide depositing in the sample by analyzing product polynucleotides or its complementary strand
, be not present, content or copy number.
It is used to measure depositing for one or more target polynucleotides in two or more samples present disclose provides a kind of
, be not present, the method for content or copy number, this method comprises the following steps:(a) two or more samples are provided, each
Sample includes one or more target polynucleotides, and every kind of target polynucleotide includes the first target sequence and the second target sequence;(b) provide
Multiple first complementary probes and the second complementary probe, including for the first complementary probe of every kind of target polynucleotide and second
Complementary probe, (i) each first complementary probe have with the Sequence of the first target sequence complementation of target polynucleotide and with
The Sequence of first target sequence incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and adjacent general sequence
Row, and (ii) each second complementary probe has with the Sequence of the second target sequence complementation of target polynucleotide and with the
Two target sequence incomplementarities close to Sequence;(c) visited under hybridization conditions with each sample incubation the multiple first is complementary
Pin and the second complementary probe so that the first complementary probe and the second complementary probe hybridize to their the more nucleosides of complementary target in sample
Acid is to form hybridization complex;(d) the of the first target sequence that target polynucleotide is hybridized in sample and the second target sequence is combined
One complementary probe and the second complementary probe are to form product polynucleotides;(e) the product polynucleotides formed by sample are collected;With
And (f) measures every kind of target polynucleotide depositing in one or more samples by analyzing product polynucleotides or its complementary strand
, be not present, content or copy number.
The first target sequence and the second target sequence of every kind of target polynucleotide can be closely adjacent to each other.Alternatively, every kind of target multinuclear
The first target sequence and the second target sequence of thuja acid can be separated by 1 to 500 nucleotide.For example, the first of every kind of target polynucleotide
Target sequence and the second target sequence can be separated by least one, at least two, at least three, at least four, at least five or at least ten core
Thuja acid, or the first target sequence of every kind of target polynucleotide and the second target sequence can be separated by 2 to 10,5 to 15,7 to 15
It is a, 10 to 12,15 to 25,25 to 40,30 to 45,40 to 16,60 to 65,60 to 75,70 to 85,80
To 95,90 to 120,110 to 150,120 to 160,130 to 170,150 to 190,170 to 210,190 to
230,200 to 230,220 to 260,230 to 270,240 to 310,300 to 340,330 to 370,360 to
400,390 to 430,410 to 450,440 to 480,470 to 500 nucleotide.
Second complementary probe may include universal sequence close to Sequence.The general sequence of second complementary probe
Row may include universal primer sequence with primer sequence complementation, and the primer sequence is attached available for (i) sample index, (ii) is increased
Sequence, (iii) is added to be used for one in the appended sequence and (iv) another part of formation sequence data or another form of detection
Person or more persons.
The adjacent universal sequence of first complementary probe may include the universal primer sequence with primer sequence complementation, described
Primer sequence can be used for increasing (i) sample index, (ii) appended sequence, (iii) for formation sequence data or another form
Detection appended sequence and one or more of (iv) another part.
Universal primer sequence may include PCR primer sequence and/or primer sequence and be used for formation sequence data or another to increase
The appended sequence of a form of detection.Can be to use for formation sequence data or the appended sequence of another form of detection
In the adapter of new-generation sequencing.Can be capture sequence for formation sequence data or the appended sequence of another form of detection
Row, optionally wherein capture sequence and are used to capture in solid support.Universal primer sequence can effectively increase beneficial to life
Into the part of sequence.
The length of sample index can be at least ten, 11,12,13,14,15 or 16 nucleotide.It is excellent
Selection of land, the length of sample index is 12 to 15 nucleotide.Sample index sequence may be selected from by SEQ ID NO:1-SEQ ID
NO:73536。
The universal sequence of first complementary probe and the second complementary probe can each include primer sequence, the primer sequence
The primer for composition sequence can be hybridized to.Primer sequence may include PCR primer sequence.
First complementary probe may include from 5' to 3':Adjacent universal sequence, with the Sequence of the first target sequence complementation with
And the inquiry site bar code in Sequence with the complementation of the first target sequence.
First complementary probe include 5' to the inquiry site bar code complementary with the first target sequence sequence and 3' to
The sequence of the complementary inquiry site bar code of first target sequence.
First complementary probe can include the sequence of the first target sequence complementation of the 3' and both 5' with inquiring site bar code.
Second complementary probe may include from 5' to 3':With the Sequence of the second target sequence complementation of target polynucleotide and
With the second target sequence incomplementarity close to Sequence.
The length for inquiring site bar code can be at least ten, 11,12,13,14,15 or 16 nucleosides
Acid.Preferably, the length for inquiring site bar code is 12 or 15 nucleotide.Inquiry site bar code may be selected from by SEQ ID
NO:1–SEQ ID NO:384。
The step of this method may include to incubate before (or hybridization) step, which includes making target polynucleotide reversibly become
Property.The step can be performed by heating as described herein.
This method is included in the additional step of enriching step foregoing description enriched product polynucleotides.Enriching step can wrap
Include:(a) one group of PCR primer sequence is provided, which includes the complementary with the primer sequence in the first complementary probe
One primer and the second primer with the PCR primer sequence complementation in the second complementary probe, and (b) amplified production polynucleotides.
This method can be the method based on solution.
First complementary probe may include inosine (for example, deoxyinosine), the 3' ends of the inosine and probe be separated by 2,3,4
A, 5,6,7,8,9,10 or more bases.
Second complementary probe may include inosine (for example, deoxyinosine), the 5' ends of the inosine and probe be separated by 2,3,4
A, 5,6,7,8,9,10 or more bases.
The 3' ends of first complementary probe can be with a kind of form in single nucleotide polymorphism (SNP) or other hereditary variations
It is complementary.
It may include to hybridize to target multinuclear using connection enzymatic treatment with reference to the step of the first complementary probe and the second complementary probe
First target sequence of thuja acid and the first complementary probe of the second target sequence (hybridization complex) and the second complementary probe are produced with being formed
Thing polynucleotides.
Disclosed method can be used in Genotyping, and wherein this method includes provide the first complementary probe one or more
A variant, wherein variant are different in terms of the homogeneity of one or more nucleotide at 3 ' ends of the first complementary probe, and
And wherein described measure includes quantization product polynucleotides or the relative frequency of its complementary strand, product polynucleotides or its is mutual
It is complementary comprising described first compared with making with the sequence of other variants described in first complementary probe to mend chain
The sequence of one or more of variants of probe, and the frequency is associated with genotype.
Disclosed method can be used for measure target polynucleotide copy number variation, and wherein it is described measure include to by
The semaphore that product polynucleotides or its complementary strand produce is with known reference signal amount or by another product polynucleotides or its is mutual
The semaphore that chain produces is mended to be compared.
Disclosed method can be used for the presence that target polynucleotide is measured in expression analysis, and wherein target polynucleotide is
RNA transcript, and wherein described measure is included to the semaphore and known ginseng by product polynucleotides or the generation of its complementary strand
It is compared than semaphore or by the semaphore that another product polynucleotides or its complementary strand produce.
Disclosed method can be used for carrying out polyploid sample Genotyping, and it is more in no information that this method further includes reduction
Formation sequence data in times body genome, it includes obtaining the sample base with target SNP/indel and nearside SNP/indel information
Because of data unit sequence, the second complementary probe of design is so that the stability of itself and the hybridization of target gene group is beaten by nearside SNP/indel
It is broken.
Disclosed method can be used for carrying out polyploid sample Genotyping, and it is more in no information that this method further includes reduction
Formation sequence data in times body genome, it includes obtaining the sample base with target SNP/indel and nearside SNP/indel information
Because of data unit sequence, the second complementary probe of design is so that the stability of itself and the hybridization of target gene group is beaten by nearside SNP/indel
It is broken, and add mutual with further prevention second with the blocking oligonucleotide of the target gene group complementation with nearside SNP/indel
Mend probe and the hybridization of the target gene group.
Disclosed method can be used for carrying out polyploid sample Genotyping, and it is more in no information that this method further includes reduction
Formation sequence data in times body genome, it includes obtaining the sample base with target SNP/indel and nearside SNP/indel information
Because of data unit sequence, and increase PCR amplification step early period to select unique genome interested.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains
The composition of amount or copy number, including multiple first complementary probes and the second complementary probe, the multiple first complementary probe and
Second complementary probe includes the first complementary probe and the second complementary probe for every kind of target polynucleotide, and (i) each first is mutual
Probe is mended with the sequence with the Sequence of the first target sequence complementation of target polynucleotide and with the first target sequence incomplementarity
Part, wherein non-complementary portion include inquiry site bar code sequence and adjacent universal sequence, and (ii) each second is complementary
Probe has with the Sequence of the second target sequence complementation of target polynucleotide and with the second target sequence incomplementarity close to sequence
Arrange part.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains
The composition of amount or copy number, including multiple first complementary probes and the second complementary probe, the multiple first complementary probe and
Second complementary probe includes the first complementary probe and the second complementary probe for every kind of target polynucleotide, and (i) each first is mutual
Mend probe have with two Sequences of the different piece complementation of the first target sequence of target polynucleotide and with the first target sequence
Two Sequences of row incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and universal sequence, and
(ii) each second complementary probe have with the Sequence of the second target sequence complementation of target polynucleotide and with the second target sequence
Row incomplementarity close to Sequence and including universal sequence.
The first target sequence and the second target sequence of every kind of target polynucleotide can be closely adjacent to each other.Alternatively, every kind of target multinuclear
The first target sequence and the second target sequence of thuja acid can be separated by 1 to 500 nucleotide.
The universal sequence of first complementary probe may include the universal primer sequence with primer sequence complementation, the primer sequence
(i) sample index can be increased, (ii) appended sequence, (iii) are used for the additional of formation sequence data or another form of detection
One or more of sequence and (iv) another part.
The universal sequence of second complementary probe may include the universal primer sequence with primer sequence complementation, the primer sequence
(i) sample index can be increased, (ii) appended sequence, (iii) are used for the additional of formation sequence data or another form of detection
One or more of sequence and (iv) another part.
The universal primer sequence of first complementary probe and/or the second complementary probe may include PCR primer sequence and/or primer
Sequence is to increase the appended sequence beneficial to formation sequence data or another form of detection.For formation sequence data or another
The appended sequence of the detection of kind form can be the adapter for new-generation sequencing.For formation sequence data or another shape
The appended sequence of the detection of formula can be capture sequence, optionally wherein capture sequence and be used to capture in solid support.It is logical
It can effectively increase the part beneficial to formation sequence with primer sequence.Universal primer sequence may include primer sequence, the primer
Sequence, which provides, to be used to increase sample index.
The length of sample index can be at least ten, 11,12,13,14,15 or 16 nucleotide.It is excellent
Selection of land, the length of sample index is 12 to 15 nucleotide.Sample index sequence may be selected from by SEQ ID NO:1-SEQ ID
NO:73536。
The universal sequence of first complementary probe and the second complementary probe can each include primer sequence, the primer sequence
The primer for composition sequence can be hybridized to.Primer sequence may include PCR primer sequence.
First complementary probe may include from 5' to 3':Adjacent universal sequence, with the Sequence of the first target sequence complementation with
And the inquiry site bar code in Sequence with the complementation of the first target sequence.
First complementary probe include 5' to the inquiry site bar code complementary with the first target sequence sequence and 3' to
The sequence of the complementary inquiry site bar code of first target sequence.
First complementary probe can include the sequence of the first target sequence complementation of the 3' and both 5' with inquiring site bar code.
Second complementary probe may include from 5' to 3':With the Sequence of the second target sequence complementation of target polynucleotide and
With the second target sequence incomplementarity close to Sequence.
The length for inquiring site bar code can be at least ten, 11,12,13,14,15 or 16 nucleosides
Acid.Preferably, the length for inquiring site bar code is 12 or 15 nucleotide.Inquiry site bar code may be selected from by SEQ ID
NO:1–SEQ ID NO:384。
First complementary probe may include inosine (for example, deoxyinosine), the 3' ends of the inosine and probe be separated by 2,3,4
A, 5,6,7,8,9,10 or more bases.
Second complementary probe may include inosine (for example, deoxyinosine), the 5' ends of the inosine and probe be separated by 2,3,4
A, 5,6,7,8,9,10 or more bases.
The 3' ends of first complementary probe can be with a kind of form in single nucleotide polymorphism (SNP) or other hereditary variations
It is complementary.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains
The kit of amount, copy number or characteristic, the kit include:(a) multiple first complementary probes and second as disclosed herein
Complementary probe;And (b) is optionally, for the buffer solution and enzyme for connecting and being enriched with.
It is used to measuring the presence of one or more target polynucleotides in sample present disclose provides a kind of, is not present, contains
The kit of amount, copy number or characteristic, the kit include:(a) multiple first complementary probes and the second complementary probe are described more
A first complementary probe and the second complementary probe are included for the first complementary probe of every kind of target polynucleotide and the second complementary spy
Pin, (i) each first complementary probe have with the Sequence of the first target sequence complementation of the target polynucleotide and with institute
The Sequence of the first target sequence incomplementarity is stated, wherein the non-complementary portion includes inquiry site bar code sequence and adjacent logical
With sequence, and (ii) each second complementary probe have with the Sequence of the second target sequence complementation of target polynucleotide and
With the second target sequence incomplementarity close to Sequence;And (b) is optionally, for the buffer solution that connects and be enriched with and
Enzyme.
It is used to measuring the presence of one or more target polynucleotides in sample present disclose provides a kind of, is not present, contains
The kit of amount, copy number or characteristic, the kit include:(a) multiple first complementary probes and the second complementary probe are described more
A first complementary probe and the second complementary probe are included for the first complementary probe of every kind of target polynucleotide and the second complementary spy
Pin, (i) each first complementary probe have two sequence portions with the different piece complementation of the first target sequence of target polynucleotide
Be divided to and two Sequences with the first target sequence incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and
Universal sequence, and (ii) each second complementary probe have with the Sequence of the second target sequence complementation of target polynucleotide with
And with the second target sequence incomplementarity close to Sequence;And (b) is optionally, for the buffer solution that connects and be enriched with and
Enzyme.
The kit may also include at least one PCR primer, polymerase and/or one group of dNTP to expand the target multinuclear of extension
Thuja acid is to realize the purpose of enrichment.
The kit may also include ligase.
The kit may also include the software for parsing data.
The kit can be used for measuring genotype, and/or the kit can be used for measuring copy number, and/or the kit
Expression available for measure RNA transcript.
Definition
Unless otherwise defined, otherwise all technical and scientific terms used herein have with it is of the art general
The normally understood identical implication of logical technical staff.
It should be appreciated that present disclosure disclosed in this specification includes all possible group of such specific feature
Close.For example, in the case of disclosing special characteristic under particular aspects or the linguistic context of specific embodiment, this feature generally can also be with
Other specific aspects and embodiment in the present invention are used in combination, except eliminating this possibility within a context.Herein
Disclosed invention content includes the embodiment being not explicitly described, and can for example utilize not specifically disclosed spy herein
Sign, but it is to provide function identical with feature explicitly disclosed herein, equivalent or similar.
When describing and proposing claim to the present invention, following term will be used according to following definition.
As used in the specification and the appended claims, singulative "one", " one kind ", "the" and " described " wrap
Plural thing is included, unless the other clear stipulaties of context.Thus, for example, " a kind of target polynucleotide " that refers to includes two kinds
Or more the such target polynucleotide of kind, and " probe " referred to includes the mixing of two or more probes or probe
Thing etc..
Term " adjacent " as used herein means two sequences substantially adjacent to each other in nucleic acid, but at two
There may be base among one or more between flanking sequence.
Term " close to " as used herein means two sequences adjacent to each other in accounting and close between sequence
There is no middle base.
Term " allele " means one kind in two or more optional forms of gene or gene loci.If one
Kind diplont has two copies of phase iso-allele, such as AA or aa, then has homozygosity at the position.If
Biology has two not one of iso-allele copies, such as Aa, then has heterozygosity at the position.Substituting nomenclature makes
Allele is represented with A and B.Homozygosity diplont is AA or BB at the position.Heterozygosity diplont is in the position
It is AB to put place.Term " allele " is also applied for that there are three or more possible alternative forms and can be such as this area
In it is known as situation about extending, such as allele A, B and C on ternary single nucleotide polymorphism.
Term " array " as used herein means can be by deliberately manufacturing made from synthesis or biosynthesis mode
Elements collection.Various forms, such as shla molecule storehouse can be presented in array, and utilizes one or more solid supports, such as
Glass slide, silica chip, particulate, nano-particle or pearl.As used herein, " solid support " is that can be attached to spy
Any material of pin, target nucleotide or product nucleotide, for example, glass and modification or functional glass, plastics, polysaccharide, nylon,
Nitrocellulose, ceramics, resin, the material based on silica, carbon, metal, inorganic material and other polymers, such as flow
Pond or other surfaces of solids (such as pearl or microarray).
Term " at least " used herein followed by numeral represent the starting of the scope since the numeral (according to determining
The variable of justice, which can have the upper limit or the scope without the upper limit).Term " at most " used herein followed by numeral
Represent that (according to defined variable, which can be used as its lower limit using 1 or 0 with the end of the scope of the end of digit
Scope or the scope without lower limit).When scope with " (first digit) to (second digit) " or " (first digit) extremely
When (second digit) " provides, it is intended that the lower limit of the scope is first digit and its upper limit is second digit.As herein
Term " plural number ", " multiple ", " plural form " and " diversity " used represents two or more features.
The term " bar code " being used interchangeably herein or " index " refer to be used to identify or " mark " is one or more
The nucleotide sequence of specific target or product polynucleotides.The length of " bar code " is generally at least 5 nucleotide (nt).If
In dry embodiment, bar code or one part may occur in which in the first complementary probe and/or the second complementary probe.Such as this paper institutes
With bar code can be used as sample bar code or inquiry site bar code.In several embodiments, identical bar code sequence is in
Two diverse locations in polynucleotides, and it is used as sample index bar code at a position, and at another position
As inquiry site bar code.In several embodiments, (it can also have identical or different length to different bar code sequences
Degree) two diverse locations in polynucleotides are in, and it is used as sample bar code at a position, and in another position
Place is used as inquiry site bar code.Bar code can have the identical sequence being present in target polynucleotide or its complementary strand, it can
Think the sequence of the Sequence complementation in target polynucleotide or its complementary strand, and it can be and target polynucleotide or its is mutual
Chain is mended without complementary sequence, or can be any combination in these states.In several embodiments, single sequence
At the same time as inquiry site bar code and sample index.In several embodiments, a part for single sequence is used as inquiry site
Bar code and another part is used as sample index.
Term " base " means can be with complementary nucleobases or nucleobase analog (for example, purine, 7- deazapurines or phonetic
Pyridine) formed Watson-Crick (Watson-Crick) type hydrogen bond nitrogen heterocyclic ring part.Typical base is naturally occurring alkali
Base:Adenine, cytimidine, guanine, thymine and uracil.Base further includes naturally occurring base and universal base
Analog, such as inosine, 3- nitro-pyrroles and 5- nitroindolines.Any universal base (does not support particular bases to match general
Base) it can put into practice the present invention.
Term " base modification " as used herein refers to that (that is, adenine, guanine, thymus gland are phonetic comprising non-standard bases
Base beyond pyridine, cytimidine and uracil) polynucleotides.Such non-standard bases can be used for a variety of purposes, such as
Stablize hybridization or make hybridization unstable;Promote or suppress degraded;Or as detectable part, quencher moiety or other parts
Tie point.Modified base (in addition to heretofore described modified base) and many examples of base analogue exist
It is known in this area.
Term " complementary polynucleotide " as used herein refers to the polynucleotides for forming base-pair each other.Base-pair is usual
Formed by the hydrogen bond between the nucleotide units in antiparallel polynucleotide chain.Complementary polynucleotide chain can be with fertile
Gloomy-Crick mode (for example, A and T, A and U, C and G) or by allow to be formed duplex any other in a manner of (be included in U and G
Between the wobble bases that are formed to) form base-pair.As it is known to those skilled in the art, when using RNA rather than DNA,
Think that uracil rather than thymidine are complementary with adenine.In definite complementarity between probe and target gene, " complementation " journey
Degree is expressed as the percentage between probe sequence and target-gene sequence or the therewith complementary strand of the target-gene sequence of best match.
In some embodiments, " complementation " degree between probe sequence and target-gene sequence or the complementary strand of target-gene sequence is not necessarily same
For 100%.In one embodiment, " complementation " degree is less than 100%, but is enough to make probe sequence and target under given conditions
Hybridize between gene order or the complementary strand of target-gene sequence.
Term " complementation " as used herein refers to basic in complementary series when with another sequence arranged anti-parallel
All there is nucleotide base at upper all positions and be free of the Sequence with four or more close to Non-complementary bases
Polynucleotides or sequence.
Term "comprising" as used herein and its phraseological equivalent terms mean except it is expressly intended that feature in addition to, can
It is optionally present other features.For example, the composition or device of "comprising" (or " it includes ") component A, B and C can only include group
Divide A, B and C, or can not only include component A, B and C, but also include one or more other components.Term as used herein
" substantially by ... form " and its phraseological equivalent terms mean except it is expressly intended that feature in addition to, may be present not substantive
Change other features of claimed invention.
Term " contact " as used herein can refer under conditions of their (if they are fully complementary) are allowed each other
Hybridize the combination of two sequences.For example, probe is being allowed to hybridize to the target polynucleotide sequence in sample (if they are fully mutual
Mend) under conditions of make the first complementary probe and the second complementary probe and sample contact.
Term " copy number variation " (" CNV ") as used herein means that the change of the DNA parts of genome causes cell
In the copy number of one or more parts of DNA change.CNV is generally corresponded to lack on some chromosomes and (is less than
Normal number) or repeat (exceeding normal number) genome relatively large region.
Term " correspondence " as used herein or " corresponding to " refer to homologous or substantially equivalent or functionally with specified sequence
Equivalent sequence.
Term " definite " as used herein means to infer or find out after reasoning, observation etc..
Term " DNA polymorphism " as used herein refers to that one of two different nucleotide sequences may be present in DNA
In specific site at situation.Preferable polymophic markers have at least two allele, each allele with higher than
1%th, 2%, 3%, 4%, 5%, 6%, 7% or higher frequency exist.Under a number of cases, the frequency that allele occurs is big
In 10%, 15% or the 20% of selected colony.Polymorphic site can be as small as a base-pair.Single nucleotide polymorphism (SNP) can be with
It is that a nucleotide is substituted by another nucleotide at polymorphic site.Single nucleotide polymorphism can lack at polymorphic site
One nucleotide of one nucleotide or insertion.Diallele polymorphism has two kinds of forms.Triallelic polymorphism has
Three kinds of forms.Single nucleotide polymorphism betides the polymorphic site occupied by single nucleotide acid, which is allele sequence
The site to make a variation between row.As the skilled personnel to understand, SNP is usually to include polynucleotides rather than single
The polymorphism of base.Other polymorphisms include (small) missing or the insertion of several nucleotide, are known as indel." DNA polymorphism " can
For referring to structural rearrangement, transposition, big insertion or missing, inversion etc., and it may also include inhereditary material (it can come from or can be with
It is not from host) it is added in genome.
Term " duplex " as used herein refer to complementary (or partial complementarity) single stranded nucleic acid molecule (for example, DNA,
RNA, PNA) double-stranded nucleic acid molecule that is annealed into another one and is formed.
Term " the first complementary probe " as used herein refers to comprising the first target sequence at least portion with target polynucleotide
Divide the polynucleotides of complementary First ray.First complementary probe can also include inquiry site bar code, the inquiry site bar shaped
Code can be directed among different sequences and/or universal sequence (it can include primer binding sequence) etc. specific allele,
For specific site or for allele and site (combination) or for allele and the bar code in site.In some realities
Apply in example, the first complementary probe can also include the primer sequence for being used for generating sample index.In several embodiments, first is complementary
Probe has 5' phosphorylated nucleosides acid.
Term " the first target sequence " refers to a part for the target polynucleotide of the target for hybridization.First target sequence can be deposited
In sample or it can be not present in sample.
Term " gene loci " as used herein refers to chromosome or gene on other types nucleic acid, base or any
The specific location of important sequence or site.
Term " genotype " as used herein refers to the gene composition of organism.It is used to refer to Single locus, Duo Gewei
Point, the change exclusive or monotype site with the sites of two or more allele, copy number or structure.
Term " gap filling " as used herein refers to the first complementary probe and the second complementary probe with not adjacent to each other
Mode hybridize to target sequence.When the first complementary probe and the second complementary probe hybridize to the first target sequence and the second target sequence
When, there may be between the first complementary probe and the second complementary probe or gap can be not present." gap " can be 1,2 to
10,5 to 15,7 to 15,10 to 12,15 to 25,25 to 40,30 to 45,40 to 16,60 to 65,60
To 75,70 to 85,80 to 95,90 to 120,110 to 150,120 to 160,130 to 170,150 to 190
It is a, 170 to 210,190 to 230,200 to 230,220 to 260,230 to 270,240 to 310,300 to 340
It is a, 330 to 370,360 to 400,390 to 430,410 to 450,440 to 480,470 to 500,2,3,
4,5,10,20,30,40,50,60,70,80,90,100,120,140,160,
180,200,220,240,260,280,300,320,340,360,380,400,420
A, 440,460,480,500 or more nucleotide., can be by using polymerase and connection under a number of cases
Enzyme fills up gap with combination the first probe of extension of single or multiple nucleotide or one end of the second probe.In target polynucleotide
In the case of for RNA, the one of the first complementary probe or the second complementary spy can be extended by using reverse transcriptase and ligase
Hold to fill up.
Term " hybridization " as used herein refer to it is usual make under strict conditions nucleic acid molecules preferably in combination with, it is dual
Or it is annealed to specific target polynucleotide.Term " stringent condition " refer to probe by preferential hybridization to its target polynucleotide and
The condition hybridized in lower degree with other sequences.The term " stingent hybridization " used under nucleic acid hybridization context depends on sequence
Row, and it is different under varying environment parameter.Hybridization stringency is this to the dependence of buffer solution composition, temperature and probe length
Known to the technical staff in field (see, for example, Sambrook and Russell (2001)《Molecular cloning:Laboratory manual》
(Molecular Cloning:A Laboratory Manual) the 3rd edition the 1-3 volumes, cold spring harbor laboratory (Cold Spring
Harbor Laboratory), New York, United States Cold Spring Harbor Publications (Cold Spring Harbor Press, NY)).Nucleosides
The degree of hybridization of acid sequence and target sequence, also referred to as intensity for hybridization, are measured by methods known in the art.Method for optimizing
It is the Tm of the given heteroduplex of measure.
Term " inquiry site " as used herein refers to the position in the nucleic acid assessed, such as assessment exists or not
In the presence of or the specific gene site of content at SNP.In other embodiments, hereditary variation is not assessed, and to gene
The existence or non-existence in site or content are assessed.In other embodiments, to the base composition of specific location in nucleic acid
Assessed.
Term " inquiry site bar code " as used herein refers to function to identify specific target polynucleotide and/or its change
The bar code of allosome.Inquire site bar code can be directed to specific allele, for specific site, for specific allele
With site or for allele and site.
Term " mark " refer to when directly or indirectly attach to nucleotide or oligonucleotides when, can be by suitably detecting hand
The part of such nucleotide or oligonucleotides that section is detected.Exemplary indicia includes bar code, fluorogen, chromophore, puts
Injectivity isotope, spin labeling, enzyme mark, chemiluminescent labeling, electrochemical luminescence compound, magnetic marker, microballoon, aurosol
Category, immune labeled, ligand, enzyme etc..
Term " site " as used herein means specific gene or gene order on chromosome or other nucleic acid structures
The position occupied.Site can be the sequence outside gene.The example of other nucleic acid structures is including but not limited to all types of
RNA (mRNA, longer non-coding RNA, tiny RNA, rRNA etc.).All types of DNA are further included, for example, it is but unlimited
In plasmid, chromosome, BAC, YAC, clay, mitochondria, chloroplaset and plastid DNA, cDNA and any other it is naturally occurring or
Artificial structure.
Term " mismatched nucleotide " as used herein refers to when sequence hybridizes each other, in polynucleotides with corresponding spy
The nucleotide of corresponding nucleotide incomplementarity in pin or primer sequence.The complementary base of C is G, and the complementary base of A is T.Change
Word says that " C " in probe is considered as mispairing with " T " coordination in target polynucleotide.
As used herein, term " modified polynucleotides " can be used for comprising universal base (for example, deoxyinosine (
Also referred herein as " inosine "), 3- nitro-pyrroles or 5- nitroindolines) nucleotide sequence.
Term " new-generation sequencing " or " NGS " as used herein refer to high-flux sequence.NGS can also refer to the third generation,
Forth generation and the other generation sequence without high throughput still with other characteristics distinguished with traditional Sanger sequencings
Data creation method.
As used herein, " nucleic acid " refers to natural, synthesis or artificial polynucleotides, such as embodies nucleotide sequence
DNA or RNA.Nucleic acid can be cleaved, clone, replicating, expanding or otherwise derivative or manipulation.Exemplary DNA material bag
Include genomic DNA (gDNA), mitochondrial DNA and complementary DNA (cDNA).Exemplary RNA materials include mRNA (mRNA), turn
Transport RNA (tRNA), Microrna (miRNA), children purpura nephritis (siRNA) and rRNA (rRNA).
As used herein, term " nucleic acid amplification " or " amplification " refer at least a portion for replicating at least one target nucleotide
Any means, generally use replicated dependent on the mode of template, includes but not limited to various be used for linear or index
The technology of form amplifying nucleic acid sequence.Non-restrictive illustrative amplification method includes polymerase chain reaction (PCR), reverse transcriptase
PCR, real-time PCR, nest-type PRC, multiplex PCR, quantitative PCR (Q-PCR), the amplification (NASBA) based on nucleotide sequence, transcriptive intermediate
Expand (TMA), ligase chain reaction (LCR), rolling circle amplification (RCA), strand displacement amplification (SDA), Ligase detection reaction
(LDR), multiple join dependency probe amplification (MLPA), connection-Q replicate enzymatic amplification, primer extend, strand displacement amplification (SDA),
Overspend strand displacement amplification, multiple displacement amplification (MDA), the amplification (NASBA) based on nucleic acid chains, two step multiplex amplifications, rolling ring expansion
Increase (RCA), digital amplification etc..The description of such technology is found in:Ausbel et al.;PCR Primer:A Laboratory
Manual, Diffenbach, Ed., Cold Spring Harbor Press (1995) (Ausbel et al.;《Round pcr is tested
Guide》, Diffenbach edits, Cold Spring Harbor Publications, nineteen ninety-five);The Electronic Protocol Book,Chang
Bioscience(2002)(《Electronic agreement》, Chang Bioscience, 2002);The Nucleic Acid
Protocols Handbook,R.Rapley,ed.,Humana Press,Totowa,N.J.(2002)(《Nucleic acid scheme hand
Volume》, R.Rapley edits, Ha Menna publishing houses, New Jersey Tuo Tuohua, 2002);With Innis et al, PCR
Protocols:A Guide to Methods and Applications, Academic Press, NY, 1990 (Innis etc.
People,《PCR schemes:Method and application guide》, academic press of the U.S., New York, nineteen ninety).
As used herein, " nucleotide " refers to the multinuclear being made of heterocyclic base, sugar and one or more bound phosphate groups
The monomeric unit of thuja acid.(guanine (G), adenine (A), cytimidine (C), thymidine (T) and urine are phonetic for naturally occurring base
Pyridine (U)) be typically purine or pyrimidine derivative, but it is to be understood that also including naturally occurring or non-naturally occurring base
Analog.Naturally occurring sugar is pentose (pentose), deoxyribose (it forms DNA) or ribose (it forms RNA), but is answered
Work as understanding, also including naturally occurring and non-naturally occurring sugar analogue.Nucleic acid is usually by phosphoric acid key connection to form nucleic acid
Or polynucleotides, but there are many other connections (for example, thiophosphate, boric acid phosphate etc.) as is generally known in the art.
Term " polynucleotides " and " oligonucleotides " are used interchangeably herein, and refer to nucleotide monomer or its modification
The linear polymer of form, including such as double-strand and single stranded deoxyribonucleic acid, ribonucleotide.Polynucleotides can completely by
DNA, ribonucleic acid or its analog composition, or block or the mixing of two or more different monomers can be included
Thing.When polynucleotide is represented by series of letters (as " ATGCCTG "), it will be appreciated that unless otherwise noted, otherwise nucleosides
Acid is by 5'- from left to right>3' order (except as otherwise noted) and " A " represent adenosine, and " C " represents cytidine, and " G " represents bird
Glycosides, " T " represent thymidine, and " U " represents uridine.When being used alone, " polynucleotides " and " oligonucleotides " refer to mainly or entirely
The sequence being made of conventional DNA or RNA monomer unit ,-i.e. by A, C, G, T or U base substitution deoxyribose or ribose saccharide ring,
And they are connected by conventional phosphoric acid skeleton part.Polynucleotides generally comprise or by the lists having less than 100 nucleotide
Chain polynucleotides form, it is also contemplated that the longer sequence with hundreds of or thousands of or more bases.In some implementations
Example in, polynucleotides contain or comprise 2 to 100,2 to 50,2 to 25,2 to 15,5 to 50,5 to 25,5 to 15
A, 10 to 50,10 to 25,10 to 20,10 to 15,12 to 50,12 to 25 or 12 to 20 nucleotide.Multinuclear
Thuja acid can be represented by their length.For example, the sequence comprising 15 nucleotide is properly termed as " 15-mer ".
" primer " or " probe " is typically to include and the region of the sequence complementation of at least six continuous nucleotide of target nucleic acid
Nucleotide sequence, but primer and probe can include and be less than 6 continuous nucleotides.In several embodiments, there is provided more nucleosides
Sour primer or probe include 6 or more, 7 or more, 8 or more, 9 or more with target polynucleotide
It is a, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, 15 or more
It is multiple, 16 or more, 17 or more, 18 or more, 19 or more, 20 or more, 21 or
More, 22 or more, 23 or more, 24 or more, 25 or more, 26 or more, 27
Or more, 28 or more, 29 or more, 30 or more, 31 or more, 32 or more, 33
It is a or more, 34 or more, 35 or more, 36 or more, 37 or more, 38 or more,
39 or more, 40 or more, 41 or more, 42 or more, 43 or more, 44 or more
It is a, 45 or more, 46 or more, 47 or more, 48 or more, 49 or more, 50 or more
Multiple, 60 or more, 70 or more, 80 or more, 90 or more or up to 100 continuous nucleotides
Identical or complementary sequence.When primer or probe include the region with many continuous nucleotides " complete complementary " of target molecule,
There is no during mispairing along its length, primer or probe can be referred to as and the complementation of target molecule 100%.
When " probe " and target polynucleotide form duplex, can in probe crossover process or after hybridization directly or
Signal is generated indirectly.
Term " product polynucleotides " as used herein refers to and the first target sequence of target polynucleotide and the second target sequence
Arrange the first complementary complementary probe and the second complementary probe (bag is with or without gap between target sequence) is combined (for example, passing through
Connection) to form the polynucleotides formed during single " product " polynucleotides.
Term " sample index bar code " as used herein or " sample index " refer to even in sample (or its reaction production
Thing, or lack its reaction product) with other samples (and/or its reaction product, or lack its reaction product) mixing when, for referring to
Determine specific sample and track the mark sequence of the information related with the sample.
Term " the first complementary probe or the second complementary probe " as used herein refers to the first target with target polynucleotide
The polynucleotides of the First ray and the second sequence of sequence or the complementation of the second target sequence.In several embodiments, the first complementary spy
Pin or the second complementary probe can also include the primer sequence for being used for generating sample index.In several embodiments, the first complementary spy
Universal sequence in pin and the second complementary probe is deliberately identical or different.In several embodiments, the first complementary probe or second
Complementary probe has 5' phosphorylated nucleosides acid.
Term " the first target sequence or the second target sequence " refers to a part for the target polynucleotide of the target for hybridization.At certain
In a little embodiments, the first target sequence or the second target sequence can reside in sample or can be not present in sample.
Term " sequencing " as used herein refers to the process of DNA sequencing or measures the order of DNA molecular inner nucleotide.Its
Include any method or skill of the order available for the adenine in measure DNA chain, guanine, cytimidine and thymidine
Art.Sequencing may also include RNA sequencings, wherein being measured to the order of base in RNA.
As used herein term " single nucleotide polymorphism " or " snp " or " SNP " refer to usually single nucleotide acid A,
T, the esoteric variant nucleic acid sequence of group different between pairing chromosome C or G.Most common SNP has two equipotentials
Gene, but they can have more than two allele.SNP also may be present in RNA molecule.In RNA molecule, SNP can
Reflect the difference of RNA processing.
Term " target polynucleotide " as used herein refers to the sequence in nucleic acid or polynucleotides as hybridizing targets.
Target polynucleotide can reside in sample or can be not present in sample.In several embodiments, target polynucleotide bag
Include and the first complementary probe of the present invention and the second complementary probe partially or completely complementary RNA or DNA.Target polynucleotide is usual
Four kinds of bases (A, T, G and C) of DNA and four kinds of bases (A, U, G and C) of RNA can be used to describe.
Term " target sequence " refers to a part for the target polynucleotide of the target for hybridization.Target sequence can be the first target sequence
Row and can reside in sample or can be not present in sample the second target sequence.Those skilled in the art is managed
As solution, " target sequence " that refers to can also refer to the complementary strand of target sequence.
Term " thermal melting point " or " Tm " as used herein are with reference to the specific sequence under set ionic strength and pH
Row.Tm is that 50% target sequence hybridizes to the temperature of complete matched probe.Tm also refers to half DNA chain and is in single-stranded (ssDNA)
The temperature of state.Tm depends on various parameters, for example, the length of the complementary strand sequence of hybridization, they specific nucleotide sequence,
The other conditions of the concentration and solution of base composition and complementary strand.
Term " universal base " as used herein refers to when the 3' ends of the first complementary probe and one or more targets are polymorphic
Help to prevent or reduce the base of molecule combination frequency when property nucleotide or nucleotide incomplementarity.The example of universal base includes
Inosine, 3- nitro-pyrroles and 5- nitroindolines.
Term " universal sequence " as used herein refers to the first complementary probe or second that may include universal primer sequence
The sequence component of complementary probe.
Term " universal primer sequence " or " universal primer binding sequence " include mutual with primer sequence such as PCR primer sequence
The primer sequence of benefit, and for increase (i) sample index, (ii) appended sequence, (iii) be used for formation sequence data or other
One or more of one or more sequences and (iv) other parts of the detection of form.As those skilled in the art manages
As solution, term " universal primer sequence " or " universal primer binding sequence " are referred to primer sequence or its complementary strand and make
With.
PCR primer sequence usually in pairs using and sequence pair in the compositions of two kinds of components can differ.Except sample
Beyond this index, any two pairs of PCR primer sequences can have identical sequence.In several embodiments, first pair and second pair
The sequence of middle primer #1 is identical, and the sequence of primer #2 is different from the sequence of primer #1, and in addition to sample index, first
It is pair identical with the sequence of the second centering primer #2.In several embodiments, PCR primer, which includes, has the function of other one or more
A universal sequence and/or one or more sample index and/or one or more Sequences.One or more universal primer sequences
The PCR reactions of row can be used for increasing sample index.It is general in the first complementary probe and its complementary portion in first PCR primer
Primer sequence can with or can not have an equal length or with 100% complementary series.Second in second PCR primer is mutual
Mend the universal primer sequence in probe and its complementary portion can with or can not have equal length or with 100% complementation
Sequence.Universal primer sequence can be used for increase linking subsequence, which is used to be attached to solid support.Some
In the case of, and the combination of solid support is used for new-generation sequencing.In other cases, and the combination of solid support is used for base
In array detection product polynucleotides.Under a number of cases, universal primer sequence is used to increase sequence or part to realize other
The detection of form or formation sequence data.
As used herein, term " PCR primer " or " PCR primer sequence " can refer to or can not refer to and " universal primer sequence
The identical sequence of row ".As the skilled personnel to understand, term " PCR primer " or " PCR primer sequence " can be with
Used with reference to PCR primer or its complementary strand.
This specification by all documents being mentioned above and to this specification and meanwhile submit or it is previously related with the application
Submit all documents (including but not limited to for public inspection this specification such document) be incorporated by reference being incorporated to
Herein.
Composition and method.
The present invention provides the presence for measuring one or more target polynucleotides in a sample or multiple samples,
It is not present, the improved composition and method of content, copy number or characteristic.Target polynucleotide can be related to polymorphism, such as take
Generation, missing, insertion, copy number variation, transposition, nucleotide modification (such as methylating) or target polynucleotide or its state it is any its
He changes.
The method of the present invention can be used for identifying that a large amount of targets are more in one or more samples in the hybridization assays based on solution
The presence of nucleotide, be not present, copy number or content (or combinations thereof).
In several embodiments, there is provided multiple samples (for example, 2-50000), wherein can include to include
Different target polynucleotides.Respectively contain mutual with multiple first complementary probes of the sequence of target sequence complementation interested and second
Mending probe can allow the first complementary probe and the second complementary probe sequences to hybridize to the first target sequence and the second target of complementation
Incubated under conditions of sequence with one or more sample.The length of exemplary first complementary probe and the second complementary probe sequences
Degree is about 50 to 200 nucleotide.In several embodiments, the first target sequence is positioned at inquiry site or a left side for polymorphic nucleotide
Side.In several embodiments, these methods can be used for identify for example single or polynucleotides polymorphism, missing, insertion, transposition,
Covalent nucleotide modification etc..
In certain embodiments, these existence or non-existence or the contents that can be used to measure specific target polynucleotide, example
Such as to measure the existence or non-existence of pathogen or cancer correlated series or content in sample (such as biological specimen).
In present or absent some illustrative methods for identifying the target polynucleotide in sample, multiple first
Complementary probe and the second complementary probe can be with target polynucleotide sequence is used under conditions of providing the hybridization for complementary series
Comprising or can be free of polymorphism one or more samples incubated.
In certain embodiments, if the first complementary polynucleotide probe and the second complementary polynucleotide probe hybridize to that
This neighbouring target polynucleotide sequence, then complementary probe can be coupled to together to form product polynucleotides.
In one embodiment, at the 3' ends of the first complementary probe, there are polymorphic nucleotide.In a reality of the present embodiment
In example, polymorphic nucleotide SNP, and the allele of two kinds of forms is represented by two the first different complementary probes, except 3'
Beyond nucleotide, the two first complementary probes are identical.(see Fig. 1 E.)
In certain embodiments, if target polynucleotide interested be not present with specific sample, or if conduct
The polymorphic nucleotide or allele of the target of first probe or the second probe are not present in sample, then the first probe and second
Probe will not hybridize to the nucleotide sequence in sample, and will not form product polynucleotides, interested corresponding to determining
Target polynucleotide be not present in sample.
In an example, the first complementary probe and the second complementary probe include target complementary series, and first is complementary
Probe is also comprising the 3' terminal nucleotides with the polymorphic nucleotide complementation on target polynucleotide.See Fig. 1 E.
In another example, the first complementary probe and the second complementary probe include target complementary series, and second is mutual
Probe is mended also comprising the 5' terminal nucleotides with the polymorphic nucleotide complementation on target polynucleotide.
Fig. 8 shows to be used to determine that (wherein every kind of target polynucleotide includes two copy (equipotentials from tetraploid organism
Gene)) target polynucleotide present or absent method variations.Every chain of given polymorphic site can be analyzed
Polymorphism.
In another example, there is provided multiple first complementary probes, its middle probe correspond to it is a variety of at anchor point can
Polymorphism, polymorphic nucleotide or the allele of energy.In the example for inquiring single base (substitution, insertion or missing) wherein,
To can have nine the first complementary probes at anchor point, and polynucleotides polymorphism has more than nine the to anchor point
One complementary probe.Under a number of cases, there are single second complementary probe to anchor point.
Preferably, every kind of target polynucleotide has at least two the first different complementary probes.For example, every kind of more nucleosides of target
Acid can have at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine or at least ten difference
The first complementary probe.Preferably, every kind of target polynucleotide has 1 the second complementary probe.However, every kind of target polynucleotide can
With at least two the second different complementary probes.For example, every kind of target polynucleotide can have at least three, at least four, at least 5
A, at least six, at least seven, at least eight, at least nine or different the second complementary probe of at least ten.Each different first
Complementary probe and/or the second complementary probe can be directed to specific allele.
In several embodiments, probe includes detectable mark or part.In several embodiments, when probe is capture
During probe, such as when probe is used to capture on the surface of solids such as microarray or pearl, probe not tape label.In some implementations
In example, labeled as bar code.In several embodiments, probe can not be extended by such as polymerase.In several embodiments, probe
It can extend.
Sample.
In several embodiments, 100 μ g are less than for the DNA in the sample in the method for the present invention or the content of RNA, lacked
In 80 μ g, less than 60 μ g, less than 40 μ g, less than 20 μ g, less than 10 μ g, less than 5 μ g, less than 4 μ g, less than 3 μ g, less than 2 μ g, few
In 1 μ g, less than 500ng, less than 400ng, less than 300ng, less than 200ng, less than 100ng, less than 50ng, less than 40ng, few
In 30ng, less than 20ng, less than 10ng, less than 5ng, less than 1ng, less than 0.1ng, less than 0.01ng, or for 0.01ng extremely
1000ng, 5ng are to 500ng, 5ng to 250ng, 10ng to 125ng, 10ng to 100ng, 5ng to 50ng or 5ng to 25ng.
Sample can derive from any animal, plant, microorganism, virus, synthetic DNA or synthesis RNA sources." multiple samples "
Refer to two or more samples from identical or different source.For example, each sample can derive from different animals or not
Same plant, or sample can derive from it is different microbe-derived.In some exemplary embodiments, the multiple sample is 2
It is a, 5,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39
It is a, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54
It is a, 55,56,57,58,59,60,61,62,63,64,65,66,67,68,69
It is a, 70,71,72,73,74,75,76,77,78,79,80,81,82,83,84
It is a, 85,86,87,88,89,90,91,92,93,94,95,96,97,98,99
A, 100 or more samples, for example, 2 to 10000 samples, 2 to 20 samples, 5 to 30 samples, 10 to 50 samples,
25 to 75 samples, 40 to 100 samples, 50 to 120 samples, 60 to 130 samples, 70 to 140 samples, 80 to 150
A sample, 90 to 170 samples, 100 to 200 samples, 150 to 250 samples, 200 to 300 samples, 250 to 500
Sample, 300 to 700 samples, 400 to 1000 samples, 500 to 1500 samples, 600 to 2000 samples, 700 to 3000
A sample, 800 to 4000 samples, 900 to 5000 samples, 50 to 1000 samples, 100 to 1000 samples, 200 to
2000 samples, 300 to 3000 samples, 400 to 4000 samples, 500 to 5000 samples, 600 to 6000 samples,
700 to 7000 samples, 800 to 8000 samples, 9 to 9000 samples or 1000 to 100000 samples.
DNA or RNA can be isolated from any source, such as biological source, such as blood or its hetero-organization, organism
Liquid, hair, nose swab, germplasm, vegetable material etc..Any nucleic acid source substantially can be used.In several embodiments, sample can
The pollutant or inhibitor to play a role comprising obstruction method, salt or other components such as PCR inhibitor.In such feelings
Under condition, sample can be extracted or purify by other means to reduce or eliminate constituents for suppressing.
In some variations, a variety of methods can be used to isolate polynucleotides from sample, such as mechanical isolation is (all
Such as glass bead technology), chemical extraction method, method or combinations thereof based on chromatographic column.Those skilled in the art institute is ripe
The a large amount of DNA extraction methods known are used equally in method described herein.
The double-stranded DNA that DNA in sample of nucleic acid can be double-strand, single-stranded or denaturation is single stranded DNA.The denaturation of double-stranded sequence
Two single stranded sequences are provided, one or both of which can be used the particular probe for each chain and (be reacted to measure in single
In).Preferable sample of nucleic acid includes the target polynucleotide of genomic DNA, cDNA, DNA fragmentation (for example, restriction fragment) etc..
Before being combined with complementary probe group, sample can be handled to crack nucleic acid.This may pass through following method
One or more of occur:Physical disruption, use are for example ultrasonic;Shearing, such as sound wave are sheared;Pin is cut;Point sink is sheared;Mist
Change;Pass through balancing gate pit;Or heating;Enzymatic lysis, uses such as DNase I, another restriction enzyme, Non-specific nuclease
Or transposase;Or chemical cracking, such as use heat and divalent metal.
Before (or hybridization) step is incubated, one or more target polynucleotides in one or more samples can occur can
Contravariance.This can for example be realized by heating stepses, such as be heated at least 70 DEG C, 70 DEG C to 100 DEG C, 75 DEG C to 100
DEG C, 80 DEG C to 98 DEG C, 85 DEG C to 95 DEG C, 90 DEG C to 100 DEG C or 95 DEG C to 100 DEG C.Preferably, 95 are heated in heating stepses
DEG C to 100 DEG C.Heating stepses are at least 30 seconds, at least 1 minute sustainable, 1-30 minutes, 2-25 minutes, 3-20 minutes, 4-15 points
Clock or 5-10 minutes.Preferably, heating stepses continue 1-15 minutes.
In several embodiments, after being combined with complementary probe, reversible denaturation occurs for the nucleic acid in sample.Double-stranded DNA can
It is denatured as single stranded DNA, such as is heated to about 98 DEG C and lasts about 1 minute.
Made using standard conditions known to those skilled in the art (such as be heated to about 98 DEG C and last about 5 minutes) double
Chain DNA denaturation is single stranded DNA., can be before hybridization by sample or sample and first when putting into practice method disclosed herein
Complementary probe and the second complementary probe be heated to 70 DEG C to 100 DEG C, 75 DEG C to 100 DEG C, 80 DEG C to 98 DEG C, 85 DEG C to 95 DEG C, 90
DEG C to 100 DEG C, 95 DEG C to 100 DEG C, 70 DEG C, 75 DEG C, 80 DEG C, 85 DEG C, 86 DEG C, 87 DEG C, 89 DEG C, 90 DEG C, 91 DEG C, 92 DEG C, 93 DEG C,
94 DEG C, 95 DEG C, 96 DEG C, 97 DEG C, 98 DEG C, 99 DEG C or 100 DEG C of temperature.
Target
In several embodiments, target polynucleotide can be need to measure it and exist, be not present, content or characteristic it is any
Nucleotide sequence.In several embodiments, target polynucleotide can be pre-selected by the people for designing given measure, and/or with spy
Fixed genotype or phenotype interested are associated, and/or are chosen for other reasons.
In several embodiments, target polynucleotide is comprising polymorphism, represents polymorphism or the core associated with polymorphism
Nucleotide sequence.
In several embodiments, can be by inquiring allele using one or more nucleotide polymorphisms as target.
Under a number of cases, polymorphism occurs at single nucleotide acid position, for example, an allele can have in given position
Thymidine, and alternative allele has such as cytimidine at same position.In several embodiments, nucleotide
Polymorphism can include substitution, missing, insertion, copy number variation, transposition, methylate or another nucleotide modification and/or variant
DNA sequence dna.In several embodiments, polymorphism may include two, three, four, or more continuous nucleotide.
The compositions disclosed herein and method can be used for the single nucleotide polymorphism in identification target polynucleotide sequence
(SNP).For example, for the genomic DNA sample from the diploid mammal of the SNP copy given with two,
SNP can have homozygosity or heterozygosity.In other instances, triploid organism body have to anchor point 3 it is different etc.
Position gene.Polyploid cell and organism include more than two pairs of pairing chromosome, and have number in whole chromosome group
Mesh changes.Polyploid is common in plant.For example, wheat has diploid (two group chromosomes), tetraploid (four group chromosomes)
With hexaploid (six group chromosomes).See example 8.
In several embodiments, in the method for identifying the polymorphism in target polynucleotide sequence, the first complementary spy
Pin and the second complementary probe under conditions of the hybridization for complementary series is provided with can be included in target polynucleotide sequence or
One or more samples that polymorphism can be free of are incubated.In several embodiments, provide and appoint for specific probe groups
3rd probe of choosing.3rd probe is generally similar to the first probe or the second probe, but is related to identical sequence interested
In not iso-allele.See Fig. 1 E.
In certain embodiments, if complementary polynucleotide probe is mutual including the polymorphic nucleotide in target polynucleotide sequence
The polymorphic nucleotide of benefit, then complementary probe combine to form product polynucleotides.In certain embodiments, it is if complementary
Polymorphic nucleotide on polynucleotide probes does not hybridize to the polymorphic nucleotide on target polynucleotide, then two complementary probes are usual
Do not combine and do not form product polynucleotides.
In several embodiments, to product polynucleotides (or product polynucleotides, its amplified production or theirs is mutual
Mend a part or some for chain) it is sequenced to determine the existence or non-existence of polymorphism.Under a number of cases, it can also lead to
Cross sequencing and determine sample homogeneity.In some other embodiments, depositing for polymorphism is determined using array or other reading data
Or be not present.In several embodiments, the capture probe or oligonucleotides provided on array is designed to substantially with drawing
The extension of thing is complementary, so that the primer not extended is not joined to capture probe.Alternatively, added in array or surveying
Unreacted probe can be removed before sequence.
In several embodiments, the first complementary series and the second complementary series of the first complementary probe and the second complementary probe
Change depend on one or more of many possible parameters, such as:(i) unwinding for duplex is formed with target polynucleotide
Temperature, (ii) Tm, the ionic strength of (iii) hybridization solution, complexity of (iv) target polynucleotide etc..
In several embodiments, sample includes one or more or a variety of different target polynucleotides.Specifically, sample bag
Containing at least two different target polynucleotides, it is at least three kinds of, 4 kinds, 5 kinds, 6 kinds, 7 kinds, 8 kinds, 9 kinds, 10 kinds, 11 kinds, 12 kinds, 13
Kind, 14 kinds, 15 kinds, 16 kinds, 17 kinds, 18 kinds, 19 kinds, 20 kinds, 21 kinds, 22 kinds, 23 kinds, 24 kinds, 25 kinds, 26 kinds, 27 kinds, 28
Kind, 29 kinds, 30 kinds, 31 kinds, 32 kinds, 33 kinds, 34 kinds, 35 kinds, 36 kinds, 37 kinds, 38 kinds, 39 kinds, 40 kinds, 41 kinds, 42 kinds, 43
Kind, 44 kinds, 45 kinds, 46 kinds, 47 kinds, 48 kinds, 49 kinds, 50 kinds, 51 kinds, 52 kinds, 53 kinds, 54 kinds, 55 kinds, 56 kinds, 57 kinds, 58
Kind, 59 kinds, 60 kinds, 61 kinds, 62 kinds, 63 kinds, 64 kinds, 65 kinds, 66 kinds, 67 kinds, 68 kinds, 69 kinds, 70 kinds, 71 kinds, 72 kinds, 73
Kind, 74 kinds, 75 kinds, 76 kinds, 77 kinds, 78 kinds, 79 kinds, 80 kinds, 81 kinds, 82 kinds, 83 kinds, 84 kinds, 85 kinds, 86 kinds, 87 kinds, 88
Kind, 89 kinds, 90 kinds, 91 kinds, 92 kinds, 93 kinds, 94 kinds, 95 kinds, 96 kinds, 97 kinds, 98 kinds, 99 kinds, 100 kinds or more kind, 120 kinds
Or more kind, 140 kinds or more kind, 160 kinds or more kind, 180 kinds or more kind, 200 kinds or more kind, 220 kinds or more
Kind, 240 kinds or more kinds, 260 kinds or more kinds, 280 kinds or more kinds or 300 kinds or more kind target polynucleotides, such as 2
Plant to 5000 kinds of target polynucleotides, 2 kinds to 20 kinds target polynucleotides, 5 kinds to 30 kinds target polynucleotides, 10 kinds to 50 kinds target multinuclears
Thuja acid, 25 kinds to 75 target polynucleotides, 40 kinds to 100 kinds target polynucleotides, 50 kinds to 120 kinds target polynucleotides, 60 kinds to 130
Kind target polynucleotide, 70 kinds to 140 kinds target polynucleotides, 80 kinds to 150 kinds target polynucleotides, 90 kinds to the 170 kinds more nucleosides of target
Acid, 100 kinds to 200 kinds target polynucleotides, 150 kinds to 250 kinds target polynucleotides, 200 kinds to 300 kinds target polynucleotides, 250 kinds
To 500 kinds of target polynucleotides, 300 kinds to 700 kinds target polynucleotides, 400 kinds to 1000 kinds target polynucleotides, 500 kinds to 1500
Kind target polynucleotide, 600 kinds to 2000 kinds target polynucleotides, 700 kinds to 3000 kinds target polynucleotides, 800 kinds to 4000 kinds targets
Polynucleotides, 900 kinds to 5000 kinds target polynucleotides, 50 kinds to 1000 kinds target polynucleotides, 100 kinds to the 2000 kinds more nucleosides of target
Acid, 200 kinds to 3000 kinds target polynucleotides, 300 kinds to 4000 kinds target polynucleotides, 500 kinds to 5000 kinds target polynucleotides or
100 kinds to 10000 kinds target polynucleotides.
The length of target polynucleotide can be different.In several embodiments, target polynucleotide is 10nt to 100nt, 10nt
To 200nt, 10nt to 300nt or 10nt to 400nt.In several embodiments, target nucleotide for 20nt to 30nt, 20nt extremely
40nt, 20nt are to 50nt, 20nt to 60nt, 20nt to 70nt, 20nt to 80nt, 20nt to 90nt, 20nt to 100nt, 20nt
To 110nt, 20nt to 120nt, 20nt to 130nt, 20nt to 140nt, 20nt to 150nt, 20nt to 160nt, 20nt extremely
170nt, 20nt to 180nt, 20nt to 190nt, 20nt to 200nt, 20nt to 210nt, 20nt to 220nt, 20nt extremely
230nt, 20nt to 240nt, 20nt to 250nt, 20nt to 260nt, 20nt to 270nt, 20nt to 280nt, 20nt extremely
290nt, 20nt to 300nt, 20nt to 310nt, 20nt to 320nt, 20nt to 330nt, 20nt to 340nt, 20nt extremely
350nt, 20nt are to 360nt, 20nt to 370nt, 20nt to 380nt, 20nt to 390nt or 20nt to 400nt.
Under a number of cases, the length of target sequence can be walked according to the melting temperature (" Tm ") of sequence, pH, salinity or incubation
Rapid temperature changes.
The Tm for a variety of target polynucleotides assessed in given measure usually each other in 1 DEG C, 2 DEG C, 3 DEG C, 4 DEG C, 5 DEG C, 6
DEG C, 7 DEG C, 8 DEG C, in the range of 9 DEG C or 10 DEG C.In several embodiments, the Tm of a variety of target polynucleotides each other in 1-3 DEG C,
2–5℃、2–4℃、3–6℃、3–5℃、4–7℃、4–6℃、5–8℃、5–7℃、6–9℃、6–8℃、7–10℃、7–9℃、8–10
DEG C or 8-9 DEG C in the range of.
Be well known in the art it is a variety of under the conditions of hybridized.Stringent condition is hybridization conditions, in these conditions
Under, polynucleotides by preferential hybridization to its target subsequence, and optionally in lower degree or and it is not all hybridize to it is mixed
The condition of other sequences in gregarious body.
In general, stringent hybridization conditions are selected to be below heat of the particular sequence under set ionic strength and pH
About 5 DEG C of mechanics fusing point (Tm).Very stringent condition is selected as the Tm equal to particular probe.
In the method for performing the present invention, many aspects of hybridization reaction condition can change, and including but not limited to hybridize
The ionic strength of the temperature of reaction, the length incubated and hybridization buffer.
With reference to-connection
In certain embodiments, after sample, the first complementary probe and the second complementary probe are mutual in permission first
Benefit probe and the second complementary probe are incubated under conditions of hybridizing to the complementary target polynucleotide in sample, the first complementary probe
It can be combined with the second complementary probe.When the first complementary probe and the second complementary probe hybridize to target-specific sequence adjacent to each other
During row, corresponding 5'- phosphorylations and 3'- the hydroxylatings end of probe pair can pass through any suitable means knot as known in the art
Close.
In certain embodiments, the first complementary probe and the second complementary probe can be with Non-covalent bindings.In other cases,
First complementary probe and the second complementary probe can be with covalent bonds.Under a number of cases, covalent bond can utilize ligase (example
Such as, DNA ligase or ligase -65 from aquatic thermophilic bacteria (T.aquaticus)) realize.In such cases, connect
Enzyme and connection buffer solution may be added to that mutual comprising adjacent first complementary probe and second for being attached to target polynucleotide in sample
In the solution for mending probe.In alternative embodiment, hybridization complex is added in connection solution.The temperature of coupled reaction can be with
Kept constant in about 1 to 20 minute, such as at about 1 minute, 2 minutes, 3 minutes, 4 minutes, 5 minutes, 6 minutes, 7 minutes, 8 points
Clock, 9 minutes, 10 minutes, 11 minutes, 12 minutes, 13 minutes, 14 minutes, 15 minutes, 16 minutes, 17 minutes, 18 minutes, 19 points
Clock, 20 minutes or be longer than in the time of 20 minutes and keep constant.In several embodiments, it is attached reaction at about 54 DEG C.
When one embodiment is analysis RNA, target polynucleotide can be before hybridizing with the first complementary probe and the second complementary probe
CDNA is converted into, or RNA transcript can be used as the hybridizing targets of the first complementary probe and the second complementary probe.Work as rna transcription
When thing is used as hybridizing targets, the first complementary probe and the second complementary probe can continue to be used to combine the first complementary probe comprising utilize
With the DNA in the embodiment of the connection of the second complementary probe because can be carried out with reference to step by method as known in the art
Change to promote the DNA connections (for example, seeing United States Patent (USP) 8,790,873) in RNA templates.For DNA to be connected to RNA templates
On exemplary ligase include SplintR PBCV-1DNA ligases or chlorella virus dna ligase.
When connecting imperfect, the temperature of coupled reaction can be improved to about 94 DEG C and be kept for 1 minute to help to be passivated
DNA ligase is simultaneously denatured product polynucleotides.Under a number of cases, temperature can about 1 minute, about 2 minutes, 3 minutes, 4
Improved in minute or 5 minutes to 90 DEG C, 91 DEG C, 92 DEG C, 93 DEG C, 94 DEG C, 95 DEG C, 96 DEG C, 97 DEG C, 98 DEG C or 99 DEG C.Some
In the case of, connection mixture be then quickly cooled to room temperature, about 4 DEG C or about 0 DEG C.
The purposes of universal base
As well known for one of skill in the art, ligase can malfunction, for example, the complementary connection or " close with mispairing
Envelope " sequence (such as G/T mispairing) is present between two nucleic acid chains.When the first complementary polynucleotide probe and the second complementary multinuclear
When thuja acid probe hybridizes to target sequence, there may be mispairing, and sequence can not be 100% complementary.In several embodiments,
Complementary probe is configured to have universal base, such as inosine (for example, deoxyinosine), it is near inquiry site.Bag
Inosine (for example, deoxyinosine) containing complementary probe will form the base-pair with relatively low stability with complementary strand.In some realities
Apply in example, universal base the 2nd of the 3' nucleotide relative to the first complementary probe, the 3rd, the 4th, the 5th, the 6th,
It is substituted at 7th, the 8th, the 9th or the 10th position.Preferably, universal base is in the 3' relative to the first complementary probe
It is substituted at 2nd position of nucleotide.Universal base help to reduce or prevent from not having one complementary with target sequence or
First complementary probe of multiple 3' nucleotide is attached to the second complementary polynucleotide probe.Alternatively, or over and above what is described above,
Universal base can the 2nd of the 5'- nucleotide relative to the second complementary polynucleotide probe, the 3rd, the 4th, the 5th,
It is substituted at 6th, the 7th, the 8th, the 9th or the 10th position, does not have and target sequence complementation to prevent or reduce
Second complementary polynucleotide probe of one or more 5'- nucleotide is attached to the first complementary polynucleotide probe.In some realities
Apply in example, inosine (for example, deoxyinosine) is the universal base for unstability mispairing (most of is G/T mispairing), therefore is being deposited
In mispairing, ligase is by blow-by the first complementary polynucleotide probe and the second complementary polynucleotide probe.In other implementations
In example, inosine (for example, deoxyinosine) or another universal base are used to avoid making the unstability mispairing in target sequence main body from (wherein sending out
Nearside SNP known to life), so as to remain able to realize appropriate connection, although otherwise the 3' nucleotide of the first probe is being inquired
Complementary with target sequence at position, this connection also will be destroyed.In certain embodiments, known to two or more polymorphic positions
In the complimentary positions for betiding the first complementary probe and/or the second complementary probe, universal base can be used at these positions
To avoid unstability mispairing.
Bar code and sample index
In certain embodiments, the first complementary probe and/or the second complementary probe, which include, makes sample and/or target sequence (position
Point and/or polymorphism or inquiry site) bar code identified.
In several embodiments, the first complementary probe and the first target sequence are complementary, and comprising non-mutually with target polynucleotide
The inquiry site bar code of benefit.
Inquiry site bar code contribute to measure target polynucleotide (for example, site) presence, be not present or content and/or
The variation (for example, polymorphism) of target polynucleotide.In several embodiments, entirely inquiry site bar code or inquiry site bar shaped
The a part of of code can be complementary with the first target sequence or the first complementary probe of the part of the second target sequence incomplementarity and second
One or both of probe.In several embodiments, inquiry site bar code can identify both site and allele (as
One composite sequence or the unitary part as single sequence).In several embodiments, inquiry site bar code can include and target
The Sequence of sequence incomplementarity and the part with target sequence complementation.In several embodiments, inquire that site bar code can be with
Only identify an allele.In such cases, its partially or even wholly with target sequence incomplementarity.In some embodiments
In, inquiry site bar code and target polynucleotide sequence incomplementarity.
In such cases, inquiry site bar code be in the first target sequence so that with the first target sequence incomplementarity, such as
Shown in Fig. 1.
The length of inquiry site bar code is usually 5 or more nucleotide.Exemplary interrogation site bar code sequence
Length for 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
A, 20,21,22,23,24,25,26,27,28,29 or 30 or more a nucleotide.
In several embodiments, inquire that site bar code includes at least five, at least six, at least seven, at least eight, at least
9, at least ten, at least at least 11, at least 12, at least 13,14 or at least 15 or more nucleotide.
In several embodiments, the first complementary probe includes inquiry site bar code, and when the first complementary probe and target
The first target sequence mutual added time of polynucleotides, inquiry site bar code sequence do not hybridize to target, however, in inquiry site bar shaped
3' and the 5' part of code side are the part with the first complementary probe of the first target sequence complementation.See Fig. 1.
In several embodiments, the second complementary probe include with the Sequence of the second target sequence complementation and with the second target
Sequence incomplementarity close to Sequence.The non-complementary portion of first complementary probe and the second complementary probe may include general sequence
Row.The universal sequence of first complementary probe and the second complementary probe can be identical or different.
In certain embodiments, the primer sequence with the complementation of the second complementary probe is used to add sample index by PCR
Into product polynucleotides (or its reaction product).However, in certain embodiments, sample can be added by the first complementary probe
Index.In some of the exemplary embodiments, sample index can be located in the PCR primer 1 of the first complementary probe so that sample index
Close to bar code to be sequenced, without the first target sequence and the second target sequence are sequenced.
In certain embodiments, the length of sample index is usually 5 or more nucleotide.In some exemplary implementations
In example, the length of sample index is 5,6,7,8,9,10,11,12,13,14,15,16,
17,18,19,20,21,22,23,24,25,26,27,28,29 or 30 or more
A nucleotide.
Exemplary sample index comprising 12 and 15 nucleotide is as shown in table 1.
In several embodiments, the sum of unique sample index is about 16128 based on 12-mer sequences.In some implementations
In example, the sum of unique sample index is about 50,000 based on 15-mer sequences.In other embodiments, unique sample rope
The sum drawn is about 66000 based on 15-mer sequences.
In several embodiments, sample index be used for by the sample index of each product polynucleotides is sequenced come
Measure the homogeneity of sample.
Selection/enrichment
In the measure of the present invention, enriching step is may include before analytical procedure.Enriching step, which is used to increase, reacts mixed
The content and product polynucleotides of product polynucleotides in compound and the ratio of non-product polynucleotides.This can pass through choosing
Select product polynucleotides and/or remove non-product polynucleotides to realize.In several embodiments, enriching step is to be based on ruler
Very little, affinity, electric charge or sequence or by remove partly or entirely non-product polynucleotides (such as by selecting, separate or
Resolution) realize.It can occur in identical or different reaction mixture with reference to enriching step.
In several embodiments, product polynucleotides can be based on particular sequence (for example, sample index or such as complementary series
Etc. sequence) presence select.Under a number of cases, product polynucleotides can be included designed for being selected during enriching step
The bar code selected.
In several embodiments, enrichment includes amplification step.For example, sample index can combine during amplification step
Into product polynucleotides, any amplified reaction known to those skilled in the relevant art, such as polymerase chain reaction are used
(PCR)。
When the product polynucleotides in different samples mix and be collected in library be sequenced or other analysis when, bag
Mark amplified production containing sample index is very useful.Primer binding sequence (or site) can be coupled to the first complementary polynucleotide
And/or second complementary polynucleotide probe to promote the amplification of product polynucleotides, no matter linear amplification or exponential amplification.Draw
Thing binding site is used to combine primer to trigger primer to extend or expand.Primer binding site is usually located at the first target sequence or
In probe portion outside two target sequences.In several embodiments, primer binding site is located at and target polynucleotide incomplementarity
In sequence.
In several embodiments, PCR is used to sample index being added in product polynucleotides, so as to collect product multinuclear
Thuja acid is to realize the purpose of sequencing.PCR primer can include and product polynucleotides or the first complementary probe or the second complementary probe
A part of complementary sequence.For example, when the first PCR primer and the second PCR primer are used to guide the PCR of product polynucleotides to expand
During increasing, the first PCR primer can include the sequence with the sequence complementation on product polynucleotides, and the second PCR primer can include
With the sequence of the sequence incomplementarity on product polynucleotides.
Under a number of cases, it is more that two different sample index one of (for example, in each PCR primer) are attached to product
In nucleotide, so as to help to increase the quantity for the sample that can be identified and analyze in unitary determination.Under a number of cases, only
One PCR primer includes sample index or bar code.
In one exemplary embodiment, it is enriched with using PCR amplification.
Polymerase includes but not limited to DNA and RNA polymerase, reverse transcription etc..Be conducive to be gathered by different polymerases
The condition of conjunction is well known to the skilled artisan in the art.
Amplification is carried out to promote the incubative time at desired temperature usually in thermal cycler is automated.In some implementations
In example, amplification includes making at least one primer continuous annealing with sequence that is complementary or being substantially complementary be at least one target nucleus
Thuja acid, synthesize at least one nucleotide chain in a manner of depending on template using polymerase and form nucleic acid double chain to separate respectively
Multiple circulations of bar chain.The circulation may or may not repeat.Amplification may include thermal cycle, or can hold under isothermal conditions
OK.
In several embodiments, amplification is kept for about 1 minute to about 10 minutes at a temperature of being included in about 90 DEG C to about 100 DEG C
Denaturation is carried out, then circulation keeps annealing for about 1 second to about 30 seconds at a temperature of being included in about 55 DEG C to about 75 DEG C,
Keep being extended for about 5 seconds to about 60 seconds at a temperature of about 55 DEG C to about 75 DEG C, and at a temperature of about 90 DEG C to about 100 DEG C
Keep being denatured for about 1 second to about 30 seconds.Other times and configuration can also be used.For example, primer annealing and extension can be
Performed at single temperature in same step.
In several embodiments, circulation perform at least 5 times, at least 10 times, at least 15 times, at least 20 times, at least 25 times, extremely
It is 30 times, at least 35 times, at least 40 times or at least 45 times few.Specific circulation time and temperature are by the particular core depending on amplification
Acid sequence, and those of ordinary skill in the art can easily determine.
In several embodiments, PCR or another DNA cloning process can be used by the annealing for promoting PCR primer or be related to sequence
The linker of the process of column-generation or linking subsequence are added in product polynucleotides.This and the side that uses of tradition in this area
Method compares, wherein adapter to be connected to polynucleotides to be sequenced.Linker and adapter can be used as physics, chemistry or enzyme
Component during rush.
After enrichment and/or amplification, sample can be collected.In such embodiments, from the product multinuclear of various samples
Thuja acid mixing is analyzed and/or is sequenced to obtain product polynucleotides storehouse together.
In the case of multiple samples are sequenced together wherein, each sample can be expanded individually, and wherein sample index includes
In the first PCR primer and/or the second PCR primer, wherein one or more sample index are specific to sample.From to
Each product polynucleotides of random sample sheet have identical sample index.Under a number of cases, two of which PCR primer includes
Sample index, bar code can be identical or different.In such cases, while complete to (being usually from one or more
It is multiple) measure of the sequences of the product polynucleotides of sample.
In several embodiments, the present invention provides composition, method and the examination measured for target polynucleotide copy number
Agent box.It is related with gene control and human diseases to copy number variation (" CNV ").Can be used for each potential CNV sites with
And first complementary probe in one or more sites and the second complementary probe assess CNV.Probe may include such as institute above
The bar code stated.Such as new-generation sequencing can be used to measure the relative amount of each sequence, wherein CNV sites (target polynucleotide)
Opposite solution reading can be used for the copy number of estimation CNV sites (target polynucleotide) with single copy target polynucleotide.Passing through will
Sample with known CN and/or CNV with known reference quantity compared with unknown sample or by being compared to measure
CNV.For example, if sample has the target polynucleotide sequence of two copies, the sum that sequence is understood is being normalized to compare
The target polynucleotide sequence of two copies will be indicated during value, and the sample of the target polynucleotide sequence with four copies is opposite
The sequence for obtaining 4 times is understood into quantity in normalization sample.Sequence deciphering will not be produced by lacking the sample of all copies.
In several embodiments, this CNV detections are extended to the content of target polynucleotide present in measure sample.
In several embodiments, the first complementary probe and the second complementary probe separate one when hybridizing to sequence interested
A or multiple nucleotide.Gap can be single nucleotide acid or more than one nucleotide., can be first when hybridizing to sample nucleic acid
The 3' ends of probe are extended.In such cases, sample nucleic acid is used as template to guide the type of modification, such as by the
The base pairing that occurs during the extension based on polymerase of one probe interact mixed in gap filling step one or
Multiple nucleotide.If gap filling step performs completion, complementary probe can for example by enzymatic connect with it is mutual close to second
The 3' ends for mending first complementary probe at the 5' ends of probe combine, as described above.Then as be free of gap filling step above
Described in embodiment, the polynucleotide products of gained are analyzed.
In the variations that wherein product polynucleotides are sequenced by new-generation sequencing technology, PCR primer
The sequence being used in combination available for generation with specific sequencing technologies, such as it is connected subsequence for increasing, which can
The surface that product polynucleotides are attached in Illumina NGS flow cells is promoted to combine DNA oligonucleotides.
Wherein use array readings analysis product polynucleotides variations in, PCR primer can also be used for generation with
The sequence that specific array is used in combination, such as increasing catenation sequence, which can promote product polynucleotides to combine
To array (such asOrMicroarray (the Affymetrix company of Santa Clara City, California, America
(Affymetrix, Inc., Santa Clara, California)) orMicroarray (California, USA
The inspiration company (Illumina, Inc., San Diego, California) in state Santiago) on DNA oligonucleotides (capture
Probe).
In several embodiments, probe, target nucleotide or product nucleotide are attached to solid support.
Improved sample index primer
As described herein, some embodiments are by the way that sample index sequence is mixed in primer sequence (for example, PCR primer),
Sample index is added in product polynucleotides.Although many sample index sequence (that is, 4^15 that may be different can be used
A different 15mer sequences), but create an optimization group must not only be distinguished from each other index sequence (such as, it is ensured that even in
In the case of base sequencing mistake, sample index is not also called mistakenly), and solve consistency problem and excellent with being also desirable that
Change in total body measurement.In the reality for mixing sample index by being used as primer tasteless nucleotide (for example, being used for PCR) and adding sample index
Apply in example, the aspect of compatibility and optimization may include such amplification.Further relate to need with other relevant Considerations of total body measurement
The potential flexibility of sample size to be processed and sample index primer to be solved.For example, 15mer sample index can be set
Count into and cause the one 12 base to can be used for sample size wherein to be solved less and without the complete of available 15mer storehouses
In the case of index ability, and it is possible thereby to handled using 15mer as 12-mer further to optimize total body measurement (example
Such as, in sequencing detects embodiment, it is only necessary to the one 12 base is sequenced i.e. recognizable sample index, so that when saving
Between and reagent).The method of available identification sequence includes the multiple steps summarized in the disclosure.In certain embodiments, one
Such step can be identified and remove those useless sequences, otherwise would interfere with the survey from the possible sequence group being previously identified
Qualitative energy.In addition, identify those may be occasionally there are the sequence of problem and remove it is also critically important because these sequences
Can be by the initial testing as derived from experience, and performed under some determination conditions with suboptimum state.
In certain embodiments, 73536 indexes can be used in 384 microtiter plate formats, it is enough to be used in 169 samples
In this plate.In other embodiments, the one 16128 15mer index in 65280 indexes can also be used for as 12-mer
In 384 microtiter plate formats, it is enough to be used in 42 sample planes.These indexes have not only carried out excellent in terms of whole installation
Change, and sample plane is optimized (for example, 1-384,385-768,769-1152 etc.) one by one.Although lacking in these indexes
The problem of unexpected may occur in actual test in number, and may need replacing, but will be few based on each sample plane
Amount index replaces with other indexes and is less likely to influence any optimization (as long as example, retaining 99% sequence, or in 384 samples
Substitute 3-4 sequence in plate, then do not answer the reduction of peep optimization degree).
Many factors include maximization, the spy of orthogonality for selecting optimized sample index group to have material impact
The opposite sex maximization and ensure with other measure components compatibility.It is expected not only in one group of sample index relative to sample
Index sequence itself and orthogonality is improved to greatest extent in particular assay step.For example, for being added in a PCR step
Sample index in product polynucleotides, maximum orthogonality considers not only sample index sequence itself, and draws in view of PCR
The sequence of thing.Also may be present should cause the maximized other sequences of orthogonality.If for example, not only added using PCR step
Sample index sequence, and add linking subsequence for subsequent use in measure (for example, new-generation sequencing flow cell adapter sequence
Row), then it is expected at utmost to improve orthogonality relative to sample index and primer sequence and flow cell linking subsequence.It is maximum
It is also an important Consideration that degree, which improves specificity, and avoids homopolymer (for example, avoiding 3 companies in sample index
Continuous base uses identical base) and make GC be standardized as in the desired scope (for example, 40% to 60%,
42% to 58%, 44% to 56% etc., as desired by specific embodiment or requirement) also critically important.Also needed in optimization process
Consider other measure components, the nucleotide sequence detected, example such as new-generation sequencing library construction are used in such as continuous mode
Sequence.
As known to the person skilled in the art, oligonucleotides design characteristics and chemical environment or determination condition are matched
It is specific critically important.Further, it is known that the specificity in such measure is influenced by a variety of determinants.It can change to change
Becoming several non-limiting examples of specific measure key element includes the concentration of solvent such as DMSO, ion concentration (monovalention
Such as K+ or Na+ or divalent ion such as Mg++), the concentration of oligonucleotides, the time of interaction and measuring temperature and/or measure
The temperature of middle difference component.
In certain embodiments, as a non-limiting example, the temperature as specific determinant is paid close attention to,
Because general correlation be relatively low temperature usually to relatively low specific related and higher temperature with it is higher special
Property it is related.Therefore, different temperatures scope as described herein should be considered as representing higher and relatively low range of specificity and non-critical
Temperature.
In the exemplary embodiment, PCR reactions are usually run under 60 DEG C of annealing and extension.Designed for herein
At a temperature of the primer that operates typically result in relatively low amplification efficiency, so as to reduce product when being run at higher temperature such as 65 DEG C
Yield.Good yield is can obtain designed for having the primer of optimum performance to be run at a temperature of 65 DEG C at 65 DEG C;So
And the design characteristics designed for the design characteristics and the primer designed for 60 DEG C of 65 DEG C of primer is slightly different.Specifically,
Primer is designed such that they are more stably attached to target sequence.Those skilled in the art is it is known that many designs can be used
Standard combines or empirically determines the combination of different designs to predict.These standards usually form (G/C content), freedom with sequence
The length of matched base-pair or quantity are related between energy (Δ G) value and two complementary strands.Similar pattern is suitable for undesirable
Undershooting-effect.It can cause the sequence motifs of undesirable product, such as primer dimerisation products, in relatively low temperature or relatively low
May be more complicated under specific reaction.Of specific interest is sequence motifs, wherein several bases at the 3' ends of oligonucleotides
With another oligonucleotides in measure or the region complete complementary of its own or close to complete complementary (Figure 11 A and 11B and figure
12A and 12B).(or other specificity determine with temperature for the Δ G values of dimer product of problems or the length of complementary portion
Factor) inherently change.Relatively low temperature will make to have relatively low complementary, related to the shorter region height of 3' ends complementation
Dimer occur non-specific amplification.Therefore, by running measure at 65 DEG C rather than 60 DEG C, it is necessary to more complementary bases
Non-specific amplification occurs.Further, it has been determined that even if the sequence of complementary base and non-precisely it is in 3' ends, sample index
It is sufficiently long between a part for primer and the 3' ends of sample index primer to cause primer that dimerization occurs from complementary series.
In certain embodiments, under measure reaction condition used, under a number of cases, tolerable has perfect matching
7bp 3' is complementary or the motif of 9bp with a mispairing, but other situations are then unacceptable, are specifically dependent upon whole
The complementarity added in a dimer molecule.Therefore, this is a useful motif, it is used to identify other available sequences
Row, because it identifies many possible oligonucleotides performed poor under most of determination conditions, and also identify that
The sequence that may under a set of conditions play a role but easily fail under conditions of specificity is lower slightly a bit.These sequences can be special
Do not occur since the 3' across different " regions " in oligonucleotides is complementary but non-exclusively, for example, it is partly but non-complete
Portion's complementarity is due to the Variable Area in oligonucleotides.An example provided in this article be " bar code " part (Figure 11 A and
11B and Figure 12 A and 12B).
Tolerate before failure and (such as exist under conditions of longer complementary region (such as 7bp and 9bp include 1 mispairing)
At 65 DEG C) measure is run compared to the condition to fail in the case where shorter 3' complementary regions (such as 5bp or 6bp) occur
(such as at 60 DEG C) is very desirable.Tolerance to longer motif greatly reduces what must be removed from useful sequence library
The quantity of oligonucleotide sequence.
Equally, measure can be run under 70 DEG C of annealing/elongating temperature, this will further limit undershooting-effect, but
Other limitations are produced to design.
One group of 15mer sample index bar code of the disclosure includes a variety of specific design considerations, these key elements produce jointly
An optimal group index, and including given reaction condition and other have similar to each seed under specific other conditions
Group.Process for identifying the group is also not necessarily limited to specific group disclosed in this invention, because can be by slightly changing the present invention
Disclosure develop similar but differ using for the identical process with higher or lower specific determination condition
Group.
For detecting the methods of genotyping of the target polynucleotide in polyploid sample
As described herein, methods of genotyping (and associated data analysis) is used to detect target multinuclear in polyploid sample
The existence or non-existence of thuja acid.In certain embodiments, target polynucleotide can be SNP or missing/insertion event (indel)
As a result.Under complicated Genetic conditions, such as in polyploid sample, when sequence data of the generation in relation to genome interested
When, the presence of no information gene group adds each site and the sequence needed for sample is understood quantity.This document describes ploidy
Reduce strategy with reduce generation without the sequence data in information gene group.Using close to subgenome specificity HSV SNP's
It is (see Figure 13) as described herein that the exemplary ploidy of label SNP reduces concept.Polyploid reduction method as described herein can be single
Solely use or be used in combination with probe as described herein.
In exemplary first method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target
SNP/indel and nearside SNP/indel information.It is opposite with target/marker SNP/indel's based on nearside SNP/indel
Probe, is designed as having selectivity to genome interested, uses nearside SNP/indel unstability strategies pair by the knowledge of position
Ploidy is reduced by organismal complexity.Selection carries out Genotyping on Axiom and shows the target of diploid cluster.Ensure target
Remember in 9 bases of the either side of thing and nearside SNP/indel is not present (see Figure 14 A-C).
In one embodiment, a form of first complementary probe is designed to and the target with SNP/indel (LHS)
Sequence is complementary, and the first complementary probe of other forms is designed to and the target sequence complementation without SNP/indel (LHS').The
3' sequence of two complementary probes (RHS) close to the first complementary probe of two kinds of forms.The existing nearside SNP by near target SNP is led
Cause to prevent the destabilizing effect connected.As a result, the selection completed to genome interested (that is, has the target gene of nearside SNP
Group will produce low sequence and understand).Incorporating nearside SNP in probe design causes the site for not producing deciphering to play a role completely
(see Figure 15 A and 15B).
In exemplary second method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target
SNP/indel and nearside SNP/indel information.Relative position based on nearside SNP/indel Yu target indicia thing SNP/indel
Knowledge, the blocking oligonucleotide of addition and the target gene group complementation with nearside SNP/indel is to prevent RHS from hybridizing to target base
Because of group.
In one embodiment, a form of first complementary probe is designed to and the target with SNP/indel (LHS)
Sequence is complementary, and the first complementary probe of other forms is designed to and the target sequence complementation without SNP/indel (LHS').The
3' sequence of two complementary probes (RHS) close to the first complementary probe of two kinds of forms.Addition and the target for including nearside SNP/indel
Closing/competition oligonucleotides of sequence complementation.Blocking oligonucleotide prevents RHS from hybridizing to target DNA.As a result, complete to interested
Genome selection (that is, the target gene group with nearside SNP will produce low sequence and understand or do not produce sequence deciphering).Pass through
Add blocking oligonucleotide, the site that nearside SNP causes not produce deciphering is incorporated in probe design (see Figure 16 A and 16B).This
Kind method is between the base 1 and 10 of target indicia thing suitable for nearside SNP.It can not make two level polymorphism unstability.
In exemplary third-party method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target
SNP/indel and nearside SNP/indel information.Relative position based on nearside SNP/indel Yu target indicia thing SNP/indel
Knowledge, PCR primer is by designing and adding PCR amplification step early period, PCR amplification step selective amplification sense early period
The unique genome or subgenome of interest.In the method, one or both of PCR primer can be with genome sequence
In the target genome sequence with nearside SNP/indel it is complementary.This unique PCR amplification step early period can be concurrent working
Process flow (that is, sample is divided into two parts).
In one embodiment, a form of first complementary probe is designed to and the target with SNP/indel (LHS)
Sequence is complementary, and the first complementary probe of other forms is designed to and the target sequence complementation without SNP/indel (LHS').The
3' sequence of two complementary probes (RHS) close to the first complementary probe of two kinds of forms.Increase PCR amplification step early period, use warp
The PCR primer of design is crossed with selective amplification unique genome interested or subgenome.Nearside SNP/indel makes PCR
The hybridization of primer is unstable.Therefore, the selection to desired unique gene group interested is completed (that is, in follow-up work flow
It is middle to eliminate the unwanted genome for including nearside SNP).Various positions are adapted to using the combination of multiple PCR primer groups and probe groups
The combination of point and genome (see Figure 17 A and 17B).
In certain embodiments, 600X mean coverages method can be used in a small amount of selected label.This method needs
Concurrent working flow (that is, sample is divided into two parts).In one embodiment, sample with neighbouring nearside SNP or single alkali
Separated between the relevant labels of base indel, and for impacted label, it is and used to 200X in other methods
The sequencing of coverage is on the contrary, it makes to have the coverage of the isolate of impacted label to improve to 600X (that is, using extra
The sequencing time and expense compensate, rather than early period Eureka part compensate).This method has been used for other
In the case of, such as deep sequencing is carried out with RNA-Seq to aid in detecting rare transcript, therefore in different situations, this
To be Eureka equivalents.
The selection of application method depends on being present in three Wheat volatiles at the nearside SNP positions relative to target SNP
Characteristic.Selective enclosure method may compatible single probe groups and workflow form.
Method for detecting target RNA in the case where being not converted into cDNA.
The analysis of RNA is frequently present of deviation, because needing RNA being converted into cDNA before analysis.Side as described herein
Method is related to directly detection target RNA, without being translated into cDNA.The detection of target RNA includes but not limited to inquire extron side
Boundary, it can detect the alternative splicing of mRNA transcripts and splicing variants, detection fusion gene (at least two independent bases
The part of cause), and for detecting the more generally expression analysis of mRNA transcripts.In certain embodiments, for detecting target
The method of RNA is using new-generation sequencing and can detect thousands of a sites in hundreds thousand of RNA samples at the same time.For examining
The method for surveying target RNA is added based on join dependency PCR amplification and using inquiry site probe and during PCR amplification
The sample index bar code added.In the exemplary embodiment, the practicality of this method is proved by performing height multiple reaction,
The reaction hybridizes to the DNA probe of RNA templates using the connection of commercially available DNA ligase.PCR amplification is carried out to connection product.By institute
The PCR product generation new-generation sequencing data obtained.Sample (being based on sample index) and site are distributed into each deciphering.Investigate by
The sequencing data of PCR product generation will disclose the splicing variants or the table of fusion and mRNA transcripts of mRNA transcripts
Reach.In example 13 and Figure 19-21, it is shown that be used for by house-keeping gene and selection aobvious outside the people's gene of cancer fusion detection
Son obtains the result of the 778-plex probe groups designed for inquiry RNA.Expected extron connection is present in house-keeping gene
In.As described herein, for detecting and inquiring that RNA target calibration method (and Correlative data analysis) is used for expression analysis, equipotential base
In the targeting researchs such as cause-specific expressed analysis, alternative splicing analysis and fusion detection.It is this directly to detect RNA's
Method is a kind of assay method of simplification, also eliminates the deviation that RNA is converted into cDNA.
Detection based on sequencing.
In a kind of exemplary application of method, surveyed using new-generation sequencing (for example, Illumina is sequenced) performing sequence
It is fixed.Can to product polynucleotides direct Sequencing, or can be to being generated in the copy or measure of product polynucleotides complementation
Chain is sequenced.When performing this method, the first complementary probe and/or the second complementary probe can include universal primer sequence.
, can be more added to product by adapter by the other methods of PCR or copy and/or amplified production polynucleotides in the case of such
In nucleotide (or reaction product), the adapter is used to product polynucleotides being connected to Illumina sequencing flow cells.Also
Flow cell adapter can be added in product polynucleotides according to other technologies as known in the art (for example, connection).
In several embodiments, using with eight or more passages Illumina flow cells (Flowing
Pond) it is used as solid support.Each passage can accommodate the cluster of the amplification more than 300,000,000, therefore can be used for high throughput analysis.In other realities
Apply in example, use the amplification cluster for accommodating varying numberFlow cell or other flow cells.
Include new-generation sequencing technology available for the sequencing technologies in disclosed method, such as ionic semiconductor is sequenced
(for example, Ion Torrent are sequenced), pyrosequencing (for example, 454 sequencings), connection method sequencing (for example, SOLiD is sequenced), close
It is sequenced in real time (for example, Pacific Biosciences) into method sequencing (for example, Illumina is sequenced) and unimolecule.
Detection based on array.
In several embodiments, using array (for example, in product polynucleotides analysis based on hybridised arrays) detection
Product polynucleotides as described herein.Exemplary array includes chip or plane matrix, pearl array, liquid phase array, " postcode " battle array
Row, microarray etc..Material such as nitrocellulose, glass, silicon wafer, the optical fiber for being suitable for structure array are the technologies of this area
Known to personnel.
Kit.
Present disclose provides including the kit for performing any means disclosed herein.
In several embodiments, the present invention provides depositing for one or more target polynucleotides in determination sample
, be not present or characteristic and/or for measuring genotype.In several embodiments, which includes the multiple first complementary spies
Pin and the second complementary probe, each first complementary probe have with the Sequence of the first target sequence complementation and with the first target sequence
The Sequence of row incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and adjacent universal sequence, and often
A second complementary probe has and the Sequence of the second target sequence complementation and adjacent with the second target sequence incomplementarity
Sequence, and buffer solution and enzyme for connecting and being enriched with.
First complementary probe can have sequence 5' to be inquired to the incomplementarity of the first complementary probe of the first target sequence complementation
Site bar code and sequence 3' inquire site bar code to the incomplementarity of the first complementary probe of the first target sequence complementation.
In several embodiments, which includes at least one PCR primer, polymerase and one group of dNTP to realize richness
The purpose of collection/amplification.
In several embodiments, which includes ligase.
In several embodiments, which includes the use of the licensing of the software needed for parsing sequence data.
In several embodiments, which includes the use of explanation.
First complementary probe and the second complementary probe (for example, lyophilized form) can provide in a dry form.If with dry
Dry form provides, and probe can be dried by preservative (for example, trehalose).
Composition.
Present disclose provides including the composition for performing any means disclosed herein.
In several embodiments, there is provided a kind of to be used to detect depositing for one or more targets in one or more samples
, be not present, content or characteristic.Said composition includes:Multiple first complementary probes and the second complementary probe, (i) each first
Complementary probe have with two Sequences of the different piece complementation of the first target sequence and with the first target sequence incomplementarity
Two Sequences, wherein non-complementary portion include inquiry site bar code sequence and universal sequence, and (ii) each second
Complementary probe have with the Sequence of the second target sequence complementation and with the second target sequence incomplementarity close to Sequence simultaneously
And including universal sequence.First complementary probe includes the target sequence complementation with the 3' and both 5' with inquiring site bar code
The sequence of two parts.Said composition can be based on solution or be attached to the part of solid support or both.
In several embodiments, a part for the complementary portion of the first complementary probe inquires site bar code sequence for incomplementarity
The 5' of row, and a part for the first complementary probe inquires the 3' of site bar code sequence for incomplementarity.Incomplementarity inquires site
Bar code sequence can be referred to as " being anchored into " target by 5' the and 3' complementary series of the first complementary probe.Incomplementarity inquires position
The length of point bar code sequence can be about 10 to 16 nucleotide, for example, length is 10,11,12,13,14
A, 15,16 nucleotide.
Bioinformatics
It is sequenced by direct Sequencing or to complementary series to measure the sequence of product polynucleotides.Side as described herein
Method can be used for generating sequencing data, these sequencing datas can be analyzed by mathematical algorithm to determine the presence of specific SNP or not deposit
It whether there is for heterogeneous or homogeneity, specific transcript in, indel and other mutation, specific site, particular target multinuclear
The copy number of thuja acid and/or other features of target polynucleotide.
Under a number of cases, the genotype of sample can be measured by the following method:Analysis of allocated is (by comparing inquiry
Site bar code) to the deciphering quantity in each allele (at the site), and determine to distribute to A equipotential bases in each sample
The deciphering quantity of cause and the genotype of the ratio instruction sample of the deciphering quantity of distribution to B allele are AA, AB, BB or nothing
Method measures.
Practicality.
Composition, method and kit as described herein are used to analyze a variety of target multinuclears in great amount of samples in unitary determination
The presence of thuja acid, be not present, content or characteristic.
In general, multigroup first complementary probe and the second complementary probe are provided in unitary determination in unitary determination
Assess the presence of multiple sequences, be not present, content or characteristic (for example, polymorphism).In several embodiments, in unitary determination
Determine a variety of polymorphisms in multiple samples.In several embodiments, composition as described herein, method and kit are suitable for
Genotyping and new-generation sequencing (NGS) technology is can relate in unitary determination while to generate great amount of samples and site
Genotype.
Invention clause
1. it is a kind of be used to measuring the presence of one or more target polynucleotides in two or more samples, be not present or
The method of content, this method comprise the following steps:
(a) two or more samples are provided, each sample includes one or more target polynucleotides, every kind of more nucleosides of target
Acid includes the first target sequence and the second target sequence;
(b) multiple first complementary probes and the second complementary probe are provided, (i) each first complementary probe has and the first target
The Sequence of sequence complementation and the Sequence with the first target sequence incomplementarity, wherein non-complementary portion include inquiry site
Bar code sequence and adjacent universal sequence, and (ii) each second complementary probe has the sequence portion with the complementation of the second target sequence
Point and with the second target sequence incomplementarity close to Sequence;
(c) the multiple first complementary probe is incubated with each individually sample and the second complementation is visited under hybridization conditions
Pin so that the first complementary probe and the second complementary probe hybridize to their complementary target polynucleotide in sample and hybridized again with being formed
It is fit;
(d) the first complementary probe and the second complementary probe that the first target sequence and the second target sequence are hybridized in sample are combined
To form product polynucleotides;
(e) the product polynucleotides formed by single sample are collected;And
(f) determine target polynucleotide in one or more samples by analyzing product polynucleotides or its complementary strand
Existence or non-existence.
2. according to the method described in the 1st section, wherein the first complementary probe and the second complementary probe and be closely adjacent to each other first
Target sequence and the second target sequence are complementary.
3. according to the method described in the 1st section, wherein the first complementary probe and the second complementary probe with it is adjacent and be separated by 1 to
The first target sequence and the second target sequence of 500 nucleotide are complementary.
4. according to the method any one of 1-3 moneys, wherein the adjacent non-complementary portion bag of the second complementary probe
Include universal sequence.
5. according to the method any one of 1-4 moneys, wherein the adjacent universal sequence bag of first complementary probe
The universal primer sequence with primer sequence complementation is included, the primer sequence can be used for increasing (i) sample index, (ii) for generating
One of the sequence data or appended sequence of another form of detection, (iii) appended sequence or (iv) other parts are more
Person.
6. according to the method described in the 4th section, wherein the adjacent universal sequence of second complementary probe includes and primer sequence
Arrange complementary universal primer sequence, the primer sequence can be used for increasing (i) sample index, (ii) be used for formation sequence data or
One or more of the appended sequence of another form of detection, (iii) appended sequence and (iv) other parts.
7. according to the method any one of 1-6 moneys, wherein there are sequence 5' to complementary with the first target sequence the
One complementary probe incomplementarity inquiry site bar code and sequence 3' to the first complementary probe of the first target sequence complementation
Incomplementarity inquires site bar code.
8. according to the method any one of 1-6 moneys, wherein there is the 3' ends and 5' ends with inquiring site bar code
The sequence of both target sequence complementations.
9. according to the method any one of 1-8 moneys, wherein the adjacent universal sequence of first complementary probe is
For 5' to complementary series, which is that the incomplementarity of 5' to the first complementary probe inquires site bar code.
10. according to the method any one of 5-9 moneys, wherein universal primer sequence includes PCR primer sequence and draws
Thing sequence is to increase the appended sequence of the detection beneficial to formation sequence data or other forms.
11. according to the method described in the 10th section, wherein for the additional of formation sequence data or another form of detection
Sequence is the adapter for new-generation sequencing.
12. according to the method described in the 10th section, wherein for the additional of formation sequence data or another form of detection
Sequence is for capturing capture sequence on a solid surface.
13. according to the method any one of 5-12 moneys, wherein primer sequence can effectively increase beneficial to generation
The part of sequence.
14. according to the method any one of 1-13 moneys, the length of wherein incomplementarity inquiry site bar code is 10
A, 11,12,13,14,15 or 16 nucleotide.
15. according to the method any one of 5-14 moneys, wherein the length of the sample index is 10,11,
12,13,14,15 or 16 nucleotide.
16. according to the method described in the 14th section, wherein the length of inquiry site bar code is 12 or 15 nucleotide.
17. according to the method described in the 15th section, the wherein length of sample index is 12 or 15 nucleotide.
18. according to the method any one of 1-17 moneys, wherein inquiry site bar code or the choosing of sample index sequence
Free SEQ ID NO:1-SEQ ID NO:384 groups formed.
19. according to the method any one of 1-18 moneys, wherein exist incubate before the step of, which is included in
Heated at a temperature of 70 DEG C to 100 DEG C.
20. according to the method any one of 1-19 moneys, it is more to be enriched with the product before being additionally included in enriching step
Nucleotide.
21. according to the method described in the 20th section, wherein the enrichment includes:(a) one group of PCR primer sequence, the PCR are provided
Primer sequence include with the first primer of the primer sequence complementation in the first complementary probe and with the PCR in the second complementary probe
Second primer of primer sequence complementation, and (b) amplified production polynucleotides.
22. according to the method any one of 1-21 moneys, wherein method is the method based on solution.
23. according to the method any one of 1-22 moneys, wherein the first complementary probe includes inosine, the inosine is with visiting
The 3' ends of pin are separated by 2,3,4,5,6,7,8,9,10 or more bases.
24. according to the method any one of 1-23 moneys, wherein the second complementary probe includes inosine, the inosine is with visiting
The 5' ends of pin are separated by 2,3,4,5,6,7,8,9,10 or more bases.
25. according to the method any one of 1-24 moneys, wherein the first complementary probe and the second complementary probe and the
One target sequence and the second target sequence are complementary, and the 3' ends of the first complementary probe and single nucleotide polymorphism (SNP) or other something lost
A kind of form during the progress of disease is different is complementary.
26. according to the method any one of 1-25 moneys, wherein the means for combining are using connection enzymatic treatment
Hybridize to the first target sequence of target polynucleotide and the first complementary probe of the second target sequence and the second complementary probe (hybridizes compound
Body) to form product polynucleotides.
27. according to the method any one of 1-26 moneys, it is used in Genotyping, including provides the first complementary spy
The homogeneity of one or more variants of pin, wherein variant in one or more nucleotide at 3 ' ends of the first complementary probe
Aspect is different, wherein the measure included compared with quantization was made with other variants of first complementary probe it is first mutual
Mend the relative frequency of one or more variants of probe and associate the frequency with genotype.
28. according to the method any one of 1-27 moneys, it is used for the copy number variation for measuring target polynucleotide, its
Described in measure include will be by the semaphore that product polynucleotides or its complementary strand produce with known reference signal amount or by another
The semaphore that product polynucleotides or its complementary strand produce is compared.
29. a kind of presence for the one or more target polynucleotides being used in determination sample, be not present or the combination of content
Thing, including:Multiple first complementary probes and the second complementary probe, (i) each first complementary probe have and the first target sequence
Two Sequences of different piece complementation and two Sequences with the first target sequence incomplementarity, wherein non-complementary portion
Including inquiry site bar code sequence and universal sequence, and (ii) each second complementary probe has and the complementation of the second target sequence
Sequence and with the second target sequence incomplementarity close to Sequence and including universal sequence.
30. according to the composition described in the 29th section, wherein first complementary probe include sequence 5' to the first target sequence
Arrange the incomplementarity inquiry site bar code and sequence 3' to complementary with the first target sequence first of the first complementary complementary probe
The incomplementarity inquiry site bar code of complementary probe.
31. according to the composition described in the 29th section, wherein first complementary probe includes and inquiry site bar code
The sequence of the target sequence complementation at both 3' ends and 5' ends.
32. according to the composition any one of 29-31 moneys, wherein first complementary probe and the second complementary spy
The universal sequence of pin each includes primer sequence, and the primer sequence can hybridize to the primer for composition sequence.
33. according to the composition described in the 32nd section, wherein primer sequence includes PCR primer sequence.
34. according to the composition any one of 29-33 moneys, wherein universal sequence includes primer sequence, the primer
Sequence can increase the additional sequence of (a) sample index, (b) appended sequence, (d) for the detection of formation sequence data or other forms
One or more of row and (e) other parts.
35. according to the composition any one of 29-34 moneys, wherein the adjacent general sequence of first complementary probe
5' is classified as to complementary series, which is that the incomplementarity of 5' to the first complementary probe inquires site bar code.
36. according to the composition described in the 34th section, wherein universal sequence is PCR primer sequence.
37. according to the composition described in the 34th section, wherein for the attached of formation sequence data or another form of detection
Add sequence for the adapter for new-generation sequencing.
38. according to the composition described in the 34th section, wherein for the attached of formation sequence data or another form of detection
Add sequence for capture sequence.
39. according to the composition any one of 29-38 moneys, wherein universal sequence includes primer sequence, the primer
Sequence, which provides, to be used to increase sample index.
40. according to the composition any one of 29-39 moneys, wherein the length of inquiry site bar code be 10,
11,12,13,14,15 or 16 nucleotide.
41. according to the composition described in the 39th section and the 40th section, wherein the length of sample index be 10,11,12,
13,14,15 or 16 nucleotide.
42. according to the composition described in the 40th section, wherein the length of inquiry site bar code is 12 or 15 nucleotide.
43. according to the composition described in the 41st section, the wherein length of sample index is 12 or 15 nucleotide.
44. according to the composition any one of 39-43 moneys, wherein inquiry site bar code or sample index sequence
Selected from by SEQ ID NO:1-SEQ ID NO:384 groups formed.
45. according to the composition any one of 29-44 moneys, wherein the first complementary probe includes inosine, the inosine
It is separated by 2,3,4,5,6,7,8,9,10 or more bases with the 3' ends of probe.
46. according to the composition any one of 29-45 moneys, wherein the second complementary probe includes inosine, the inosine
It is separated by 2,3,4,5,6,7,8,9,10 or more bases with the 5' ends of probe.
47. a kind of be used to measuring the presence of one or more target polynucleotides in sample, be not present, content or characteristic
Kit, the kit include:
(a) multiple first complementary probes and the second complementary probe, (i) each first complementary probe have and the first target sequence
Complementary Sequence and the Sequence with the first target sequence incomplementarity, wherein non-complementary portion include inquiry site bar shaped
Code sequence and adjacent universal sequence, and (ii) each second complementary probe have with the Sequence of the second target sequence complementation with
And with the second target sequence incomplementarity close to Sequence;And
(b) it is used for the buffer solution and enzyme for connecting and being enriched with.
48. according to the kit described in the 47th section, also comprising at least one PCR primer, polymerase and one group of dNTP to expand
Increase the target polynucleotide of extension to realize the purpose of enrichment.
49. according to the kit described in the 47th section or the 48th section, also comprising ligase.
50. according to the kit any one of 47-49 moneys, also comprising the software explained needed for data.
51. according to the kit any one of 47-50 moneys, for measuring genotype.
52. according to the kit any one of 47-51 moneys, for measuring copy number.
Example
The purpose for providing following instance is to show various embodiments of the present invention, it is not intended that limits this in any way
Invention.The example and method described herein of offer represent preferred embodiment, they are exemplary, and are not intended to be limiting
The scope of the present invention.To those skilled in the art, spirit of the invention defined in the scope of claim is met
Modification or other purposes are feasible.
Example 1 carries out foranalysis of nucleic acids by combining the polynucleotide probes of bar shaped code labeling.
Polynucleotide probes by combining bar shaped code labeling perform the method for foranalysis of nucleic acids, and by providing, to hybridize to target more
Two complementary probes of two parts (the first target sequence and the second target sequence) of nucleotide are realized (Figure 1A).First complementary spy
Pin and the second complementary probe can close to or separate 1 to 500 or more nucleotide.
First complementary probe includes shorter inquiry site bar code (as shown in Figure 1A, filament), the inquiry site bar shaped
Code further discriminates between a kind of first complementary probe and another form of first complementary probe (Fig. 1 E) or other first complementary probes.
This inquiry site bar code allows to determine site information and allelic information according to shorter evenly sized report sequence
(or only site information or only allelic information).For the measurement result produced using new-generation sequencing, to inquiry position
The Sequence of addition and target 5' complementations also makes inquiry site bar code be in high quality sequence data in point bar code
Position.Inquire that site bar code can include only site, the only combination of allele, site and allele or the independent sequence of conduct
The site of row and the information of allele.Using inquiring that site bar code makes the sequence of the report on gene loci and size, cloth
Put mode and nucleotide composition associates.
In the first complementary probe, with the first target sequence complementation (Figure 1A;Thick line) sequence be asked site bar code every
It is broken into two parts.Inquire site bar code and the first target sequence incomplementarity.
First complementary probe can also include universal sequence (Figure 1A-D;Dotted line).This universal sequence is properly termed as " general to draw
Thing 1 ".This describes its common features as PCR primer site.It will be appreciated, however, that universal sequence can not have it is this
Function, and can have the function of other, one or more of amplification and capture including but not limited to other forms.It is logical
It can also be used to promote addition one or more of other sequences or other parts with sequence.
Second complementary probe has and the second target sequence (Figure 1A, thick line) and universal sequence (Figure 1A;Dotted line) complementary sequence
Row.This universal sequence is properly termed as that " universal primer 2 ", this describes its common features as PCR primer site.First is mutual
Mend the universal sequence in probe and the second complementary probe can be or can not be identical sequence (or can be complimentary to one another or can
With not complementary).
First complementary probe and/or the second complementary probe, which can also be included or can not included, is used for the sequence that size is adjusted
Row.
Universal sequence and target sequence incomplementarity.
In the first complementary probe and the second complementary probe and the first target sequence that there may be or can be not present in sample
After row and the hybridization of the second target sequence, it can extend or the first complementary probe can not be extended so that it is tight with the second complementary probe
Neighbour, wherein gap can be not present between the first complementary probe and the second complementary probe, or the first complementary probe and second mutual
There can be the gap for including one or more bases between benefit probe, which can be padded in gap filling step.
Adjacent first complementary probe and the second complementary probe combine (as shown in Figure 1A, chevron pattern) generation product multinuclear
Thuja acid (extends to 3' universal primers 2, as shown in Figure 1B) from 5' universal primers 1.
Then the template (Figure 1B) that this product polynucleotides can be enriched with as amplified reaction or other forms.In this example
In, it is enriched with by PCR reactions.Although other configurations may be used, as illustrated in this example, PCR primer 2 has conduct
A part, a part and work as sample index sequence for the complementary series of universal primer 2 (coming from the second complementary probe)
To be connected a part (medium line) for subsequence.Using product polynucleotides as template, by (the closure in Figure 1B of PCR primer 2
Arrow) start DNA synthesis.
Then, this amplified production is as the template (figure that (in this example) is further expanded by PCR primer 1
1C).Although other configurations may be used, as illustrated in this example, PCR primer 1 has the complementary series as universal primer 1
A part (dotted line from the first complementary probe) and as linking subsequence a part (medium line).Use the first round
The product of amplification starts DNA by PCR primer 1 (closure arrow) and synthesizes as template.In certain embodiments, PCR primer 1 is gone back
There can be the part as sample index sequence, similar to the PCR primer 2 shown in Figure 1B.
In alternative embodiment, added in sample index (or its part) with PCR primer 1.In other implementations
In example, sample index is added among both PCR primer 1 and PCR primer 2.Pass through PCR primer 2 and/or PCR primer 1, sample
Mark sequence (sample index) or other sample identification parts are attached to each product polynucleotides.When sample index adds PCR
When in primer 1, it is close to site bar code is inquired to promote the sequencing to both inquiry site bar code and sample index, at the same time
The sum for the base for needing to be sequenced at utmost is reduced (if for example, sample index adds PCR primer 2, nothing at least in part
First target sequence and the second target sequence need to be sequenced, first target sequence and the second target sequence can be at inquiry side originally
Between bar code and sample index).
Two-wheeled DNA synthesis (Figure 1B and Fig. 1 C) obtains double stranded amplicon as shown in figure iD.In this example, this amplification
The more multicopy of son is expanded by more wheels and generated, and the amplification guides DNA synthesis using PCR primer 2 and PCR primer 1.
In addition, in this example, generate the sequence data of the part (or whole amplicon) in relation to each amplicon.Not
In the case of formation sequence data, there may be or the part there is no each amplicon.By each sequence and data of generation
Storehouse is compared, and is distributed to appropriate sample and allele and/or site.Various factors (includes but not limited to sequence
Mistake, polymerase errors or non-specific binding) mistake may be caused to distribute.Deciphering quantity to list display analyzed with
Measure target sequence, SNP or gene loci presence, be not present, content or copy number.
(Fig. 1 E) in another example, there are the first complementary probe of two or more forms.Every kind of form is in 3'
End has different sequences (as shown in A and B).This different sequence can be one or more bases.In extreme circumstances
(for example, detection large fragment deletion or complexity indel), the first complementary probe of two kinds of forms and the first target of entirely different form
Sequence is complementary.First complementary probe of two or more forms is between both extreme cases, and it is mutual to retain first
Mend the other elements of probe.Despite the presence of other purposes, including but not limited to check pollution or the part of filling material or quasispecies or
Heteroresistance, but the first complementary probe of diversified forms is commonly used in the classical genotype information of generation.In such case
Under, for each sample and site, to distributing to the deciphering quantity of A allele and distributing to the deciphering quantity of B allele
It is compared.For each site and in view of distributing to the deciphering quantity of A allele relative to distribution to B allele
The ratio of the deciphering quantity of (and erroneous matching), the sample with deciphering of the main distribution to A allele is AA, is had main
The sample of distribution to the deciphering of B allele is BB, and the sample of the deciphering with considerable distribution to two kinds of allele
This is AB.These A and B nomenclatures are only used for distinguishing, not appointing with reference to the nucleotide sequence associated with A or B allele
What is arranged.
2. both sides of example include the inquiry site bar code of complementary series.
When performing foranalysis of nucleic acids by combining the polynucleotide probes measure of bar shaped code labeling, deposited in the first complementary probe
Multiple positions of inquiry site bar code can be arranged wherein.Inquiry site bar code may be arranged in universal sequence, can arrange
(it is common between universal sequence and target-specific sequences in art methods, such as United States Patent (USP) US 8, in 460,866
It is disclosed), and may be arranged in target-specific sequences, as this paper is illustrated.When inquiry site bar code is arranged in
During in target-specific sequences and with target polynucleotide incomplementarity, the both sides of inquiry site bar code are respectively provided with complementary series portion
Point.When allele and site information (or under a number of cases, only site information) coding is when inquiring in the bar code of site,
The advantage is that can control the degree of the sequence difference for detecting target polynucleotide in multiple reaction.In such situation
Under, detect independent of the enough difference between target polynucleotide sequence.130 probe triplets (first are included to using
Two kinds of forms of complementary probe and a kind of form of the second complementary probe) and be placed between target-specific sequences and universal sequence
Probe assembly (PC) the obtained result of measure of 6mer inquiry site bar codes and use comprising for identical target polynucleotide and
130 probe triplets of variant and be placed in target-specific sequences and with target-specific sequences incomplementarity (so that
In the first complementary probe, there is complementary Sequence in the both sides of inquiry site bar code) 12-mer inquire site bar
The result that the PC of shape code is measured is compared.Probe in PC is respectively 50pM.In the design of 12-mer probes, at 5' ends
Complementary portion is increased several bases (with the remainder of complementary region at a distance of 12 bases).In 6mer, 12-mer design,
The complementary region 3' of 6mer or 12-mer inquiry site bar codes is identical in terms of size and composition.In addition, in 12-mer designs,
Incomplementarity inquiry site bar code comprising 12 bases includes the information of the combination in allele and site.Designed in 6-mer
In, the incomplementarity inquiry site bar code comprising 6 bases includes the information of allele.Information of the distribution to site will be understood
It is the sequence (will similarly be contained in data) of target sequence.
Cow genome group DNA (50ng/ μ L) from single sample is placed in the hole of porous plate, 98 DEG C is heated to and keeps
15 minutes.Then a part for each sample is transferred to fresh sample plate and is mixed with PC (12-mer).Then, these reactants
1 minute is kept at 98 DEG C to unwind, and at 60 DEG C incubate 20 it is small when hybridized.It is after hybridization, 3.2 μ L are anti-
Thing is answered to add in the wait plate comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases.Will be new
Sample panel seals, mixed reactant, and centrifugation, is then kept for 15 minutes at 54 DEG C, kept for 10 minutes at 98 DEG C, be cooled to 4
DEG C and be kept at this temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, are included in PCR reactions
Promega GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and PCR primer 1
With PCR primer 2.All reactants mixing, handled by single Zymo-100 silicagel columns and use 150 μ L TE8.0 into
After row elution, 32 circulations were completed altogether in 20 seconds/15 seconds at a temperature of 65 DEG C/95 DEG C.Then utilize
2100 trace of Bioanalyzer carry out quantitative analysis to the library collected, and are diluted to Illumina Next500 flow cells
In appropriate fraction, and formation sequence data.
To the sample index sequence and inquiry site bar code (as needed, including other sequences) included in each deciphering
Compared with database, deciphering quantity that list display is formed by each sample x sites x allele (and including mistake
The deciphering of distribution).For each site, to the deciphering quantity (X-axis) of A allele and the solution of B allele of each sample
Reading amount (Y-axis) is mapped.This A allele is referred to as cluster figure with B allele figures, as described in Fig. 2A-C.
Above-mentioned standard side is also performed using the PC comprising 6mer probes and a DNA for being heated to 98 DEG C and being kept for 15 minutes
Case.
In general, the results showed that, there is the inquiry position between universal sequence and target sequence by the first complementary probe
The PC of point bar code by the first complementary probe there is the incomplementarity in target sequence to inquire bar code (so that inquiry bar
There are complementary series part for the both sides of shape code) PC obtain genotype ability it is similar.Although the it was observed that difference in site one by one
It is different, but the position of the inquiry site bar code in relation to site and allele and type (wherein inquire site bar code by cloth
Put and the information and target sequence incomplementarity) information do not change the characteristic of Genotyping cluster figure.By including 6mer and 12-mer
Some differences between the result that the PC of design is obtained are probably that effective concentration and probe concentration of the inhomogeneities as caused by manufacturer is (complete
Long material) difference, other differences are due to that 12-mer designs have stronger ability, it can be ensured that similar target sequence will not that
This gets wrong.
Example 3. differentiates G in Genotyping measure:T mispairing.
Those skilled in the art is it is known that based on the measure of connection in T:G hybridization errors (" mispairing ") betide SNP inquiries
With relatively low specificity during the 3' ends of sequence and target sequence.Occur under this case when detecting G/A SNP.With regard to Genotyping
For measure, when the 3' ends of the first complementary probe (for example, first complementary probe #1) of the first form, there are C and the second form
The first complementary probe (for example, first complementary probe #1') there are during T, this is probably a problem.Correct first form
The connection of first complementary probe is at the SNP site of target polynucleotide on the target polynucleotide with G, and the first of mistake
The first complementary probe connection of form is at the SNP site of target polynucleotide on the target polynucleotide with A.Correct
The first complementary probe connection of two forms is at the SNP site of target polynucleotide on the target polynucleotide with A, and wrong
The first complementary probe connection of the second form is at the SNP site of target polynucleotide on the target polynucleotide with G by mistake.
For a person skilled in the art, when detecting C/T SNP, it is understood that there may be similar T:G hybridization errors.The G:T
Part hydrogen bond between " mispairing " nucleotide is sufficiently stable, it is allowed to which ligase (inefficiently) is by the first complementary probe knot of mispairing
Close to the second complementary probe.This causes non-specific target polynucleotide to occur within the time of 0-25%.At utmost to reduce
This signal, using the universal base deoxyinosine of the inquiry 3' positions nearside in the first impacted complementary probe.By
The 2nd, the 3rd, the 4th, the 5th, the 6th, the 7th, the 8th or the 9th of the 3' ends of first complementary probe of influence form
Deoxyinosine is included at a position, makes G:T mispairing unstabilitys, so as to reduce the possibility for producing nonspecific products polynucleotides
(and frequency).When deoxyinosine is in the 2nd 3' positions of the first complementary probe of impacted form, G:T mispairing causes shakiness
Qualitative deficiency so that at utmost reduce incorrect link, and the frequency for producing nonspecific products polynucleotides is relatively low.Not by
Deoxyinosine in first complementary probe of the form of influence does not make hybridization unstability to influence genotype resolution ratio, and with (phase
When in big degree) specifically mode and specific product polynucleotides are attached reaction.With deoxyinosine, to move to first mutual
The 5' sides of probe are mended, it reduces G:The ability of the stability of T mispairing declines.When inosine is at the 10th position away from 3' ends,
The new-generation sequencing produced by nonspecific products polynucleotides is understood to comprising the impacted or ill-formalness first complementary spy
Pin is of equal importance.The ideal position for reducing the deoxyinosine of mispairing connection is at the 2nd, the 3rd or the 4th 3' base.
In this example, exist at the 3' positions 2 to 10 of the first complementary probe (there are T at 3' ends) of impacted form de-
Oxygen inosine (inosine that there is substitution base).So obtain 10 kinds of forms of the first complementary probe of impacted form.Use probe
Buffer solution visits the first complementary probe comprising impacted form, the first complementary probe of impregnable form and the second complementation
50pM is made in a kind of probe assembly of inosine arrangement of pin (for target polynucleotide).By single ox gDNA samples
(50ng/ μ L) heats 20min at 98 DEG C makes its cracking, and then 5 μ L are filled into hole.Then filled out with each probe mixture
Fill four holes.Then NGG reactants are heated to 98 DEG C and are kept for 1 minute, when being subsequently cooled to 60 DEG C and small holding 20.Miscellaneous
After friendship, 3.2 μ L reactants are added comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases
Wait plate in.New sample panel is sealed, is mixed, centrifugation, is then kept for 15 minutes at 54 DEG C, kept for 10 seconds at 98 DEG C
Clock, is cooled to 4 DEG C and is kept at this temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, are reacted in PCR
In include Promega GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and the
One universal primer and the second universal primer.In the mixing of all reactants, handled and be used in combination by single Zymo-100 silicagel columns
After 150 μ L TE8.0 are eluted, 32 circulations are completed altogether in 20s/15s at a temperature of 65 DEG C/95 DEG C.Then it is sharp
Quantitative analysis is carried out to the library collected with 2100 trace of Bioanalyzer, is diluted to Illumina Next500 flow cells
Appropriate fraction in, and formation sequence data.To the sample index sequence included in each deciphering and allele and site
Bar code sequence is compared with database, and the deciphering quantity that list display is formed by each sample x sites x allele is (to the greatest extent
Pipe is understood including non-specificity).The results are shown in Figure 3.
In figure 3, series model (the impacted form of LHS-T or the first complementary probe is 5' to 3', and target gDNA or
Genomic DNA is 3' to 5') ten big 3' positions of the first complementary probe comprising 3'T nucleotide (nothing, iT2 to iT10) are shown simultaneously
And it is illustrated as the G nucleotide in mispairing to genomic dna sequence.The 2nd shown 3' positions (i) correspond to " iT2 ".gDNA
The underscore part of sequence is the second complementary probe by the part of hybridization.Closed grey bar is homozygosity GG samples, striated bar
Represent the sample of homozygosity AA.Y-axis is the logarithmic scale of the deciphering quantity associated with the T-shaped formula of the first complementary probe.In ash
In vitta (homozygosity GG samples), exist by G:Non-specific connection caused by the stability of T mispairing.It is (homozygous in striated bar
Property AA samples) in, understand from specificity connection.It is placed in the 2nd or the 3rd 3' of the first complementary probe of impacted form
The deoxyinosine arrangement put significantly reduces the deciphering quantity of non-specific connection.Similarly, deoxyinosine can be used for the
In one complementary probe, which has 3'G and G:The possibility of T mispairing.
Example 4. is used for the method that target polynucleotide is detected in the case of a large amount of excess of background dna.
Utilize the non-of detection method and associated data analysis the detection excess comprising the DNA from multiple species and largely
Target polynucleotide present in the sample of target polynucleotide DNA.Detection method (and associated data analysis) generation has gene
The information of type information (SNP or other variations) and the content in relation to target present in sample.A kind of detection method it is effective
Property is proven in model experiment, wherein by the way that target (or signal) genome (single ox sample) is titrated to background large intestine bar
In bacterium genomic DNA, by genome of E.coli DNA (background or " noise " DNA that do not detect) and the target gene of variable amount
Group DNA mixing.Noise is arranged to each reaction 0,125 or 250ng, and with the TE buffer solutions of pH=8.3 by signal from every
Secondary response 250,125,62.5ng serial dilutions to as low as 0.12207ng (totally 12 kinds of concentration).Test tube is heated to 98 DEG C and is protected
Hold 15 minutes, crack DNA.Each signal pipe is used as source, it is anti-that 5 μ L samples therefrom are transferred to 8EG in 96PCR orifice plates row
In each hole of Ying Kongzhong.By group (two kinds of forms of the 135 probe triplets for carrying out Genotyping to cow genome group DNA
The first complementary probe and a form of second complementary probe) formed probe assembly (PC).By PC and each reaction 0,125 or
The noise genome of E.coli DNA mixing of 250ng.These PC+ Escherichia coli mixtures are distributed on 96 orifice plates (in three rows
In be each reaction 250ng, be each reaction 125ng in three rows, and be each reaction 0ng in two rows).Then, will
These reactants heat 1 minute to unwind at 98 DEG C, and at 60 DEG C incubate 20 it is small when hybridized.In hybridization
Afterwards, by 3.2 μ L reactants add comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases etc.
Treat in plate.New sample panel is sealed, mixed reactant, centrifugation, is then kept for 15 minutes at 54 DEG C, 10 are kept at 98 DEG C
Second, it is cooled to 4 DEG C and is kept at this temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, it is anti-in PCR
Should in comprising Promega GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and
First general PCR primer and the second general PCR primer.Carried out in the mixing of all reactants, by single Zymo-100 silicagel columns
After handling and being eluted with 150 μ L TE8.0, completed altogether 32 times in 20 seconds/15 seconds at a temperature of 65 DEG C/95 DEG C
Circulation.Then quantitative analysis is carried out to the library collected using Bioanalyzer 2100trace, is diluted to Illumina
In the appropriate fraction of Next500 flow cells, and formation sequence data.
To sample index sequence and allele and site bar code sequence (the inquiry site bar included in each deciphering
Shape code) compared with database, deciphering quantity that list display is formed by each sample x sites x allele (and including
The deciphering of mistake distribution) and determine genotype.Shown in Fig. 4 A and Fig. 4 B and DNA and the relevant number of cow genome group equivalent
According to, and show to can be used for the polynucleotides target calibration method in detection diploid eukaryotic gene groups, even if target signal genome
Less than the 0.1% of STb gene is accounted for, which is also suitable.This method can be extrapolated to be detected under the background of eukaryotic gene groups
Microbial genome, the genomic fragment in detection of complex food, environment or other samples, detects the RNA of low content in cell,
Detect the pollution seed being incidentally present of and other application.
Reversible denaturation before the hybridization of 5. probe of example improves cluster figure resolution ratio.
Methods of genotyping and associated data analysis need to be used as the more nucleosides of target using double-strand or single-chain nucleic acid (NA)
Acid.First complementary probe and the second complementary probe need to touch the single-stranded NA for hybridizing to target polynucleotide.Experimental result table
It is bright, to obtain the even single-stranded NA of double-strand, it is necessary to which sample is heated to higher temperature.Exemplary temperature includes 70 DEG C to 100 DEG C
In the range of, and heating time is 1 second to 15 minutes.This reversible denaturation step is improved to target polynucleotide (especially
The target polynucleotide being present in double-strand in sample) detection.
Experiment is performed using similar to the method described in example 2, (is used for using the 135 probe triplets comprising ox SNP
Genotyping) probe assembly (PC) and 96 cow genome group DNA samples.In an experiment, DNA is heated to 98 DEG C and is protected
Hold 15 minutes, add PC, mix with sample, then by sample and PC heating (reversible denaturation) to 98 DEG C and holding 1 minute, finally
Hybridization step when execution 20 is small at 60 DEG C.In the second experiment, do not heated before hybridization step when 20 is small.In this reality
In testing, not by sample be heated to 98 DEG C and keep 15 minutes, and not by PC and sample be heated to 98 DEG C and keep 1 minute, most
Hybridization when execution 20 is small at 60 DEG C afterwards.After hybridization, the steps such as connection, PCR and the sample pooling as described in example 2 are performed
Suddenly.
Then quantitative analysis is carried out to the library collected using 2100 trace of Bioanalyzer, is diluted to Illumina
In the appropriate fraction of Next500 flow cells, and formation sequence data.To the sample index sequence included in each deciphering and wait
Compared with database, list display is formed by each sample x sites x allele for position gene and site bar code sequence
Understand quantity (and including the deciphering of mistake distribution).For each site, to the deciphering quantity of the A allele of each sample
Deciphering quantity (Y-axis) mapping of (X-axis) and B allele.As a result as fig. 5 a and fig. 5b.
Example 6. carries out Genotyping measure using dry probe assembly.
Genotyping measure as described herein includes the nucleic acid and probe blend mixed with high salt concentration.First is complementary
Probe and the second complementary probe are in solution, wherein in " probe assembly " or " PC " (probe, TE and hybridization buffer)
The concentration of each single probe is 50pM.This example illustrates the improved method for setting Genotyping to measure reaction.It is expected to visit
Needle assemblies are placed in reacting hole, make its drying and sealed sample plate, at room temperature long-term (that is, several years) storage.For to ox base
Because group DNA carries out one group of 135 probe triplet (the first complementary probe and a form of second of two kinds of forms of Genotyping
Complementary probe) dried in reacting hole.The single PC that working concentration is 50pM is made.3 μ L PC are placed in the six of 384 orifice plates
Arrange in hole.Another PC is made into wherein including identical 135 probe triplets, TE and buffer solution and 0.4mM trehaloses.Seaweed
Sugar is can be beneficial to the preservative of dry polynucleotide, and dry PC is fixed to the bottom of reacting hole by it.Similarly, comprising seaweed
The PC of sugar is used to add in six row holes of 384 orifice plates with the amount of 3 μ L.By sample panel is placed in Laminar Ventilation cupboard (wherein without
Bacterium dust-free air passes through sample panel) it is dried overnight, make one of each type of sample panel PC (wrapping with or without trehalose)
It is completely dried.By a sample panel sealing without trehalose and stored frozen is at -20 DEG C.By dry sample panel sealing simultaneously
Storage is at room temperature.
After one month, fresh PC is prepared, and 3 μ L PC are added in six row holes of 384 orifice plates.By one group of 96 cow genome group
DNA sample (concentration is 50ng/ μ L, 35 μ L of volume) is heated to 98 DEG C and is kept for 15 minutes.Then gDNA is added into 4 sample panels
In, 5 μ L are added into two wet sample panels (a kind of fresh sample plate, another kind store one month at -20 DEG C), and will
8 μ L are added in two dry-eye disease plates and (stored at room temperature one month).The porose volume of institute is 8 μ L.Sample panel is sealed,
Simply centrifuge, and stand at room temperature 2 it is small when to ensure the complete rehydration of dry probe.Then, these reactants are existed
1 minute is heated at 98 DEG C to unwind, and at 60 DEG C keep 20 it is small when hybridized.After hybridization, 3.2 μ L are reacted
Thing is added in each plate comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases.By new sample
Product plate seals, and mixes, and centrifugation, is then kept for 15 minutes at 54 DEG C, kept for 10 seconds at 98 DEG C, be cooled to 4 DEG C and keep
At such a temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, Promega is included in PCR reactions
GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and the first general PCR primer and
Second general PCR primer.In the mixing of all reactants, handled by single Zymo-100 silicagel columns and use 150 μ L
After TE8.0 is eluted, 32 circulations were completed altogether in 20 seconds/15 seconds at a temperature of 65 DEG C/95 DEG C.Then utilize
2100 trace of Bioanalyzer carry out quantitative analysis to the library collected, and are diluted to Illumina Next500 flow cells
In appropriate fraction, and formation sequence data.To the sample index sequence and allele and site bar included in each deciphering
Shape code sequence compared with database, deciphering quantity that list display is formed by each sample x sites x allele (and
Include the deciphering of mistake distribution).For each site, to the deciphering quantity (X-axis) and B equipotentials of the A allele of each sample
Deciphering quantity (Y-axis) mapping of gene.As a result as shown in Fig. 6 A-D.
In general, generated by freshly prepared PC or stored frozen or PC that is dry with trehalose and storing at room temperature
The ability of genotype is similar.
Example 7. is used for the method for measuring copy number variation.
Copy number analysis method can be used for measure copy number variation (CNV), wherein by the zero-copy of allele with it is identical
One or two copy of allele distinguishes.To prove this point, a new generation that copy number is analyzed to measure generation surveys
Sequence understands (96 ox samples of the input DNA with normalized amount in all samples) and suitable for the probe assembly
Inquiry site bar code and the database of sample index sequence are compared.The deciphering quantity that list display is created by each sample
With the single allele of Single locus (and including the deciphering of mistake distribution) and analyzed.As for figures 7 a-c, in BB
The animal of homozygosity has null solution reading or the deciphering close to zero, wherein the inquiry site with A allele (at the site)
Bar code.There are about 200 decipherings in the animal of AB heterozygosity, wherein the inquiry position with A allele (at the site)
Point bar code.Finally, AA homozygosity animal has about 400 decipherings, wherein the inquiry site bar code with A allele.
In such CNV measure, input nucleic acid needs to be consistent sample to be tested, or needs to come using extra label
Adjust the deciphering quantity (and/or each gene loci of inquiry) produced in each sample.
Example 8. uses Genotyping assessment tetraploid genome.
In tetraploid organism, four copies of allele may have, and a copy is included on every chromosome.
To imitate tetraploid organism, the DNA from two kinds of different diploid animals (same species) is mixed, generation has
The sample of four copies of any given allele.Add comprising (two kinds of the probe triplet for a variety of target polynucleotides
First complementary probe of form and a form of second complementary probe), and method is performed according to described in example 2, it is different
It there is provided the cluster figure for five kinds of genotype.
Sample index sequence and inquiry site bar code sequence to being included in each deciphering arrange compared with database
Table shows the deciphering quantity (and including the deciphering of mistake distribution) formed by each sample x sites x allele.For the position
Point, deciphering quantity (Y-axis) mapping of deciphering quantity (X-axis) and B allele to the A allele of each sample.This experiment
Illustrating in sample is or five kinds of genotype groups is distinguished during comprising tetraploid genome and produce the ability of genotype.As a result as schemed
Shown in 8.
The Genotyping inquiry in 9. multiple alleles site of example.
Genotyping inquiry can be by only adding for triallelic, tetra-allelic or more allele
The three, 4th or more forms the first complementary probe to multiple alleles SNP carry out Genotyping.For detection example three
Possible genotype at allele SNP positions, probe assembly (PC) include three almost identical the first complementary probes and list
A second complementary probe.It is each with may be present three kinds in diploid gene group DNA not in three the first complementary probes
One of (SNP) complementary different 3' terminal nucleotides are replaced with base.Each also having in three the first complementary probes
Have the inquiry site bar code of uniqueness, which can identify allele and site, the allele and
Site for the first complementary probe of accurate target polynucleotide and the precise forms of variant by being detected.Perform strictly according to the facts
Method described in example 2, unlike for the site, comprising three inquiry bar codes, (each variant corresponds in database
One inquiry bar code).
Sample index sequence and inquiry site bar code sequence to being included in each deciphering arrange compared with database
Table shows the deciphering quantity (and including the deciphering of mistake distribution) formed by each sample x sites x allele.For the position
Point, the deciphering quantity (Y-axis) and C equipotential bases of deciphering quantity (X-axis) and B allele to the A allele of each sample
Deciphering quantity (z-axis) mapping of cause.This A allele is referred to as cluster figure with B allele and the figure of C allele.This
In the case of, allele A is G bases, and allele B is T bases, and allele-C is C bases.AA animals are along x-axis, BB
Animal is along y-axis, and CC animals are along z-axis.Heterozygosity animal (TC, TG, CG) is between any two axis.This experiment shows produces
The ability of the genotype in multiple alleles site.The results are shown in Figure 9.
Example 10. is used for the methods of genotyping for detecting the hereditary variation (gene loci) in addition to single base is replaced.
Methods of genotyping (and associated data analysis) whether there is in the sample for target polynucleotide.Some
In the case of, target polynucleotide can be the result of missing/insertion event.In this experiment, a form of first complementary probe
It is designed to complementary with the target sequence comprising missing, the first complementary probe of other forms is designed to not having missing
Target sequence it is complementary.3' sequence of second complementary probe close to the first complementary probe of two kinds of forms.According to example 2 after
It is continuous to perform workflow.
For each site, to the deciphering quantity (X-axis) of A allele and the solution reading of B allele of each sample
Measure (Y-axis) mapping.In this case, A allele and B allele are insertion and missing (vice versa).This experiment exhibition
The ability for the variant that the genotype at measure specific gene site and not single base are replaced is shown.As a result such as Figure 10 institutes
Show.
Example 11. is used for the method for selecting and optimizing sample index bar code.
Target is to generate totally 96000 complementary probe, wherein having 15mer between universal primer sequence and linking subsequence
Sample index bar code (Fig. 1 C and Fig. 1 D).These 15mer sample index bar codes also have 12 nucleotide (nt) of reduction
Length is understood, it is suitable for handling the different samples of low amount, cost and time is for example sequenced so as to save,
Because the one 12 nucleotide need to be only sequenced to identify specific sample index.The theoretical maximum of different 15mer sequences
Numerical value is 1073741824 (4 nucleotide comprising 15mer sequences are 4^15 different sequence=1073741824).From the number
Homopolymer is deducted in word, obtains 139839696 15mer (http://oris.org/searchQ=berserker+
A123620)
Further shorten these 15mer sequences by selection course.Selection criteria includes at utmost improving in bar code
And bar code adds the orthogonality of flanking sequence (universal primer sequence and linking subsequence);At utmost improve bar code
Specificity so that they are free of homopolymer, and G/C content is about 40-60%;And ensure and (such as Illumina TruSeq)
The business compatibility of Nextera indexes, avoids the subsequence for easily making instrument produce high error rate sequence.After selection course,
It is randomized bar code and confuses bar code candidate vectors.
These bar codes stores are in complete and reduction deciphering length group.They are preloaded into business to index above (with rank
Connect sub- extension).The quantity of bar code is arranged from maximum to smallest edit distance " d ".Check each bar code in candidate vectors
GC contents, be subject to the tripolymers of phase erroneous effects, and touching in deciphering group that is complete or shortening is searched under given " d "
Hit.If bar code not in any group of, is added to two groups of concentration.
Index plate sequence-index plate is grouped by performance indicator, with including higher in such as subset of all samples plate
Orthogonality/specificity.For each (384 hole) index plate to be generated, based on unspecified bar shaped code character (such as 15/
12nt understands editing distance) the optimal bar code subgroup of selection.By subset allocation to each plate, and the performance for calculating each plate refers to
Mark.Performance indicator is based on sequencing solution reading.The example of performance indicator is as follows.
It was found that some bar code sequences cause relatively low sequencing solution reading.
It was found that the motif in bar code causes bar code and the interaction being connected between subsequence.It was found that the motif includes
The sequence of about 7 bases (CTAGCCTCC), and can cause to be produced from complementation between the 3' ends of complementary probe and internal sequence.
It has also been found that the variation of this 7bp motif form.Example is as shown in Figure 11 A and 11B and Figure 12 A and 12B.
Computer program is built to substitute these problematic sequences, such as by successive optimization sequence as far as possible, and
Reach the 96K bar codes of full breadth.Since this specific motif seems more more than problematic tripolymer and editing distance
Ground influences performance, therefore all these is all taken into account in design/branch mailbox flow.However, in the case where all substituting, and
When carrying out local optimum to each plate under the identical standard of global editing distance, 84096 sample index are generated.These ropes
The one 16128 index in drawing also is used as 12-mer, for handling the experiment of greater number sample and using (for example, place
10 384 hole microtiter plates are managed, each hole includes a sample).
Example 12. is used for the methods of genotyping for detecting the target polynucleotide in polyploid sample.
In this example, describe for detecting the present or absent of the target polynucleotide in polyploid wheat sample
Methods of genotyping (and associated data analysis).Using ploidy reduce strategy with reduce generation without in information gene group
Sequence data.Under a number of cases, target polynucleotide can be the result of SNP or missing/insertion event (indel).
In first method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/
Indel and nearside SNP/indel information.Relative position based on nearside SNP/indel Yu target/marker SNP/indel
Knowledge, probe is designed as to have selectivity to genome interested, using nearside SNP unstabilitys strategy to multiple by biology
Polygamy reduces ploidy.Selection carries out Genotyping on Axiom and shows the target of diploid cluster.It is another to ensure appointing for target indicia thing
Nearside SNP is not present in 9 bases of side (see Figure 14 A-C).
A form of first complementary probe be designed to it is complementary with the target sequence with SNP (LHS), the of other forms
One complementary probe is designed to complementary with the target sequence without SNP or indel (LHS').Second complementary probe (RHS) close to
The 3' sequences of first complementary probe of two kinds of forms.Completing the selection to genome interested (that is, has the target of nearside SNP
Genome will produce low sequence number).Incorporating nearside SNP in probe design causes the site for not producing deciphering to play a role completely
(see Figure 15 A and 15B).Workflow is continued to execute according to example 2.
In the second approach, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/
Indel and nearside SNP/indel information.Knowing based on nearside SNP/indel and the relative position of target indicia thing SNP/indel
Know, addition is with having the blocking oligonucleotide of the target genome sequence complementation of nearside SNP/indel to prevent RHS from hybridizing to target base
Because of group.
A form of first complementary probe is designed to, other shapes complementary with the target sequence with SNP/indel (LHS)
First complementary probe of formula is designed to complementary with the target sequence without SNP/indel (LHS').Second complementary probe (RNS)
Close to the 3' sequences of the first complementary probe of two kinds of forms.Addition and the target sequence comprising nearside SNP are complementary in target gene group
Closing/competition oligonucleotides.Blocking oligonucleotide prevents RHS from hybridizing to target gene group.Complete to genome interested
Selection (that is, the target gene group with nearside SNP will produce low sequence and understand or do not produce sequence deciphering).It is few by adding closing
Nucleotide, incorporates the site that nearside SNP causes not produce deciphering in probe design (see Figure 16 A and 16B).This method is applicable in
Between nearside SNP is in the base 1 and 10 of target indicia thing.It can not make two level polymorphism unstability.According to example 2 after
It is continuous to perform workflow.
In third method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/
Indel and nearside SNP/indel information.Knowing based on nearside SNP/indel and the relative position of target indicia thing SNP/indel
Know, PCR primer is designed to selective amplification unique genome interested or subgenome.In this approach, increase
Early period PCR amplification step.Target gene group can be hybridized to one or both of the PCR primer of target genome sequence complementation
Nearside SNP in sequence.This prior step using PCR amplification can be that (that is, sample is divided into two to parallel work flow journey form
Part).
A form of first complementary probe is designed to, other shapes complementary with the target sequence with SNP/indel (LHS)
First complementary probe of formula is designed to complementary with the target sequence without SNP/indel (LHS').Second complementary probe (RHS)
Close to the 3' sequences of the first complementary probe of two kinds of forms.Increase PCR amplification step early period, use the PCR primer by design
With selective amplification unique genome interested or subgenome.Nearside SNP/indel may lose the hybridization of PCR primer
Surely.The selection to desired genome interested is completed (that is, to eliminate in follow-up work flow and unwanted include nearside
The genome of SNP).The combination of various sites and genome is adapted to (see figure using the combination of multiple PCR primer groups and probe groups
17A and 17B).Workflow is continued to execute according to example 2.
For each site, to the deciphering quantity (X-axis) of A allele and the solution reading of B allele of each sample
Measure (Y-axis) mapping.(Axiom cluster map analysis.) this experiment shows uses the knowledge-chosen of nearside SNP/indel base interested
Because group with identify treat selection target SNP/indel and design probe ability.As a result as shown in figures 18a and 18b.
In other method, 600X mean coverages can be used in a small amount of selection marquee thing.This method needs parallel work
Make flow (that is, sample is divided into two parts).Under a number of cases, sample with neighbouring nearside SNP or single base indel phases
Separated between the label of pass, and for impacted label, it is and used to 200X coverages in other methods
Sequencing is on the contrary, it makes to have the coverage of the isolate of impacted label to improve to 600X (when i.e., using extra sequencing
Between and expense compensate, rather than early period Eureka part compensate).In the case of this method has been used for other,
Deep sequencing such as is carried out with RNA-Seq to aid in detecting rare transcript, therefore in different situations, this will be
Eureka equivalents.
Example 13. is used for the method that target RNA is detected in the case where being not converted into cDNA.
In this example, the method for detecting target RNA in the case where being not converted into cDNA is described.In an example
Property method in, more reconnections are carried out to RNA target mark using new commercially available ligase (Splintr for being purchased from NEB) and are situated between
The PCR led, the ligase can connect the adjacent DNA probe for hybridizing to RNA chains.This multiple ligation-mediated PCR based on RNA can
Many measure is performed, wherein without RNA is converted into cDNA.Method described herein, which has, eliminates RNA to cDNA conversion deviations
The advantages of.Method described herein has potential purposes in terms of RNA is inquired, is having the advantage that in applying below:Detect chain
Specific alleles purposes, the copy number measure of RNA and mRNA transcripts, alternative splicing and splicing variants analysis, with
And the detection of fusion.Method described in this example is the multiple ligation-mediated PCR detection side as described herein based on DNA
Method directly extends, but is used as target using RNA rather than DNA.
Select one group of 778 site in multiple human mRNA's transcript known at extron and exon boundary.Visit
Pin is designed to inquire these mRNA transcripts.Fusion is normally incorporated with the 5' ends of a gene and the 3' of another gene
Between end.Diverse location in DNA occurs for the breakpoint of each gene, but is most commonly in introne so that the RNA of splicing leads to
Often in exon boundary, there are breakpoint.Probe is designed to the extron of encirclement introne of the covering comprising known fusion breakpoint
End.As positive control, probe is designed to be arranged in the encirclement introne for Beta Actin and GAPDH genes
The end of extron, it is free of known fusions.As negative control, probe is designed to be arranged in for Beta Actin
With the end of the introne of GAPDH genes, it is only expanded depositing in the case of dna.
Ligation-mediated PCR needs a pair of two kinds of DNA probe, the first complementary probe and phosphorylation occurs at 5' ends
The second complementary probe.DNA or RNA specificity ligase can hybridize to target DNA or RNA moulds respectively in two DNA probes
The 3'OH groups of the first complementary probe are made to be attached to the second complementary probe 5' phosphate groups in the case of plate.First complementary probe
Hybridization is designed at exon boundary, and the second complementary probe is designed to hybridization in close to hybridizing the first complementary probe
Extron extron at.For example, if the first complementary probe is designed to hybridize to extron II, corresponding second is mutual
Mending probe will be designed to hybridize to extron III.In this way, the first complementary probe and the second complementary probe are to being only capable of
The extron II of appropriate montage is enough included to the RNA transcript of extron III events.Other second complementary probes are designed to
Such as in the case where the second complementary probe is designed to hybridize to extron IV, measure will detection extron II/ extrons IV
Splicing variants.The length of DNA probe is between 20 bases and 50 bases, so that the annealing temperature that is calculated is between 68
DEG C between 74 DEG C.Each first complementary probe has common/common PCR primer sites at 5' ends, and each second complementation is visited
Pin has different common/common PCR primer sites at 3' ends.Supported using common/common PCR primer sites special using sample
Opposite sex index PCR amplification connection product.High salt concentration buffer solution (750mM KCl, 30mM Tris-HCl pH=8.5,
0.5mM EDTA pH=8.0) in, all probe blendings are the single blend that concentration is 50pM.
By commercially available human cell line RNA (HeLa), high salt concentration probe buffer solution and RNA protection reagent be mixed into compared with
In 8 small μ L reactants, which is contained in the hole of 384 orifice plates, and is sealed with the paper tinsel of heating.Will be miscellaneous in PCR instrument device
Hand over reactant to be heated to 95 DEG C and kept for 1 minute, when being subsequently cooled to 60 DEG C and small holding 20, to promote to hybridize.In connection
Before, hybridization reaction thing is cooled to 54 DEG C, be subsequently placed in it is wet on ice.
Connection mixture comprising Splintr enzymes (unit/reaction) and its 1X reaction buffers is distributed into each reaction
32 μ L, and it is cooled to wet ice temperature.Then 8 μ L hybridization reactions things are added in coupled reaction mixture and are sufficiently mixed.First
Whole mixture is heated to 54 DEG C and is kept for 15 minutes, 92 DEG C is then heated to and is kept for 15 seconds so that any disconnected probe
Go to hybridize, and be cooled to 4 DEG C or freezing.
The common PCR primer that PCR mixtures are included in standard PCR reaction buffers, the first complementary probe (has
Illumina sequencings flow cell binding sequence) and the second complementary probe in common PCR primer (it is sequenced in Illumina flows
The other half end of dynamic pond binding sequence is nearby uniquely indexed (sample index)).PCR primer in the mixture will expand
Increase any connection product of the first complementary probe and the second complementary probe.The PCR reaction products of sample index are collected, in silica gel
Purified on column, to remove excessive salt, enzyme, small probe and primer.The library collected to this carry out qualitatively and quantitatively with
Meet size requirements.It is the PCR for being successfully connected product by the first complementary probe and the second complementary probe to react successful mark
Expand obtained primer size length and be offset to 210bp (signal) from 150bp (noise artifacts).
Sequencing to pcr amplification product will disclose such as extron II/ extrons III or may include outside extron II/
Show the information that the first complementary probe of sub- IV splicing variants and the second complementary probe combine.Repetition deciphering, which can be counted, (to divide
Case), and these count the Relative copy number that can be used for inferring RNA transcript.
Prove that ligase specific product is produced by hybridization and coupled reaction, perform one group of 96 secondary response, and show
Total solution reading (Figure 19) of all sites in each sample.Normal solution reading shows ligase specific product by the sample analyzed
Generation.When ligase is saved in reaction, the deciphering of detection is close to zero (16 independent reactions altogether) (Figure 19, bottom right in ellipse
The point of side).This group of data eliminate any first complementary probe, which, which does not have, is connected to the first complementary probe
The second complementary probe of gametophyte, substantially eliminate the noise of pseudo- first complementary probe exception connection product.In addition, also carry out
Titration research, wherein input connection enzyme concentration (unit/reaction) and inputting RNA and being titrated to zero (Figure 20 and Figure 21).Even
The total solution reading for connecing the mRNA transcripts of glyceraldehyde-3-phosphate dehydrogenase (GADPH) gene obtained in enzyme titration research is (any
The site 745 being assigned as in 778 site groups) total solution reading show that connecting enzyme reaction depends on input connection enzyme concentration (figure
20).Inputting the total of the mRNA transcripts for titrating glyceraldehyde-3-phosphate dehydrogenase (GADPH) gene obtained in research of RNA
Total solution reading of solution reading (site 745 being arbitrarily assigned as in 778 site groups) shows that coupled reaction depends on the amount of input RNA
(Figure 21).In figure 21, human DNA sample is further included to prove that montage effect depends on RNA rather than DNA.In these researchs,
GADPH genetic transcriptions thing (site 745, extron VII) is only investigated by branch mailbox solution reading.It is expected that fusion product does not exist
In this data group based on HeLa.
Although various examples and other information is provided above to explain the aspect in right, not base
Special characteristic or arrangement in such example imply limitations on claims, and those of ordinary skill can use this
A little examples derive various embodiments.In addition, although some themes may use for certain structural features, condition or
The language of the example of purposes describes, but it is to be understood that the theme limited in claim is not necessarily so limited.
Claims (32)
1. a kind of be used to measure the presence of one or more target polynucleotides in two or more samples, be not present, content
Or the method for copy number, it the described method comprises the following steps:
(a) two or more samples are provided, each sample includes one or more target polynucleotides, every kind of target polynucleotide bag
Containing the first target sequence and the second target sequence;
(b) multiple first complementary probes and the second complementary probe, the multiple first complementary probe and the second complementary probe are provided
Including the first complementary probe and the second complementary probe for every kind of target polynucleotide, (i) each first complementary probe have with
The Sequence of the first target sequence complementation of the target polynucleotide and the Sequence with the first target sequence incomplementarity,
Wherein described non-complementary portion includes inquiry site bar code sequence and adjacent universal sequence, and (ii) each second complementary spy
Needle set have with the Sequence of the second target sequence complementation of the target polynucleotide and with the second target sequence incomplementarity
Close to Sequence;
(c) with the multiple first complementary probe of each sample incubation and the second complementary probe under hybridization conditions so that first
Complementary probe and the second complementary probe hybridize with their complementary target polynucleotide to form hybridization complex in the sample;
(d) the first complementary probe for hybridizing in the sample with the first target sequence of target polynucleotide and the second target sequence and the are combined
Two complementary probes are to form product polynucleotides;
(e) the product polynucleotides formed by the sample are collected;And
(f) every kind of target polynucleotide is measured in one or more samples by analyzing product polynucleotides or its complementary strand
In the presence of, be not present, content or copy number.
2. according to the method described in claim 1, the first target sequence and the second target sequence of wherein described every kind of target polynucleotide
It is closely adjacent to each other.
3. according to the method described in claim 1, the first target sequence and the second target sequence of wherein described every kind of target polynucleotide
It is separated by 1 to 500 nucleotide.
4. method according to any one of claim 1-3, wherein second complementary probe is described close to sequence portion
Dividing includes universal sequence.
5. according to the method described in claim 4, the universal sequence of wherein described second complementary probe includes and primer sequence
Complementary universal primer sequence is arranged, the primer sequence can be used in increasing (i) sample index, (ii) appended sequence, (iii) use
In formation sequence data or one or more of the appended sequence of another form of detection and (iv) another part.
6. according to the method any one of claim 1-5, wherein the adjacent general sequence of first complementary probe
Row include universal primer sequence with primer sequence complementation, the primer sequence can be used in increasing (i) sample index, (ii) it is attached
Sequence, (iii) is added to be used for one in the appended sequence and (iv) another part of formation sequence data or another form of detection
Person or more persons.
7. according to the method described in claim 5 or claim 6, wherein the universal primer sequence includes PCR primer sequence
And/or primer sequence is to increase the appended sequence for formation sequence data or another form of detection.
8. according to the method any one of claim 5-7, wherein described be used for formation sequence data or another form
The appended sequence of detection be adapter for new-generation sequencing.
9. according to the method any one of claim 5-7, wherein described be used for formation sequence data or another form
Detection appended sequence for capture sequence, optionally wherein it is described capture sequence be used for capture in solid support.
10. according to the method any one of claim 5-9, have wherein the universal primer sequence can effectively increase
Beneficial to the part of formation sequence.
11. according to the method any one of claim 5-10, wherein the length of sample index is 10,11,
12,13,14,15 or 16 nucleotide.
12. according to the method for claim 11, wherein the length of sample index is 12 or 15 nucleotide.
13. according to the method any one of claim 4-12, wherein first complementary probe and the second complementary probe
The universal sequence each include primer sequence, the primer sequence can be with the primer hybridization for composition sequence.
14. according to the method for claim 13, wherein the primer sequence includes PCR primer sequence.
15. according to the method any one of claim 1-14, wherein first complementary probe include 5 ' to it is described
The sequence and 3 ' the inquiry site bar extremely with the first target sequence complementation of the complementary inquiry site bar code of first target sequence
The sequence of shape code.
16. according to the method any one of claim 1-15, wherein first complementary probe is included from 5' to 3':Institute
State adjacent universal sequence, with the Sequence of the first target sequence complementation and the Sequence and the first target sequence
Arrange complementary inquiry site bar code.
17. according to the method any one of claim 1-16, wherein the length of incomplementarity inquiry site bar code
For 10,11,12,13,14,15 or 16 nucleotide.
18. according to the method for claim 17, wherein the length of the inquiry site bar code is 12 or 15 nucleosides
Acid.
19. according to the method any one of claim 1-18, wherein the step of before incubation, it is included in 70 DEG C extremely
Heated at a temperature of 100 DEG C.
20. according to the method any one of claim 1-19, the production is enriched with before being additionally included in the compilation steps
Thing polynucleotides.
21. according to the method for claim 20, wherein the enrichment includes:(a) one group of PCR primer sequence is provided, it is described
PCR primer sequence include with the first primer of primer sequence complementation in first complementary probe and with the described second complementation
Second primer of the PCR primer sequence complementation on probe, and (b) expand the product polynucleotides.
22. according to the method any one of claim 1-21, wherein the method is the method based on solution.
23. according to the method any one of claim 1-22, wherein first complementary probe includes inosine, the flesh
The 3' ends of glycosides and the probe are separated by 2,3,4,5,6,7,8,9,10 or more bases.
24. according to the method any one of claim 1-23, wherein second complementary probe includes inosine, the flesh
5 ' the end of glycosides and the probe is separated by 2,3,4,5,6,7,8,9,10 or more bases.
25. according to the method any one of claim 1-24, wherein the 3' ends of first complementary probe and list
A kind of form in nucleotide polymorphisms (SNP) or other hereditary variations is complementary.
26. according to the method any one of claim 1-25, wherein the first complementary probe of the combination and the second complementation
The step of probe, includes the use of described the of connection enzymatic treatment and the first target sequence of target polynucleotide and the hybridization of the second target sequence
One complementary probe and second complementary probe (hybridization complex) are to form product polynucleotides.
27. according to the method any one of claim 1-26, wherein the method is used in Genotyping, wherein described
Method includes providing one or more variants of first complementary probe, wherein the variant is in the described first complementary spy
It is different in terms of the homogeneity of one or more of nucleotide at the 3 ' end of pin, and wherein described measure includes quantifying
The relative frequency of product polynucleotides or its complementary strand, the product polynucleotides or its complementary strand include mutual with described first
The sequence for mending other variants of probe made one or more of changes of first complementary probe compared
The sequence of allosome, and the frequency is associated with genotype.
28. according to the method any one of claim 1-26, wherein the method is used for the institute for measuring target polynucleotide
State copy number variation, and wherein described measure include will by the semaphore that product polynucleotides or its complementary strand produce with
Known reference signal amount is compared by the semaphore that another product polynucleotides or its complementary strand produce.
29. according to the method any one of claim 1-26, wherein the method is used to measure target in expression analysis
The presence of polynucleotides, wherein the target polynucleotide is RNA transcript, and include will be by product multinuclear for wherein described measure
The semaphore that thuja acid or its complementary strand produce is with known reference signal amount or by another product polynucleotides or its complementary strand
The semaphore produced is compared.
30. according to the method any one of claim 1-26, wherein the method is used to carry out base to polyploid sample
Because of parting, the method further includes reduction formation sequence data in no information polyploid genome, it, which includes obtaining, has target
The sample gene data unit sequence of SNP/indel and nearside SNP/indel information, design second complementary probe so that its with
The stability of the hybridization of target gene group is broken by nearside SNP/indel.
31. according to the method any one of claim 1-26, wherein the method is used to carry out base to polyploid sample
Because of parting, the method further includes reduction formation sequence data in no information polyploid genome, it, which includes obtaining, has target
The sample gene data unit sequence of SNP/indel and nearside SNP/indel information, design second complementary probe so that its with
The stability of the hybridization of target gene group is broken by nearside SNP/indel, and addition and the target gene with nearside SNP/indel
Complementary blocking oligonucleotide is organized further to prevent the hybridization of second complementary probe and the target gene group.
32. according to the method any one of claim 1-26, wherein the method is used to carry out base to polyploid sample
Because of parting, the method further includes reduction formation sequence data in no information polyploid genome, it, which includes obtaining, has target
The sample gene data unit sequence of SNP/indel and nearside SNP/indel information, and increase PCR amplification step early period is to select
Select the unique gene group to be studied.
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662289303P | 2016-01-31 | 2016-01-31 | |
US62/289,303 | 2016-01-31 | ||
US201662317879P | 2016-04-04 | 2016-04-04 | |
US62/317,879 | 2016-04-04 | ||
US201662353088P | 2016-06-22 | 2016-06-22 | |
US62/353,088 | 2016-06-22 | ||
PCT/US2016/060991 WO2017044993A2 (en) | 2015-09-08 | 2016-11-08 | Nucleic acid analysis by joining barcoded polynucleotide probes |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108026568A true CN108026568A (en) | 2018-05-11 |
Family
ID=62083370
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201680052075.1A Pending CN108026568A (en) | 2016-01-31 | 2016-11-08 | Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP3347497A4 (en) |
CN (1) | CN108026568A (en) |
WO (1) | WO2017044993A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110408717A (en) * | 2019-07-23 | 2019-11-05 | 四川省农业科学院生物技术核技术研究所 | The specific amplification primer of Ganoderma mitochondria rns gene and its application |
CN111100935A (en) * | 2018-10-26 | 2020-05-05 | 厦门大学 | Method for detecting drug-resistant gene of bacteria |
CN113518829A (en) * | 2018-12-31 | 2021-10-19 | Htg分子诊断有限公司 | Method for detecting DNA and RNA in same sample |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110832076B (en) * | 2017-06-27 | 2024-05-17 | 国立大学法人东京大学 | Probes and methods for detecting transcripts resulting from fusion gene and/or exon skipping |
CN112074610A (en) * | 2018-02-22 | 2020-12-11 | 10X基因组学有限公司 | Conjugation-mediated nucleic acid analysis |
US11639928B2 (en) | 2018-02-22 | 2023-05-02 | 10X Genomics, Inc. | Methods and systems for characterizing analytes from individual cells or cell populations |
CN116323969A (en) * | 2020-10-01 | 2023-06-23 | 谷歌有限责任公司 | Linked double bar code insertion construction |
EP4294945A1 (en) * | 2021-02-17 | 2023-12-27 | Act Genomics (IP) Limited | Dna fragment joining detecting method and kit thereof |
WO2022182682A1 (en) | 2021-02-23 | 2022-09-01 | 10X Genomics, Inc. | Probe-based analysis of nucleic acids and proteins |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101395280A (en) * | 2006-03-01 | 2009-03-25 | 凯津公司 | High throughput sequence-based detection of snps using ligation assays |
CN104830993A (en) * | 2015-06-08 | 2015-08-12 | 中国海洋大学 | High-throughput typing technique universal to various molecular markers |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5912124A (en) * | 1996-06-14 | 1999-06-15 | Sarnoff Corporation | Padlock probe detection |
WO2005001113A2 (en) * | 2003-06-27 | 2005-01-06 | Thomas Jefferson University | Methods for detecting nucleic acid variations |
US8808991B2 (en) * | 2003-09-02 | 2014-08-19 | Keygene N.V. | Ola-based methods for the detection of target nucleic avid sequences |
US7604937B2 (en) * | 2004-03-24 | 2009-10-20 | Applied Biosystems, Llc | Encoding and decoding reactions for determining target polynucleotides |
AU2006213907A1 (en) * | 2005-02-09 | 2006-08-17 | Stratagene California | Key probe compositions and methods for polynucleotide detection |
WO2013106807A1 (en) * | 2012-01-13 | 2013-07-18 | Curry John D | Scalable characterization of nucleic acids by parallel sequencing |
-
2016
- 2016-11-08 CN CN201680052075.1A patent/CN108026568A/en active Pending
- 2016-11-08 EP EP16845310.8A patent/EP3347497A4/en active Pending
- 2016-11-08 WO PCT/US2016/060991 patent/WO2017044993A2/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101395280A (en) * | 2006-03-01 | 2009-03-25 | 凯津公司 | High throughput sequence-based detection of snps using ligation assays |
CN104830993A (en) * | 2015-06-08 | 2015-08-12 | 中国海洋大学 | High-throughput typing technique universal to various molecular markers |
Non-Patent Citations (4)
Title |
---|
L.KVASTAD等: "Single cell analysis of cancer cells using an improved RT-MLPA method has potential for cancer diagnosis and monitoring", 《SCIENTIFIC REPORTS》 * |
姬艳丽等: "高通量MLPA基因分型技术在1个Del表型家系基因鉴定中的应用", 《中国输血杂志》 * |
张莉等: "联合应用MLPA和测序技术检测地中海贫血基因缺陷", 《实用预防医学》 * |
胡福泉: "《现代基因操作技术》", 31 October 2000, 人民军医出版社 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111100935A (en) * | 2018-10-26 | 2020-05-05 | 厦门大学 | Method for detecting drug-resistant gene of bacteria |
CN111100935B (en) * | 2018-10-26 | 2023-03-31 | 厦门大学 | Method for detecting drug resistance gene of bacteria |
CN113518829A (en) * | 2018-12-31 | 2021-10-19 | Htg分子诊断有限公司 | Method for detecting DNA and RNA in same sample |
CN110408717A (en) * | 2019-07-23 | 2019-11-05 | 四川省农业科学院生物技术核技术研究所 | The specific amplification primer of Ganoderma mitochondria rns gene and its application |
Also Published As
Publication number | Publication date |
---|---|
WO2017044993A2 (en) | 2017-03-16 |
EP3347497A2 (en) | 2018-07-18 |
EP3347497A4 (en) | 2019-01-23 |
WO2017044993A3 (en) | 2017-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108026568A (en) | Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling | |
US20220049296A1 (en) | Nucleic acid analysis by joining barcoded polynucleotide probes | |
ES2873850T3 (en) | Next Generation Sequencing Libraries | |
US20190024141A1 (en) | Direct Capture, Amplification and Sequencing of Target DNA Using Immobilized Primers | |
US8986958B2 (en) | Methods for generating target specific probes for solution based capture | |
EP2395098B1 (en) | Base specific cleavage of methylation-specific amplification products in combination with mass analysis | |
JP6925424B2 (en) | A method of increasing the throughput of a single molecule sequence by ligating short DNA fragments | |
US20110003301A1 (en) | Methods for detecting genetic variations in dna samples | |
US20120003657A1 (en) | Targeted sequencing library preparation by genomic dna circularization | |
EP3129505B1 (en) | Methods for clonal replication and amplification of nucleic acid molecules for genomic and therapeutic applications | |
CN105934523A (en) | Multiplex detection of nucleic acids | |
CN108611398A (en) | Genotyping is carried out by new-generation sequencing | |
CN107109401A (en) | It is enriched with using the polynucleotides of CRISPR cas systems | |
CN106574286A (en) | Selective amplification of nucleic acid sequences | |
JP6234463B2 (en) | Nucleic acid multiplex analysis method | |
US20220364169A1 (en) | Sequencing method for genomic rearrangement detection | |
CN107760772A (en) | For the method for nucleic acid match end sequencing, composition, system, instrument and kit | |
WO2017181670A1 (en) | Method for enriching target nucleic acid sequence from nucleic acid sample | |
US20200299764A1 (en) | System and method for transposase-mediated amplicon sequencing | |
US20070148636A1 (en) | Method, compositions and kits for preparation of nucleic acids | |
US10036053B2 (en) | Determination of variants produced upon replication or transcription of nucleic acid sequences | |
KR102237248B1 (en) | SNP marker set for individual identification and population genetic analysis of Pinus densiflora and their use | |
Ladas | Hybridization enrichment of subgenomic targets for next generation sequencing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |