CN108026568A - Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling - Google Patents

Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling Download PDF

Info

Publication number
CN108026568A
CN108026568A CN201680052075.1A CN201680052075A CN108026568A CN 108026568 A CN108026568 A CN 108026568A CN 201680052075 A CN201680052075 A CN 201680052075A CN 108026568 A CN108026568 A CN 108026568A
Authority
CN
China
Prior art keywords
sequence
complementary
target
probe
complementary probe
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201680052075.1A
Other languages
Chinese (zh)
Inventor
H·科史恩斯基
J·D·卡尔瑞
R·欧卡尔拉汉
A·麦克伊
D·菲兹帕特里克
P·H·迪金森
A·C·施韦兹尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Affymetrix Inc
Original Assignee
Affymetrix Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Affymetrix Inc filed Critical Affymetrix Inc
Publication of CN108026568A publication Critical patent/CN108026568A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses the presence for measuring one or more polynucleotide sequences in two or more samples, it is not present, composition, method and the kit of content, copy number or other characteristics, and the application of the composition, method and kit in Genotyping, the assessment for copying number variation, expression analysis, splicing variants and the measure of fusion and other genetic analyses.

Description

Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling
Related application
This application claims the U.S. Provisional Patent Application No. submitted for 8th in September in 2015 in January, 62/215,679,2016 The U.S. Provisional Patent Application No. submitted for 31st U.S. Provisional Patent Application No. submitted on April 4th, 62/289,303,2016 The priority for the U.S. Provisional Patent Application No. 62/353,088 that on June 22nd, 62/317,879 and 2016 submits, these U.S. Temporary patent application is incorporated by herein for all purposes.
Technical field
The present invention relates to for carrying out foranalysis of nucleic acids to two or more samples and keeping the same of each sample at the same time Composition, method and the kit of property.
Introduction
It is required to keep the homogeneity of each sample and promotes such as analysis of Genotyping, copy number, expression analysis, table See genetic analysis and measure specific gene, SNP, indel, transcript or gene loci presence, be not present or content etc. is more The genetic analysis method of the application of weight processing method.The present invention can meet this demand.
The content of the invention
Present disclose provides composition, method and kit for carrying out foranalysis of nucleic acids to target polynucleotide etc..This point Analysis may include to measure the existence or non-existence of a variety of target polynucleotides in two or more samples.In other respects, this point Analysis can be used in two or more samples to one or more allele carry out Genotyping, analysis copy number variation, Analyze the expression of epigenetics event (such as methylating) or the one or more RNA transcripts of analysis.
This method may include following steps:Two or more samples are provided, it is more that each sample includes one or more targets Nucleotide, every kind of target polynucleotide include the first target sequence and the second target sequence;Multiple first complementary probes and second mutual are provided Probe is mended, (i) each first complementary probe has with the Sequence of the first target sequence complementation and non-mutually with the first target sequence The Sequence of benefit, wherein non-complementary portion include inquiry site bar code sequence and adjacent universal sequence, and (ii) is each Second complementary probe has with the Sequence of the second target sequence complementation and with the second target sequence incomplementarity close to sequence portion Point;With each individually multiple first complementary probes of sample incubation and the second complementary probe under hybridization conditions so that first is complementary Probe and the second complementary probe hybridize in sample their complementary target polynucleotide to form hybridization complex;With reference in sample The first complementary probe and the second complementary probe of the first target sequence and the second target sequence are hybridized to form product polynucleotides;It is rich Collect the product polynucleotides formed by single sample;It is and more to measure target by analyzing product polynucleotides or its complementary strand Existence or non-existence of the nucleotide in one or more samples.
First complementary probe and the second complementary probe can be complementary with the first target sequence and the second target sequence, and can be with that This close to or it is adjacent to each other and be separated by 1 to 500 nucleotide.
First complementary probe can have comprising both 3' and 5' complementary with target sequence and in inquiry site bar code The sequence of side, and the adjacent universal sequence of the first complementary probe can be 5' to complementary series part, the complementary series portion Point can be the first complementary probe 5' to incomplementarity inquire site bar code.
The non-complementary portion of first complementary probe and the second complementary probe can include universal sequence, and can also include additional Sequence, the appended sequence can effectively make the length normalization method of product polynucleotides in given measure.First complementary spy Pin and the universal sequence of the second complementary probe can be identical or different.
Universal sequence may include the primer binding sequence with primer sequence complementation, and the primer sequence can be used for increasing (i) Sample index, (ii) are used for the appended sequence of formation sequence data or another form of detection (such as new-generation sequencing Adapter, for capturing capture probe on a solid surface or sequence) and (iii) other parts (for example, can be used for next One or more of the part of generation sequencing (" NNGG ")).
Primer sequence may include PCR primer sequence.
Incomplementarity inquire site bar code and sample index length can be 10,11,12,13,14,15 A or 16 nucleotide, such as length are 12 or 15 nucleotide.Inquiry site bar code may be selected from SEQ ID NO:1-SEQ ID NO:384.Sample index bar code may be selected from SEQ ID NO:1-SEQ ID NO:73536.
When performing this method, the first complementary probe and the second complementary probe composition can be added before hybridization step The temperature of heat to 70 DEG C to 100 DEG C.Can before compilation steps enriched product polynucleotides, such as pass through product polynucleotides PCR amplification realize.
Composition and method can be based on solution, and each in the first complementary probe and the second complementary probe can Including inosine, the 3' ends and 5' ends of the inosine and probe are separated by 2,3,4,5,6,7,8,9,10 or more Multiple bases.
Present disclose provides depositing available for Genotyping, measure copy number variation and/or the specific target polynucleotide of measure Or be not present or the composition of content, method and kit.
The supplementary features and advantage of the disclosure are as described in following specification.These features of the disclosure and other features will Become more fully apparent from following specification, or can be by putting into practice acquistion to the principles described herein.
Attached drawing schematic illustration
Figure 1A-E provide the composition for carrying out foranalysis of nucleic acids by combining the polynucleotide probes of bar shaped code labeling With the schematic diagram of method.The attached drawing will be described in detail in example 1.
Fig. 2 shows the result of study of the influence in relation to inquiring site bar code arrangement in the first complementary probe.In figure The cluster figure in multiple sites and two kinds of strategies for the inquiry bar code arrangement in the first complementary probe are shown.Left figure (6mer) between the first target sequence and universal sequence with shorter inquiry site bar code (6 nucleotide) ( 6-mer is included between one target sequence and universal sequence).Right figure (12mer) has longer inquiry site in the first target sequence Bar code (12 nucleotide) so that inquiry site bar code there are complementary series for both sides (12- is included in the first target sequence mer).The deciphering quantity of allele A (x-axis) and allele B (y-axis) are shown, wherein each corresponding uniqueness of point in figure Sample (96 identical samples are included in per treatment).AA animals are along x-axis, and BB animals are along y-axis, and AB animals occupy axis Center.Show in figure, under a number of cases, genotype resolution ratio is similar, and in other cases, genotype resolution ratio exists It is outstanding under a kind of or other arrangements.As shown in Figure 2 A, wherein the first complementary probe have comprising related allele and The information in both sites and the 12-mer inquiry site bar codes in the both sides of inquiry site bar code with complementary series As a result have than wherein the first complementary probe comprising in relation to the allele in different sequences and the information in site and inquiring The result that 6-mer inquiry site bar code of the both sides of site bar code without complementary series obtains provides clearer base Because of type cluster (Fig. 2A).In this case, inquire site bar code and target sequence and universal sequence close to.As shown in Figure 2 B, two Group produces similar genotype cluster.As shown in Figure 2 C, inquire that the genotype cluster that site bar code produces provides slightly by 6-mer Micro- clearer genotype cluster.
Fig. 3 shows to mitigate the G of probe triplet using deoxyinosine in Genotyping measures embodiment:The shadow of T mispairing Loud result of study.The 2nd to the 10th 3' position by the first complementary probe that deoxyinosine is placed in impacted form, To being subject to serious G:The probe that T mispairing influences is modified (nothing, iT2 to iT10).Series model is (wherein complementary for first The LHS-T (the first complementary probe) of the impacted variation of probe is shown on 5' to 3' directions, and target gDNA or genome DNA is shown on 3' to 5' directions) show in genomic dna sequence comprising mispairing to the first complementary probe of G nucleotide 10 big 3' positions.The 2nd shown 3' positions (i) correspond to " iT2 ".The underscore part of gDNA sequences is the second complementary spy Pin is by the part of hybridization.Closed grey bar is homozygosity GG samples, and striated bar represents the sample of homozygosity AA.Y-axis is and T-shaped The logarithmic scale for the deciphering quantity that first complementary probe body of formula is associated.Grey bar (homozygosity GG samples) is represented by G:T is wrong Non-specific connection caused by the stability matched somebody with somebody.Striated bar (homozygosity AA samples) represents specificity connection.The result shows that put Deoxyinosine arrangement in the 2nd or the 3rd 3' position of the first complementary probe of modified forms significantly reduces non-spy The deciphering quantity of opposite sex connection.Similarly, deoxyinosine can be used in the first complementary probe, which has 3'G And G:The possibility of T mispairing.
Fig. 4 shows wherein to detect the result of study of a small amount of target DNA in background (noise) genomic DNA.Fig. 4 A show to use Signal and noise base in the quantity (above) in per treatment two optimal sites, signal and noise genome and each reaction Because of the average relative uniformity of the ng (figure below) of group.The result shows that reduced with the quantity of signal gene group, the phase in two sites It is still very high to uniformity.Even for the 122ng input signal genes under the background equivalent to 250000 noise genomes Group, average relative uniformity is also 100%.This is under the background of the noise genome of equivalent size, in signal gene group Testing result under 0.05% pollutional condition.Fig. 4 B show the average deciphering quantity associated with Single locus per treatment, There is shown with the ng of signal and noise genome in the quantity (above) and each reaction of signal and noise genome (figure below).With The quantity for signal gene group is reduced, and the deciphering quantity associated with Single locus also reduces, and largely with instead The content of noise DNA present in answering is unrelated.
Fig. 5 shows the result of study being heated or not heated before Genotyping measure embodiment is performed to sample of nucleic acid. Cluster illustrates the presence (heating of the Single locus and reversible denaturation in workflow;Fig. 5 A) or there is no (do not heat;Figure 5B).There is shown with the deciphering quantity of allele A (x-axis) and allele B (y-axis), wherein corresponding one of each point is unique Sample (includes 96 identical samples) in per treatment.AA animals are along x-axis, and BB animals are along y-axis, and AB animals occupy axis Center.Response diagram (heating with reversible denaturation;Fig. 5 A) three easily distinguishable genotype clusters are shown.Lack reversible denaturation Response diagram (does not heat;Fig. 5 B) three easily distinguishable genotype clusters are not shown.
Fig. 6 illustrates various storage methods to Genotyping using the cluster of Single locus and four kinds of probe member storage processing Measure the influence of the reaction result of embodiment.It is from left to right freshly prepared (Fig. 6 A), (Fig. 6 B) of freezing, dry (figure The figure of (Fig. 6 D) probe member 6C) and through trehalose dried.Allele A (x-axis) and allele B (y-axis) are shown in figure Deciphering quantity, wherein the corresponding unique sample (96 identical samples are included in per treatment) of each point.AA is moved Thing is along x-axis, and BB animals are along y-axis, and AB animals occupy the center of axis.Although fresh, freezing and obtained by being dried with trehalose Figure it is similar, but do not use in the figure that trehalose is dried to obtain and show that the resolution ratio of three kinds of genotype is relatively low.
Fig. 7 shows the use of the copy number analysis embodiment when performing copy number analysis to measure copy number variation (CNV) On the way.Fig. 7 A show the solution associated with the inquiry site bar code of the A allele of the target site for independent sample in X-axis Reading amount.Circle=BB samples, triangle=AB samples, square=AA samples.Fig. 7 B show being averaged for BB, AB and AA sample Solve reading (bar) and standard deviation (palpus).Fig. 7 C show that the copy number of A gene locis is 0,1 or 2.
Fig. 8 shows the use of the tetraploid Genotyping embodiment in the detection and Genotyping of tetraploid genomic DNA On the way.The cluster figure of Single locus in simulation tetraploid genomic DNA sample is shown in figure.There is shown with allele A (x-axis) and wait The deciphering quantity of position gene B (y-axis) and allele C (z-axis), wherein each corresponding unique sample of point.This In the case of, allele A is C bases, and allele B is T bases.With TTTT (solid circles) or CCCC (solid just It is square) the homozygosity animal of genotype draws along Y-axis or X-axis respectively.Heterozygosity animal with hollow square, black triangle and Open diamonds are shown.
Fig. 9 shows the purposes of the Genotyping embodiment for the inquiry in multiple alleles site.Shown in figure single more The cluster figure in allele site.Three allele are substitute.There is shown with allele A (x-axis) and allele B (y Axis) and allele C (z-axis) deciphering quantity, wherein the corresponding unique sample of each point.In this case, etc. Position Gene A is G bases, and allele B is T bases, and allele-C is C bases.AA animals are along x-axis, and BB animals are along y Axis, CC animals are along z-axis.Heterozygosity animal (TC, TG, CG) is between any two axis.
The embodiment that Figure 10 is shown with for inquiring missing is checked in sample of nucleic acid presence or absence of particular sequence Result of study.Show three bases (Figure 10 A) of missing and lack the cluster figure in the site of 45kb (Figure 10 B).There is shown with equipotential The deciphering quantity of Gene A (x-axis) and allele B (y-axis), wherein each corresponding unique sample of point is (in per treatment Include 96 identical samples).AA animals are along x-axis, and BB animals are along y-axis, and AB animals occupy the center of axis.With missing The resolution ratio of the cluster figure in site is similar to the resolution ratio of the cluster figure in the site replaced there are single base.
Figure 11 A are the schematic diagram for showing the exemplary sequence with the self-complementarity by lines instruction.
Figure 11 B are the mutually homotactic schematic diagram shown in Figure 11 A with the variable bar code area by square frame instruction.
Figure 12 A are to show 7 base-pairs (bp) in the internal 3' ends complementation of index (7+0+1) and some other matchings To stablize the schematic diagram of the variation of dimer.
Figure 12 B are the schematic diagram for showing 7 base-pairs (bp) in the internal 3' ends partial complementarity of index (1+0+7), In this example, 0 matches for GT, it can be matched equivalent to 9 base-pairs (bp).
Figure 13 is to show unstability site (nearside SNP) and label site (target SNP) and its in polyploid target gene group Relative position schematic diagram.Unstability site can be at the either side of label/target SNP.Hollow arrow is directed toward them in target base Because of the corresponding site in group.
Figure 14 is to show unstability site (nearside SNP) and label site (target SNP) and its in polyploid target gene group Relative position schematic diagram.Unstability site can be at the either side of label/target SNP.Figure 14 A show wherein unstability position Point and label site are the situation of SNP.Figure 14 B show that wherein unstability site is insertion point and label site is The situation of SNP.Figure 14 C show the situation that wherein unstability site is deletion segment and label site is SNP.
Figure 15 is the present or absent Genotyping shown for detecting the target polynucleotide in polyploid sample The schematic diagram of probe used in method.Figure 15 A show that the situation of nearside SNP is not present wherein in target DNA.LHS and RHS are miscellaneous Target DNA is sent to, and LHS is connected (cloud represents connection) with RHS.Figure 15 B show that there are the feelings of nearside SNP wherein in target DNA Shape (arrow direction cross).Nearside SNP makes the hybridization between RHS and target DNA unstable, and does not connect.
Figure 16 is the present or absent Genotyping shown for detecting the target polynucleotide in polyploid sample The schematic diagram of probe used in method.Figure 16 A show that the situation of nearside SNP is not present wherein in target DNA.LHS and RHS are miscellaneous Target DNA is sent to, and LHS is connected (cloud represents connection) with RHS.Figure 16 B show that wherein (arrow refers to there are the situation of nearside SNP To cross).It further avoid with the blocking oligonucleotide of the target DNA complementation with nearside SNP miscellaneous between RHS and target DNA Hand over, and do not connect.
Figure 17 is the present or absent Genotyping shown for detecting the target polynucleotide in polyploid sample The schematic diagram of probe used in method.Figure 17 A show that the situation of nearside SNP is not present wherein in target DNA.Increase early period PCR amplification step, is only expanded interested using knowledge of the PCR primer based on nearside SNP and the relative position of target/label SNP Unique genome or subgenome.Then, PCR amplification that LHS and RHS hybridizes to target DNA, and LHS and RHS occurs Connection (cloud represents connection).Figure 17 B show that there are the situation (arrow direction cross) of nearside SNP wherein in target DNA.Target DNA In nearside SNP avoid PCR amplification early period, its disturb PCR primer be attached to target DNA.
Figure 18 shows the influence that early period, PCR amplification step understood sequence.There is shown with allele A (x-axis) and equipotential The deciphering quantity of gene B (y-axis), wherein each corresponding unique sample of point.Figure 18 A are shown without PCR amplification step early period The result of cluster figure on rapid genomic DNA.Figure 18 B show the cluster figure on PCR amplification comprising PCR amplification step early period. Enrichment PCR amplification step improves the resolution ratio of site cluster figure.
Figure 19 is to show that SplintR ligases can connect to hybridize to grinding for the mRNA transcripts from mankind's hela cell line Study carefully result.Reading there is shown with each sample is total (in all sites).When SplintR ligases are saved in reaction, inspection The deciphering gone out is close to zero (16 independent reactions altogether).This group of data eliminate any first complementary probe, the first complementary spy Pin does not have the second complementary probe of gametophyte for being connected to the first complementary probe, and it is abnormal to substantially eliminate pseudo- first complementary probe The noise of connection product.
Figure 20 shows glyceraldehyde-3-phosphate dehydrogenase (GADPH) gene obtained by the titration of SplintR ligases The result of total solution reading of mRNA transcripts (being arbitrarily assigned as the site 745 in 778 site groups) is (per 500mL connection mixtures Deposit SplintR enzymes milliliter number [25 units/μ L]).With the decline of SplintR ligase unit concentrations, repeatedly weighing Branch mailbox (binned) the solution reading obtained during repetition measurement is fixed is also reduced.In the case where RNA ligase is zero unit, which does not deposit Understood in branch mailbox.SplintR connections enzyme reaction depends on the concentration of SplintR ligases.
Figure 21 shows the glyceraldehyde-3-phosphate dehydrogenase that the titration by inputting RNA and human genome DNA obtains (GADPH) result of total solution reading of the mRNA transcripts (being arbitrarily assigned as the site 745 in 778 site groups) of gene.With The reduction of RNA concentration in reaction, branch mailbox site 745 solve reading and also reduce.Still deposited in the case of dna in zero input RNA, inspection Measure a small amount of branch mailbox signal.Neither inputting RNA nor in the case of inputting DNA, branch mailbox signal is close to zero.SplintR connections Enzyme reaction depends on RNA, but has trace reactivity with DNA.
Specific embodiment
Present disclose provides composition, method and the kit for including multiple first complementary probes and the second complementary probe. Each first complementary probe may include the sequence with the first target sequence complementation interested.Each second complementary probe may include with The sequence of second target sequence complementation interested.When the first complementary probe and the second complementary probe hybridize to the first target sequence of complementation When row and the second target sequence, the first probe and the second probe can be combined to form product polynucleotides.
The disclosure additionally provides multiple samples, and each sample may include one or more target sequences.Some samples include A variety of target sequences, and some samples do not include any target sequence.
Present disclose provides the presence available at least one of definite one or more samples target polynucleotide, do not deposit , composition, method and the kit of genotype, content or copy number.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains The method of amount or copy number, this method comprise the following steps:(a) sample for including one or more target polynucleotides is provided, often Kind target polynucleotide includes the first target sequence and the second target sequence;(b) multiple first complementary probes and the second complementary probe are provided, Including the first complementary probe and the second complementary probe for every kind of target polynucleotide, (i) each first complementary probe tool There is the Sequence with the Sequence of the first target sequence complementation of target polynucleotide and with the first target sequence incomplementarity, wherein Non-complementary portion includes inquiry site bar code sequence and adjacent universal sequence, and (ii) each second complementary probe have with The Sequence of the second target sequence complementation of target polynucleotide and with the second target sequence incomplementarity close to Sequence;(c) The multiple first complementary probe of sample incubation and the second complementary probe are used under hybridization conditions so that the first complementary probe and Two complementary probes hybridize in sample their complementary target polynucleotide to form hybridization complex;(d) combine and hybridize in sample It is more to form product to the first target sequence and the first complementary probe of the second target sequence of target polynucleotide and the second complementary probe Nucleotide;And (e) measures every kind of target polynucleotide depositing in the sample by analyzing product polynucleotides or its complementary strand , be not present, content or copy number.
It is used to measure depositing for one or more target polynucleotides in two or more samples present disclose provides a kind of , be not present, the method for content or copy number, this method comprises the following steps:(a) two or more samples are provided, each Sample includes one or more target polynucleotides, and every kind of target polynucleotide includes the first target sequence and the second target sequence;(b) provide Multiple first complementary probes and the second complementary probe, including for the first complementary probe of every kind of target polynucleotide and second Complementary probe, (i) each first complementary probe have with the Sequence of the first target sequence complementation of target polynucleotide and with The Sequence of first target sequence incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and adjacent general sequence Row, and (ii) each second complementary probe has with the Sequence of the second target sequence complementation of target polynucleotide and with the Two target sequence incomplementarities close to Sequence;(c) visited under hybridization conditions with each sample incubation the multiple first is complementary Pin and the second complementary probe so that the first complementary probe and the second complementary probe hybridize to their the more nucleosides of complementary target in sample Acid is to form hybridization complex;(d) the of the first target sequence that target polynucleotide is hybridized in sample and the second target sequence is combined One complementary probe and the second complementary probe are to form product polynucleotides;(e) the product polynucleotides formed by sample are collected;With And (f) measures every kind of target polynucleotide depositing in one or more samples by analyzing product polynucleotides or its complementary strand , be not present, content or copy number.
The first target sequence and the second target sequence of every kind of target polynucleotide can be closely adjacent to each other.Alternatively, every kind of target multinuclear The first target sequence and the second target sequence of thuja acid can be separated by 1 to 500 nucleotide.For example, the first of every kind of target polynucleotide Target sequence and the second target sequence can be separated by least one, at least two, at least three, at least four, at least five or at least ten core Thuja acid, or the first target sequence of every kind of target polynucleotide and the second target sequence can be separated by 2 to 10,5 to 15,7 to 15 It is a, 10 to 12,15 to 25,25 to 40,30 to 45,40 to 16,60 to 65,60 to 75,70 to 85,80 To 95,90 to 120,110 to 150,120 to 160,130 to 170,150 to 190,170 to 210,190 to 230,200 to 230,220 to 260,230 to 270,240 to 310,300 to 340,330 to 370,360 to 400,390 to 430,410 to 450,440 to 480,470 to 500 nucleotide.
Second complementary probe may include universal sequence close to Sequence.The general sequence of second complementary probe Row may include universal primer sequence with primer sequence complementation, and the primer sequence is attached available for (i) sample index, (ii) is increased Sequence, (iii) is added to be used for one in the appended sequence and (iv) another part of formation sequence data or another form of detection Person or more persons.
The adjacent universal sequence of first complementary probe may include the universal primer sequence with primer sequence complementation, described Primer sequence can be used for increasing (i) sample index, (ii) appended sequence, (iii) for formation sequence data or another form Detection appended sequence and one or more of (iv) another part.
Universal primer sequence may include PCR primer sequence and/or primer sequence and be used for formation sequence data or another to increase The appended sequence of a form of detection.Can be to use for formation sequence data or the appended sequence of another form of detection In the adapter of new-generation sequencing.Can be capture sequence for formation sequence data or the appended sequence of another form of detection Row, optionally wherein capture sequence and are used to capture in solid support.Universal primer sequence can effectively increase beneficial to life Into the part of sequence.
The length of sample index can be at least ten, 11,12,13,14,15 or 16 nucleotide.It is excellent Selection of land, the length of sample index is 12 to 15 nucleotide.Sample index sequence may be selected from by SEQ ID NO:1-SEQ ID NO:73536。
The universal sequence of first complementary probe and the second complementary probe can each include primer sequence, the primer sequence The primer for composition sequence can be hybridized to.Primer sequence may include PCR primer sequence.
First complementary probe may include from 5' to 3':Adjacent universal sequence, with the Sequence of the first target sequence complementation with And the inquiry site bar code in Sequence with the complementation of the first target sequence.
First complementary probe include 5' to the inquiry site bar code complementary with the first target sequence sequence and 3' to The sequence of the complementary inquiry site bar code of first target sequence.
First complementary probe can include the sequence of the first target sequence complementation of the 3' and both 5' with inquiring site bar code.
Second complementary probe may include from 5' to 3':With the Sequence of the second target sequence complementation of target polynucleotide and With the second target sequence incomplementarity close to Sequence.
The length for inquiring site bar code can be at least ten, 11,12,13,14,15 or 16 nucleosides Acid.Preferably, the length for inquiring site bar code is 12 or 15 nucleotide.Inquiry site bar code may be selected from by SEQ ID NO:1–SEQ ID NO:384。
The step of this method may include to incubate before (or hybridization) step, which includes making target polynucleotide reversibly become Property.The step can be performed by heating as described herein.
This method is included in the additional step of enriching step foregoing description enriched product polynucleotides.Enriching step can wrap Include:(a) one group of PCR primer sequence is provided, which includes the complementary with the primer sequence in the first complementary probe One primer and the second primer with the PCR primer sequence complementation in the second complementary probe, and (b) amplified production polynucleotides.
This method can be the method based on solution.
First complementary probe may include inosine (for example, deoxyinosine), the 3' ends of the inosine and probe be separated by 2,3,4 A, 5,6,7,8,9,10 or more bases.
Second complementary probe may include inosine (for example, deoxyinosine), the 5' ends of the inosine and probe be separated by 2,3,4 A, 5,6,7,8,9,10 or more bases.
The 3' ends of first complementary probe can be with a kind of form in single nucleotide polymorphism (SNP) or other hereditary variations It is complementary.
It may include to hybridize to target multinuclear using connection enzymatic treatment with reference to the step of the first complementary probe and the second complementary probe First target sequence of thuja acid and the first complementary probe of the second target sequence (hybridization complex) and the second complementary probe are produced with being formed Thing polynucleotides.
Disclosed method can be used in Genotyping, and wherein this method includes provide the first complementary probe one or more A variant, wherein variant are different in terms of the homogeneity of one or more nucleotide at 3 ' ends of the first complementary probe, and And wherein described measure includes quantization product polynucleotides or the relative frequency of its complementary strand, product polynucleotides or its is mutual It is complementary comprising described first compared with making with the sequence of other variants described in first complementary probe to mend chain The sequence of one or more of variants of probe, and the frequency is associated with genotype.
Disclosed method can be used for measure target polynucleotide copy number variation, and wherein it is described measure include to by The semaphore that product polynucleotides or its complementary strand produce is with known reference signal amount or by another product polynucleotides or its is mutual The semaphore that chain produces is mended to be compared.
Disclosed method can be used for the presence that target polynucleotide is measured in expression analysis, and wherein target polynucleotide is RNA transcript, and wherein described measure is included to the semaphore and known ginseng by product polynucleotides or the generation of its complementary strand It is compared than semaphore or by the semaphore that another product polynucleotides or its complementary strand produce.
Disclosed method can be used for carrying out polyploid sample Genotyping, and it is more in no information that this method further includes reduction Formation sequence data in times body genome, it includes obtaining the sample base with target SNP/indel and nearside SNP/indel information Because of data unit sequence, the second complementary probe of design is so that the stability of itself and the hybridization of target gene group is beaten by nearside SNP/indel It is broken.
Disclosed method can be used for carrying out polyploid sample Genotyping, and it is more in no information that this method further includes reduction Formation sequence data in times body genome, it includes obtaining the sample base with target SNP/indel and nearside SNP/indel information Because of data unit sequence, the second complementary probe of design is so that the stability of itself and the hybridization of target gene group is beaten by nearside SNP/indel It is broken, and add mutual with further prevention second with the blocking oligonucleotide of the target gene group complementation with nearside SNP/indel Mend probe and the hybridization of the target gene group.
Disclosed method can be used for carrying out polyploid sample Genotyping, and it is more in no information that this method further includes reduction Formation sequence data in times body genome, it includes obtaining the sample base with target SNP/indel and nearside SNP/indel information Because of data unit sequence, and increase PCR amplification step early period to select unique genome interested.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains The composition of amount or copy number, including multiple first complementary probes and the second complementary probe, the multiple first complementary probe and Second complementary probe includes the first complementary probe and the second complementary probe for every kind of target polynucleotide, and (i) each first is mutual Probe is mended with the sequence with the Sequence of the first target sequence complementation of target polynucleotide and with the first target sequence incomplementarity Part, wherein non-complementary portion include inquiry site bar code sequence and adjacent universal sequence, and (ii) each second is complementary Probe has with the Sequence of the second target sequence complementation of target polynucleotide and with the second target sequence incomplementarity close to sequence Arrange part.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains The composition of amount or copy number, including multiple first complementary probes and the second complementary probe, the multiple first complementary probe and Second complementary probe includes the first complementary probe and the second complementary probe for every kind of target polynucleotide, and (i) each first is mutual Mend probe have with two Sequences of the different piece complementation of the first target sequence of target polynucleotide and with the first target sequence Two Sequences of row incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and universal sequence, and (ii) each second complementary probe have with the Sequence of the second target sequence complementation of target polynucleotide and with the second target sequence Row incomplementarity close to Sequence and including universal sequence.
The first target sequence and the second target sequence of every kind of target polynucleotide can be closely adjacent to each other.Alternatively, every kind of target multinuclear The first target sequence and the second target sequence of thuja acid can be separated by 1 to 500 nucleotide.
The universal sequence of first complementary probe may include the universal primer sequence with primer sequence complementation, the primer sequence (i) sample index can be increased, (ii) appended sequence, (iii) are used for the additional of formation sequence data or another form of detection One or more of sequence and (iv) another part.
The universal sequence of second complementary probe may include the universal primer sequence with primer sequence complementation, the primer sequence (i) sample index can be increased, (ii) appended sequence, (iii) are used for the additional of formation sequence data or another form of detection One or more of sequence and (iv) another part.
The universal primer sequence of first complementary probe and/or the second complementary probe may include PCR primer sequence and/or primer Sequence is to increase the appended sequence beneficial to formation sequence data or another form of detection.For formation sequence data or another The appended sequence of the detection of kind form can be the adapter for new-generation sequencing.For formation sequence data or another shape The appended sequence of the detection of formula can be capture sequence, optionally wherein capture sequence and be used to capture in solid support.It is logical It can effectively increase the part beneficial to formation sequence with primer sequence.Universal primer sequence may include primer sequence, the primer Sequence, which provides, to be used to increase sample index.
The length of sample index can be at least ten, 11,12,13,14,15 or 16 nucleotide.It is excellent Selection of land, the length of sample index is 12 to 15 nucleotide.Sample index sequence may be selected from by SEQ ID NO:1-SEQ ID NO:73536。
The universal sequence of first complementary probe and the second complementary probe can each include primer sequence, the primer sequence The primer for composition sequence can be hybridized to.Primer sequence may include PCR primer sequence.
First complementary probe may include from 5' to 3':Adjacent universal sequence, with the Sequence of the first target sequence complementation with And the inquiry site bar code in Sequence with the complementation of the first target sequence.
First complementary probe include 5' to the inquiry site bar code complementary with the first target sequence sequence and 3' to The sequence of the complementary inquiry site bar code of first target sequence.
First complementary probe can include the sequence of the first target sequence complementation of the 3' and both 5' with inquiring site bar code.
Second complementary probe may include from 5' to 3':With the Sequence of the second target sequence complementation of target polynucleotide and With the second target sequence incomplementarity close to Sequence.
The length for inquiring site bar code can be at least ten, 11,12,13,14,15 or 16 nucleosides Acid.Preferably, the length for inquiring site bar code is 12 or 15 nucleotide.Inquiry site bar code may be selected from by SEQ ID NO:1–SEQ ID NO:384。
First complementary probe may include inosine (for example, deoxyinosine), the 3' ends of the inosine and probe be separated by 2,3,4 A, 5,6,7,8,9,10 or more bases.
Second complementary probe may include inosine (for example, deoxyinosine), the 5' ends of the inosine and probe be separated by 2,3,4 A, 5,6,7,8,9,10 or more bases.
The 3' ends of first complementary probe can be with a kind of form in single nucleotide polymorphism (SNP) or other hereditary variations It is complementary.
Present disclose provides a kind of presence for the one or more target polynucleotides being used in determination sample, it is not present, contains The kit of amount, copy number or characteristic, the kit include:(a) multiple first complementary probes and second as disclosed herein Complementary probe;And (b) is optionally, for the buffer solution and enzyme for connecting and being enriched with.
It is used to measuring the presence of one or more target polynucleotides in sample present disclose provides a kind of, is not present, contains The kit of amount, copy number or characteristic, the kit include:(a) multiple first complementary probes and the second complementary probe are described more A first complementary probe and the second complementary probe are included for the first complementary probe of every kind of target polynucleotide and the second complementary spy Pin, (i) each first complementary probe have with the Sequence of the first target sequence complementation of the target polynucleotide and with institute The Sequence of the first target sequence incomplementarity is stated, wherein the non-complementary portion includes inquiry site bar code sequence and adjacent logical With sequence, and (ii) each second complementary probe have with the Sequence of the second target sequence complementation of target polynucleotide and With the second target sequence incomplementarity close to Sequence;And (b) is optionally, for the buffer solution that connects and be enriched with and Enzyme.
It is used to measuring the presence of one or more target polynucleotides in sample present disclose provides a kind of, is not present, contains The kit of amount, copy number or characteristic, the kit include:(a) multiple first complementary probes and the second complementary probe are described more A first complementary probe and the second complementary probe are included for the first complementary probe of every kind of target polynucleotide and the second complementary spy Pin, (i) each first complementary probe have two sequence portions with the different piece complementation of the first target sequence of target polynucleotide Be divided to and two Sequences with the first target sequence incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and Universal sequence, and (ii) each second complementary probe have with the Sequence of the second target sequence complementation of target polynucleotide with And with the second target sequence incomplementarity close to Sequence;And (b) is optionally, for the buffer solution that connects and be enriched with and Enzyme.
The kit may also include at least one PCR primer, polymerase and/or one group of dNTP to expand the target multinuclear of extension Thuja acid is to realize the purpose of enrichment.
The kit may also include ligase.
The kit may also include the software for parsing data.
The kit can be used for measuring genotype, and/or the kit can be used for measuring copy number, and/or the kit Expression available for measure RNA transcript.
Definition
Unless otherwise defined, otherwise all technical and scientific terms used herein have with it is of the art general The normally understood identical implication of logical technical staff.
It should be appreciated that present disclosure disclosed in this specification includes all possible group of such specific feature Close.For example, in the case of disclosing special characteristic under particular aspects or the linguistic context of specific embodiment, this feature generally can also be with Other specific aspects and embodiment in the present invention are used in combination, except eliminating this possibility within a context.Herein Disclosed invention content includes the embodiment being not explicitly described, and can for example utilize not specifically disclosed spy herein Sign, but it is to provide function identical with feature explicitly disclosed herein, equivalent or similar.
When describing and proposing claim to the present invention, following term will be used according to following definition.
As used in the specification and the appended claims, singulative "one", " one kind ", "the" and " described " wrap Plural thing is included, unless the other clear stipulaties of context.Thus, for example, " a kind of target polynucleotide " that refers to includes two kinds Or more the such target polynucleotide of kind, and " probe " referred to includes the mixing of two or more probes or probe Thing etc..
Term " adjacent " as used herein means two sequences substantially adjacent to each other in nucleic acid, but at two There may be base among one or more between flanking sequence.
Term " close to " as used herein means two sequences adjacent to each other in accounting and close between sequence There is no middle base.
Term " allele " means one kind in two or more optional forms of gene or gene loci.If one Kind diplont has two copies of phase iso-allele, such as AA or aa, then has homozygosity at the position.If Biology has two not one of iso-allele copies, such as Aa, then has heterozygosity at the position.Substituting nomenclature makes Allele is represented with A and B.Homozygosity diplont is AA or BB at the position.Heterozygosity diplont is in the position It is AB to put place.Term " allele " is also applied for that there are three or more possible alternative forms and can be such as this area In it is known as situation about extending, such as allele A, B and C on ternary single nucleotide polymorphism.
Term " array " as used herein means can be by deliberately manufacturing made from synthesis or biosynthesis mode Elements collection.Various forms, such as shla molecule storehouse can be presented in array, and utilizes one or more solid supports, such as Glass slide, silica chip, particulate, nano-particle or pearl.As used herein, " solid support " is that can be attached to spy Any material of pin, target nucleotide or product nucleotide, for example, glass and modification or functional glass, plastics, polysaccharide, nylon, Nitrocellulose, ceramics, resin, the material based on silica, carbon, metal, inorganic material and other polymers, such as flow Pond or other surfaces of solids (such as pearl or microarray).
Term " at least " used herein followed by numeral represent the starting of the scope since the numeral (according to determining The variable of justice, which can have the upper limit or the scope without the upper limit).Term " at most " used herein followed by numeral Represent that (according to defined variable, which can be used as its lower limit using 1 or 0 with the end of the scope of the end of digit Scope or the scope without lower limit).When scope with " (first digit) to (second digit) " or " (first digit) extremely When (second digit) " provides, it is intended that the lower limit of the scope is first digit and its upper limit is second digit.As herein Term " plural number ", " multiple ", " plural form " and " diversity " used represents two or more features.
The term " bar code " being used interchangeably herein or " index " refer to be used to identify or " mark " is one or more The nucleotide sequence of specific target or product polynucleotides.The length of " bar code " is generally at least 5 nucleotide (nt).If In dry embodiment, bar code or one part may occur in which in the first complementary probe and/or the second complementary probe.Such as this paper institutes With bar code can be used as sample bar code or inquiry site bar code.In several embodiments, identical bar code sequence is in Two diverse locations in polynucleotides, and it is used as sample index bar code at a position, and at another position As inquiry site bar code.In several embodiments, (it can also have identical or different length to different bar code sequences Degree) two diverse locations in polynucleotides are in, and it is used as sample bar code at a position, and in another position Place is used as inquiry site bar code.Bar code can have the identical sequence being present in target polynucleotide or its complementary strand, it can Think the sequence of the Sequence complementation in target polynucleotide or its complementary strand, and it can be and target polynucleotide or its is mutual Chain is mended without complementary sequence, or can be any combination in these states.In several embodiments, single sequence At the same time as inquiry site bar code and sample index.In several embodiments, a part for single sequence is used as inquiry site Bar code and another part is used as sample index.
Term " base " means can be with complementary nucleobases or nucleobase analog (for example, purine, 7- deazapurines or phonetic Pyridine) formed Watson-Crick (Watson-Crick) type hydrogen bond nitrogen heterocyclic ring part.Typical base is naturally occurring alkali Base:Adenine, cytimidine, guanine, thymine and uracil.Base further includes naturally occurring base and universal base Analog, such as inosine, 3- nitro-pyrroles and 5- nitroindolines.Any universal base (does not support particular bases to match general Base) it can put into practice the present invention.
Term " base modification " as used herein refers to that (that is, adenine, guanine, thymus gland are phonetic comprising non-standard bases Base beyond pyridine, cytimidine and uracil) polynucleotides.Such non-standard bases can be used for a variety of purposes, such as Stablize hybridization or make hybridization unstable;Promote or suppress degraded;Or as detectable part, quencher moiety or other parts Tie point.Modified base (in addition to heretofore described modified base) and many examples of base analogue exist It is known in this area.
Term " complementary polynucleotide " as used herein refers to the polynucleotides for forming base-pair each other.Base-pair is usual Formed by the hydrogen bond between the nucleotide units in antiparallel polynucleotide chain.Complementary polynucleotide chain can be with fertile Gloomy-Crick mode (for example, A and T, A and U, C and G) or by allow to be formed duplex any other in a manner of (be included in U and G Between the wobble bases that are formed to) form base-pair.As it is known to those skilled in the art, when using RNA rather than DNA, Think that uracil rather than thymidine are complementary with adenine.In definite complementarity between probe and target gene, " complementation " journey Degree is expressed as the percentage between probe sequence and target-gene sequence or the therewith complementary strand of the target-gene sequence of best match. In some embodiments, " complementation " degree between probe sequence and target-gene sequence or the complementary strand of target-gene sequence is not necessarily same For 100%.In one embodiment, " complementation " degree is less than 100%, but is enough to make probe sequence and target under given conditions Hybridize between gene order or the complementary strand of target-gene sequence.
Term " complementation " as used herein refers to basic in complementary series when with another sequence arranged anti-parallel All there is nucleotide base at upper all positions and be free of the Sequence with four or more close to Non-complementary bases Polynucleotides or sequence.
Term "comprising" as used herein and its phraseological equivalent terms mean except it is expressly intended that feature in addition to, can It is optionally present other features.For example, the composition or device of "comprising" (or " it includes ") component A, B and C can only include group Divide A, B and C, or can not only include component A, B and C, but also include one or more other components.Term as used herein " substantially by ... form " and its phraseological equivalent terms mean except it is expressly intended that feature in addition to, may be present not substantive Change other features of claimed invention.
Term " contact " as used herein can refer under conditions of their (if they are fully complementary) are allowed each other Hybridize the combination of two sequences.For example, probe is being allowed to hybridize to the target polynucleotide sequence in sample (if they are fully mutual Mend) under conditions of make the first complementary probe and the second complementary probe and sample contact.
Term " copy number variation " (" CNV ") as used herein means that the change of the DNA parts of genome causes cell In the copy number of one or more parts of DNA change.CNV is generally corresponded to lack on some chromosomes and (is less than Normal number) or repeat (exceeding normal number) genome relatively large region.
Term " correspondence " as used herein or " corresponding to " refer to homologous or substantially equivalent or functionally with specified sequence Equivalent sequence.
Term " definite " as used herein means to infer or find out after reasoning, observation etc..
Term " DNA polymorphism " as used herein refers to that one of two different nucleotide sequences may be present in DNA In specific site at situation.Preferable polymophic markers have at least two allele, each allele with higher than 1%th, 2%, 3%, 4%, 5%, 6%, 7% or higher frequency exist.Under a number of cases, the frequency that allele occurs is big In 10%, 15% or the 20% of selected colony.Polymorphic site can be as small as a base-pair.Single nucleotide polymorphism (SNP) can be with It is that a nucleotide is substituted by another nucleotide at polymorphic site.Single nucleotide polymorphism can lack at polymorphic site One nucleotide of one nucleotide or insertion.Diallele polymorphism has two kinds of forms.Triallelic polymorphism has Three kinds of forms.Single nucleotide polymorphism betides the polymorphic site occupied by single nucleotide acid, which is allele sequence The site to make a variation between row.As the skilled personnel to understand, SNP is usually to include polynucleotides rather than single The polymorphism of base.Other polymorphisms include (small) missing or the insertion of several nucleotide, are known as indel." DNA polymorphism " can For referring to structural rearrangement, transposition, big insertion or missing, inversion etc., and it may also include inhereditary material (it can come from or can be with It is not from host) it is added in genome.
Term " duplex " as used herein refer to complementary (or partial complementarity) single stranded nucleic acid molecule (for example, DNA, RNA, PNA) double-stranded nucleic acid molecule that is annealed into another one and is formed.
Term " the first complementary probe " as used herein refers to comprising the first target sequence at least portion with target polynucleotide Divide the polynucleotides of complementary First ray.First complementary probe can also include inquiry site bar code, the inquiry site bar shaped Code can be directed among different sequences and/or universal sequence (it can include primer binding sequence) etc. specific allele, For specific site or for allele and site (combination) or for allele and the bar code in site.In some realities Apply in example, the first complementary probe can also include the primer sequence for being used for generating sample index.In several embodiments, first is complementary Probe has 5' phosphorylated nucleosides acid.
Term " the first target sequence " refers to a part for the target polynucleotide of the target for hybridization.First target sequence can be deposited In sample or it can be not present in sample.
Term " gene loci " as used herein refers to chromosome or gene on other types nucleic acid, base or any The specific location of important sequence or site.
Term " genotype " as used herein refers to the gene composition of organism.It is used to refer to Single locus, Duo Gewei Point, the change exclusive or monotype site with the sites of two or more allele, copy number or structure.
Term " gap filling " as used herein refers to the first complementary probe and the second complementary probe with not adjacent to each other Mode hybridize to target sequence.When the first complementary probe and the second complementary probe hybridize to the first target sequence and the second target sequence When, there may be between the first complementary probe and the second complementary probe or gap can be not present." gap " can be 1,2 to 10,5 to 15,7 to 15,10 to 12,15 to 25,25 to 40,30 to 45,40 to 16,60 to 65,60 To 75,70 to 85,80 to 95,90 to 120,110 to 150,120 to 160,130 to 170,150 to 190 It is a, 170 to 210,190 to 230,200 to 230,220 to 260,230 to 270,240 to 310,300 to 340 It is a, 330 to 370,360 to 400,390 to 430,410 to 450,440 to 480,470 to 500,2,3, 4,5,10,20,30,40,50,60,70,80,90,100,120,140,160, 180,200,220,240,260,280,300,320,340,360,380,400,420 A, 440,460,480,500 or more nucleotide., can be by using polymerase and connection under a number of cases Enzyme fills up gap with combination the first probe of extension of single or multiple nucleotide or one end of the second probe.In target polynucleotide In the case of for RNA, the one of the first complementary probe or the second complementary spy can be extended by using reverse transcriptase and ligase Hold to fill up.
Term " hybridization " as used herein refer to it is usual make under strict conditions nucleic acid molecules preferably in combination with, it is dual Or it is annealed to specific target polynucleotide.Term " stringent condition " refer to probe by preferential hybridization to its target polynucleotide and The condition hybridized in lower degree with other sequences.The term " stingent hybridization " used under nucleic acid hybridization context depends on sequence Row, and it is different under varying environment parameter.Hybridization stringency is this to the dependence of buffer solution composition, temperature and probe length Known to the technical staff in field (see, for example, Sambrook and Russell (2001)《Molecular cloning:Laboratory manual》 (Molecular Cloning:A Laboratory Manual) the 3rd edition the 1-3 volumes, cold spring harbor laboratory (Cold Spring Harbor Laboratory), New York, United States Cold Spring Harbor Publications (Cold Spring Harbor Press, NY)).Nucleosides The degree of hybridization of acid sequence and target sequence, also referred to as intensity for hybridization, are measured by methods known in the art.Method for optimizing It is the Tm of the given heteroduplex of measure.
Term " inquiry site " as used herein refers to the position in the nucleic acid assessed, such as assessment exists or not In the presence of or the specific gene site of content at SNP.In other embodiments, hereditary variation is not assessed, and to gene The existence or non-existence in site or content are assessed.In other embodiments, to the base composition of specific location in nucleic acid Assessed.
Term " inquiry site bar code " as used herein refers to function to identify specific target polynucleotide and/or its change The bar code of allosome.Inquire site bar code can be directed to specific allele, for specific site, for specific allele With site or for allele and site.
Term " mark " refer to when directly or indirectly attach to nucleotide or oligonucleotides when, can be by suitably detecting hand The part of such nucleotide or oligonucleotides that section is detected.Exemplary indicia includes bar code, fluorogen, chromophore, puts Injectivity isotope, spin labeling, enzyme mark, chemiluminescent labeling, electrochemical luminescence compound, magnetic marker, microballoon, aurosol Category, immune labeled, ligand, enzyme etc..
Term " site " as used herein means specific gene or gene order on chromosome or other nucleic acid structures The position occupied.Site can be the sequence outside gene.The example of other nucleic acid structures is including but not limited to all types of RNA (mRNA, longer non-coding RNA, tiny RNA, rRNA etc.).All types of DNA are further included, for example, it is but unlimited In plasmid, chromosome, BAC, YAC, clay, mitochondria, chloroplaset and plastid DNA, cDNA and any other it is naturally occurring or Artificial structure.
Term " mismatched nucleotide " as used herein refers to when sequence hybridizes each other, in polynucleotides with corresponding spy The nucleotide of corresponding nucleotide incomplementarity in pin or primer sequence.The complementary base of C is G, and the complementary base of A is T.Change Word says that " C " in probe is considered as mispairing with " T " coordination in target polynucleotide.
As used herein, term " modified polynucleotides " can be used for comprising universal base (for example, deoxyinosine ( Also referred herein as " inosine "), 3- nitro-pyrroles or 5- nitroindolines) nucleotide sequence.
Term " new-generation sequencing " or " NGS " as used herein refer to high-flux sequence.NGS can also refer to the third generation, Forth generation and the other generation sequence without high throughput still with other characteristics distinguished with traditional Sanger sequencings Data creation method.
As used herein, " nucleic acid " refers to natural, synthesis or artificial polynucleotides, such as embodies nucleotide sequence DNA or RNA.Nucleic acid can be cleaved, clone, replicating, expanding or otherwise derivative or manipulation.Exemplary DNA material bag Include genomic DNA (gDNA), mitochondrial DNA and complementary DNA (cDNA).Exemplary RNA materials include mRNA (mRNA), turn Transport RNA (tRNA), Microrna (miRNA), children purpura nephritis (siRNA) and rRNA (rRNA).
As used herein, term " nucleic acid amplification " or " amplification " refer at least a portion for replicating at least one target nucleotide Any means, generally use replicated dependent on the mode of template, includes but not limited to various be used for linear or index The technology of form amplifying nucleic acid sequence.Non-restrictive illustrative amplification method includes polymerase chain reaction (PCR), reverse transcriptase PCR, real-time PCR, nest-type PRC, multiplex PCR, quantitative PCR (Q-PCR), the amplification (NASBA) based on nucleotide sequence, transcriptive intermediate Expand (TMA), ligase chain reaction (LCR), rolling circle amplification (RCA), strand displacement amplification (SDA), Ligase detection reaction (LDR), multiple join dependency probe amplification (MLPA), connection-Q replicate enzymatic amplification, primer extend, strand displacement amplification (SDA), Overspend strand displacement amplification, multiple displacement amplification (MDA), the amplification (NASBA) based on nucleic acid chains, two step multiplex amplifications, rolling ring expansion Increase (RCA), digital amplification etc..The description of such technology is found in:Ausbel et al.;PCR Primer:A Laboratory Manual, Diffenbach, Ed., Cold Spring Harbor Press (1995) (Ausbel et al.;《Round pcr is tested Guide》, Diffenbach edits, Cold Spring Harbor Publications, nineteen ninety-five);The Electronic Protocol Book,Chang Bioscience(2002)(《Electronic agreement》, Chang Bioscience, 2002);The Nucleic Acid Protocols Handbook,R.Rapley,ed.,Humana Press,Totowa,N.J.(2002)(《Nucleic acid scheme hand Volume》, R.Rapley edits, Ha Menna publishing houses, New Jersey Tuo Tuohua, 2002);With Innis et al, PCR Protocols:A Guide to Methods and Applications, Academic Press, NY, 1990 (Innis etc. People,《PCR schemes:Method and application guide》, academic press of the U.S., New York, nineteen ninety).
As used herein, " nucleotide " refers to the multinuclear being made of heterocyclic base, sugar and one or more bound phosphate groups The monomeric unit of thuja acid.(guanine (G), adenine (A), cytimidine (C), thymidine (T) and urine are phonetic for naturally occurring base Pyridine (U)) be typically purine or pyrimidine derivative, but it is to be understood that also including naturally occurring or non-naturally occurring base Analog.Naturally occurring sugar is pentose (pentose), deoxyribose (it forms DNA) or ribose (it forms RNA), but is answered Work as understanding, also including naturally occurring and non-naturally occurring sugar analogue.Nucleic acid is usually by phosphoric acid key connection to form nucleic acid Or polynucleotides, but there are many other connections (for example, thiophosphate, boric acid phosphate etc.) as is generally known in the art.
Term " polynucleotides " and " oligonucleotides " are used interchangeably herein, and refer to nucleotide monomer or its modification The linear polymer of form, including such as double-strand and single stranded deoxyribonucleic acid, ribonucleotide.Polynucleotides can completely by DNA, ribonucleic acid or its analog composition, or block or the mixing of two or more different monomers can be included Thing.When polynucleotide is represented by series of letters (as " ATGCCTG "), it will be appreciated that unless otherwise noted, otherwise nucleosides Acid is by 5'- from left to right>3' order (except as otherwise noted) and " A " represent adenosine, and " C " represents cytidine, and " G " represents bird Glycosides, " T " represent thymidine, and " U " represents uridine.When being used alone, " polynucleotides " and " oligonucleotides " refer to mainly or entirely The sequence being made of conventional DNA or RNA monomer unit ,-i.e. by A, C, G, T or U base substitution deoxyribose or ribose saccharide ring, And they are connected by conventional phosphoric acid skeleton part.Polynucleotides generally comprise or by the lists having less than 100 nucleotide Chain polynucleotides form, it is also contemplated that the longer sequence with hundreds of or thousands of or more bases.In some implementations Example in, polynucleotides contain or comprise 2 to 100,2 to 50,2 to 25,2 to 15,5 to 50,5 to 25,5 to 15 A, 10 to 50,10 to 25,10 to 20,10 to 15,12 to 50,12 to 25 or 12 to 20 nucleotide.Multinuclear Thuja acid can be represented by their length.For example, the sequence comprising 15 nucleotide is properly termed as " 15-mer ".
" primer " or " probe " is typically to include and the region of the sequence complementation of at least six continuous nucleotide of target nucleic acid Nucleotide sequence, but primer and probe can include and be less than 6 continuous nucleotides.In several embodiments, there is provided more nucleosides Sour primer or probe include 6 or more, 7 or more, 8 or more, 9 or more with target polynucleotide It is a, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, 15 or more It is multiple, 16 or more, 17 or more, 18 or more, 19 or more, 20 or more, 21 or More, 22 or more, 23 or more, 24 or more, 25 or more, 26 or more, 27 Or more, 28 or more, 29 or more, 30 or more, 31 or more, 32 or more, 33 It is a or more, 34 or more, 35 or more, 36 or more, 37 or more, 38 or more, 39 or more, 40 or more, 41 or more, 42 or more, 43 or more, 44 or more It is a, 45 or more, 46 or more, 47 or more, 48 or more, 49 or more, 50 or more Multiple, 60 or more, 70 or more, 80 or more, 90 or more or up to 100 continuous nucleotides Identical or complementary sequence.When primer or probe include the region with many continuous nucleotides " complete complementary " of target molecule, There is no during mispairing along its length, primer or probe can be referred to as and the complementation of target molecule 100%.
When " probe " and target polynucleotide form duplex, can in probe crossover process or after hybridization directly or Signal is generated indirectly.
Term " product polynucleotides " as used herein refers to and the first target sequence of target polynucleotide and the second target sequence Arrange the first complementary complementary probe and the second complementary probe (bag is with or without gap between target sequence) is combined (for example, passing through Connection) to form the polynucleotides formed during single " product " polynucleotides.
Term " sample index bar code " as used herein or " sample index " refer to even in sample (or its reaction production Thing, or lack its reaction product) with other samples (and/or its reaction product, or lack its reaction product) mixing when, for referring to Determine specific sample and track the mark sequence of the information related with the sample.
Term " the first complementary probe or the second complementary probe " as used herein refers to the first target with target polynucleotide The polynucleotides of the First ray and the second sequence of sequence or the complementation of the second target sequence.In several embodiments, the first complementary spy Pin or the second complementary probe can also include the primer sequence for being used for generating sample index.In several embodiments, the first complementary spy Universal sequence in pin and the second complementary probe is deliberately identical or different.In several embodiments, the first complementary probe or second Complementary probe has 5' phosphorylated nucleosides acid.
Term " the first target sequence or the second target sequence " refers to a part for the target polynucleotide of the target for hybridization.At certain In a little embodiments, the first target sequence or the second target sequence can reside in sample or can be not present in sample.
Term " sequencing " as used herein refers to the process of DNA sequencing or measures the order of DNA molecular inner nucleotide.Its Include any method or skill of the order available for the adenine in measure DNA chain, guanine, cytimidine and thymidine Art.Sequencing may also include RNA sequencings, wherein being measured to the order of base in RNA.
As used herein term " single nucleotide polymorphism " or " snp " or " SNP " refer to usually single nucleotide acid A, T, the esoteric variant nucleic acid sequence of group different between pairing chromosome C or G.Most common SNP has two equipotentials Gene, but they can have more than two allele.SNP also may be present in RNA molecule.In RNA molecule, SNP can Reflect the difference of RNA processing.
Term " target polynucleotide " as used herein refers to the sequence in nucleic acid or polynucleotides as hybridizing targets. Target polynucleotide can reside in sample or can be not present in sample.In several embodiments, target polynucleotide bag Include and the first complementary probe of the present invention and the second complementary probe partially or completely complementary RNA or DNA.Target polynucleotide is usual Four kinds of bases (A, T, G and C) of DNA and four kinds of bases (A, U, G and C) of RNA can be used to describe.
Term " target sequence " refers to a part for the target polynucleotide of the target for hybridization.Target sequence can be the first target sequence Row and can reside in sample or can be not present in sample the second target sequence.Those skilled in the art is managed As solution, " target sequence " that refers to can also refer to the complementary strand of target sequence.
Term " thermal melting point " or " Tm " as used herein are with reference to the specific sequence under set ionic strength and pH Row.Tm is that 50% target sequence hybridizes to the temperature of complete matched probe.Tm also refers to half DNA chain and is in single-stranded (ssDNA) The temperature of state.Tm depends on various parameters, for example, the length of the complementary strand sequence of hybridization, they specific nucleotide sequence, The other conditions of the concentration and solution of base composition and complementary strand.
Term " universal base " as used herein refers to when the 3' ends of the first complementary probe and one or more targets are polymorphic Help to prevent or reduce the base of molecule combination frequency when property nucleotide or nucleotide incomplementarity.The example of universal base includes Inosine, 3- nitro-pyrroles and 5- nitroindolines.
Term " universal sequence " as used herein refers to the first complementary probe or second that may include universal primer sequence The sequence component of complementary probe.
Term " universal primer sequence " or " universal primer binding sequence " include mutual with primer sequence such as PCR primer sequence The primer sequence of benefit, and for increase (i) sample index, (ii) appended sequence, (iii) be used for formation sequence data or other One or more of one or more sequences and (iv) other parts of the detection of form.As those skilled in the art manages As solution, term " universal primer sequence " or " universal primer binding sequence " are referred to primer sequence or its complementary strand and make With.
PCR primer sequence usually in pairs using and sequence pair in the compositions of two kinds of components can differ.Except sample Beyond this index, any two pairs of PCR primer sequences can have identical sequence.In several embodiments, first pair and second pair The sequence of middle primer #1 is identical, and the sequence of primer #2 is different from the sequence of primer #1, and in addition to sample index, first It is pair identical with the sequence of the second centering primer #2.In several embodiments, PCR primer, which includes, has the function of other one or more A universal sequence and/or one or more sample index and/or one or more Sequences.One or more universal primer sequences The PCR reactions of row can be used for increasing sample index.It is general in the first complementary probe and its complementary portion in first PCR primer Primer sequence can with or can not have an equal length or with 100% complementary series.Second in second PCR primer is mutual Mend the universal primer sequence in probe and its complementary portion can with or can not have equal length or with 100% complementation Sequence.Universal primer sequence can be used for increase linking subsequence, which is used to be attached to solid support.Some In the case of, and the combination of solid support is used for new-generation sequencing.In other cases, and the combination of solid support is used for base In array detection product polynucleotides.Under a number of cases, universal primer sequence is used to increase sequence or part to realize other The detection of form or formation sequence data.
As used herein, term " PCR primer " or " PCR primer sequence " can refer to or can not refer to and " universal primer sequence The identical sequence of row ".As the skilled personnel to understand, term " PCR primer " or " PCR primer sequence " can be with Used with reference to PCR primer or its complementary strand.
This specification by all documents being mentioned above and to this specification and meanwhile submit or it is previously related with the application Submit all documents (including but not limited to for public inspection this specification such document) be incorporated by reference being incorporated to Herein.
Composition and method.
The present invention provides the presence for measuring one or more target polynucleotides in a sample or multiple samples, It is not present, the improved composition and method of content, copy number or characteristic.Target polynucleotide can be related to polymorphism, such as take Generation, missing, insertion, copy number variation, transposition, nucleotide modification (such as methylating) or target polynucleotide or its state it is any its He changes.
The method of the present invention can be used for identifying that a large amount of targets are more in one or more samples in the hybridization assays based on solution The presence of nucleotide, be not present, copy number or content (or combinations thereof).
In several embodiments, there is provided multiple samples (for example, 2-50000), wherein can include to include Different target polynucleotides.Respectively contain mutual with multiple first complementary probes of the sequence of target sequence complementation interested and second Mending probe can allow the first complementary probe and the second complementary probe sequences to hybridize to the first target sequence and the second target of complementation Incubated under conditions of sequence with one or more sample.The length of exemplary first complementary probe and the second complementary probe sequences Degree is about 50 to 200 nucleotide.In several embodiments, the first target sequence is positioned at inquiry site or a left side for polymorphic nucleotide Side.In several embodiments, these methods can be used for identify for example single or polynucleotides polymorphism, missing, insertion, transposition, Covalent nucleotide modification etc..
In certain embodiments, these existence or non-existence or the contents that can be used to measure specific target polynucleotide, example Such as to measure the existence or non-existence of pathogen or cancer correlated series or content in sample (such as biological specimen).
In present or absent some illustrative methods for identifying the target polynucleotide in sample, multiple first Complementary probe and the second complementary probe can be with target polynucleotide sequence is used under conditions of providing the hybridization for complementary series Comprising or can be free of polymorphism one or more samples incubated.
In certain embodiments, if the first complementary polynucleotide probe and the second complementary polynucleotide probe hybridize to that This neighbouring target polynucleotide sequence, then complementary probe can be coupled to together to form product polynucleotides.
In one embodiment, at the 3' ends of the first complementary probe, there are polymorphic nucleotide.In a reality of the present embodiment In example, polymorphic nucleotide SNP, and the allele of two kinds of forms is represented by two the first different complementary probes, except 3' Beyond nucleotide, the two first complementary probes are identical.(see Fig. 1 E.)
In certain embodiments, if target polynucleotide interested be not present with specific sample, or if conduct The polymorphic nucleotide or allele of the target of first probe or the second probe are not present in sample, then the first probe and second Probe will not hybridize to the nucleotide sequence in sample, and will not form product polynucleotides, interested corresponding to determining Target polynucleotide be not present in sample.
In an example, the first complementary probe and the second complementary probe include target complementary series, and first is complementary Probe is also comprising the 3' terminal nucleotides with the polymorphic nucleotide complementation on target polynucleotide.See Fig. 1 E.
In another example, the first complementary probe and the second complementary probe include target complementary series, and second is mutual Probe is mended also comprising the 5' terminal nucleotides with the polymorphic nucleotide complementation on target polynucleotide.
Fig. 8 shows to be used to determine that (wherein every kind of target polynucleotide includes two copy (equipotentials from tetraploid organism Gene)) target polynucleotide present or absent method variations.Every chain of given polymorphic site can be analyzed Polymorphism.
In another example, there is provided multiple first complementary probes, its middle probe correspond to it is a variety of at anchor point can Polymorphism, polymorphic nucleotide or the allele of energy.In the example for inquiring single base (substitution, insertion or missing) wherein, To can have nine the first complementary probes at anchor point, and polynucleotides polymorphism has more than nine the to anchor point One complementary probe.Under a number of cases, there are single second complementary probe to anchor point.
Preferably, every kind of target polynucleotide has at least two the first different complementary probes.For example, every kind of more nucleosides of target Acid can have at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine or at least ten difference The first complementary probe.Preferably, every kind of target polynucleotide has 1 the second complementary probe.However, every kind of target polynucleotide can With at least two the second different complementary probes.For example, every kind of target polynucleotide can have at least three, at least four, at least 5 A, at least six, at least seven, at least eight, at least nine or different the second complementary probe of at least ten.Each different first Complementary probe and/or the second complementary probe can be directed to specific allele.
In several embodiments, probe includes detectable mark or part.In several embodiments, when probe is capture During probe, such as when probe is used to capture on the surface of solids such as microarray or pearl, probe not tape label.In some implementations In example, labeled as bar code.In several embodiments, probe can not be extended by such as polymerase.In several embodiments, probe It can extend.
Sample.
In several embodiments, 100 μ g are less than for the DNA in the sample in the method for the present invention or the content of RNA, lacked In 80 μ g, less than 60 μ g, less than 40 μ g, less than 20 μ g, less than 10 μ g, less than 5 μ g, less than 4 μ g, less than 3 μ g, less than 2 μ g, few In 1 μ g, less than 500ng, less than 400ng, less than 300ng, less than 200ng, less than 100ng, less than 50ng, less than 40ng, few In 30ng, less than 20ng, less than 10ng, less than 5ng, less than 1ng, less than 0.1ng, less than 0.01ng, or for 0.01ng extremely 1000ng, 5ng are to 500ng, 5ng to 250ng, 10ng to 125ng, 10ng to 100ng, 5ng to 50ng or 5ng to 25ng.
Sample can derive from any animal, plant, microorganism, virus, synthetic DNA or synthesis RNA sources." multiple samples " Refer to two or more samples from identical or different source.For example, each sample can derive from different animals or not Same plant, or sample can derive from it is different microbe-derived.In some exemplary embodiments, the multiple sample is 2 It is a, 5,10,11,12,13,14,15,16,17,18,19,20,21,22,23, 24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39 It is a, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54 It is a, 55,56,57,58,59,60,61,62,63,64,65,66,67,68,69 It is a, 70,71,72,73,74,75,76,77,78,79,80,81,82,83,84 It is a, 85,86,87,88,89,90,91,92,93,94,95,96,97,98,99 A, 100 or more samples, for example, 2 to 10000 samples, 2 to 20 samples, 5 to 30 samples, 10 to 50 samples, 25 to 75 samples, 40 to 100 samples, 50 to 120 samples, 60 to 130 samples, 70 to 140 samples, 80 to 150 A sample, 90 to 170 samples, 100 to 200 samples, 150 to 250 samples, 200 to 300 samples, 250 to 500 Sample, 300 to 700 samples, 400 to 1000 samples, 500 to 1500 samples, 600 to 2000 samples, 700 to 3000 A sample, 800 to 4000 samples, 900 to 5000 samples, 50 to 1000 samples, 100 to 1000 samples, 200 to 2000 samples, 300 to 3000 samples, 400 to 4000 samples, 500 to 5000 samples, 600 to 6000 samples, 700 to 7000 samples, 800 to 8000 samples, 9 to 9000 samples or 1000 to 100000 samples.
DNA or RNA can be isolated from any source, such as biological source, such as blood or its hetero-organization, organism Liquid, hair, nose swab, germplasm, vegetable material etc..Any nucleic acid source substantially can be used.In several embodiments, sample can The pollutant or inhibitor to play a role comprising obstruction method, salt or other components such as PCR inhibitor.In such feelings Under condition, sample can be extracted or purify by other means to reduce or eliminate constituents for suppressing.
In some variations, a variety of methods can be used to isolate polynucleotides from sample, such as mechanical isolation is (all Such as glass bead technology), chemical extraction method, method or combinations thereof based on chromatographic column.Those skilled in the art institute is ripe The a large amount of DNA extraction methods known are used equally in method described herein.
The double-stranded DNA that DNA in sample of nucleic acid can be double-strand, single-stranded or denaturation is single stranded DNA.The denaturation of double-stranded sequence Two single stranded sequences are provided, one or both of which can be used the particular probe for each chain and (be reacted to measure in single In).Preferable sample of nucleic acid includes the target polynucleotide of genomic DNA, cDNA, DNA fragmentation (for example, restriction fragment) etc..
Before being combined with complementary probe group, sample can be handled to crack nucleic acid.This may pass through following method One or more of occur:Physical disruption, use are for example ultrasonic;Shearing, such as sound wave are sheared;Pin is cut;Point sink is sheared;Mist Change;Pass through balancing gate pit;Or heating;Enzymatic lysis, uses such as DNase I, another restriction enzyme, Non-specific nuclease Or transposase;Or chemical cracking, such as use heat and divalent metal.
Before (or hybridization) step is incubated, one or more target polynucleotides in one or more samples can occur can Contravariance.This can for example be realized by heating stepses, such as be heated at least 70 DEG C, 70 DEG C to 100 DEG C, 75 DEG C to 100 DEG C, 80 DEG C to 98 DEG C, 85 DEG C to 95 DEG C, 90 DEG C to 100 DEG C or 95 DEG C to 100 DEG C.Preferably, 95 are heated in heating stepses DEG C to 100 DEG C.Heating stepses are at least 30 seconds, at least 1 minute sustainable, 1-30 minutes, 2-25 minutes, 3-20 minutes, 4-15 points Clock or 5-10 minutes.Preferably, heating stepses continue 1-15 minutes.
In several embodiments, after being combined with complementary probe, reversible denaturation occurs for the nucleic acid in sample.Double-stranded DNA can It is denatured as single stranded DNA, such as is heated to about 98 DEG C and lasts about 1 minute.
Made using standard conditions known to those skilled in the art (such as be heated to about 98 DEG C and last about 5 minutes) double Chain DNA denaturation is single stranded DNA., can be before hybridization by sample or sample and first when putting into practice method disclosed herein Complementary probe and the second complementary probe be heated to 70 DEG C to 100 DEG C, 75 DEG C to 100 DEG C, 80 DEG C to 98 DEG C, 85 DEG C to 95 DEG C, 90 DEG C to 100 DEG C, 95 DEG C to 100 DEG C, 70 DEG C, 75 DEG C, 80 DEG C, 85 DEG C, 86 DEG C, 87 DEG C, 89 DEG C, 90 DEG C, 91 DEG C, 92 DEG C, 93 DEG C, 94 DEG C, 95 DEG C, 96 DEG C, 97 DEG C, 98 DEG C, 99 DEG C or 100 DEG C of temperature.
Target
In several embodiments, target polynucleotide can be need to measure it and exist, be not present, content or characteristic it is any Nucleotide sequence.In several embodiments, target polynucleotide can be pre-selected by the people for designing given measure, and/or with spy Fixed genotype or phenotype interested are associated, and/or are chosen for other reasons.
In several embodiments, target polynucleotide is comprising polymorphism, represents polymorphism or the core associated with polymorphism Nucleotide sequence.
In several embodiments, can be by inquiring allele using one or more nucleotide polymorphisms as target. Under a number of cases, polymorphism occurs at single nucleotide acid position, for example, an allele can have in given position Thymidine, and alternative allele has such as cytimidine at same position.In several embodiments, nucleotide Polymorphism can include substitution, missing, insertion, copy number variation, transposition, methylate or another nucleotide modification and/or variant DNA sequence dna.In several embodiments, polymorphism may include two, three, four, or more continuous nucleotide.
The compositions disclosed herein and method can be used for the single nucleotide polymorphism in identification target polynucleotide sequence (SNP).For example, for the genomic DNA sample from the diploid mammal of the SNP copy given with two, SNP can have homozygosity or heterozygosity.In other instances, triploid organism body have to anchor point 3 it is different etc. Position gene.Polyploid cell and organism include more than two pairs of pairing chromosome, and have number in whole chromosome group Mesh changes.Polyploid is common in plant.For example, wheat has diploid (two group chromosomes), tetraploid (four group chromosomes) With hexaploid (six group chromosomes).See example 8.
In several embodiments, in the method for identifying the polymorphism in target polynucleotide sequence, the first complementary spy Pin and the second complementary probe under conditions of the hybridization for complementary series is provided with can be included in target polynucleotide sequence or One or more samples that polymorphism can be free of are incubated.In several embodiments, provide and appoint for specific probe groups 3rd probe of choosing.3rd probe is generally similar to the first probe or the second probe, but is related to identical sequence interested In not iso-allele.See Fig. 1 E.
In certain embodiments, if complementary polynucleotide probe is mutual including the polymorphic nucleotide in target polynucleotide sequence The polymorphic nucleotide of benefit, then complementary probe combine to form product polynucleotides.In certain embodiments, it is if complementary Polymorphic nucleotide on polynucleotide probes does not hybridize to the polymorphic nucleotide on target polynucleotide, then two complementary probes are usual Do not combine and do not form product polynucleotides.
In several embodiments, to product polynucleotides (or product polynucleotides, its amplified production or theirs is mutual Mend a part or some for chain) it is sequenced to determine the existence or non-existence of polymorphism.Under a number of cases, it can also lead to Cross sequencing and determine sample homogeneity.In some other embodiments, depositing for polymorphism is determined using array or other reading data Or be not present.In several embodiments, the capture probe or oligonucleotides provided on array is designed to substantially with drawing The extension of thing is complementary, so that the primer not extended is not joined to capture probe.Alternatively, added in array or surveying Unreacted probe can be removed before sequence.
In several embodiments, the first complementary series and the second complementary series of the first complementary probe and the second complementary probe Change depend on one or more of many possible parameters, such as:(i) unwinding for duplex is formed with target polynucleotide Temperature, (ii) Tm, the ionic strength of (iii) hybridization solution, complexity of (iv) target polynucleotide etc..
In several embodiments, sample includes one or more or a variety of different target polynucleotides.Specifically, sample bag Containing at least two different target polynucleotides, it is at least three kinds of, 4 kinds, 5 kinds, 6 kinds, 7 kinds, 8 kinds, 9 kinds, 10 kinds, 11 kinds, 12 kinds, 13 Kind, 14 kinds, 15 kinds, 16 kinds, 17 kinds, 18 kinds, 19 kinds, 20 kinds, 21 kinds, 22 kinds, 23 kinds, 24 kinds, 25 kinds, 26 kinds, 27 kinds, 28 Kind, 29 kinds, 30 kinds, 31 kinds, 32 kinds, 33 kinds, 34 kinds, 35 kinds, 36 kinds, 37 kinds, 38 kinds, 39 kinds, 40 kinds, 41 kinds, 42 kinds, 43 Kind, 44 kinds, 45 kinds, 46 kinds, 47 kinds, 48 kinds, 49 kinds, 50 kinds, 51 kinds, 52 kinds, 53 kinds, 54 kinds, 55 kinds, 56 kinds, 57 kinds, 58 Kind, 59 kinds, 60 kinds, 61 kinds, 62 kinds, 63 kinds, 64 kinds, 65 kinds, 66 kinds, 67 kinds, 68 kinds, 69 kinds, 70 kinds, 71 kinds, 72 kinds, 73 Kind, 74 kinds, 75 kinds, 76 kinds, 77 kinds, 78 kinds, 79 kinds, 80 kinds, 81 kinds, 82 kinds, 83 kinds, 84 kinds, 85 kinds, 86 kinds, 87 kinds, 88 Kind, 89 kinds, 90 kinds, 91 kinds, 92 kinds, 93 kinds, 94 kinds, 95 kinds, 96 kinds, 97 kinds, 98 kinds, 99 kinds, 100 kinds or more kind, 120 kinds Or more kind, 140 kinds or more kind, 160 kinds or more kind, 180 kinds or more kind, 200 kinds or more kind, 220 kinds or more Kind, 240 kinds or more kinds, 260 kinds or more kinds, 280 kinds or more kinds or 300 kinds or more kind target polynucleotides, such as 2 Plant to 5000 kinds of target polynucleotides, 2 kinds to 20 kinds target polynucleotides, 5 kinds to 30 kinds target polynucleotides, 10 kinds to 50 kinds target multinuclears Thuja acid, 25 kinds to 75 target polynucleotides, 40 kinds to 100 kinds target polynucleotides, 50 kinds to 120 kinds target polynucleotides, 60 kinds to 130 Kind target polynucleotide, 70 kinds to 140 kinds target polynucleotides, 80 kinds to 150 kinds target polynucleotides, 90 kinds to the 170 kinds more nucleosides of target Acid, 100 kinds to 200 kinds target polynucleotides, 150 kinds to 250 kinds target polynucleotides, 200 kinds to 300 kinds target polynucleotides, 250 kinds To 500 kinds of target polynucleotides, 300 kinds to 700 kinds target polynucleotides, 400 kinds to 1000 kinds target polynucleotides, 500 kinds to 1500 Kind target polynucleotide, 600 kinds to 2000 kinds target polynucleotides, 700 kinds to 3000 kinds target polynucleotides, 800 kinds to 4000 kinds targets Polynucleotides, 900 kinds to 5000 kinds target polynucleotides, 50 kinds to 1000 kinds target polynucleotides, 100 kinds to the 2000 kinds more nucleosides of target Acid, 200 kinds to 3000 kinds target polynucleotides, 300 kinds to 4000 kinds target polynucleotides, 500 kinds to 5000 kinds target polynucleotides or 100 kinds to 10000 kinds target polynucleotides.
The length of target polynucleotide can be different.In several embodiments, target polynucleotide is 10nt to 100nt, 10nt To 200nt, 10nt to 300nt or 10nt to 400nt.In several embodiments, target nucleotide for 20nt to 30nt, 20nt extremely 40nt, 20nt are to 50nt, 20nt to 60nt, 20nt to 70nt, 20nt to 80nt, 20nt to 90nt, 20nt to 100nt, 20nt To 110nt, 20nt to 120nt, 20nt to 130nt, 20nt to 140nt, 20nt to 150nt, 20nt to 160nt, 20nt extremely 170nt, 20nt to 180nt, 20nt to 190nt, 20nt to 200nt, 20nt to 210nt, 20nt to 220nt, 20nt extremely 230nt, 20nt to 240nt, 20nt to 250nt, 20nt to 260nt, 20nt to 270nt, 20nt to 280nt, 20nt extremely 290nt, 20nt to 300nt, 20nt to 310nt, 20nt to 320nt, 20nt to 330nt, 20nt to 340nt, 20nt extremely 350nt, 20nt are to 360nt, 20nt to 370nt, 20nt to 380nt, 20nt to 390nt or 20nt to 400nt.
Under a number of cases, the length of target sequence can be walked according to the melting temperature (" Tm ") of sequence, pH, salinity or incubation Rapid temperature changes.
The Tm for a variety of target polynucleotides assessed in given measure usually each other in 1 DEG C, 2 DEG C, 3 DEG C, 4 DEG C, 5 DEG C, 6 DEG C, 7 DEG C, 8 DEG C, in the range of 9 DEG C or 10 DEG C.In several embodiments, the Tm of a variety of target polynucleotides each other in 1-3 DEG C, 2–5℃、2–4℃、3–6℃、3–5℃、4–7℃、4–6℃、5–8℃、5–7℃、6–9℃、6–8℃、7–10℃、7–9℃、8–10 DEG C or 8-9 DEG C in the range of.
Be well known in the art it is a variety of under the conditions of hybridized.Stringent condition is hybridization conditions, in these conditions Under, polynucleotides by preferential hybridization to its target subsequence, and optionally in lower degree or and it is not all hybridize to it is mixed The condition of other sequences in gregarious body.
In general, stringent hybridization conditions are selected to be below heat of the particular sequence under set ionic strength and pH About 5 DEG C of mechanics fusing point (Tm).Very stringent condition is selected as the Tm equal to particular probe.
In the method for performing the present invention, many aspects of hybridization reaction condition can change, and including but not limited to hybridize The ionic strength of the temperature of reaction, the length incubated and hybridization buffer.
With reference to-connection
In certain embodiments, after sample, the first complementary probe and the second complementary probe are mutual in permission first Benefit probe and the second complementary probe are incubated under conditions of hybridizing to the complementary target polynucleotide in sample, the first complementary probe It can be combined with the second complementary probe.When the first complementary probe and the second complementary probe hybridize to target-specific sequence adjacent to each other During row, corresponding 5'- phosphorylations and 3'- the hydroxylatings end of probe pair can pass through any suitable means knot as known in the art Close.
In certain embodiments, the first complementary probe and the second complementary probe can be with Non-covalent bindings.In other cases, First complementary probe and the second complementary probe can be with covalent bonds.Under a number of cases, covalent bond can utilize ligase (example Such as, DNA ligase or ligase -65 from aquatic thermophilic bacteria (T.aquaticus)) realize.In such cases, connect Enzyme and connection buffer solution may be added to that mutual comprising adjacent first complementary probe and second for being attached to target polynucleotide in sample In the solution for mending probe.In alternative embodiment, hybridization complex is added in connection solution.The temperature of coupled reaction can be with Kept constant in about 1 to 20 minute, such as at about 1 minute, 2 minutes, 3 minutes, 4 minutes, 5 minutes, 6 minutes, 7 minutes, 8 points Clock, 9 minutes, 10 minutes, 11 minutes, 12 minutes, 13 minutes, 14 minutes, 15 minutes, 16 minutes, 17 minutes, 18 minutes, 19 points Clock, 20 minutes or be longer than in the time of 20 minutes and keep constant.In several embodiments, it is attached reaction at about 54 DEG C. When one embodiment is analysis RNA, target polynucleotide can be before hybridizing with the first complementary probe and the second complementary probe CDNA is converted into, or RNA transcript can be used as the hybridizing targets of the first complementary probe and the second complementary probe.Work as rna transcription When thing is used as hybridizing targets, the first complementary probe and the second complementary probe can continue to be used to combine the first complementary probe comprising utilize With the DNA in the embodiment of the connection of the second complementary probe because can be carried out with reference to step by method as known in the art Change to promote the DNA connections (for example, seeing United States Patent (USP) 8,790,873) in RNA templates.For DNA to be connected to RNA templates On exemplary ligase include SplintR PBCV-1DNA ligases or chlorella virus dna ligase.
When connecting imperfect, the temperature of coupled reaction can be improved to about 94 DEG C and be kept for 1 minute to help to be passivated DNA ligase is simultaneously denatured product polynucleotides.Under a number of cases, temperature can about 1 minute, about 2 minutes, 3 minutes, 4 Improved in minute or 5 minutes to 90 DEG C, 91 DEG C, 92 DEG C, 93 DEG C, 94 DEG C, 95 DEG C, 96 DEG C, 97 DEG C, 98 DEG C or 99 DEG C.Some In the case of, connection mixture be then quickly cooled to room temperature, about 4 DEG C or about 0 DEG C.
The purposes of universal base
As well known for one of skill in the art, ligase can malfunction, for example, the complementary connection or " close with mispairing Envelope " sequence (such as G/T mispairing) is present between two nucleic acid chains.When the first complementary polynucleotide probe and the second complementary multinuclear When thuja acid probe hybridizes to target sequence, there may be mispairing, and sequence can not be 100% complementary.In several embodiments, Complementary probe is configured to have universal base, such as inosine (for example, deoxyinosine), it is near inquiry site.Bag Inosine (for example, deoxyinosine) containing complementary probe will form the base-pair with relatively low stability with complementary strand.In some realities Apply in example, universal base the 2nd of the 3' nucleotide relative to the first complementary probe, the 3rd, the 4th, the 5th, the 6th, It is substituted at 7th, the 8th, the 9th or the 10th position.Preferably, universal base is in the 3' relative to the first complementary probe It is substituted at 2nd position of nucleotide.Universal base help to reduce or prevent from not having one complementary with target sequence or First complementary probe of multiple 3' nucleotide is attached to the second complementary polynucleotide probe.Alternatively, or over and above what is described above, Universal base can the 2nd of the 5'- nucleotide relative to the second complementary polynucleotide probe, the 3rd, the 4th, the 5th, It is substituted at 6th, the 7th, the 8th, the 9th or the 10th position, does not have and target sequence complementation to prevent or reduce Second complementary polynucleotide probe of one or more 5'- nucleotide is attached to the first complementary polynucleotide probe.In some realities Apply in example, inosine (for example, deoxyinosine) is the universal base for unstability mispairing (most of is G/T mispairing), therefore is being deposited In mispairing, ligase is by blow-by the first complementary polynucleotide probe and the second complementary polynucleotide probe.In other implementations In example, inosine (for example, deoxyinosine) or another universal base are used to avoid making the unstability mispairing in target sequence main body from (wherein sending out Nearside SNP known to life), so as to remain able to realize appropriate connection, although otherwise the 3' nucleotide of the first probe is being inquired Complementary with target sequence at position, this connection also will be destroyed.In certain embodiments, known to two or more polymorphic positions In the complimentary positions for betiding the first complementary probe and/or the second complementary probe, universal base can be used at these positions To avoid unstability mispairing.
Bar code and sample index
In certain embodiments, the first complementary probe and/or the second complementary probe, which include, makes sample and/or target sequence (position Point and/or polymorphism or inquiry site) bar code identified.
In several embodiments, the first complementary probe and the first target sequence are complementary, and comprising non-mutually with target polynucleotide The inquiry site bar code of benefit.
Inquiry site bar code contribute to measure target polynucleotide (for example, site) presence, be not present or content and/or The variation (for example, polymorphism) of target polynucleotide.In several embodiments, entirely inquiry site bar code or inquiry site bar shaped The a part of of code can be complementary with the first target sequence or the first complementary probe of the part of the second target sequence incomplementarity and second One or both of probe.In several embodiments, inquiry site bar code can identify both site and allele (as One composite sequence or the unitary part as single sequence).In several embodiments, inquiry site bar code can include and target The Sequence of sequence incomplementarity and the part with target sequence complementation.In several embodiments, inquire that site bar code can be with Only identify an allele.In such cases, its partially or even wholly with target sequence incomplementarity.In some embodiments In, inquiry site bar code and target polynucleotide sequence incomplementarity.
In such cases, inquiry site bar code be in the first target sequence so that with the first target sequence incomplementarity, such as Shown in Fig. 1.
The length of inquiry site bar code is usually 5 or more nucleotide.Exemplary interrogation site bar code sequence Length for 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 A, 20,21,22,23,24,25,26,27,28,29 or 30 or more a nucleotide.
In several embodiments, inquire that site bar code includes at least five, at least six, at least seven, at least eight, at least 9, at least ten, at least at least 11, at least 12, at least 13,14 or at least 15 or more nucleotide.
In several embodiments, the first complementary probe includes inquiry site bar code, and when the first complementary probe and target The first target sequence mutual added time of polynucleotides, inquiry site bar code sequence do not hybridize to target, however, in inquiry site bar shaped 3' and the 5' part of code side are the part with the first complementary probe of the first target sequence complementation.See Fig. 1.
In several embodiments, the second complementary probe include with the Sequence of the second target sequence complementation and with the second target Sequence incomplementarity close to Sequence.The non-complementary portion of first complementary probe and the second complementary probe may include general sequence Row.The universal sequence of first complementary probe and the second complementary probe can be identical or different.
In certain embodiments, the primer sequence with the complementation of the second complementary probe is used to add sample index by PCR Into product polynucleotides (or its reaction product).However, in certain embodiments, sample can be added by the first complementary probe Index.In some of the exemplary embodiments, sample index can be located in the PCR primer 1 of the first complementary probe so that sample index Close to bar code to be sequenced, without the first target sequence and the second target sequence are sequenced.
In certain embodiments, the length of sample index is usually 5 or more nucleotide.In some exemplary implementations In example, the length of sample index is 5,6,7,8,9,10,11,12,13,14,15,16, 17,18,19,20,21,22,23,24,25,26,27,28,29 or 30 or more A nucleotide.
Exemplary sample index comprising 12 and 15 nucleotide is as shown in table 1.
In several embodiments, the sum of unique sample index is about 16128 based on 12-mer sequences.In some implementations In example, the sum of unique sample index is about 50,000 based on 15-mer sequences.In other embodiments, unique sample rope The sum drawn is about 66000 based on 15-mer sequences.
In several embodiments, sample index be used for by the sample index of each product polynucleotides is sequenced come Measure the homogeneity of sample.
Selection/enrichment
In the measure of the present invention, enriching step is may include before analytical procedure.Enriching step, which is used to increase, reacts mixed The content and product polynucleotides of product polynucleotides in compound and the ratio of non-product polynucleotides.This can pass through choosing Select product polynucleotides and/or remove non-product polynucleotides to realize.In several embodiments, enriching step is to be based on ruler Very little, affinity, electric charge or sequence or by remove partly or entirely non-product polynucleotides (such as by selecting, separate or Resolution) realize.It can occur in identical or different reaction mixture with reference to enriching step.
In several embodiments, product polynucleotides can be based on particular sequence (for example, sample index or such as complementary series Etc. sequence) presence select.Under a number of cases, product polynucleotides can be included designed for being selected during enriching step The bar code selected.
In several embodiments, enrichment includes amplification step.For example, sample index can combine during amplification step Into product polynucleotides, any amplified reaction known to those skilled in the relevant art, such as polymerase chain reaction are used (PCR)。
When the product polynucleotides in different samples mix and be collected in library be sequenced or other analysis when, bag Mark amplified production containing sample index is very useful.Primer binding sequence (or site) can be coupled to the first complementary polynucleotide And/or second complementary polynucleotide probe to promote the amplification of product polynucleotides, no matter linear amplification or exponential amplification.Draw Thing binding site is used to combine primer to trigger primer to extend or expand.Primer binding site is usually located at the first target sequence or In probe portion outside two target sequences.In several embodiments, primer binding site is located at and target polynucleotide incomplementarity In sequence.
In several embodiments, PCR is used to sample index being added in product polynucleotides, so as to collect product multinuclear Thuja acid is to realize the purpose of sequencing.PCR primer can include and product polynucleotides or the first complementary probe or the second complementary probe A part of complementary sequence.For example, when the first PCR primer and the second PCR primer are used to guide the PCR of product polynucleotides to expand During increasing, the first PCR primer can include the sequence with the sequence complementation on product polynucleotides, and the second PCR primer can include With the sequence of the sequence incomplementarity on product polynucleotides.
Under a number of cases, it is more that two different sample index one of (for example, in each PCR primer) are attached to product In nucleotide, so as to help to increase the quantity for the sample that can be identified and analyze in unitary determination.Under a number of cases, only One PCR primer includes sample index or bar code.
In one exemplary embodiment, it is enriched with using PCR amplification.
Polymerase includes but not limited to DNA and RNA polymerase, reverse transcription etc..Be conducive to be gathered by different polymerases The condition of conjunction is well known to the skilled artisan in the art.
Amplification is carried out to promote the incubative time at desired temperature usually in thermal cycler is automated.In some implementations In example, amplification includes making at least one primer continuous annealing with sequence that is complementary or being substantially complementary be at least one target nucleus Thuja acid, synthesize at least one nucleotide chain in a manner of depending on template using polymerase and form nucleic acid double chain to separate respectively Multiple circulations of bar chain.The circulation may or may not repeat.Amplification may include thermal cycle, or can hold under isothermal conditions OK.
In several embodiments, amplification is kept for about 1 minute to about 10 minutes at a temperature of being included in about 90 DEG C to about 100 DEG C Denaturation is carried out, then circulation keeps annealing for about 1 second to about 30 seconds at a temperature of being included in about 55 DEG C to about 75 DEG C, Keep being extended for about 5 seconds to about 60 seconds at a temperature of about 55 DEG C to about 75 DEG C, and at a temperature of about 90 DEG C to about 100 DEG C Keep being denatured for about 1 second to about 30 seconds.Other times and configuration can also be used.For example, primer annealing and extension can be Performed at single temperature in same step.
In several embodiments, circulation perform at least 5 times, at least 10 times, at least 15 times, at least 20 times, at least 25 times, extremely It is 30 times, at least 35 times, at least 40 times or at least 45 times few.Specific circulation time and temperature are by the particular core depending on amplification Acid sequence, and those of ordinary skill in the art can easily determine.
In several embodiments, PCR or another DNA cloning process can be used by the annealing for promoting PCR primer or be related to sequence The linker of the process of column-generation or linking subsequence are added in product polynucleotides.This and the side that uses of tradition in this area Method compares, wherein adapter to be connected to polynucleotides to be sequenced.Linker and adapter can be used as physics, chemistry or enzyme Component during rush.
After enrichment and/or amplification, sample can be collected.In such embodiments, from the product multinuclear of various samples Thuja acid mixing is analyzed and/or is sequenced to obtain product polynucleotides storehouse together.
In the case of multiple samples are sequenced together wherein, each sample can be expanded individually, and wherein sample index includes In the first PCR primer and/or the second PCR primer, wherein one or more sample index are specific to sample.From to Each product polynucleotides of random sample sheet have identical sample index.Under a number of cases, two of which PCR primer includes Sample index, bar code can be identical or different.In such cases, while complete to (being usually from one or more It is multiple) measure of the sequences of the product polynucleotides of sample.
In several embodiments, the present invention provides composition, method and the examination measured for target polynucleotide copy number Agent box.It is related with gene control and human diseases to copy number variation (" CNV ").Can be used for each potential CNV sites with And first complementary probe in one or more sites and the second complementary probe assess CNV.Probe may include such as institute above The bar code stated.Such as new-generation sequencing can be used to measure the relative amount of each sequence, wherein CNV sites (target polynucleotide) Opposite solution reading can be used for the copy number of estimation CNV sites (target polynucleotide) with single copy target polynucleotide.Passing through will Sample with known CN and/or CNV with known reference quantity compared with unknown sample or by being compared to measure CNV.For example, if sample has the target polynucleotide sequence of two copies, the sum that sequence is understood is being normalized to compare The target polynucleotide sequence of two copies will be indicated during value, and the sample of the target polynucleotide sequence with four copies is opposite The sequence for obtaining 4 times is understood into quantity in normalization sample.Sequence deciphering will not be produced by lacking the sample of all copies.
In several embodiments, this CNV detections are extended to the content of target polynucleotide present in measure sample.
In several embodiments, the first complementary probe and the second complementary probe separate one when hybridizing to sequence interested A or multiple nucleotide.Gap can be single nucleotide acid or more than one nucleotide., can be first when hybridizing to sample nucleic acid The 3' ends of probe are extended.In such cases, sample nucleic acid is used as template to guide the type of modification, such as by the The base pairing that occurs during the extension based on polymerase of one probe interact mixed in gap filling step one or Multiple nucleotide.If gap filling step performs completion, complementary probe can for example by enzymatic connect with it is mutual close to second The 3' ends for mending first complementary probe at the 5' ends of probe combine, as described above.Then as be free of gap filling step above Described in embodiment, the polynucleotide products of gained are analyzed.
In the variations that wherein product polynucleotides are sequenced by new-generation sequencing technology, PCR primer The sequence being used in combination available for generation with specific sequencing technologies, such as it is connected subsequence for increasing, which can The surface that product polynucleotides are attached in Illumina NGS flow cells is promoted to combine DNA oligonucleotides.
Wherein use array readings analysis product polynucleotides variations in, PCR primer can also be used for generation with The sequence that specific array is used in combination, such as increasing catenation sequence, which can promote product polynucleotides to combine To array (such asOrMicroarray (the Affymetrix company of Santa Clara City, California, America (Affymetrix, Inc., Santa Clara, California)) orMicroarray (California, USA The inspiration company (Illumina, Inc., San Diego, California) in state Santiago) on DNA oligonucleotides (capture Probe).
In several embodiments, probe, target nucleotide or product nucleotide are attached to solid support.
Improved sample index primer
As described herein, some embodiments are by the way that sample index sequence is mixed in primer sequence (for example, PCR primer), Sample index is added in product polynucleotides.Although many sample index sequence (that is, 4^15 that may be different can be used A different 15mer sequences), but create an optimization group must not only be distinguished from each other index sequence (such as, it is ensured that even in In the case of base sequencing mistake, sample index is not also called mistakenly), and solve consistency problem and excellent with being also desirable that Change in total body measurement.In the reality for mixing sample index by being used as primer tasteless nucleotide (for example, being used for PCR) and adding sample index Apply in example, the aspect of compatibility and optimization may include such amplification.Further relate to need with other relevant Considerations of total body measurement The potential flexibility of sample size to be processed and sample index primer to be solved.For example, 15mer sample index can be set Count into and cause the one 12 base to can be used for sample size wherein to be solved less and without the complete of available 15mer storehouses In the case of index ability, and it is possible thereby to handled using 15mer as 12-mer further to optimize total body measurement (example Such as, in sequencing detects embodiment, it is only necessary to the one 12 base is sequenced i.e. recognizable sample index, so that when saving Between and reagent).The method of available identification sequence includes the multiple steps summarized in the disclosure.In certain embodiments, one Such step can be identified and remove those useless sequences, otherwise would interfere with the survey from the possible sequence group being previously identified Qualitative energy.In addition, identify those may be occasionally there are the sequence of problem and remove it is also critically important because these sequences Can be by the initial testing as derived from experience, and performed under some determination conditions with suboptimum state.
In certain embodiments, 73536 indexes can be used in 384 microtiter plate formats, it is enough to be used in 169 samples In this plate.In other embodiments, the one 16128 15mer index in 65280 indexes can also be used for as 12-mer In 384 microtiter plate formats, it is enough to be used in 42 sample planes.These indexes have not only carried out excellent in terms of whole installation Change, and sample plane is optimized (for example, 1-384,385-768,769-1152 etc.) one by one.Although lacking in these indexes The problem of unexpected may occur in actual test in number, and may need replacing, but will be few based on each sample plane Amount index replaces with other indexes and is less likely to influence any optimization (as long as example, retaining 99% sequence, or in 384 samples Substitute 3-4 sequence in plate, then do not answer the reduction of peep optimization degree).
Many factors include maximization, the spy of orthogonality for selecting optimized sample index group to have material impact The opposite sex maximization and ensure with other measure components compatibility.It is expected not only in one group of sample index relative to sample Index sequence itself and orthogonality is improved to greatest extent in particular assay step.For example, for being added in a PCR step Sample index in product polynucleotides, maximum orthogonality considers not only sample index sequence itself, and draws in view of PCR The sequence of thing.Also may be present should cause the maximized other sequences of orthogonality.If for example, not only added using PCR step Sample index sequence, and add linking subsequence for subsequent use in measure (for example, new-generation sequencing flow cell adapter sequence Row), then it is expected at utmost to improve orthogonality relative to sample index and primer sequence and flow cell linking subsequence.It is maximum It is also an important Consideration that degree, which improves specificity, and avoids homopolymer (for example, avoiding 3 companies in sample index Continuous base uses identical base) and make GC be standardized as in the desired scope (for example, 40% to 60%, 42% to 58%, 44% to 56% etc., as desired by specific embodiment or requirement) also critically important.Also needed in optimization process Consider other measure components, the nucleotide sequence detected, example such as new-generation sequencing library construction are used in such as continuous mode Sequence.
As known to the person skilled in the art, oligonucleotides design characteristics and chemical environment or determination condition are matched It is specific critically important.Further, it is known that the specificity in such measure is influenced by a variety of determinants.It can change to change Becoming several non-limiting examples of specific measure key element includes the concentration of solvent such as DMSO, ion concentration (monovalention Such as K+ or Na+ or divalent ion such as Mg++), the concentration of oligonucleotides, the time of interaction and measuring temperature and/or measure The temperature of middle difference component.
In certain embodiments, as a non-limiting example, the temperature as specific determinant is paid close attention to, Because general correlation be relatively low temperature usually to relatively low specific related and higher temperature with it is higher special Property it is related.Therefore, different temperatures scope as described herein should be considered as representing higher and relatively low range of specificity and non-critical Temperature.
In the exemplary embodiment, PCR reactions are usually run under 60 DEG C of annealing and extension.Designed for herein At a temperature of the primer that operates typically result in relatively low amplification efficiency, so as to reduce product when being run at higher temperature such as 65 DEG C Yield.Good yield is can obtain designed for having the primer of optimum performance to be run at a temperature of 65 DEG C at 65 DEG C;So And the design characteristics designed for the design characteristics and the primer designed for 60 DEG C of 65 DEG C of primer is slightly different.Specifically, Primer is designed such that they are more stably attached to target sequence.Those skilled in the art is it is known that many designs can be used Standard combines or empirically determines the combination of different designs to predict.These standards usually form (G/C content), freedom with sequence The length of matched base-pair or quantity are related between energy (Δ G) value and two complementary strands.Similar pattern is suitable for undesirable Undershooting-effect.It can cause the sequence motifs of undesirable product, such as primer dimerisation products, in relatively low temperature or relatively low May be more complicated under specific reaction.Of specific interest is sequence motifs, wherein several bases at the 3' ends of oligonucleotides With another oligonucleotides in measure or the region complete complementary of its own or close to complete complementary (Figure 11 A and 11B and figure 12A and 12B).(or other specificity determine with temperature for the Δ G values of dimer product of problems or the length of complementary portion Factor) inherently change.Relatively low temperature will make to have relatively low complementary, related to the shorter region height of 3' ends complementation Dimer occur non-specific amplification.Therefore, by running measure at 65 DEG C rather than 60 DEG C, it is necessary to more complementary bases Non-specific amplification occurs.Further, it has been determined that even if the sequence of complementary base and non-precisely it is in 3' ends, sample index It is sufficiently long between a part for primer and the 3' ends of sample index primer to cause primer that dimerization occurs from complementary series.
In certain embodiments, under measure reaction condition used, under a number of cases, tolerable has perfect matching 7bp 3' is complementary or the motif of 9bp with a mispairing, but other situations are then unacceptable, are specifically dependent upon whole The complementarity added in a dimer molecule.Therefore, this is a useful motif, it is used to identify other available sequences Row, because it identifies many possible oligonucleotides performed poor under most of determination conditions, and also identify that The sequence that may under a set of conditions play a role but easily fail under conditions of specificity is lower slightly a bit.These sequences can be special Do not occur since the 3' across different " regions " in oligonucleotides is complementary but non-exclusively, for example, it is partly but non-complete Portion's complementarity is due to the Variable Area in oligonucleotides.An example provided in this article be " bar code " part (Figure 11 A and 11B and Figure 12 A and 12B).
Tolerate before failure and (such as exist under conditions of longer complementary region (such as 7bp and 9bp include 1 mispairing) At 65 DEG C) measure is run compared to the condition to fail in the case where shorter 3' complementary regions (such as 5bp or 6bp) occur (such as at 60 DEG C) is very desirable.Tolerance to longer motif greatly reduces what must be removed from useful sequence library The quantity of oligonucleotide sequence.
Equally, measure can be run under 70 DEG C of annealing/elongating temperature, this will further limit undershooting-effect, but Other limitations are produced to design.
One group of 15mer sample index bar code of the disclosure includes a variety of specific design considerations, these key elements produce jointly An optimal group index, and including given reaction condition and other have similar to each seed under specific other conditions Group.Process for identifying the group is also not necessarily limited to specific group disclosed in this invention, because can be by slightly changing the present invention Disclosure develop similar but differ using for the identical process with higher or lower specific determination condition Group.
For detecting the methods of genotyping of the target polynucleotide in polyploid sample
As described herein, methods of genotyping (and associated data analysis) is used to detect target multinuclear in polyploid sample The existence or non-existence of thuja acid.In certain embodiments, target polynucleotide can be SNP or missing/insertion event (indel) As a result.Under complicated Genetic conditions, such as in polyploid sample, when sequence data of the generation in relation to genome interested When, the presence of no information gene group adds each site and the sequence needed for sample is understood quantity.This document describes ploidy Reduce strategy with reduce generation without the sequence data in information gene group.Using close to subgenome specificity HSV SNP's It is (see Figure 13) as described herein that the exemplary ploidy of label SNP reduces concept.Polyploid reduction method as described herein can be single Solely use or be used in combination with probe as described herein.
In exemplary first method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/indel and nearside SNP/indel information.It is opposite with target/marker SNP/indel's based on nearside SNP/indel Probe, is designed as having selectivity to genome interested, uses nearside SNP/indel unstability strategies pair by the knowledge of position Ploidy is reduced by organismal complexity.Selection carries out Genotyping on Axiom and shows the target of diploid cluster.Ensure target Remember in 9 bases of the either side of thing and nearside SNP/indel is not present (see Figure 14 A-C).
In one embodiment, a form of first complementary probe is designed to and the target with SNP/indel (LHS) Sequence is complementary, and the first complementary probe of other forms is designed to and the target sequence complementation without SNP/indel (LHS').The 3' sequence of two complementary probes (RHS) close to the first complementary probe of two kinds of forms.The existing nearside SNP by near target SNP is led Cause to prevent the destabilizing effect connected.As a result, the selection completed to genome interested (that is, has the target gene of nearside SNP Group will produce low sequence and understand).Incorporating nearside SNP in probe design causes the site for not producing deciphering to play a role completely (see Figure 15 A and 15B).
In exemplary second method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/indel and nearside SNP/indel information.Relative position based on nearside SNP/indel Yu target indicia thing SNP/indel Knowledge, the blocking oligonucleotide of addition and the target gene group complementation with nearside SNP/indel is to prevent RHS from hybridizing to target base Because of group.
In one embodiment, a form of first complementary probe is designed to and the target with SNP/indel (LHS) Sequence is complementary, and the first complementary probe of other forms is designed to and the target sequence complementation without SNP/indel (LHS').The 3' sequence of two complementary probes (RHS) close to the first complementary probe of two kinds of forms.Addition and the target for including nearside SNP/indel Closing/competition oligonucleotides of sequence complementation.Blocking oligonucleotide prevents RHS from hybridizing to target DNA.As a result, complete to interested Genome selection (that is, the target gene group with nearside SNP will produce low sequence and understand or do not produce sequence deciphering).Pass through Add blocking oligonucleotide, the site that nearside SNP causes not produce deciphering is incorporated in probe design (see Figure 16 A and 16B).This Kind method is between the base 1 and 10 of target indicia thing suitable for nearside SNP.It can not make two level polymorphism unstability.
In exemplary third-party method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/indel and nearside SNP/indel information.Relative position based on nearside SNP/indel Yu target indicia thing SNP/indel Knowledge, PCR primer is by designing and adding PCR amplification step early period, PCR amplification step selective amplification sense early period The unique genome or subgenome of interest.In the method, one or both of PCR primer can be with genome sequence In the target genome sequence with nearside SNP/indel it is complementary.This unique PCR amplification step early period can be concurrent working Process flow (that is, sample is divided into two parts).
In one embodiment, a form of first complementary probe is designed to and the target with SNP/indel (LHS) Sequence is complementary, and the first complementary probe of other forms is designed to and the target sequence complementation without SNP/indel (LHS').The 3' sequence of two complementary probes (RHS) close to the first complementary probe of two kinds of forms.Increase PCR amplification step early period, use warp The PCR primer of design is crossed with selective amplification unique genome interested or subgenome.Nearside SNP/indel makes PCR The hybridization of primer is unstable.Therefore, the selection to desired unique gene group interested is completed (that is, in follow-up work flow It is middle to eliminate the unwanted genome for including nearside SNP).Various positions are adapted to using the combination of multiple PCR primer groups and probe groups The combination of point and genome (see Figure 17 A and 17B).
In certain embodiments, 600X mean coverages method can be used in a small amount of selected label.This method needs Concurrent working flow (that is, sample is divided into two parts).In one embodiment, sample with neighbouring nearside SNP or single alkali Separated between the relevant labels of base indel, and for impacted label, it is and used to 200X in other methods The sequencing of coverage is on the contrary, it makes to have the coverage of the isolate of impacted label to improve to 600X (that is, using extra The sequencing time and expense compensate, rather than early period Eureka part compensate).This method has been used for other In the case of, such as deep sequencing is carried out with RNA-Seq to aid in detecting rare transcript, therefore in different situations, this To be Eureka equivalents.
The selection of application method depends on being present in three Wheat volatiles at the nearside SNP positions relative to target SNP Characteristic.Selective enclosure method may compatible single probe groups and workflow form.
Method for detecting target RNA in the case where being not converted into cDNA.
The analysis of RNA is frequently present of deviation, because needing RNA being converted into cDNA before analysis.Side as described herein Method is related to directly detection target RNA, without being translated into cDNA.The detection of target RNA includes but not limited to inquire extron side Boundary, it can detect the alternative splicing of mRNA transcripts and splicing variants, detection fusion gene (at least two independent bases The part of cause), and for detecting the more generally expression analysis of mRNA transcripts.In certain embodiments, for detecting target The method of RNA is using new-generation sequencing and can detect thousands of a sites in hundreds thousand of RNA samples at the same time.For examining The method for surveying target RNA is added based on join dependency PCR amplification and using inquiry site probe and during PCR amplification The sample index bar code added.In the exemplary embodiment, the practicality of this method is proved by performing height multiple reaction, The reaction hybridizes to the DNA probe of RNA templates using the connection of commercially available DNA ligase.PCR amplification is carried out to connection product.By institute The PCR product generation new-generation sequencing data obtained.Sample (being based on sample index) and site are distributed into each deciphering.Investigate by The sequencing data of PCR product generation will disclose the splicing variants or the table of fusion and mRNA transcripts of mRNA transcripts Reach.In example 13 and Figure 19-21, it is shown that be used for by house-keeping gene and selection aobvious outside the people's gene of cancer fusion detection Son obtains the result of the 778-plex probe groups designed for inquiry RNA.Expected extron connection is present in house-keeping gene In.As described herein, for detecting and inquiring that RNA target calibration method (and Correlative data analysis) is used for expression analysis, equipotential base In the targeting researchs such as cause-specific expressed analysis, alternative splicing analysis and fusion detection.It is this directly to detect RNA's Method is a kind of assay method of simplification, also eliminates the deviation that RNA is converted into cDNA.
Detection based on sequencing.
In a kind of exemplary application of method, surveyed using new-generation sequencing (for example, Illumina is sequenced) performing sequence It is fixed.Can to product polynucleotides direct Sequencing, or can be to being generated in the copy or measure of product polynucleotides complementation Chain is sequenced.When performing this method, the first complementary probe and/or the second complementary probe can include universal primer sequence. , can be more added to product by adapter by the other methods of PCR or copy and/or amplified production polynucleotides in the case of such In nucleotide (or reaction product), the adapter is used to product polynucleotides being connected to Illumina sequencing flow cells.Also Flow cell adapter can be added in product polynucleotides according to other technologies as known in the art (for example, connection).
In several embodiments, using with eight or more passages Illumina flow cells (Flowing Pond) it is used as solid support.Each passage can accommodate the cluster of the amplification more than 300,000,000, therefore can be used for high throughput analysis.In other realities Apply in example, use the amplification cluster for accommodating varying numberFlow cell or other flow cells.
Include new-generation sequencing technology available for the sequencing technologies in disclosed method, such as ionic semiconductor is sequenced (for example, Ion Torrent are sequenced), pyrosequencing (for example, 454 sequencings), connection method sequencing (for example, SOLiD is sequenced), close It is sequenced in real time (for example, Pacific Biosciences) into method sequencing (for example, Illumina is sequenced) and unimolecule.
Detection based on array.
In several embodiments, using array (for example, in product polynucleotides analysis based on hybridised arrays) detection Product polynucleotides as described herein.Exemplary array includes chip or plane matrix, pearl array, liquid phase array, " postcode " battle array Row, microarray etc..Material such as nitrocellulose, glass, silicon wafer, the optical fiber for being suitable for structure array are the technologies of this area Known to personnel.
Kit.
Present disclose provides including the kit for performing any means disclosed herein.
In several embodiments, the present invention provides depositing for one or more target polynucleotides in determination sample , be not present or characteristic and/or for measuring genotype.In several embodiments, which includes the multiple first complementary spies Pin and the second complementary probe, each first complementary probe have with the Sequence of the first target sequence complementation and with the first target sequence The Sequence of row incomplementarity, wherein non-complementary portion include inquiry site bar code sequence and adjacent universal sequence, and often A second complementary probe has and the Sequence of the second target sequence complementation and adjacent with the second target sequence incomplementarity Sequence, and buffer solution and enzyme for connecting and being enriched with.
First complementary probe can have sequence 5' to be inquired to the incomplementarity of the first complementary probe of the first target sequence complementation Site bar code and sequence 3' inquire site bar code to the incomplementarity of the first complementary probe of the first target sequence complementation.
In several embodiments, which includes at least one PCR primer, polymerase and one group of dNTP to realize richness The purpose of collection/amplification.
In several embodiments, which includes ligase.
In several embodiments, which includes the use of the licensing of the software needed for parsing sequence data.
In several embodiments, which includes the use of explanation.
First complementary probe and the second complementary probe (for example, lyophilized form) can provide in a dry form.If with dry Dry form provides, and probe can be dried by preservative (for example, trehalose).
Composition.
Present disclose provides including the composition for performing any means disclosed herein.
In several embodiments, there is provided a kind of to be used to detect depositing for one or more targets in one or more samples , be not present, content or characteristic.Said composition includes:Multiple first complementary probes and the second complementary probe, (i) each first Complementary probe have with two Sequences of the different piece complementation of the first target sequence and with the first target sequence incomplementarity Two Sequences, wherein non-complementary portion include inquiry site bar code sequence and universal sequence, and (ii) each second Complementary probe have with the Sequence of the second target sequence complementation and with the second target sequence incomplementarity close to Sequence simultaneously And including universal sequence.First complementary probe includes the target sequence complementation with the 3' and both 5' with inquiring site bar code The sequence of two parts.Said composition can be based on solution or be attached to the part of solid support or both.
In several embodiments, a part for the complementary portion of the first complementary probe inquires site bar code sequence for incomplementarity The 5' of row, and a part for the first complementary probe inquires the 3' of site bar code sequence for incomplementarity.Incomplementarity inquires site Bar code sequence can be referred to as " being anchored into " target by 5' the and 3' complementary series of the first complementary probe.Incomplementarity inquires position The length of point bar code sequence can be about 10 to 16 nucleotide, for example, length is 10,11,12,13,14 A, 15,16 nucleotide.
Bioinformatics
It is sequenced by direct Sequencing or to complementary series to measure the sequence of product polynucleotides.Side as described herein Method can be used for generating sequencing data, these sequencing datas can be analyzed by mathematical algorithm to determine the presence of specific SNP or not deposit It whether there is for heterogeneous or homogeneity, specific transcript in, indel and other mutation, specific site, particular target multinuclear The copy number of thuja acid and/or other features of target polynucleotide.
Under a number of cases, the genotype of sample can be measured by the following method:Analysis of allocated is (by comparing inquiry Site bar code) to the deciphering quantity in each allele (at the site), and determine to distribute to A equipotential bases in each sample The deciphering quantity of cause and the genotype of the ratio instruction sample of the deciphering quantity of distribution to B allele are AA, AB, BB or nothing Method measures.
Practicality.
Composition, method and kit as described herein are used to analyze a variety of target multinuclears in great amount of samples in unitary determination The presence of thuja acid, be not present, content or characteristic.
In general, multigroup first complementary probe and the second complementary probe are provided in unitary determination in unitary determination Assess the presence of multiple sequences, be not present, content or characteristic (for example, polymorphism).In several embodiments, in unitary determination Determine a variety of polymorphisms in multiple samples.In several embodiments, composition as described herein, method and kit are suitable for Genotyping and new-generation sequencing (NGS) technology is can relate in unitary determination while to generate great amount of samples and site Genotype.
Invention clause
1. it is a kind of be used to measuring the presence of one or more target polynucleotides in two or more samples, be not present or The method of content, this method comprise the following steps:
(a) two or more samples are provided, each sample includes one or more target polynucleotides, every kind of more nucleosides of target Acid includes the first target sequence and the second target sequence;
(b) multiple first complementary probes and the second complementary probe are provided, (i) each first complementary probe has and the first target The Sequence of sequence complementation and the Sequence with the first target sequence incomplementarity, wherein non-complementary portion include inquiry site Bar code sequence and adjacent universal sequence, and (ii) each second complementary probe has the sequence portion with the complementation of the second target sequence Point and with the second target sequence incomplementarity close to Sequence;
(c) the multiple first complementary probe is incubated with each individually sample and the second complementation is visited under hybridization conditions Pin so that the first complementary probe and the second complementary probe hybridize to their complementary target polynucleotide in sample and hybridized again with being formed It is fit;
(d) the first complementary probe and the second complementary probe that the first target sequence and the second target sequence are hybridized in sample are combined To form product polynucleotides;
(e) the product polynucleotides formed by single sample are collected;And
(f) determine target polynucleotide in one or more samples by analyzing product polynucleotides or its complementary strand Existence or non-existence.
2. according to the method described in the 1st section, wherein the first complementary probe and the second complementary probe and be closely adjacent to each other first Target sequence and the second target sequence are complementary.
3. according to the method described in the 1st section, wherein the first complementary probe and the second complementary probe with it is adjacent and be separated by 1 to The first target sequence and the second target sequence of 500 nucleotide are complementary.
4. according to the method any one of 1-3 moneys, wherein the adjacent non-complementary portion bag of the second complementary probe Include universal sequence.
5. according to the method any one of 1-4 moneys, wherein the adjacent universal sequence bag of first complementary probe The universal primer sequence with primer sequence complementation is included, the primer sequence can be used for increasing (i) sample index, (ii) for generating One of the sequence data or appended sequence of another form of detection, (iii) appended sequence or (iv) other parts are more Person.
6. according to the method described in the 4th section, wherein the adjacent universal sequence of second complementary probe includes and primer sequence Arrange complementary universal primer sequence, the primer sequence can be used for increasing (i) sample index, (ii) be used for formation sequence data or One or more of the appended sequence of another form of detection, (iii) appended sequence and (iv) other parts.
7. according to the method any one of 1-6 moneys, wherein there are sequence 5' to complementary with the first target sequence the One complementary probe incomplementarity inquiry site bar code and sequence 3' to the first complementary probe of the first target sequence complementation Incomplementarity inquires site bar code.
8. according to the method any one of 1-6 moneys, wherein there is the 3' ends and 5' ends with inquiring site bar code The sequence of both target sequence complementations.
9. according to the method any one of 1-8 moneys, wherein the adjacent universal sequence of first complementary probe is For 5' to complementary series, which is that the incomplementarity of 5' to the first complementary probe inquires site bar code.
10. according to the method any one of 5-9 moneys, wherein universal primer sequence includes PCR primer sequence and draws Thing sequence is to increase the appended sequence of the detection beneficial to formation sequence data or other forms.
11. according to the method described in the 10th section, wherein for the additional of formation sequence data or another form of detection Sequence is the adapter for new-generation sequencing.
12. according to the method described in the 10th section, wherein for the additional of formation sequence data or another form of detection Sequence is for capturing capture sequence on a solid surface.
13. according to the method any one of 5-12 moneys, wherein primer sequence can effectively increase beneficial to generation The part of sequence.
14. according to the method any one of 1-13 moneys, the length of wherein incomplementarity inquiry site bar code is 10 A, 11,12,13,14,15 or 16 nucleotide.
15. according to the method any one of 5-14 moneys, wherein the length of the sample index is 10,11, 12,13,14,15 or 16 nucleotide.
16. according to the method described in the 14th section, wherein the length of inquiry site bar code is 12 or 15 nucleotide.
17. according to the method described in the 15th section, the wherein length of sample index is 12 or 15 nucleotide.
18. according to the method any one of 1-17 moneys, wherein inquiry site bar code or the choosing of sample index sequence Free SEQ ID NO:1-SEQ ID NO:384 groups formed.
19. according to the method any one of 1-18 moneys, wherein exist incubate before the step of, which is included in Heated at a temperature of 70 DEG C to 100 DEG C.
20. according to the method any one of 1-19 moneys, it is more to be enriched with the product before being additionally included in enriching step Nucleotide.
21. according to the method described in the 20th section, wherein the enrichment includes:(a) one group of PCR primer sequence, the PCR are provided Primer sequence include with the first primer of the primer sequence complementation in the first complementary probe and with the PCR in the second complementary probe Second primer of primer sequence complementation, and (b) amplified production polynucleotides.
22. according to the method any one of 1-21 moneys, wherein method is the method based on solution.
23. according to the method any one of 1-22 moneys, wherein the first complementary probe includes inosine, the inosine is with visiting The 3' ends of pin are separated by 2,3,4,5,6,7,8,9,10 or more bases.
24. according to the method any one of 1-23 moneys, wherein the second complementary probe includes inosine, the inosine is with visiting The 5' ends of pin are separated by 2,3,4,5,6,7,8,9,10 or more bases.
25. according to the method any one of 1-24 moneys, wherein the first complementary probe and the second complementary probe and the One target sequence and the second target sequence are complementary, and the 3' ends of the first complementary probe and single nucleotide polymorphism (SNP) or other something lost A kind of form during the progress of disease is different is complementary.
26. according to the method any one of 1-25 moneys, wherein the means for combining are using connection enzymatic treatment Hybridize to the first target sequence of target polynucleotide and the first complementary probe of the second target sequence and the second complementary probe (hybridizes compound Body) to form product polynucleotides.
27. according to the method any one of 1-26 moneys, it is used in Genotyping, including provides the first complementary spy The homogeneity of one or more variants of pin, wherein variant in one or more nucleotide at 3 ' ends of the first complementary probe Aspect is different, wherein the measure included compared with quantization was made with other variants of first complementary probe it is first mutual Mend the relative frequency of one or more variants of probe and associate the frequency with genotype.
28. according to the method any one of 1-27 moneys, it is used for the copy number variation for measuring target polynucleotide, its Described in measure include will be by the semaphore that product polynucleotides or its complementary strand produce with known reference signal amount or by another The semaphore that product polynucleotides or its complementary strand produce is compared.
29. a kind of presence for the one or more target polynucleotides being used in determination sample, be not present or the combination of content Thing, including:Multiple first complementary probes and the second complementary probe, (i) each first complementary probe have and the first target sequence Two Sequences of different piece complementation and two Sequences with the first target sequence incomplementarity, wherein non-complementary portion Including inquiry site bar code sequence and universal sequence, and (ii) each second complementary probe has and the complementation of the second target sequence Sequence and with the second target sequence incomplementarity close to Sequence and including universal sequence.
30. according to the composition described in the 29th section, wherein first complementary probe include sequence 5' to the first target sequence Arrange the incomplementarity inquiry site bar code and sequence 3' to complementary with the first target sequence first of the first complementary complementary probe The incomplementarity inquiry site bar code of complementary probe.
31. according to the composition described in the 29th section, wherein first complementary probe includes and inquiry site bar code The sequence of the target sequence complementation at both 3' ends and 5' ends.
32. according to the composition any one of 29-31 moneys, wherein first complementary probe and the second complementary spy The universal sequence of pin each includes primer sequence, and the primer sequence can hybridize to the primer for composition sequence.
33. according to the composition described in the 32nd section, wherein primer sequence includes PCR primer sequence.
34. according to the composition any one of 29-33 moneys, wherein universal sequence includes primer sequence, the primer Sequence can increase the additional sequence of (a) sample index, (b) appended sequence, (d) for the detection of formation sequence data or other forms One or more of row and (e) other parts.
35. according to the composition any one of 29-34 moneys, wherein the adjacent general sequence of first complementary probe 5' is classified as to complementary series, which is that the incomplementarity of 5' to the first complementary probe inquires site bar code.
36. according to the composition described in the 34th section, wherein universal sequence is PCR primer sequence.
37. according to the composition described in the 34th section, wherein for the attached of formation sequence data or another form of detection Add sequence for the adapter for new-generation sequencing.
38. according to the composition described in the 34th section, wherein for the attached of formation sequence data or another form of detection Add sequence for capture sequence.
39. according to the composition any one of 29-38 moneys, wherein universal sequence includes primer sequence, the primer Sequence, which provides, to be used to increase sample index.
40. according to the composition any one of 29-39 moneys, wherein the length of inquiry site bar code be 10, 11,12,13,14,15 or 16 nucleotide.
41. according to the composition described in the 39th section and the 40th section, wherein the length of sample index be 10,11,12, 13,14,15 or 16 nucleotide.
42. according to the composition described in the 40th section, wherein the length of inquiry site bar code is 12 or 15 nucleotide.
43. according to the composition described in the 41st section, the wherein length of sample index is 12 or 15 nucleotide.
44. according to the composition any one of 39-43 moneys, wherein inquiry site bar code or sample index sequence Selected from by SEQ ID NO:1-SEQ ID NO:384 groups formed.
45. according to the composition any one of 29-44 moneys, wherein the first complementary probe includes inosine, the inosine It is separated by 2,3,4,5,6,7,8,9,10 or more bases with the 3' ends of probe.
46. according to the composition any one of 29-45 moneys, wherein the second complementary probe includes inosine, the inosine It is separated by 2,3,4,5,6,7,8,9,10 or more bases with the 5' ends of probe.
47. a kind of be used to measuring the presence of one or more target polynucleotides in sample, be not present, content or characteristic Kit, the kit include:
(a) multiple first complementary probes and the second complementary probe, (i) each first complementary probe have and the first target sequence Complementary Sequence and the Sequence with the first target sequence incomplementarity, wherein non-complementary portion include inquiry site bar shaped Code sequence and adjacent universal sequence, and (ii) each second complementary probe have with the Sequence of the second target sequence complementation with And with the second target sequence incomplementarity close to Sequence;And
(b) it is used for the buffer solution and enzyme for connecting and being enriched with.
48. according to the kit described in the 47th section, also comprising at least one PCR primer, polymerase and one group of dNTP to expand Increase the target polynucleotide of extension to realize the purpose of enrichment.
49. according to the kit described in the 47th section or the 48th section, also comprising ligase.
50. according to the kit any one of 47-49 moneys, also comprising the software explained needed for data.
51. according to the kit any one of 47-50 moneys, for measuring genotype.
52. according to the kit any one of 47-51 moneys, for measuring copy number.
Example
The purpose for providing following instance is to show various embodiments of the present invention, it is not intended that limits this in any way Invention.The example and method described herein of offer represent preferred embodiment, they are exemplary, and are not intended to be limiting The scope of the present invention.To those skilled in the art, spirit of the invention defined in the scope of claim is met Modification or other purposes are feasible.
Example 1 carries out foranalysis of nucleic acids by combining the polynucleotide probes of bar shaped code labeling.
Polynucleotide probes by combining bar shaped code labeling perform the method for foranalysis of nucleic acids, and by providing, to hybridize to target more Two complementary probes of two parts (the first target sequence and the second target sequence) of nucleotide are realized (Figure 1A).First complementary spy Pin and the second complementary probe can close to or separate 1 to 500 or more nucleotide.
First complementary probe includes shorter inquiry site bar code (as shown in Figure 1A, filament), the inquiry site bar shaped Code further discriminates between a kind of first complementary probe and another form of first complementary probe (Fig. 1 E) or other first complementary probes. This inquiry site bar code allows to determine site information and allelic information according to shorter evenly sized report sequence (or only site information or only allelic information).For the measurement result produced using new-generation sequencing, to inquiry position The Sequence of addition and target 5' complementations also makes inquiry site bar code be in high quality sequence data in point bar code Position.Inquire that site bar code can include only site, the only combination of allele, site and allele or the independent sequence of conduct The site of row and the information of allele.Using inquiring that site bar code makes the sequence of the report on gene loci and size, cloth Put mode and nucleotide composition associates.
In the first complementary probe, with the first target sequence complementation (Figure 1A;Thick line) sequence be asked site bar code every It is broken into two parts.Inquire site bar code and the first target sequence incomplementarity.
First complementary probe can also include universal sequence (Figure 1A-D;Dotted line).This universal sequence is properly termed as " general to draw Thing 1 ".This describes its common features as PCR primer site.It will be appreciated, however, that universal sequence can not have it is this Function, and can have the function of other, one or more of amplification and capture including but not limited to other forms.It is logical It can also be used to promote addition one or more of other sequences or other parts with sequence.
Second complementary probe has and the second target sequence (Figure 1A, thick line) and universal sequence (Figure 1A;Dotted line) complementary sequence Row.This universal sequence is properly termed as that " universal primer 2 ", this describes its common features as PCR primer site.First is mutual Mend the universal sequence in probe and the second complementary probe can be or can not be identical sequence (or can be complimentary to one another or can With not complementary).
First complementary probe and/or the second complementary probe, which can also be included or can not included, is used for the sequence that size is adjusted Row.
Universal sequence and target sequence incomplementarity.
In the first complementary probe and the second complementary probe and the first target sequence that there may be or can be not present in sample After row and the hybridization of the second target sequence, it can extend or the first complementary probe can not be extended so that it is tight with the second complementary probe Neighbour, wherein gap can be not present between the first complementary probe and the second complementary probe, or the first complementary probe and second mutual There can be the gap for including one or more bases between benefit probe, which can be padded in gap filling step.
Adjacent first complementary probe and the second complementary probe combine (as shown in Figure 1A, chevron pattern) generation product multinuclear Thuja acid (extends to 3' universal primers 2, as shown in Figure 1B) from 5' universal primers 1.
Then the template (Figure 1B) that this product polynucleotides can be enriched with as amplified reaction or other forms.In this example In, it is enriched with by PCR reactions.Although other configurations may be used, as illustrated in this example, PCR primer 2 has conduct A part, a part and work as sample index sequence for the complementary series of universal primer 2 (coming from the second complementary probe) To be connected a part (medium line) for subsequence.Using product polynucleotides as template, by (the closure in Figure 1B of PCR primer 2 Arrow) start DNA synthesis.
Then, this amplified production is as the template (figure that (in this example) is further expanded by PCR primer 1 1C).Although other configurations may be used, as illustrated in this example, PCR primer 1 has the complementary series as universal primer 1 A part (dotted line from the first complementary probe) and as linking subsequence a part (medium line).Use the first round The product of amplification starts DNA by PCR primer 1 (closure arrow) and synthesizes as template.In certain embodiments, PCR primer 1 is gone back There can be the part as sample index sequence, similar to the PCR primer 2 shown in Figure 1B.
In alternative embodiment, added in sample index (or its part) with PCR primer 1.In other implementations In example, sample index is added among both PCR primer 1 and PCR primer 2.Pass through PCR primer 2 and/or PCR primer 1, sample Mark sequence (sample index) or other sample identification parts are attached to each product polynucleotides.When sample index adds PCR When in primer 1, it is close to site bar code is inquired to promote the sequencing to both inquiry site bar code and sample index, at the same time The sum for the base for needing to be sequenced at utmost is reduced (if for example, sample index adds PCR primer 2, nothing at least in part First target sequence and the second target sequence need to be sequenced, first target sequence and the second target sequence can be at inquiry side originally Between bar code and sample index).
Two-wheeled DNA synthesis (Figure 1B and Fig. 1 C) obtains double stranded amplicon as shown in figure iD.In this example, this amplification The more multicopy of son is expanded by more wheels and generated, and the amplification guides DNA synthesis using PCR primer 2 and PCR primer 1.
In addition, in this example, generate the sequence data of the part (or whole amplicon) in relation to each amplicon.Not In the case of formation sequence data, there may be or the part there is no each amplicon.By each sequence and data of generation Storehouse is compared, and is distributed to appropriate sample and allele and/or site.Various factors (includes but not limited to sequence Mistake, polymerase errors or non-specific binding) mistake may be caused to distribute.Deciphering quantity to list display analyzed with Measure target sequence, SNP or gene loci presence, be not present, content or copy number.
(Fig. 1 E) in another example, there are the first complementary probe of two or more forms.Every kind of form is in 3' End has different sequences (as shown in A and B).This different sequence can be one or more bases.In extreme circumstances (for example, detection large fragment deletion or complexity indel), the first complementary probe of two kinds of forms and the first target of entirely different form Sequence is complementary.First complementary probe of two or more forms is between both extreme cases, and it is mutual to retain first Mend the other elements of probe.Despite the presence of other purposes, including but not limited to check pollution or the part of filling material or quasispecies or Heteroresistance, but the first complementary probe of diversified forms is commonly used in the classical genotype information of generation.In such case Under, for each sample and site, to distributing to the deciphering quantity of A allele and distributing to the deciphering quantity of B allele It is compared.For each site and in view of distributing to the deciphering quantity of A allele relative to distribution to B allele The ratio of the deciphering quantity of (and erroneous matching), the sample with deciphering of the main distribution to A allele is AA, is had main The sample of distribution to the deciphering of B allele is BB, and the sample of the deciphering with considerable distribution to two kinds of allele This is AB.These A and B nomenclatures are only used for distinguishing, not appointing with reference to the nucleotide sequence associated with A or B allele What is arranged.
2. both sides of example include the inquiry site bar code of complementary series.
When performing foranalysis of nucleic acids by combining the polynucleotide probes measure of bar shaped code labeling, deposited in the first complementary probe Multiple positions of inquiry site bar code can be arranged wherein.Inquiry site bar code may be arranged in universal sequence, can arrange (it is common between universal sequence and target-specific sequences in art methods, such as United States Patent (USP) US 8, in 460,866 It is disclosed), and may be arranged in target-specific sequences, as this paper is illustrated.When inquiry site bar code is arranged in During in target-specific sequences and with target polynucleotide incomplementarity, the both sides of inquiry site bar code are respectively provided with complementary series portion Point.When allele and site information (or under a number of cases, only site information) coding is when inquiring in the bar code of site, The advantage is that can control the degree of the sequence difference for detecting target polynucleotide in multiple reaction.In such situation Under, detect independent of the enough difference between target polynucleotide sequence.130 probe triplets (first are included to using Two kinds of forms of complementary probe and a kind of form of the second complementary probe) and be placed between target-specific sequences and universal sequence Probe assembly (PC) the obtained result of measure of 6mer inquiry site bar codes and use comprising for identical target polynucleotide and 130 probe triplets of variant and be placed in target-specific sequences and with target-specific sequences incomplementarity (so that In the first complementary probe, there is complementary Sequence in the both sides of inquiry site bar code) 12-mer inquire site bar The result that the PC of shape code is measured is compared.Probe in PC is respectively 50pM.In the design of 12-mer probes, at 5' ends Complementary portion is increased several bases (with the remainder of complementary region at a distance of 12 bases).In 6mer, 12-mer design, The complementary region 3' of 6mer or 12-mer inquiry site bar codes is identical in terms of size and composition.In addition, in 12-mer designs, Incomplementarity inquiry site bar code comprising 12 bases includes the information of the combination in allele and site.Designed in 6-mer In, the incomplementarity inquiry site bar code comprising 6 bases includes the information of allele.Information of the distribution to site will be understood It is the sequence (will similarly be contained in data) of target sequence.
Cow genome group DNA (50ng/ μ L) from single sample is placed in the hole of porous plate, 98 DEG C is heated to and keeps 15 minutes.Then a part for each sample is transferred to fresh sample plate and is mixed with PC (12-mer).Then, these reactants 1 minute is kept at 98 DEG C to unwind, and at 60 DEG C incubate 20 it is small when hybridized.It is after hybridization, 3.2 μ L are anti- Thing is answered to add in the wait plate comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases.Will be new Sample panel seals, mixed reactant, and centrifugation, is then kept for 15 minutes at 54 DEG C, kept for 10 minutes at 98 DEG C, be cooled to 4 DEG C and be kept at this temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, are included in PCR reactions Promega GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and PCR primer 1 With PCR primer 2.All reactants mixing, handled by single Zymo-100 silicagel columns and use 150 μ L TE8.0 into After row elution, 32 circulations were completed altogether in 20 seconds/15 seconds at a temperature of 65 DEG C/95 DEG C.Then utilize 2100 trace of Bioanalyzer carry out quantitative analysis to the library collected, and are diluted to Illumina Next500 flow cells In appropriate fraction, and formation sequence data.
To the sample index sequence and inquiry site bar code (as needed, including other sequences) included in each deciphering Compared with database, deciphering quantity that list display is formed by each sample x sites x allele (and including mistake The deciphering of distribution).For each site, to the deciphering quantity (X-axis) of A allele and the solution of B allele of each sample Reading amount (Y-axis) is mapped.This A allele is referred to as cluster figure with B allele figures, as described in Fig. 2A-C.
Above-mentioned standard side is also performed using the PC comprising 6mer probes and a DNA for being heated to 98 DEG C and being kept for 15 minutes Case.
In general, the results showed that, there is the inquiry position between universal sequence and target sequence by the first complementary probe The PC of point bar code by the first complementary probe there is the incomplementarity in target sequence to inquire bar code (so that inquiry bar There are complementary series part for the both sides of shape code) PC obtain genotype ability it is similar.Although the it was observed that difference in site one by one It is different, but the position of the inquiry site bar code in relation to site and allele and type (wherein inquire site bar code by cloth Put and the information and target sequence incomplementarity) information do not change the characteristic of Genotyping cluster figure.By including 6mer and 12-mer Some differences between the result that the PC of design is obtained are probably that effective concentration and probe concentration of the inhomogeneities as caused by manufacturer is (complete Long material) difference, other differences are due to that 12-mer designs have stronger ability, it can be ensured that similar target sequence will not that This gets wrong.
Example 3. differentiates G in Genotyping measure:T mispairing.
Those skilled in the art is it is known that based on the measure of connection in T:G hybridization errors (" mispairing ") betide SNP inquiries With relatively low specificity during the 3' ends of sequence and target sequence.Occur under this case when detecting G/A SNP.With regard to Genotyping For measure, when the 3' ends of the first complementary probe (for example, first complementary probe #1) of the first form, there are C and the second form The first complementary probe (for example, first complementary probe #1') there are during T, this is probably a problem.Correct first form The connection of first complementary probe is at the SNP site of target polynucleotide on the target polynucleotide with G, and the first of mistake The first complementary probe connection of form is at the SNP site of target polynucleotide on the target polynucleotide with A.Correct The first complementary probe connection of two forms is at the SNP site of target polynucleotide on the target polynucleotide with A, and wrong The first complementary probe connection of the second form is at the SNP site of target polynucleotide on the target polynucleotide with G by mistake. For a person skilled in the art, when detecting C/T SNP, it is understood that there may be similar T:G hybridization errors.The G:T Part hydrogen bond between " mispairing " nucleotide is sufficiently stable, it is allowed to which ligase (inefficiently) is by the first complementary probe knot of mispairing Close to the second complementary probe.This causes non-specific target polynucleotide to occur within the time of 0-25%.At utmost to reduce This signal, using the universal base deoxyinosine of the inquiry 3' positions nearside in the first impacted complementary probe.By The 2nd, the 3rd, the 4th, the 5th, the 6th, the 7th, the 8th or the 9th of the 3' ends of first complementary probe of influence form Deoxyinosine is included at a position, makes G:T mispairing unstabilitys, so as to reduce the possibility for producing nonspecific products polynucleotides (and frequency).When deoxyinosine is in the 2nd 3' positions of the first complementary probe of impacted form, G:T mispairing causes shakiness Qualitative deficiency so that at utmost reduce incorrect link, and the frequency for producing nonspecific products polynucleotides is relatively low.Not by Deoxyinosine in first complementary probe of the form of influence does not make hybridization unstability to influence genotype resolution ratio, and with (phase When in big degree) specifically mode and specific product polynucleotides are attached reaction.With deoxyinosine, to move to first mutual The 5' sides of probe are mended, it reduces G:The ability of the stability of T mispairing declines.When inosine is at the 10th position away from 3' ends, The new-generation sequencing produced by nonspecific products polynucleotides is understood to comprising the impacted or ill-formalness first complementary spy Pin is of equal importance.The ideal position for reducing the deoxyinosine of mispairing connection is at the 2nd, the 3rd or the 4th 3' base.
In this example, exist at the 3' positions 2 to 10 of the first complementary probe (there are T at 3' ends) of impacted form de- Oxygen inosine (inosine that there is substitution base).So obtain 10 kinds of forms of the first complementary probe of impacted form.Use probe Buffer solution visits the first complementary probe comprising impacted form, the first complementary probe of impregnable form and the second complementation 50pM is made in a kind of probe assembly of inosine arrangement of pin (for target polynucleotide).By single ox gDNA samples (50ng/ μ L) heats 20min at 98 DEG C makes its cracking, and then 5 μ L are filled into hole.Then filled out with each probe mixture Fill four holes.Then NGG reactants are heated to 98 DEG C and are kept for 1 minute, when being subsequently cooled to 60 DEG C and small holding 20.Miscellaneous After friendship, 3.2 μ L reactants are added comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases Wait plate in.New sample panel is sealed, is mixed, centrifugation, is then kept for 15 minutes at 54 DEG C, kept for 10 seconds at 98 DEG C Clock, is cooled to 4 DEG C and is kept at this temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, are reacted in PCR In include Promega GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and the One universal primer and the second universal primer.In the mixing of all reactants, handled and be used in combination by single Zymo-100 silicagel columns After 150 μ L TE8.0 are eluted, 32 circulations are completed altogether in 20s/15s at a temperature of 65 DEG C/95 DEG C.Then it is sharp Quantitative analysis is carried out to the library collected with 2100 trace of Bioanalyzer, is diluted to Illumina Next500 flow cells Appropriate fraction in, and formation sequence data.To the sample index sequence included in each deciphering and allele and site Bar code sequence is compared with database, and the deciphering quantity that list display is formed by each sample x sites x allele is (to the greatest extent Pipe is understood including non-specificity).The results are shown in Figure 3.
In figure 3, series model (the impacted form of LHS-T or the first complementary probe is 5' to 3', and target gDNA or Genomic DNA is 3' to 5') ten big 3' positions of the first complementary probe comprising 3'T nucleotide (nothing, iT2 to iT10) are shown simultaneously And it is illustrated as the G nucleotide in mispairing to genomic dna sequence.The 2nd shown 3' positions (i) correspond to " iT2 ".gDNA The underscore part of sequence is the second complementary probe by the part of hybridization.Closed grey bar is homozygosity GG samples, striated bar Represent the sample of homozygosity AA.Y-axis is the logarithmic scale of the deciphering quantity associated with the T-shaped formula of the first complementary probe.In ash In vitta (homozygosity GG samples), exist by G:Non-specific connection caused by the stability of T mispairing.It is (homozygous in striated bar Property AA samples) in, understand from specificity connection.It is placed in the 2nd or the 3rd 3' of the first complementary probe of impacted form The deoxyinosine arrangement put significantly reduces the deciphering quantity of non-specific connection.Similarly, deoxyinosine can be used for the In one complementary probe, which has 3'G and G:The possibility of T mispairing.
Example 4. is used for the method that target polynucleotide is detected in the case of a large amount of excess of background dna.
Utilize the non-of detection method and associated data analysis the detection excess comprising the DNA from multiple species and largely Target polynucleotide present in the sample of target polynucleotide DNA.Detection method (and associated data analysis) generation has gene The information of type information (SNP or other variations) and the content in relation to target present in sample.A kind of detection method it is effective Property is proven in model experiment, wherein by the way that target (or signal) genome (single ox sample) is titrated to background large intestine bar In bacterium genomic DNA, by genome of E.coli DNA (background or " noise " DNA that do not detect) and the target gene of variable amount Group DNA mixing.Noise is arranged to each reaction 0,125 or 250ng, and with the TE buffer solutions of pH=8.3 by signal from every Secondary response 250,125,62.5ng serial dilutions to as low as 0.12207ng (totally 12 kinds of concentration).Test tube is heated to 98 DEG C and is protected Hold 15 minutes, crack DNA.Each signal pipe is used as source, it is anti-that 5 μ L samples therefrom are transferred to 8EG in 96PCR orifice plates row In each hole of Ying Kongzhong.By group (two kinds of forms of the 135 probe triplets for carrying out Genotyping to cow genome group DNA The first complementary probe and a form of second complementary probe) formed probe assembly (PC).By PC and each reaction 0,125 or The noise genome of E.coli DNA mixing of 250ng.These PC+ Escherichia coli mixtures are distributed on 96 orifice plates (in three rows In be each reaction 250ng, be each reaction 125ng in three rows, and be each reaction 0ng in two rows).Then, will These reactants heat 1 minute to unwind at 98 DEG C, and at 60 DEG C incubate 20 it is small when hybridized.In hybridization Afterwards, by 3.2 μ L reactants add comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases etc. Treat in plate.New sample panel is sealed, mixed reactant, centrifugation, is then kept for 15 minutes at 54 DEG C, 10 are kept at 98 DEG C Second, it is cooled to 4 DEG C and is kept at this temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, it is anti-in PCR Should in comprising Promega GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and First general PCR primer and the second general PCR primer.Carried out in the mixing of all reactants, by single Zymo-100 silicagel columns After handling and being eluted with 150 μ L TE8.0, completed altogether 32 times in 20 seconds/15 seconds at a temperature of 65 DEG C/95 DEG C Circulation.Then quantitative analysis is carried out to the library collected using Bioanalyzer 2100trace, is diluted to Illumina In the appropriate fraction of Next500 flow cells, and formation sequence data.
To sample index sequence and allele and site bar code sequence (the inquiry site bar included in each deciphering Shape code) compared with database, deciphering quantity that list display is formed by each sample x sites x allele (and including The deciphering of mistake distribution) and determine genotype.Shown in Fig. 4 A and Fig. 4 B and DNA and the relevant number of cow genome group equivalent According to, and show to can be used for the polynucleotides target calibration method in detection diploid eukaryotic gene groups, even if target signal genome Less than the 0.1% of STb gene is accounted for, which is also suitable.This method can be extrapolated to be detected under the background of eukaryotic gene groups Microbial genome, the genomic fragment in detection of complex food, environment or other samples, detects the RNA of low content in cell, Detect the pollution seed being incidentally present of and other application.
Reversible denaturation before the hybridization of 5. probe of example improves cluster figure resolution ratio.
Methods of genotyping and associated data analysis need to be used as the more nucleosides of target using double-strand or single-chain nucleic acid (NA) Acid.First complementary probe and the second complementary probe need to touch the single-stranded NA for hybridizing to target polynucleotide.Experimental result table It is bright, to obtain the even single-stranded NA of double-strand, it is necessary to which sample is heated to higher temperature.Exemplary temperature includes 70 DEG C to 100 DEG C In the range of, and heating time is 1 second to 15 minutes.This reversible denaturation step is improved to target polynucleotide (especially The target polynucleotide being present in double-strand in sample) detection.
Experiment is performed using similar to the method described in example 2, (is used for using the 135 probe triplets comprising ox SNP Genotyping) probe assembly (PC) and 96 cow genome group DNA samples.In an experiment, DNA is heated to 98 DEG C and is protected Hold 15 minutes, add PC, mix with sample, then by sample and PC heating (reversible denaturation) to 98 DEG C and holding 1 minute, finally Hybridization step when execution 20 is small at 60 DEG C.In the second experiment, do not heated before hybridization step when 20 is small.In this reality In testing, not by sample be heated to 98 DEG C and keep 15 minutes, and not by PC and sample be heated to 98 DEG C and keep 1 minute, most Hybridization when execution 20 is small at 60 DEG C afterwards.After hybridization, the steps such as connection, PCR and the sample pooling as described in example 2 are performed Suddenly.
Then quantitative analysis is carried out to the library collected using 2100 trace of Bioanalyzer, is diluted to Illumina In the appropriate fraction of Next500 flow cells, and formation sequence data.To the sample index sequence included in each deciphering and wait Compared with database, list display is formed by each sample x sites x allele for position gene and site bar code sequence Understand quantity (and including the deciphering of mistake distribution).For each site, to the deciphering quantity of the A allele of each sample Deciphering quantity (Y-axis) mapping of (X-axis) and B allele.As a result as fig. 5 a and fig. 5b.
Example 6. carries out Genotyping measure using dry probe assembly.
Genotyping measure as described herein includes the nucleic acid and probe blend mixed with high salt concentration.First is complementary Probe and the second complementary probe are in solution, wherein in " probe assembly " or " PC " (probe, TE and hybridization buffer) The concentration of each single probe is 50pM.This example illustrates the improved method for setting Genotyping to measure reaction.It is expected to visit Needle assemblies are placed in reacting hole, make its drying and sealed sample plate, at room temperature long-term (that is, several years) storage.For to ox base Because group DNA carries out one group of 135 probe triplet (the first complementary probe and a form of second of two kinds of forms of Genotyping Complementary probe) dried in reacting hole.The single PC that working concentration is 50pM is made.3 μ L PC are placed in the six of 384 orifice plates Arrange in hole.Another PC is made into wherein including identical 135 probe triplets, TE and buffer solution and 0.4mM trehaloses.Seaweed Sugar is can be beneficial to the preservative of dry polynucleotide, and dry PC is fixed to the bottom of reacting hole by it.Similarly, comprising seaweed The PC of sugar is used to add in six row holes of 384 orifice plates with the amount of 3 μ L.By sample panel is placed in Laminar Ventilation cupboard (wherein without Bacterium dust-free air passes through sample panel) it is dried overnight, make one of each type of sample panel PC (wrapping with or without trehalose) It is completely dried.By a sample panel sealing without trehalose and stored frozen is at -20 DEG C.By dry sample panel sealing simultaneously Storage is at room temperature.
After one month, fresh PC is prepared, and 3 μ L PC are added in six row holes of 384 orifice plates.By one group of 96 cow genome group DNA sample (concentration is 50ng/ μ L, 35 μ L of volume) is heated to 98 DEG C and is kept for 15 minutes.Then gDNA is added into 4 sample panels In, 5 μ L are added into two wet sample panels (a kind of fresh sample plate, another kind store one month at -20 DEG C), and will 8 μ L are added in two dry-eye disease plates and (stored at room temperature one month).The porose volume of institute is 8 μ L.Sample panel is sealed, Simply centrifuge, and stand at room temperature 2 it is small when to ensure the complete rehydration of dry probe.Then, these reactants are existed 1 minute is heated at 98 DEG C to unwind, and at 60 DEG C keep 20 it is small when hybridized.After hybridization, 3.2 μ L are reacted Thing is added in each plate comprising 12.8 μ L NEB 1X Taq DNA ligases buffer solutions and Taq DNA ligases.By new sample Product plate seals, and mixes, and centrifugation, is then kept for 15 minutes at 54 DEG C, kept for 10 seconds at 98 DEG C, be cooled to 4 DEG C and keep At such a temperature.This reactants being fully connected of 1 μ L are used in PCR reactions, Promega is included in PCR reactions GoTaq Hotstart Taq PCR (12.5 μ L of cumulative volume), 1X buffer solutions, dNTP, 0.3uM and the first general PCR primer and Second general PCR primer.In the mixing of all reactants, handled by single Zymo-100 silicagel columns and use 150 μ L After TE8.0 is eluted, 32 circulations were completed altogether in 20 seconds/15 seconds at a temperature of 65 DEG C/95 DEG C.Then utilize 2100 trace of Bioanalyzer carry out quantitative analysis to the library collected, and are diluted to Illumina Next500 flow cells In appropriate fraction, and formation sequence data.To the sample index sequence and allele and site bar included in each deciphering Shape code sequence compared with database, deciphering quantity that list display is formed by each sample x sites x allele (and Include the deciphering of mistake distribution).For each site, to the deciphering quantity (X-axis) and B equipotentials of the A allele of each sample Deciphering quantity (Y-axis) mapping of gene.As a result as shown in Fig. 6 A-D.
In general, generated by freshly prepared PC or stored frozen or PC that is dry with trehalose and storing at room temperature The ability of genotype is similar.
Example 7. is used for the method for measuring copy number variation.
Copy number analysis method can be used for measure copy number variation (CNV), wherein by the zero-copy of allele with it is identical One or two copy of allele distinguishes.To prove this point, a new generation that copy number is analyzed to measure generation surveys Sequence understands (96 ox samples of the input DNA with normalized amount in all samples) and suitable for the probe assembly Inquiry site bar code and the database of sample index sequence are compared.The deciphering quantity that list display is created by each sample With the single allele of Single locus (and including the deciphering of mistake distribution) and analyzed.As for figures 7 a-c, in BB The animal of homozygosity has null solution reading or the deciphering close to zero, wherein the inquiry site with A allele (at the site) Bar code.There are about 200 decipherings in the animal of AB heterozygosity, wherein the inquiry position with A allele (at the site) Point bar code.Finally, AA homozygosity animal has about 400 decipherings, wherein the inquiry site bar code with A allele. In such CNV measure, input nucleic acid needs to be consistent sample to be tested, or needs to come using extra label Adjust the deciphering quantity (and/or each gene loci of inquiry) produced in each sample.
Example 8. uses Genotyping assessment tetraploid genome.
In tetraploid organism, four copies of allele may have, and a copy is included on every chromosome. To imitate tetraploid organism, the DNA from two kinds of different diploid animals (same species) is mixed, generation has The sample of four copies of any given allele.Add comprising (two kinds of the probe triplet for a variety of target polynucleotides First complementary probe of form and a form of second complementary probe), and method is performed according to described in example 2, it is different It there is provided the cluster figure for five kinds of genotype.
Sample index sequence and inquiry site bar code sequence to being included in each deciphering arrange compared with database Table shows the deciphering quantity (and including the deciphering of mistake distribution) formed by each sample x sites x allele.For the position Point, deciphering quantity (Y-axis) mapping of deciphering quantity (X-axis) and B allele to the A allele of each sample.This experiment Illustrating in sample is or five kinds of genotype groups is distinguished during comprising tetraploid genome and produce the ability of genotype.As a result as schemed Shown in 8.
The Genotyping inquiry in 9. multiple alleles site of example.
Genotyping inquiry can be by only adding for triallelic, tetra-allelic or more allele The three, 4th or more forms the first complementary probe to multiple alleles SNP carry out Genotyping.For detection example three Possible genotype at allele SNP positions, probe assembly (PC) include three almost identical the first complementary probes and list A second complementary probe.It is each with may be present three kinds in diploid gene group DNA not in three the first complementary probes One of (SNP) complementary different 3' terminal nucleotides are replaced with base.Each also having in three the first complementary probes Have the inquiry site bar code of uniqueness, which can identify allele and site, the allele and Site for the first complementary probe of accurate target polynucleotide and the precise forms of variant by being detected.Perform strictly according to the facts Method described in example 2, unlike for the site, comprising three inquiry bar codes, (each variant corresponds in database One inquiry bar code).
Sample index sequence and inquiry site bar code sequence to being included in each deciphering arrange compared with database Table shows the deciphering quantity (and including the deciphering of mistake distribution) formed by each sample x sites x allele.For the position Point, the deciphering quantity (Y-axis) and C equipotential bases of deciphering quantity (X-axis) and B allele to the A allele of each sample Deciphering quantity (z-axis) mapping of cause.This A allele is referred to as cluster figure with B allele and the figure of C allele.This In the case of, allele A is G bases, and allele B is T bases, and allele-C is C bases.AA animals are along x-axis, BB Animal is along y-axis, and CC animals are along z-axis.Heterozygosity animal (TC, TG, CG) is between any two axis.This experiment shows produces The ability of the genotype in multiple alleles site.The results are shown in Figure 9.
Example 10. is used for the methods of genotyping for detecting the hereditary variation (gene loci) in addition to single base is replaced.
Methods of genotyping (and associated data analysis) whether there is in the sample for target polynucleotide.Some In the case of, target polynucleotide can be the result of missing/insertion event.In this experiment, a form of first complementary probe It is designed to complementary with the target sequence comprising missing, the first complementary probe of other forms is designed to not having missing Target sequence it is complementary.3' sequence of second complementary probe close to the first complementary probe of two kinds of forms.According to example 2 after It is continuous to perform workflow.
For each site, to the deciphering quantity (X-axis) of A allele and the solution reading of B allele of each sample Measure (Y-axis) mapping.In this case, A allele and B allele are insertion and missing (vice versa).This experiment exhibition The ability for the variant that the genotype at measure specific gene site and not single base are replaced is shown.As a result such as Figure 10 institutes Show.
Example 11. is used for the method for selecting and optimizing sample index bar code.
Target is to generate totally 96000 complementary probe, wherein having 15mer between universal primer sequence and linking subsequence Sample index bar code (Fig. 1 C and Fig. 1 D).These 15mer sample index bar codes also have 12 nucleotide (nt) of reduction Length is understood, it is suitable for handling the different samples of low amount, cost and time is for example sequenced so as to save, Because the one 12 nucleotide need to be only sequenced to identify specific sample index.The theoretical maximum of different 15mer sequences Numerical value is 1073741824 (4 nucleotide comprising 15mer sequences are 4^15 different sequence=1073741824).From the number Homopolymer is deducted in word, obtains 139839696 15mer (http://oris.org/searchQ=berserker+ A123620)
Further shorten these 15mer sequences by selection course.Selection criteria includes at utmost improving in bar code And bar code adds the orthogonality of flanking sequence (universal primer sequence and linking subsequence);At utmost improve bar code Specificity so that they are free of homopolymer, and G/C content is about 40-60%;And ensure and (such as Illumina TruSeq) The business compatibility of Nextera indexes, avoids the subsequence for easily making instrument produce high error rate sequence.After selection course, It is randomized bar code and confuses bar code candidate vectors.
These bar codes stores are in complete and reduction deciphering length group.They are preloaded into business to index above (with rank Connect sub- extension).The quantity of bar code is arranged from maximum to smallest edit distance " d ".Check each bar code in candidate vectors GC contents, be subject to the tripolymers of phase erroneous effects, and touching in deciphering group that is complete or shortening is searched under given " d " Hit.If bar code not in any group of, is added to two groups of concentration.
Index plate sequence-index plate is grouped by performance indicator, with including higher in such as subset of all samples plate Orthogonality/specificity.For each (384 hole) index plate to be generated, based on unspecified bar shaped code character (such as 15/ 12nt understands editing distance) the optimal bar code subgroup of selection.By subset allocation to each plate, and the performance for calculating each plate refers to Mark.Performance indicator is based on sequencing solution reading.The example of performance indicator is as follows.
It was found that some bar code sequences cause relatively low sequencing solution reading.
It was found that the motif in bar code causes bar code and the interaction being connected between subsequence.It was found that the motif includes The sequence of about 7 bases (CTAGCCTCC), and can cause to be produced from complementation between the 3' ends of complementary probe and internal sequence. It has also been found that the variation of this 7bp motif form.Example is as shown in Figure 11 A and 11B and Figure 12 A and 12B.
Computer program is built to substitute these problematic sequences, such as by successive optimization sequence as far as possible, and Reach the 96K bar codes of full breadth.Since this specific motif seems more more than problematic tripolymer and editing distance Ground influences performance, therefore all these is all taken into account in design/branch mailbox flow.However, in the case where all substituting, and When carrying out local optimum to each plate under the identical standard of global editing distance, 84096 sample index are generated.These ropes The one 16128 index in drawing also is used as 12-mer, for handling the experiment of greater number sample and using (for example, place 10 384 hole microtiter plates are managed, each hole includes a sample).
Example 12. is used for the methods of genotyping for detecting the target polynucleotide in polyploid sample.
In this example, describe for detecting the present or absent of the target polynucleotide in polyploid wheat sample Methods of genotyping (and associated data analysis).Using ploidy reduce strategy with reduce generation without in information gene group Sequence data.Under a number of cases, target polynucleotide can be the result of SNP or missing/insertion event (indel).
In first method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/ Indel and nearside SNP/indel information.Relative position based on nearside SNP/indel Yu target/marker SNP/indel Knowledge, probe is designed as to have selectivity to genome interested, using nearside SNP unstabilitys strategy to multiple by biology Polygamy reduces ploidy.Selection carries out Genotyping on Axiom and shows the target of diploid cluster.It is another to ensure appointing for target indicia thing Nearside SNP is not present in 9 bases of side (see Figure 14 A-C).
A form of first complementary probe be designed to it is complementary with the target sequence with SNP (LHS), the of other forms One complementary probe is designed to complementary with the target sequence without SNP or indel (LHS').Second complementary probe (RHS) close to The 3' sequences of first complementary probe of two kinds of forms.Completing the selection to genome interested (that is, has the target of nearside SNP Genome will produce low sequence number).Incorporating nearside SNP in probe design causes the site for not producing deciphering to play a role completely (see Figure 15 A and 15B).Workflow is continued to execute according to example 2.
In the second approach, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/ Indel and nearside SNP/indel information.Knowing based on nearside SNP/indel and the relative position of target indicia thing SNP/indel Know, addition is with having the blocking oligonucleotide of the target genome sequence complementation of nearside SNP/indel to prevent RHS from hybridizing to target base Because of group.
A form of first complementary probe is designed to, other shapes complementary with the target sequence with SNP/indel (LHS) First complementary probe of formula is designed to complementary with the target sequence without SNP/indel (LHS').Second complementary probe (RNS) Close to the 3' sequences of the first complementary probe of two kinds of forms.Addition and the target sequence comprising nearside SNP are complementary in target gene group Closing/competition oligonucleotides.Blocking oligonucleotide prevents RHS from hybridizing to target gene group.Complete to genome interested Selection (that is, the target gene group with nearside SNP will produce low sequence and understand or do not produce sequence deciphering).It is few by adding closing Nucleotide, incorporates the site that nearside SNP causes not produce deciphering in probe design (see Figure 16 A and 16B).This method is applicable in Between nearside SNP is in the base 1 and 10 of target indicia thing.It can not make two level polymorphism unstability.According to example 2 after It is continuous to perform workflow.
In third method, disclosed and proprietary wheat cdna data unit sequence is obtained, wherein including target SNP/ Indel and nearside SNP/indel information.Knowing based on nearside SNP/indel and the relative position of target indicia thing SNP/indel Know, PCR primer is designed to selective amplification unique genome interested or subgenome.In this approach, increase Early period PCR amplification step.Target gene group can be hybridized to one or both of the PCR primer of target genome sequence complementation Nearside SNP in sequence.This prior step using PCR amplification can be that (that is, sample is divided into two to parallel work flow journey form Part).
A form of first complementary probe is designed to, other shapes complementary with the target sequence with SNP/indel (LHS) First complementary probe of formula is designed to complementary with the target sequence without SNP/indel (LHS').Second complementary probe (RHS) Close to the 3' sequences of the first complementary probe of two kinds of forms.Increase PCR amplification step early period, use the PCR primer by design With selective amplification unique genome interested or subgenome.Nearside SNP/indel may lose the hybridization of PCR primer Surely.The selection to desired genome interested is completed (that is, to eliminate in follow-up work flow and unwanted include nearside The genome of SNP).The combination of various sites and genome is adapted to (see figure using the combination of multiple PCR primer groups and probe groups 17A and 17B).Workflow is continued to execute according to example 2.
For each site, to the deciphering quantity (X-axis) of A allele and the solution reading of B allele of each sample Measure (Y-axis) mapping.(Axiom cluster map analysis.) this experiment shows uses the knowledge-chosen of nearside SNP/indel base interested Because group with identify treat selection target SNP/indel and design probe ability.As a result as shown in figures 18a and 18b.
In other method, 600X mean coverages can be used in a small amount of selection marquee thing.This method needs parallel work Make flow (that is, sample is divided into two parts).Under a number of cases, sample with neighbouring nearside SNP or single base indel phases Separated between the label of pass, and for impacted label, it is and used to 200X coverages in other methods Sequencing is on the contrary, it makes to have the coverage of the isolate of impacted label to improve to 600X (when i.e., using extra sequencing Between and expense compensate, rather than early period Eureka part compensate).In the case of this method has been used for other, Deep sequencing such as is carried out with RNA-Seq to aid in detecting rare transcript, therefore in different situations, this will be Eureka equivalents.
Example 13. is used for the method that target RNA is detected in the case where being not converted into cDNA.
In this example, the method for detecting target RNA in the case where being not converted into cDNA is described.In an example Property method in, more reconnections are carried out to RNA target mark using new commercially available ligase (Splintr for being purchased from NEB) and are situated between The PCR led, the ligase can connect the adjacent DNA probe for hybridizing to RNA chains.This multiple ligation-mediated PCR based on RNA can Many measure is performed, wherein without RNA is converted into cDNA.Method described herein, which has, eliminates RNA to cDNA conversion deviations The advantages of.Method described herein has potential purposes in terms of RNA is inquired, is having the advantage that in applying below:Detect chain Specific alleles purposes, the copy number measure of RNA and mRNA transcripts, alternative splicing and splicing variants analysis, with And the detection of fusion.Method described in this example is the multiple ligation-mediated PCR detection side as described herein based on DNA Method directly extends, but is used as target using RNA rather than DNA.
Select one group of 778 site in multiple human mRNA's transcript known at extron and exon boundary.Visit Pin is designed to inquire these mRNA transcripts.Fusion is normally incorporated with the 5' ends of a gene and the 3' of another gene Between end.Diverse location in DNA occurs for the breakpoint of each gene, but is most commonly in introne so that the RNA of splicing leads to Often in exon boundary, there are breakpoint.Probe is designed to the extron of encirclement introne of the covering comprising known fusion breakpoint End.As positive control, probe is designed to be arranged in the encirclement introne for Beta Actin and GAPDH genes The end of extron, it is free of known fusions.As negative control, probe is designed to be arranged in for Beta Actin With the end of the introne of GAPDH genes, it is only expanded depositing in the case of dna.
Ligation-mediated PCR needs a pair of two kinds of DNA probe, the first complementary probe and phosphorylation occurs at 5' ends The second complementary probe.DNA or RNA specificity ligase can hybridize to target DNA or RNA moulds respectively in two DNA probes The 3'OH groups of the first complementary probe are made to be attached to the second complementary probe 5' phosphate groups in the case of plate.First complementary probe Hybridization is designed at exon boundary, and the second complementary probe is designed to hybridization in close to hybridizing the first complementary probe Extron extron at.For example, if the first complementary probe is designed to hybridize to extron II, corresponding second is mutual Mending probe will be designed to hybridize to extron III.In this way, the first complementary probe and the second complementary probe are to being only capable of The extron II of appropriate montage is enough included to the RNA transcript of extron III events.Other second complementary probes are designed to Such as in the case where the second complementary probe is designed to hybridize to extron IV, measure will detection extron II/ extrons IV Splicing variants.The length of DNA probe is between 20 bases and 50 bases, so that the annealing temperature that is calculated is between 68 DEG C between 74 DEG C.Each first complementary probe has common/common PCR primer sites at 5' ends, and each second complementation is visited Pin has different common/common PCR primer sites at 3' ends.Supported using common/common PCR primer sites special using sample Opposite sex index PCR amplification connection product.High salt concentration buffer solution (750mM KCl, 30mM Tris-HCl pH=8.5, 0.5mM EDTA pH=8.0) in, all probe blendings are the single blend that concentration is 50pM.
By commercially available human cell line RNA (HeLa), high salt concentration probe buffer solution and RNA protection reagent be mixed into compared with In 8 small μ L reactants, which is contained in the hole of 384 orifice plates, and is sealed with the paper tinsel of heating.Will be miscellaneous in PCR instrument device Hand over reactant to be heated to 95 DEG C and kept for 1 minute, when being subsequently cooled to 60 DEG C and small holding 20, to promote to hybridize.In connection Before, hybridization reaction thing is cooled to 54 DEG C, be subsequently placed in it is wet on ice.
Connection mixture comprising Splintr enzymes (unit/reaction) and its 1X reaction buffers is distributed into each reaction 32 μ L, and it is cooled to wet ice temperature.Then 8 μ L hybridization reactions things are added in coupled reaction mixture and are sufficiently mixed.First Whole mixture is heated to 54 DEG C and is kept for 15 minutes, 92 DEG C is then heated to and is kept for 15 seconds so that any disconnected probe Go to hybridize, and be cooled to 4 DEG C or freezing.
The common PCR primer that PCR mixtures are included in standard PCR reaction buffers, the first complementary probe (has Illumina sequencings flow cell binding sequence) and the second complementary probe in common PCR primer (it is sequenced in Illumina flows The other half end of dynamic pond binding sequence is nearby uniquely indexed (sample index)).PCR primer in the mixture will expand Increase any connection product of the first complementary probe and the second complementary probe.The PCR reaction products of sample index are collected, in silica gel Purified on column, to remove excessive salt, enzyme, small probe and primer.The library collected to this carry out qualitatively and quantitatively with Meet size requirements.It is the PCR for being successfully connected product by the first complementary probe and the second complementary probe to react successful mark Expand obtained primer size length and be offset to 210bp (signal) from 150bp (noise artifacts).
Sequencing to pcr amplification product will disclose such as extron II/ extrons III or may include outside extron II/ Show the information that the first complementary probe of sub- IV splicing variants and the second complementary probe combine.Repetition deciphering, which can be counted, (to divide Case), and these count the Relative copy number that can be used for inferring RNA transcript.
Prove that ligase specific product is produced by hybridization and coupled reaction, perform one group of 96 secondary response, and show Total solution reading (Figure 19) of all sites in each sample.Normal solution reading shows ligase specific product by the sample analyzed Generation.When ligase is saved in reaction, the deciphering of detection is close to zero (16 independent reactions altogether) (Figure 19, bottom right in ellipse The point of side).This group of data eliminate any first complementary probe, which, which does not have, is connected to the first complementary probe The second complementary probe of gametophyte, substantially eliminate the noise of pseudo- first complementary probe exception connection product.In addition, also carry out Titration research, wherein input connection enzyme concentration (unit/reaction) and inputting RNA and being titrated to zero (Figure 20 and Figure 21).Even The total solution reading for connecing the mRNA transcripts of glyceraldehyde-3-phosphate dehydrogenase (GADPH) gene obtained in enzyme titration research is (any The site 745 being assigned as in 778 site groups) total solution reading show that connecting enzyme reaction depends on input connection enzyme concentration (figure 20).Inputting the total of the mRNA transcripts for titrating glyceraldehyde-3-phosphate dehydrogenase (GADPH) gene obtained in research of RNA Total solution reading of solution reading (site 745 being arbitrarily assigned as in 778 site groups) shows that coupled reaction depends on the amount of input RNA (Figure 21).In figure 21, human DNA sample is further included to prove that montage effect depends on RNA rather than DNA.In these researchs, GADPH genetic transcriptions thing (site 745, extron VII) is only investigated by branch mailbox solution reading.It is expected that fusion product does not exist In this data group based on HeLa.
Although various examples and other information is provided above to explain the aspect in right, not base Special characteristic or arrangement in such example imply limitations on claims, and those of ordinary skill can use this A little examples derive various embodiments.In addition, although some themes may use for certain structural features, condition or The language of the example of purposes describes, but it is to be understood that the theme limited in claim is not necessarily so limited.

Claims (32)

1. a kind of be used to measure the presence of one or more target polynucleotides in two or more samples, be not present, content Or the method for copy number, it the described method comprises the following steps:
(a) two or more samples are provided, each sample includes one or more target polynucleotides, every kind of target polynucleotide bag Containing the first target sequence and the second target sequence;
(b) multiple first complementary probes and the second complementary probe, the multiple first complementary probe and the second complementary probe are provided Including the first complementary probe and the second complementary probe for every kind of target polynucleotide, (i) each first complementary probe have with The Sequence of the first target sequence complementation of the target polynucleotide and the Sequence with the first target sequence incomplementarity, Wherein described non-complementary portion includes inquiry site bar code sequence and adjacent universal sequence, and (ii) each second complementary spy Needle set have with the Sequence of the second target sequence complementation of the target polynucleotide and with the second target sequence incomplementarity Close to Sequence;
(c) with the multiple first complementary probe of each sample incubation and the second complementary probe under hybridization conditions so that first Complementary probe and the second complementary probe hybridize with their complementary target polynucleotide to form hybridization complex in the sample;
(d) the first complementary probe for hybridizing in the sample with the first target sequence of target polynucleotide and the second target sequence and the are combined Two complementary probes are to form product polynucleotides;
(e) the product polynucleotides formed by the sample are collected;And
(f) every kind of target polynucleotide is measured in one or more samples by analyzing product polynucleotides or its complementary strand In the presence of, be not present, content or copy number.
2. according to the method described in claim 1, the first target sequence and the second target sequence of wherein described every kind of target polynucleotide It is closely adjacent to each other.
3. according to the method described in claim 1, the first target sequence and the second target sequence of wherein described every kind of target polynucleotide It is separated by 1 to 500 nucleotide.
4. method according to any one of claim 1-3, wherein second complementary probe is described close to sequence portion Dividing includes universal sequence.
5. according to the method described in claim 4, the universal sequence of wherein described second complementary probe includes and primer sequence Complementary universal primer sequence is arranged, the primer sequence can be used in increasing (i) sample index, (ii) appended sequence, (iii) use In formation sequence data or one or more of the appended sequence of another form of detection and (iv) another part.
6. according to the method any one of claim 1-5, wherein the adjacent general sequence of first complementary probe Row include universal primer sequence with primer sequence complementation, the primer sequence can be used in increasing (i) sample index, (ii) it is attached Sequence, (iii) is added to be used for one in the appended sequence and (iv) another part of formation sequence data or another form of detection Person or more persons.
7. according to the method described in claim 5 or claim 6, wherein the universal primer sequence includes PCR primer sequence And/or primer sequence is to increase the appended sequence for formation sequence data or another form of detection.
8. according to the method any one of claim 5-7, wherein described be used for formation sequence data or another form The appended sequence of detection be adapter for new-generation sequencing.
9. according to the method any one of claim 5-7, wherein described be used for formation sequence data or another form Detection appended sequence for capture sequence, optionally wherein it is described capture sequence be used for capture in solid support.
10. according to the method any one of claim 5-9, have wherein the universal primer sequence can effectively increase Beneficial to the part of formation sequence.
11. according to the method any one of claim 5-10, wherein the length of sample index is 10,11, 12,13,14,15 or 16 nucleotide.
12. according to the method for claim 11, wherein the length of sample index is 12 or 15 nucleotide.
13. according to the method any one of claim 4-12, wherein first complementary probe and the second complementary probe The universal sequence each include primer sequence, the primer sequence can be with the primer hybridization for composition sequence.
14. according to the method for claim 13, wherein the primer sequence includes PCR primer sequence.
15. according to the method any one of claim 1-14, wherein first complementary probe include 5 ' to it is described The sequence and 3 ' the inquiry site bar extremely with the first target sequence complementation of the complementary inquiry site bar code of first target sequence The sequence of shape code.
16. according to the method any one of claim 1-15, wherein first complementary probe is included from 5' to 3':Institute State adjacent universal sequence, with the Sequence of the first target sequence complementation and the Sequence and the first target sequence Arrange complementary inquiry site bar code.
17. according to the method any one of claim 1-16, wherein the length of incomplementarity inquiry site bar code For 10,11,12,13,14,15 or 16 nucleotide.
18. according to the method for claim 17, wherein the length of the inquiry site bar code is 12 or 15 nucleosides Acid.
19. according to the method any one of claim 1-18, wherein the step of before incubation, it is included in 70 DEG C extremely Heated at a temperature of 100 DEG C.
20. according to the method any one of claim 1-19, the production is enriched with before being additionally included in the compilation steps Thing polynucleotides.
21. according to the method for claim 20, wherein the enrichment includes:(a) one group of PCR primer sequence is provided, it is described PCR primer sequence include with the first primer of primer sequence complementation in first complementary probe and with the described second complementation Second primer of the PCR primer sequence complementation on probe, and (b) expand the product polynucleotides.
22. according to the method any one of claim 1-21, wherein the method is the method based on solution.
23. according to the method any one of claim 1-22, wherein first complementary probe includes inosine, the flesh The 3' ends of glycosides and the probe are separated by 2,3,4,5,6,7,8,9,10 or more bases.
24. according to the method any one of claim 1-23, wherein second complementary probe includes inosine, the flesh 5 ' the end of glycosides and the probe is separated by 2,3,4,5,6,7,8,9,10 or more bases.
25. according to the method any one of claim 1-24, wherein the 3' ends of first complementary probe and list A kind of form in nucleotide polymorphisms (SNP) or other hereditary variations is complementary.
26. according to the method any one of claim 1-25, wherein the first complementary probe of the combination and the second complementation The step of probe, includes the use of described the of connection enzymatic treatment and the first target sequence of target polynucleotide and the hybridization of the second target sequence One complementary probe and second complementary probe (hybridization complex) are to form product polynucleotides.
27. according to the method any one of claim 1-26, wherein the method is used in Genotyping, wherein described Method includes providing one or more variants of first complementary probe, wherein the variant is in the described first complementary spy It is different in terms of the homogeneity of one or more of nucleotide at the 3 ' end of pin, and wherein described measure includes quantifying The relative frequency of product polynucleotides or its complementary strand, the product polynucleotides or its complementary strand include mutual with described first The sequence for mending other variants of probe made one or more of changes of first complementary probe compared The sequence of allosome, and the frequency is associated with genotype.
28. according to the method any one of claim 1-26, wherein the method is used for the institute for measuring target polynucleotide State copy number variation, and wherein described measure include will by the semaphore that product polynucleotides or its complementary strand produce with Known reference signal amount is compared by the semaphore that another product polynucleotides or its complementary strand produce.
29. according to the method any one of claim 1-26, wherein the method is used to measure target in expression analysis The presence of polynucleotides, wherein the target polynucleotide is RNA transcript, and include will be by product multinuclear for wherein described measure The semaphore that thuja acid or its complementary strand produce is with known reference signal amount or by another product polynucleotides or its complementary strand The semaphore produced is compared.
30. according to the method any one of claim 1-26, wherein the method is used to carry out base to polyploid sample Because of parting, the method further includes reduction formation sequence data in no information polyploid genome, it, which includes obtaining, has target The sample gene data unit sequence of SNP/indel and nearside SNP/indel information, design second complementary probe so that its with The stability of the hybridization of target gene group is broken by nearside SNP/indel.
31. according to the method any one of claim 1-26, wherein the method is used to carry out base to polyploid sample Because of parting, the method further includes reduction formation sequence data in no information polyploid genome, it, which includes obtaining, has target The sample gene data unit sequence of SNP/indel and nearside SNP/indel information, design second complementary probe so that its with The stability of the hybridization of target gene group is broken by nearside SNP/indel, and addition and the target gene with nearside SNP/indel Complementary blocking oligonucleotide is organized further to prevent the hybridization of second complementary probe and the target gene group.
32. according to the method any one of claim 1-26, wherein the method is used to carry out base to polyploid sample Because of parting, the method further includes reduction formation sequence data in no information polyploid genome, it, which includes obtaining, has target The sample gene data unit sequence of SNP/indel and nearside SNP/indel information, and increase PCR amplification step early period is to select Select the unique gene group to be studied.
CN201680052075.1A 2016-01-31 2016-11-08 Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling Pending CN108026568A (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
US201662289303P 2016-01-31 2016-01-31
US62/289,303 2016-01-31
US201662317879P 2016-04-04 2016-04-04
US62/317,879 2016-04-04
US201662353088P 2016-06-22 2016-06-22
US62/353,088 2016-06-22
PCT/US2016/060991 WO2017044993A2 (en) 2015-09-08 2016-11-08 Nucleic acid analysis by joining barcoded polynucleotide probes

Publications (1)

Publication Number Publication Date
CN108026568A true CN108026568A (en) 2018-05-11

Family

ID=62083370

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680052075.1A Pending CN108026568A (en) 2016-01-31 2016-11-08 Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling

Country Status (3)

Country Link
EP (1) EP3347497A4 (en)
CN (1) CN108026568A (en)
WO (1) WO2017044993A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110408717A (en) * 2019-07-23 2019-11-05 四川省农业科学院生物技术核技术研究所 The specific amplification primer of Ganoderma mitochondria rns gene and its application
CN111100935A (en) * 2018-10-26 2020-05-05 厦门大学 Method for detecting drug-resistant gene of bacteria
CN113518829A (en) * 2018-12-31 2021-10-19 Htg分子诊断有限公司 Method for detecting DNA and RNA in same sample

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110832076B (en) * 2017-06-27 2024-05-17 国立大学法人东京大学 Probes and methods for detecting transcripts resulting from fusion gene and/or exon skipping
CN112074610A (en) * 2018-02-22 2020-12-11 10X基因组学有限公司 Conjugation-mediated nucleic acid analysis
US11639928B2 (en) 2018-02-22 2023-05-02 10X Genomics, Inc. Methods and systems for characterizing analytes from individual cells or cell populations
CN116323969A (en) * 2020-10-01 2023-06-23 谷歌有限责任公司 Linked double bar code insertion construction
EP4294945A1 (en) * 2021-02-17 2023-12-27 Act Genomics (IP) Limited Dna fragment joining detecting method and kit thereof
WO2022182682A1 (en) 2021-02-23 2022-09-01 10X Genomics, Inc. Probe-based analysis of nucleic acids and proteins

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101395280A (en) * 2006-03-01 2009-03-25 凯津公司 High throughput sequence-based detection of snps using ligation assays
CN104830993A (en) * 2015-06-08 2015-08-12 中国海洋大学 High-throughput typing technique universal to various molecular markers

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5912124A (en) * 1996-06-14 1999-06-15 Sarnoff Corporation Padlock probe detection
WO2005001113A2 (en) * 2003-06-27 2005-01-06 Thomas Jefferson University Methods for detecting nucleic acid variations
US8808991B2 (en) * 2003-09-02 2014-08-19 Keygene N.V. Ola-based methods for the detection of target nucleic avid sequences
US7604937B2 (en) * 2004-03-24 2009-10-20 Applied Biosystems, Llc Encoding and decoding reactions for determining target polynucleotides
AU2006213907A1 (en) * 2005-02-09 2006-08-17 Stratagene California Key probe compositions and methods for polynucleotide detection
WO2013106807A1 (en) * 2012-01-13 2013-07-18 Curry John D Scalable characterization of nucleic acids by parallel sequencing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101395280A (en) * 2006-03-01 2009-03-25 凯津公司 High throughput sequence-based detection of snps using ligation assays
CN104830993A (en) * 2015-06-08 2015-08-12 中国海洋大学 High-throughput typing technique universal to various molecular markers

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
L.KVASTAD等: "Single cell analysis of cancer cells using an improved RT-MLPA method has potential for cancer diagnosis and monitoring", 《SCIENTIFIC REPORTS》 *
姬艳丽等: "高通量MLPA基因分型技术在1个Del表型家系基因鉴定中的应用", 《中国输血杂志》 *
张莉等: "联合应用MLPA和测序技术检测地中海贫血基因缺陷", 《实用预防医学》 *
胡福泉: "《现代基因操作技术》", 31 October 2000, 人民军医出版社 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111100935A (en) * 2018-10-26 2020-05-05 厦门大学 Method for detecting drug-resistant gene of bacteria
CN111100935B (en) * 2018-10-26 2023-03-31 厦门大学 Method for detecting drug resistance gene of bacteria
CN113518829A (en) * 2018-12-31 2021-10-19 Htg分子诊断有限公司 Method for detecting DNA and RNA in same sample
CN110408717A (en) * 2019-07-23 2019-11-05 四川省农业科学院生物技术核技术研究所 The specific amplification primer of Ganoderma mitochondria rns gene and its application

Also Published As

Publication number Publication date
WO2017044993A2 (en) 2017-03-16
EP3347497A2 (en) 2018-07-18
EP3347497A4 (en) 2019-01-23
WO2017044993A3 (en) 2017-04-27

Similar Documents

Publication Publication Date Title
CN108026568A (en) Foranalysis of nucleic acids is carried out by the polynucleotide probes for combining bar shaped code labeling
US20220049296A1 (en) Nucleic acid analysis by joining barcoded polynucleotide probes
ES2873850T3 (en) Next Generation Sequencing Libraries
US20190024141A1 (en) Direct Capture, Amplification and Sequencing of Target DNA Using Immobilized Primers
US8986958B2 (en) Methods for generating target specific probes for solution based capture
EP2395098B1 (en) Base specific cleavage of methylation-specific amplification products in combination with mass analysis
JP6925424B2 (en) A method of increasing the throughput of a single molecule sequence by ligating short DNA fragments
US20110003301A1 (en) Methods for detecting genetic variations in dna samples
US20120003657A1 (en) Targeted sequencing library preparation by genomic dna circularization
EP3129505B1 (en) Methods for clonal replication and amplification of nucleic acid molecules for genomic and therapeutic applications
CN105934523A (en) Multiplex detection of nucleic acids
CN108611398A (en) Genotyping is carried out by new-generation sequencing
CN107109401A (en) It is enriched with using the polynucleotides of CRISPR cas systems
CN106574286A (en) Selective amplification of nucleic acid sequences
JP6234463B2 (en) Nucleic acid multiplex analysis method
US20220364169A1 (en) Sequencing method for genomic rearrangement detection
CN107760772A (en) For the method for nucleic acid match end sequencing, composition, system, instrument and kit
WO2017181670A1 (en) Method for enriching target nucleic acid sequence from nucleic acid sample
US20200299764A1 (en) System and method for transposase-mediated amplicon sequencing
US20070148636A1 (en) Method, compositions and kits for preparation of nucleic acids
US10036053B2 (en) Determination of variants produced upon replication or transcription of nucleic acid sequences
KR102237248B1 (en) SNP marker set for individual identification and population genetic analysis of Pinus densiflora and their use
Ladas Hybridization enrichment of subgenomic targets for next generation sequencing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination