WO2013173394A2 - Method for increasing accuracy in quantitative detection of polynucleotides - Google Patents

Method for increasing accuracy in quantitative detection of polynucleotides Download PDF

Info

Publication number
WO2013173394A2
WO2013173394A2 PCT/US2013/041031 US2013041031W WO2013173394A2 WO 2013173394 A2 WO2013173394 A2 WO 2013173394A2 US 2013041031 W US2013041031 W US 2013041031W WO 2013173394 A2 WO2013173394 A2 WO 2013173394A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequences
target
sequencing
umi
targets
Prior art date
Application number
PCT/US2013/041031
Other languages
French (fr)
Other versions
WO2013173394A3 (en
Inventor
Chunlin Wang
Jian Han
Original Assignee
Cb Biotechnologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cb Biotechnologies, Inc. filed Critical Cb Biotechnologies, Inc.
Priority to EP13790712.7A priority Critical patent/EP2850211B1/en
Priority to ES13790712T priority patent/ES2899687T3/en
Priority to CA2873585A priority patent/CA2873585C/en
Priority to US14/401,322 priority patent/US20150132754A1/en
Publication of WO2013173394A2 publication Critical patent/WO2013173394A2/en
Publication of WO2013173394A3 publication Critical patent/WO2013173394A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6851Quantitative amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the invention relates to methods for quantitative detection of polynucleotides in a mixed sample of polynucleotides. More particularly, the invention relates to methods for increasing accuracy of quantitation of PCR amplification products.
  • Quantitation of DNA, RNA, and gene products is important in a variety of applications— most notably in the areas of microbial and viral detection in clinical samples and in analyzing clinical samples for immunodiversity. Determining the relative numbers of a potentially disease-causing bacteria, for example, could be useful in the clinical setting for providing information regarding patient status, disease progression, likelihood of progression to disease, etc. Quantitation of T cell receptor expression, B cell antibody production, etc., may provide insight into the status of an individual's immune system, the presence or absence of disease, and the progression of change that may be indicative of disease— or even lead to disease.
  • B cells have heavy and light chains
  • T cells have a and ⁇ chains.
  • B cells have heavy and light chains
  • T cells have a and ⁇ chains.
  • the human body contains approximately 10 10 lymphocytes, each with a unique combination of gene segments that specify the variable region, the part of the receptor that binds antigen.
  • Each person has an individualized immune repertoire, shaped by three key factors: (1) the genetic polymorphism at the MHC loci; (2) the antigen exposure history; and (3) the constant regulation and modulation of the immune system.
  • Humans are capable of generating 10 15 or more different B and T cells, although not all of these 10 15 B or T cells are present at any given time, due to the history of exposure to various antigens and the process of negative selection during the maturation of immune cells.
  • Random recombination of heavy-chain segments ⁇ V H , D H , and J H ) and light- chain segments ⁇ V K and or ⁇ ⁇ and ⁇ ) produces V H DH) H (heavy chain) and V K or ⁇ / K ⁇ K (light chain) coding units in B cells, and a similar process occurs in T cells.
  • Adding to variable- region diversity is the random deletion of nucleotides at V, D and J segments in the junction position and the random insertion of nucleotides into the regions between the DJ and VD segments in heavy chain or the regions between the VJ segments in light chain.
  • RNA quantitation is prone to error from machine or pipette mis-calibration, or dilution, and these methods often require sample dilution for accurate measurement. For samples in which there is already a very low copy number, or at least a relatively low copy number, given the overall numbers of targets, this is very problematic.
  • spectrophotometry cannot be used to detect such small quantities of RNA. It generally takes at least 10 4 cells to produce enough RNA for accurate quantitation by this method.
  • Next-generation sequencing technologies have provided opportunities to significantly increase the sensitivity of quantifying DNA and/or RNA targets.
  • Various methods have been developed to improve increasing accuracy of quantification of different polynucleotides in a sample with mixed polynucleotides, including such methods as competitive polymerase chain reaction (PCR), described in U.S. Patent Number 5,213,961 and deep barcode sequencing using unique molecular identifiers (UMI), as described by Smith et al. (Smith, A.M., "Quantitative Phenotyping via Deep Barcode Sequencing," Genome Research (2009) 19: 1836-1842).
  • PCR competitive polymerase chain reaction
  • UMI unique molecular identifiers
  • the present invention relates to a method for increasing accuracy and sensitivity of quantitative detection of target polynucleotides in a sample with different polynucleotides, the method comprising the steps of (a) labeling a target polynucleotide with a unique molecular identifier and a universal primer binding site to produce at least one labeled target polynucleotide; and (b) amplifying the at least one labeled polynucleotide using at least one universal primer to produce multiple copies of the labeled target polynucleotide.
  • the method may be performed by incorporating into a substantial number of individual target sequences in a pool of target sequences at least one randomly- generated sequence comprising from about 4 to about 15 randomly-generated nucleotides, the at least one randomly-generated sequence forming a unique molecular identifier for an individual target sequence, and a universal adapter sequence (i.e., a primer binding site for a universal primer) to form a target /UMI/adapter polynucleotide; attaching the U Ml/universal adapter sequence to the target in a reverse transcription (RT) reaction at 50-60 degree Celsius for RNA targets (A), a primer extension reaction at 50-60 degree Celsius for DNA targets (B), or a ligation reaction for pre-selected DNA targets (C); and attaching a second universal adapter to the product of the previous step (A) or (B) by a DNA extension reaction at approximately 70 degree Celsius, and amplifying, with universal primer, products with the universal primer binding site attached at both ends at a temperature of approximately 70 degree Celsius.
  • the first step of attaching to a target sequence a unique molecular identifier and an adapter sequence is performed by ligation, DNA extension or reverse transcription.
  • the first step using DNA extension or reverse transcription is performed at a temperature of from about 50 to about 60 degrees Celsius.
  • the second step of the method is performed at a temperature of from about 65 to about 75 degrees Celsius.
  • aspects of the invention involve performing the first step of the method by reverse transcription or DNA extension, using a target-specific primer which comprises a unique molecular identifier sequence of from about 4 to about 15 nucleotides and an adapter sequence.
  • a unique molecular identifier of from about 4 to about 15 nucleotides and a universal binding site are added to a target sequence by ligation.
  • the method is performed as an automated method in a closed cassette.
  • the method may also further comprise the steps of sequencing the products produced the amplification step and removing artifacts through statistical filtering.
  • the statistical filtering includes estimating the context-specific error rate based on control DNA sequencing, grouping sequences differing in a single position, assessing the error rate based on the context of the different position, applying a Poisson model to estimate the probability of the sequence with smaller count to be random error and removing those with a probability greater than 0.001 of being random error.
  • Figure 1 is a plot of the coding capacity of random sequences of various length allowing 0.5% of targets labeled with the same random sequence. The plot is based on data from 10 simulation experiments.
  • FIG. 2 is a diagram of steps to label a target with a unique molecular identifier (UMI), and subsequent amplification steps.
  • UMI unique molecular identifier
  • RNA target Left panel A
  • RT reverse-transcription
  • Tm melting temperature
  • a UMI center panel B
  • Tm melting temperature
  • a second gene-specific primer and universal primers are added to the reaction with thermostable DNA polymerase. Both the second gene-specific primer and universal primers are designed with Tm greater than 70 degrees Celsius.
  • UMIs are introduced through ligation, where a double adaptor with UMI is ligated to target molecules and UMIs are introduced to a target at both ends. The UMI-labeled targets are then amplified before sequencing.
  • Figure 3 is a context-specific error pattern derived from control DNA sequences determined by the Illumina hiSeq2000 platform. For each row, the height (width) of pattern -filled blocks show the error rate of the last of the triplet changed to either A, C, G or T.
  • Figures 5A and 5B are photographs of gels containing PCR amplification products produced by the method of the invention.
  • the first four lanes of Figure 4A contain products generated using universal primers and the 2 nd four lanes contain products generated using primers for adding a UMI sequence and adapter sequence during RT-PCR, but under the higher temperature conditions of the 2 nd /3 rd steps of the method. This illustrates that contamination by UMI tagging primers may be avoided using the 3-step method of the invention.
  • the lanes of Figure 4B contain amplification products generated using primers designed for amplification under higher-temperature conditions of the 2 nd and 3 rd steps of the method.
  • Figure 6 is a drawing illustrating the steps of adding to a target sequence a unique molecular identifier and an adapter sequence (A); performing a first amplification step using at least one forward primer which comprises an adapter sequence and a universal primer binding site sequence (B); and performing a second amplification step using at least one universal primer (C).
  • FIG. 7 illustrates the benefit of UMI labeling of targets using the method of the invention.
  • Targets in the pool of amplification produced by the present method are sequenced, generally using high-throughput, next-generation sequencing methods.
  • each original template (I.A) is labeled with unique UMI (II.A) and sequenced free-of-error (III.A), where the count of the original templates is the same as the count of the combination of target and UMI.
  • UMIs are too short and with limited coding capacity, the same UMI might be attached to different templates, which will inevitably result in underestimation of the count of the original templates (II.B).
  • the inventors have developed a method for increasing the accuracy of detecting the numbers of polynucleotides of substantially the same sequence in a mixed sample of polynucleotides, which may be used in analyses as diverse as those of the immune repertoire, microbiome, gene expression profiling, miRNA profiling, copy number variations, and even prenatal diagnosis of trisomies and drug resistance mutation detections (such as low copy number HIV drug resistance mutation detections).
  • the invention provides a method for increasing accuracy of quantitative detection of polynucleotides, the method comprising the steps of (a) labeling a target polynucleotide with a unique molecular identifier and a universal primer binding site to produce at least one labeled target polynucleotide; and (b) amplifying the at least one labeled polynucleotide using at least one universal primer to produce multiple copies of the labeled target polynucleotide.
  • the method may be performed by incorporating into a substantial number of individual target sequences in a pool of target sequences at least one randomly-generated sequence comprising from about 4 to about 15 randomly-generated nucleotides, the at least one randomly-generated sequence forming a unique molecular identifier for an individual target sequence, and a universal adapter sequence (i.e., a primer binding site for a universal primer) to form a target /U Ml/adapter polynucleotide; attaching the U Ml/universal adapter sequence to the target in a reverse transcription (RT) reaction at 50-60 degree Celsius for RNA targets (A), a primer extension reaction at 50-60 degree Celsius for DNA targets (B), or a ligation reaction for pre-selected DNA targets (C); and attaching a second universal adapter to the product of the previous step (A) or (B) by a DNA extension reaction at approximately 70 degree Celsius, and amplifying, with universal primer, products with the universal primer binding site attached at both ends at a temperature of approximately 70 degree Celsius.
  • each target in a pool is labeled with a unique barcode by covalently attaching a random sequence of a certain length (barcode) to a target polynucleotide before amplification and sequencing.
  • barcode a random sequence of a certain length
  • the combination of barcode and target then works as a proxy for the target during amplification and is ultimately sequenced together.
  • UMIs have to be long enough to provide sufficient coding capacity so that no two identical targets are labeled with the same UMI; 2) UMIs have to be introduced to target sequences before the amplification steps; and 3) both UMIs and target sequences have to be sequenced without errors.
  • the first requirement can be met by using longer UMIs.
  • the inventors have addressed the second requirement by developing a method that incorporates UMIs in a two-step PCR reaction.
  • the inventors address the third requirement by introducing a new statistical approach to correct for sequencing errors. By combining both methods, they make the UMI strategy more practically useful and increase the accuracy for profiling polynucleotides in a complex genetic pool.
  • RNA target For an RNA target, a UMI is introduced into a target through reverse- transcription (RT) using reverse-transcriptase ( Figure 2, left panel A).
  • a gene-specific primer, UMI, and a universal adaptor are synthesized to form one single molecule, where the annealing temperature between the gene-specific primer and a target is designed to be between 50 and 60 Celsius degree.
  • a second gene-specific primer attaching to a second universal adaptor, universal primer is added to reaction, where the annealing temperature between the second gene-specific primer and targets is designed beyond 70 Celsius degree.
  • the second annealing and extension temperature is set to 70 Celsius degree.
  • a PCR reaction is performed at 95 degrees C for 15 seconds, and 72 degrees for 30-40 cycles.
  • a UMI is introduced into the target through a regular primer extension step with DNA polymerase ( Figure 2, center panel B).
  • a gene-specific primer UMI and a universal adaptor are synthesized in one single molecule, where the annealing temperature between the gene-specific primer and targets is designed between 50 and 60 degrees Celsius.
  • a second gene-specific primer attaches to a second universal adaptor, and universal primer is added to reaction, the annealing temperature between the second gene-specific primer and targets designed to be above 70 degrees Celsius.
  • the second annealing and extension temperature is set at about 70 degrees Celsius.
  • a PCR reaction is performed at 95 degrees C for 15 seconds, and 72 degrees C for 30-40 cycles.
  • UMI may be added using a ligation reaction. Double-stranded UMI and universal adaptors are ligated to targets directly. Universal primers are then added to the reaction and a PCR reaction is performed at 95 degrees C for 15 seconds, and 72 degrees C for 30-40 cycles. Universal primers are designed to bind 4-6 bases away from the completely random UMI sequences as our pilot study showed that the first 4 bases after the primer region are important for PCR efficiency.
  • the UMI strategy when used in the absence of the added steps provided by the inventors, operates on the assumption that both PCR and sequencing steps report the underlying target and UMI fragment free of error. However, this is an incorrect assumption because errors in both PCR and sequencing are inevitable. It is commonly known that the three popular next-generation sequencing platforms on the market today (Illumina HiSeq, Life Technologies Ion Torrent PGM and 454 FLX system) produce sequences with significant numbers of sequencing errors. Figure 3 plots the error pattern of the bench-top version of the three platforms.
  • This method comprises the steps of 1) estimating error rates by mixing with amplification products of UMI-labeled targets a small amount of control DNA, the sequence of which has been previously determined, sequencing both target and control together, and comparing sequences amplified from control DNA with known sequences, to estimate context-specific pattern of error; 2) organizing target sequences by counting the distribution of unique sequences, where any two unique sequences are grouped if the two sequences differ in a single position; and 3) estimating the odds of the minor sequence in a group of artifacts according to the Poison model (figure 4A).
  • the random label segment is 15 nucleotides in length, it can randomly create about 10756894 unique molecular identifiers to label about 99.5% of around 10 7 the target polynucleotides.
  • a target polynucleotide is used often herein, but it is to be understood that multiple target polynucleotides generally exist within any clinical sample. These may represent sequences derived from, for example, the same or different bacteria, T cells, B cells, viruses, etc. The term, therefore, encompasses labeling of as many single target polynucleotides as can effectively be labeled within a sample. In some cases, such as in the case of immunorepertoire analysis, target polynucleotides may easily number in the millions.
  • UMI-labeled target polynucleotides comprising copies of the same DNA sequences will be individually labeled with different barcodes, each barcode being counted only once to provide a more accurate representation of the numbers of copies of target polynucleotides in a sample. It is therefore important to introduce the UMI label into the method so that it will not be utilized to prime subsequent amplifications and introduce amplification bias into the sample.
  • the method of the invention may be performed very effectively using a closed cassette and automated methods such as those described in United States Patent
  • a cassette is insertable into a base machine (“base unit") that operably interfaces with the cassette to provide the necessary movement of a series of parts designed to provide up-and-down vertical movement, horizontal back-and- forth movement, and fluid handling by a cassette pipette which operates within the confines of the area bounded by the top, bottom, ends, and sides of the cassette, these parts being referred to as a cam bar, a lead screw, and a pipette pump assembly, respectively. It is also possible to provide a mechanism that allows the movement of the cassette pipette in any direction in the x-y-z plane, or to allow for circular/rotary movement throughout the enclosed cassette.
  • At least one of the reagent chambers in the cassette may form a PCR reaction chamber for performing the desired first amplification step (PCRl) and second amplification step (PCR2) of the present invention.
  • a reaction chamber may be constructed of different diameter, depth, and wall thickness than other reagent chambers.
  • a reaction chamber preferably will be a thin-walled chamber to aid in thermal conduction between external thermocyclers located in the base unit and the fluid within the reaction chamber.
  • the walls should be tapered so as to easily fit into the thermocycler and make thermal contact with thermocycler without adhering to its surface.
  • the reaction chamber should be of a depth and shape that allows for its fluid volume to be positioned inside the thermocycler.
  • the depth of the PCR chamber should be compatible with the vertical motion of the cassette pipette.
  • the chamber will also be accessible to a user's pipette tip if inserted into the chamber through the casette's fill port, and the material used to form the PCR chamber may be optically clear so that the user can see when the pipette tip has reached the bottom of the chamber.
  • UMIs Unique Molecular Identifiers
  • the method they designed utilizes primers comprising target-specific sequences for promoting binding to targets to initiate primer extension, as well as randomly- generated UMIs and adapters.
  • the purpose of the adapters is to form a binding site for primers used in next steps, those primers being used to add to resulting polynucleotides nucleotide sequences that form binding sites for universal primers, those primers being chosen for their ability to effectively promote amplification at temperatures of from about 65 to about 75 degrees C.
  • the primers comprising target-specific sequences are designed for use at lower temperatures, their influence can be limited in the subsequent amplification steps.
  • amplification bias may be further limited.
  • the present method may also comprise the step of removing a portion of the reaction mix, which contains the products of reverse transcription from the first step of the method, and using that portion for the second amplification reaction. This step may be used to further decrease the influence of the target-specific, UMI-labeled primers in the next two steps.
  • Sequencing methods including next-generation high-throughput sequencing methods, are prone to errors, which may be limited to a small percentage— but may produce a significant and unacceptable level of variance when large numbers of nucleotides are sequenced.
  • the method may also further comprise the steps of sequencing the products produced by steps a through c and correcting for sequencing errors using a statistical filtering step using formula I:
  • the combination of individually labeling target molecules, semi-quantitatively amplifying those labeled molecules using the two-step amplification of the present invention, using universal primers to decrease amplification bias and improve amplification efficiency, and statistically correcting the sequencing results, will give a much more accurate result and allow a researcher to better determine the types and numbers of immune system cells, antibodies, bacteria, etc. that are present in a given sample.
  • miIgHC_l ACACTCTTTCCCTACACGACGCTCTTCCGATCT NNNNNNNNNNTCTGACGTCAGTGGGTAGATGGTGGG (SEQ ID NO: 1); miIgHC_2: ACACTCTTTCCCTACACG ACGCTCTTCCG ATCTN NNNNNNNNNN NTCTG ACTGGATAG ACTG ATGGGGGTG (SEQ ID NO: 2); miIgHC_3: ACACTCTTTCCCTACACGACGCTCTTCCGATCT NNNNNNNNNNNN NTCTG ACTGGATAG ACTG ATGGGGGTG (SEQ ID NO: 2); miIgHC_3: ACACTCTTTCCCTACACGACGCTCTT
  • HC_4 AC ACTCTTTCCCTAC ACG ACGCTCTTCCG ATCTN NNNNNNNNN N NTCTG ACAAGGGGTAGAGCTGAGGGTT (SEQ ID NO: 4); miIgHC_5:
  • miIgHC_6 ACACTCTTTCCCTACACGAC GCTCTTCCG ATCTN N NNNNNNNNN NTCTGACGGGGAAGACATTTGGG AAGG (SEQ ID NO: 6);
  • ACACTCTTTCCCTACACG ACGCTCTTCCG ATCTN NNNNNNNNNN NTCTG ACAG A
  • GGAGGAACATGTCAGGT SEQ ID NO: 7
  • miIgHC_8 ACACTCTTTCCCTACACGACGCTCTT CCGATCTN NNNNNNNNNN NTCTG ACGGGATAGACAGATGGGGCTG (SEQ ID NO: 8).
  • TMs of UMI segments targeted for use as annealing sequences were evaluated. Results are listed in Table 1, in order from lowest to highest TM.
  • a second primer sequence was also synthesized (SEQ ID NO: 10: CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAAC CGCTCTTCC (with bold print indicating the adapter sequence).
  • Illumina primers SEQ ID NO: 11:
  • CTCGGCATTCCTGCTGAACCGCTCTTCCGATCT served as universal primers.
  • control DNA e.g., PhiX DNA
  • VDJ amplicons were mixed with VDJ amplicons and all were sequenced together. Extract reads for control DNA were based on matches between reads and reference sequence for control DNA. Control DNA sequences were aligned to corresponding reference sequences. The context of specific error patterns were summarized by counting the difference in the alignment between reads and reference (control) DNA, estimating context-specific error rate.
  • GCA->GCA For example, if for a small (three nucleotide) fragment GCA, there are 1000 GCA's in all alignments: 991 GCA->GCA, 3GCA- >GCC, 2 GCA->GCG, 2 GCA->GCT, 1 GCA->GC- (deletion) and 1 GCA->GCAx (insertion, x is any one of A, C, G and T), then the error rate for GCA->GCC is 0.003, GCA- >GCG is 0.002 and GCA->GCT is 0.002, GCA->GC- is 0.001 and GCA->GCAx is 0.001.

Abstract

Disclosed is a method for improving the sensitivity and accuracy of quantitative detection of polynucleotides in a sample, such a clinical specimen, by a method that utilizes a two- or three-step process of tagging/labeling target molecules and adding an adapter sequence for adding a universal primer for efficient amplification of targets while decreasing target amplification bias. When combined with the step of statistically correcting for sequencing errors, the method can significantly increase the accuracy of quantitative detection of polynucleotides in a sample.

Description

METHOD FOR INCREASING ACCURACY IN QUANTITATIVE DETECTION OF
POLYNUCLEOTIDES
Field of the Invention
[0001] The invention relates to methods for quantitative detection of polynucleotides in a mixed sample of polynucleotides. More particularly, the invention relates to methods for increasing accuracy of quantitation of PCR amplification products. Background of the Invention
[0002] Quantitation of DNA, RNA, and gene products is important in a variety of applications— most notably in the areas of microbial and viral detection in clinical samples and in analyzing clinical samples for immunodiversity. Determining the relative numbers of a potentially disease-causing bacteria, for example, could be useful in the clinical setting for providing information regarding patient status, disease progression, likelihood of progression to disease, etc. Quantitation of T cell receptor expression, B cell antibody production, etc., may provide insight into the status of an individual's immune system, the presence or absence of disease, and the progression of change that may be indicative of disease— or even lead to disease.
[0003] When evaluating the immune system, researchers are faced with a vast array of diversity and potentially very low copy numbers of targets. Determining the relative amounts of each target (e.g., T cell receptor, B cell antibody) can be a daunting task. Antigen receptors displayed by B cells and T cells have two major parts: B cells have heavy and light chains, and most T cells have a and β chains. Estimates are that the human body contains approximately 1010 lymphocytes, each with a unique combination of gene segments that specify the variable region, the part of the receptor that binds antigen. Each person has an individualized immune repertoire, shaped by three key factors: (1) the genetic polymorphism at the MHC loci; (2) the antigen exposure history; and (3) the constant regulation and modulation of the immune system. Humans are capable of generating 1015 or more different B and T cells, although not all of these 1015 B or T cells are present at any given time, due to the history of exposure to various antigens and the process of negative selection during the maturation of immune cells.
[0004] Random recombination of heavy-chain segments {VH, DH, and JH) and light- chain segments {VK and or νλ and Λ) produces VHDH)H (heavy chain) and V K or\/K}K (light chain) coding units in B cells, and a similar process occurs in T cells. Adding to variable- region diversity is the random deletion of nucleotides at V, D and J segments in the junction position and the random insertion of nucleotides into the regions between the DJ and VD segments in heavy chain or the regions between the VJ segments in light chain.
[0005] One method for quantitating gene expression is to isolate RNA from the samples to be compared, quantitate the RNA by UV spectrophotometry or with a fluorescent dye, and then use equal mass amounts of RNA in real-time RT-PCR. However, RNA quantitation is prone to error from machine or pipette mis-calibration, or dilution, and these methods often require sample dilution for accurate measurement. For samples in which there is already a very low copy number, or at least a relatively low copy number, given the overall numbers of targets, this is very problematic. Furthermore, spectrophotometry cannot be used to detect such small quantities of RNA. It generally takes at least 104 cells to produce enough RNA for accurate quantitation by this method. Using a fluorescent dye can increase sensitivity up to 100-fold, but for many applications even that level of sensitivity is not enough. [0006] Next-generation sequencing technologies have provided opportunities to significantly increase the sensitivity of quantifying DNA and/or RNA targets. Various methods have been developed to improve increasing accuracy of quantification of different polynucleotides in a sample with mixed polynucleotides, including such methods as competitive polymerase chain reaction (PCR), described in U.S. Patent Number 5,213,961 and deep barcode sequencing using unique molecular identifiers (UMI), as described by Smith et al. (Smith, A.M., "Quantitative Phenotyping via Deep Barcode Sequencing," Genome Research (2009) 19: 1836-1842).
[0007] Unique molecular identifiers, or molecular barcodes, provide an advantage in quantifying copy numbers in a sample. However, if UMI are involved in more than the first round of PCR, the same UMI may be introduced into different targets, resulting in counting errors. Also, the UMI method works based on an ideal, but unrealistic, situation— that is, where both PCR and sequencing technologies are both perfect and no errors are introduced. The UMI strategy operates on the assumption that both PCR and sequencing steps report the underlying targets and UMI fragments free-of-error. However, this is an erroneous assumption because those errors in both PCR and sequencing are inevitable. However, every current sequencing platform is subject to sequencing errors. Two very popular platforms each have error rates of around one percent. When large numbers of sequences are obtained, this sequencing error can create a significant number of artificial targets.
[0008] What are needed are methods for improving accuracy of quantification of different polynucleotides in a sample with mixed polynucleotides. Summary of the Invention
[0009] The present invention relates to a method for increasing accuracy and sensitivity of quantitative detection of target polynucleotides in a sample with different polynucleotides, the method comprising the steps of (a) labeling a target polynucleotide with a unique molecular identifier and a universal primer binding site to produce at least one labeled target polynucleotide; and (b) amplifying the at least one labeled polynucleotide using at least one universal primer to produce multiple copies of the labeled target polynucleotide. The method may be performed by incorporating into a substantial number of individual target sequences in a pool of target sequences at least one randomly- generated sequence comprising from about 4 to about 15 randomly-generated nucleotides, the at least one randomly-generated sequence forming a unique molecular identifier for an individual target sequence, and a universal adapter sequence (i.e., a primer binding site for a universal primer) to form a target /UMI/adapter polynucleotide; attaching the U Ml/universal adapter sequence to the target in a reverse transcription (RT) reaction at 50-60 degree Celsius for RNA targets (A), a primer extension reaction at 50-60 degree Celsius for DNA targets (B), or a ligation reaction for pre-selected DNA targets (C); and attaching a second universal adapter to the product of the previous step (A) or (B) by a DNA extension reaction at approximately 70 degree Celsius, and amplifying, with universal primer, products with the universal primer binding site attached at both ends at a temperature of approximately 70 degree Celsius.
[0010] In various aspects of the method, the first step of attaching to a target sequence a unique molecular identifier and an adapter sequence is performed by ligation, DNA extension or reverse transcription. In various aspects, the first step using DNA extension or reverse transcription is performed at a temperature of from about 50 to about 60 degrees Celsius. In various aspects, the second step of the method is performed at a temperature of from about 65 to about 75 degrees Celsius.
[0011] Aspects of the invention involve performing the first step of the method by reverse transcription or DNA extension, using a target-specific primer which comprises a unique molecular identifier sequence of from about 4 to about 15 nucleotides and an adapter sequence. In other aspects, a unique molecular identifier of from about 4 to about 15 nucleotides and a universal binding site are added to a target sequence by ligation.
[0012] In various aspects of the invention, the method is performed as an automated method in a closed cassette. The method may also further comprise the steps of sequencing the products produced the amplification step and removing artifacts through statistical filtering. The statistical filtering includes estimating the context-specific error rate based on control DNA sequencing, grouping sequences differing in a single position, assessing the error rate based on the context of the different position, applying a Poisson model to estimate the probability of the sequence with smaller count to be random error and removing those with a probability greater than 0.001 of being random error.
Brief Description of the Drawings
[0013] Figure 1 is a plot of the coding capacity of random sequences of various length allowing 0.5% of targets labeled with the same random sequence. The plot is based on data from 10 simulation experiments.
[0014] Figure 2 is a diagram of steps to label a target with a unique molecular identifier (UMI), and subsequent amplification steps. For an RNA target (Left panel A), the UMI is introduced by reverse-transcriptase through a reverse-transcription (RT) step where the gene-specific primer is designed with melting temperature (Tm) at between 50 and 60 degrees Celsius. For double-stranded DNA molecules, if specific regions of DNA molecules only are the targets, a UMI (center panel B) is introduced through chain-extension by DNA polymerase with gene-specific primers, which are designed with Tm at between about 50 and about 60 degrees Celsius. After a first step of labeling, a second gene-specific primer and universal primers are added to the reaction with thermostable DNA polymerase. Both the second gene-specific primer and universal primers are designed with Tm greater than 70 degrees Celsius. For pre-selected DNA targets, UMIs are introduced through ligation, where a double adaptor with UMI is ligated to target molecules and UMIs are introduced to a target at both ends. The UMI-labeled targets are then amplified before sequencing.
[0015] Figure 3 is a context-specific error pattern derived from control DNA sequences determined by the Illumina hiSeq2000 platform. For each row, the height (width) of pattern -filled blocks show the error rate of the last of the triplet changed to either A, C, G or T.
[0016] Figure 4 Panel A shows the formula for estimating the odds of whether a minor sequence is generated through artifact, where n is the count of minor sequence in a group, and N the count of major sequence in the same group, λ is the expected mean number of sequences identical to the minor sequence, which is computed as λ= Ν*μ, where μ is the estimated error rate from GCA-GCT in panel B and GCA-GCG in panel C. If the value of P is less than 0.001, it is unlikely that the minor sequence is due to artifacts. Panel B gives an example of a minor sequence with the count 878 being considered as artifact as the value of P is 0.989, which is beyond the 0.001 probability/random error threshold. And panel C gives an example of minor sequences with the count 2698 being considered as authentic as the value of P is 7.4e-12, less than 0.001. [0017] Figures 5A and 5B are photographs of gels containing PCR amplification products produced by the method of the invention. The first four lanes of Figure 4A contain products generated using universal primers and the 2nd four lanes contain products generated using primers for adding a UMI sequence and adapter sequence during RT-PCR, but under the higher temperature conditions of the 2nd/3rd steps of the method. This illustrates that contamination by UMI tagging primers may be avoided using the 3-step method of the invention. The lanes of Figure 4B contain amplification products generated using primers designed for amplification under higher-temperature conditions of the 2nd and 3rd steps of the method.
[0018] Figure 6 is a drawing illustrating the steps of adding to a target sequence a unique molecular identifier and an adapter sequence (A); performing a first amplification step using at least one forward primer which comprises an adapter sequence and a universal primer binding site sequence (B); and performing a second amplification step using at least one universal primer (C).
[0019] Figure 7 illustrates the benefit of UMI labeling of targets using the method of the invention. Targets in the pool of amplification produced by the present method are sequenced, generally using high-throughput, next-generation sequencing methods. In an ideal situation, each original template (I.A) is labeled with unique UMI (II.A) and sequenced free-of-error (III.A), where the count of the original templates is the same as the count of the combination of target and UMI. If UMIs are too short and with limited coding capacity, the same UMI might be attached to different templates, which will inevitably result in underestimation of the count of the original templates (II.B). If UMIs are attached to targets as they have been amplified, the number of UMIs attached targets is greater than the count of original templates, resulting in over-estimation of the count of certain targets (II. C). If sequencing is not free of error, error could occur in targets, UMI or both. Error occurring in targets results in over-estimation of the count of distinct templates. Error occurring in the UMI region results in over-estimation of the count of certain targets (III. B). With the inventors' statistical filtering technique, those sequencing errors can be detected and removed, which will restore the correct count of distinct targets and the count of each target.
Detailed Description
[0020] The inventors have developed a method for increasing the accuracy of detecting the numbers of polynucleotides of substantially the same sequence in a mixed sample of polynucleotides, which may be used in analyses as diverse as those of the immune repertoire, microbiome, gene expression profiling, miRNA profiling, copy number variations, and even prenatal diagnosis of trisomies and drug resistance mutation detections (such as low copy number HIV drug resistance mutation detections).
[0021] The invention provides a method for increasing accuracy of quantitative detection of polynucleotides, the method comprising the steps of (a) labeling a target polynucleotide with a unique molecular identifier and a universal primer binding site to produce at least one labeled target polynucleotide; and (b) amplifying the at least one labeled polynucleotide using at least one universal primer to produce multiple copies of the labeled target polynucleotide. The method may be performed by incorporating into a substantial number of individual target sequences in a pool of target sequences at least one randomly-generated sequence comprising from about 4 to about 15 randomly-generated nucleotides, the at least one randomly-generated sequence forming a unique molecular identifier for an individual target sequence, and a universal adapter sequence (i.e., a primer binding site for a universal primer) to form a target /U Ml/adapter polynucleotide; attaching the U Ml/universal adapter sequence to the target in a reverse transcription (RT) reaction at 50-60 degree Celsius for RNA targets (A), a primer extension reaction at 50-60 degree Celsius for DNA targets (B), or a ligation reaction for pre-selected DNA targets (C); and attaching a second universal adapter to the product of the previous step (A) or (B) by a DNA extension reaction at approximately 70 degree Celsius, and amplifying, with universal primer, products with the universal primer binding site attached at both ends at a temperature of approximately 70 degree Celsius.
[0022] Accurate determination of the composition and quantification of different polynucleotides of varying frequencies in a complex genetic pool is important in a variety of applications— most notably in the areas of microbial and viral detection in clinical samples and in analyzing clinical samples for immunodiversity. Recently, a new method based on deep barcoding or unique molecular identifiers (UMI), as described by Smith et al. (Smith, A.M., "Quantitative Phenotyping via Deep Barcode Sequencing," Genome Research (2009) 19: 1836-1842), has shown promise for decreasing the counting bias introduced during amplification and sequencing. Briefly, each target in a pool is labeled with a unique barcode by covalently attaching a random sequence of a certain length (barcode) to a target polynucleotide before amplification and sequencing. The combination of barcode and target then works as a proxy for the target during amplification and is ultimately sequenced together. At the final step, the unique combination of barcode and target is counted only once. By doing so, the bias introduced during both the amplification stage and the sequencing stage can be suppressed due to the large coding capacity of random sequences of a certain length, which is about 4N (if N is the length of barcode (UMI), for example, the coding capacity of random sequences of the length of 10 is 410= 1048576). However, there are three prerequisites for the success of this approach: 1) UMIs have to be long enough to provide sufficient coding capacity so that no two identical targets are labeled with the same UMI; 2) UMIs have to be introduced to target sequences before the amplification steps; and 3) both UMIs and target sequences have to be sequenced without errors. The first requirement can be met by using longer UMIs. The inventors have addressed the second requirement by developing a method that incorporates UMIs in a two-step PCR reaction. The inventors address the third requirement by introducing a new statistical approach to correct for sequencing errors. By combining both methods, they make the UMI strategy more practically useful and increase the accuracy for profiling polynucleotides in a complex genetic pool.
[0023] For an RNA target, a UMI is introduced into a target through reverse- transcription (RT) using reverse-transcriptase (Figure 2, left panel A). A gene-specific primer, UMI, and a universal adaptor are synthesized to form one single molecule, where the annealing temperature between the gene-specific primer and a target is designed to be between 50 and 60 Celsius degree. After the RT step, a second gene-specific primer attaching to a second universal adaptor, universal primer is added to reaction, where the annealing temperature between the second gene-specific primer and targets is designed beyond 70 Celsius degree. The second annealing and extension temperature is set to 70 Celsius degree. After this step, a PCR reaction is performed at 95 degrees C for 15 seconds, and 72 degrees for 30-40 cycles.
[0024] For DNA targets embedded in large DNA molecules, a UMI is introduced into the target through a regular primer extension step with DNA polymerase (Figure 2, center panel B). A gene-specific primer UMI and a universal adaptor are synthesized in one single molecule, where the annealing temperature between the gene-specific primer and targets is designed between 50 and 60 degrees Celsius. After the primer extension reaction, a second gene-specific primer attaches to a second universal adaptor, and universal primer is added to reaction, the annealing temperature between the second gene-specific primer and targets designed to be above 70 degrees Celsius. The second annealing and extension temperature is set at about 70 degrees Celsius. After this step, a PCR reaction is performed at 95 degrees C for 15 seconds, and 72 degrees C for 30-40 cycles.
[0025] For fragmented DNA targets, UMI may be added using a ligation reaction. Double-stranded UMI and universal adaptors are ligated to targets directly. Universal primers are then added to the reaction and a PCR reaction is performed at 95 degrees C for 15 seconds, and 72 degrees C for 30-40 cycles. Universal primers are designed to bind 4-6 bases away from the completely random UMI sequences as our pilot study showed that the first 4 bases after the primer region are important for PCR efficiency.
[0026] The UMI strategy, when used in the absence of the added steps provided by the inventors, operates on the assumption that both PCR and sequencing steps report the underlying target and UMI fragment free of error. However, this is an incorrect assumption because errors in both PCR and sequencing are inevitable. It is commonly known that the three popular next-generation sequencing platforms on the market today (Illumina HiSeq, Life Technologies Ion Torrent PGM and 454 FLX system) produce sequences with significant numbers of sequencing errors. Figure 3 plots the error pattern of the bench-top version of the three platforms.
[0027] For profiling sequences in a complex genetic pool such as 16S rRNA sequencing and immunodiversity studies, the distribution of templates in a sample varies. Sequencing artifacts inevitably distort the result of profiling of nucleic acids in a genetic pool by sequencing. For instance, errors in the UMI region cause an over-estimation of the count of corresponding targets and those errors in the target sequences cause an over-estimation of the number of different targets in the genetic pool. After studying the error patterns of multiple sequencing attempts, several patterns stand out. First,the error rate of any next- generation sequencing platforms is in the range between 0.1% and 5%. Second, errors occur differently in different contexts (i.e., errors are context-specific). Figure 3 shows a context- specific error pattern by the Illumina HiSeq2000 platform.
[0028] To suppress artifacts introduced by both PCR and sequencing, the inventors developed a statistical method for identifying those artifacts. This method comprises the steps of 1) estimating error rates by mixing with amplification products of UMI-labeled targets a small amount of control DNA, the sequence of which has been previously determined, sequencing both target and control together, and comparing sequences amplified from control DNA with known sequences, to estimate context-specific pattern of error; 2) organizing target sequences by counting the distribution of unique sequences, where any two unique sequences are grouped if the two sequences differ in a single position; and 3) estimating the odds of the minor sequence in a group of artifacts according to the Poison model (figure 4A).
[0029] The inventors noted that if the random label segment is 15 nucleotides in length, it can randomly create about 10756894 unique molecular identifiers to label about 99.5% of around 107the target polynucleotides.
[0030] The term "a target polynucleotide" is used often herein, but it is to be understood that multiple target polynucleotides generally exist within any clinical sample. These may represent sequences derived from, for example, the same or different bacteria, T cells, B cells, viruses, etc. The term, therefore, encompasses labeling of as many single target polynucleotides as can effectively be labeled within a sample. In some cases, such as in the case of immunorepertoire analysis, target polynucleotides may easily number in the millions. Ultimately, UMI-labeled target polynucleotides comprising copies of the same DNA sequences will be individually labeled with different barcodes, each barcode being counted only once to provide a more accurate representation of the numbers of copies of target polynucleotides in a sample. It is therefore important to introduce the UMI label into the method so that it will not be utilized to prime subsequent amplifications and introduce amplification bias into the sample.
[0031] The method of the invention may be performed very effectively using a closed cassette and automated methods such as those described in United States Patent
Application Publication Number 20100291668A1. The type of quantitation for which the method of the invention is especially useful (/.e., highly diverse targets, low copy numbers in samples) is also especially sensitive to the risk of contamination, which will negatively impact accurate quantitation. The closed system created by the cassette disclosed in United States Patent Application Publication Number 20100291668A1 significantly reduces the risk of contamination, while increasing the efficiency with which many samples may be processed.
[0032] When using the automated method described in United States Patent Application Publication Number 20100291668A1, a cassette is insertable into a base machine ("base unit") that operably interfaces with the cassette to provide the necessary movement of a series of parts designed to provide up-and-down vertical movement, horizontal back-and- forth movement, and fluid handling by a cassette pipette which operates within the confines of the area bounded by the top, bottom, ends, and sides of the cassette, these parts being referred to as a cam bar, a lead screw, and a pipette pump assembly, respectively. It is also possible to provide a mechanism that allows the movement of the cassette pipette in any direction in the x-y-z plane, or to allow for circular/rotary movement throughout the enclosed cassette.
[0033] At least one of the reagent chambers in the cassette may form a PCR reaction chamber for performing the desired first amplification step (PCRl) and second amplification step (PCR2) of the present invention. Such a reaction chamber may be constructed of different diameter, depth, and wall thickness than other reagent chambers. For example, a reaction chamber preferably will be a thin-walled chamber to aid in thermal conduction between external thermocyclers located in the base unit and the fluid within the reaction chamber. The walls should be tapered so as to easily fit into the thermocycler and make thermal contact with thermocycler without adhering to its surface. The reaction chamber should be of a depth and shape that allows for its fluid volume to be positioned inside the thermocycler. The depth of the PCR chamber should be compatible with the vertical motion of the cassette pipette. Preferably, the chamber will also be accessible to a user's pipette tip if inserted into the chamber through the casette's fill port, and the material used to form the PCR chamber may be optically clear so that the user can see when the pipette tip has reached the bottom of the chamber.
[0034] Barcodes, or Unique Molecular Identifiers (UMIs), allow quantitation of PCR products. However, the inventors' experiments with simple addition of UMI sequences in controlled assays in which the number of beginning targets and the relative concentrations of each were known demonstrated that simply adding the UMIs does not give an accurate assessment of the number of targets in, for example, a clinical sample obtained from a human or animal. They hypothesized that utilization of the primers needed for incorporation of the UMI sequences into target-derived polynucleotides could result in additional rounds of amplification in which certain UMIs were added to more than one target. This could result in UMIs representing multiple targets, but being counted as part of a single target, artificially inflating the numbers of some targets. They proposed to develop a method in which tagging/labeling of the target molecules would be performed in a first step, with subsequent steps being designed to limit the influence of the UMI-containing primers so that any primers that remained in the mix would not label additional molecules to an appreciable extent. Counting of products occurs as shown in Fig. 7, where targets may be separated according to their respective sequences and may be quantitated by the numbers of UMIs associated with them in the resulting sequencing results.
[0035] The method they designed utilizes primers comprising target-specific sequences for promoting binding to targets to initiate primer extension, as well as randomly- generated UMIs and adapters. The purpose of the adapters is to form a binding site for primers used in next steps, those primers being used to add to resulting polynucleotides nucleotide sequences that form binding sites for universal primers, those primers being chosen for their ability to effectively promote amplification at temperatures of from about 65 to about 75 degrees C. When the primers comprising target-specific sequences are designed for use at lower temperatures, their influence can be limited in the subsequent amplification steps. By using universal primers in the third step (2nd amplification step), amplification bias may be further limited.
[0036] Methods for designing primers having desired annealing temperatures are known to those of skill in the art. Methods for generating random nucleotide sequences that may be used as unique molecular identifiers have been described previously and are also known to those of skill in the art.
[0037] The present method may also comprise the step of removing a portion of the reaction mix, which contains the products of reverse transcription from the first step of the method, and using that portion for the second amplification reaction. This step may be used to further decrease the influence of the target-specific, UMI-labeled primers in the next two steps.
[0038] Sequencing methods, including next-generation high-throughput sequencing methods, are prone to errors, which may be limited to a small percentage— but may produce a significant and unacceptable level of variance when large numbers of nucleotides are sequenced. The method may also further comprise the steps of sequencing the products produced by steps a through c and correcting for sequencing errors using a statistical filtering step using formula I:
Figure imgf000017_0001
Particularly when used in the analysis of a human or animal immunorepertoire or the microbial population of, for example, the human intestine, the combination of individually labeling target molecules, semi-quantitatively amplifying those labeled molecules using the two-step amplification of the present invention, using universal primers to decrease amplification bias and improve amplification efficiency, and statistically correcting the sequencing results, will give a much more accurate result and allow a researcher to better determine the types and numbers of immune system cells, antibodies, bacteria, etc. that are present in a given sample.
[0039] The invention may be further described by means of the following non- limiting examples. Examples
[0040] The following primers were used to incorporating into each target sequence a unique molecular identifier: miIgHC_l: ACACTCTTTCCCTACACGACGCTCTTCCGATCT NNNNNNNNNNNNNNTCTGACGTCAGTGGGTAGATGGTGGG (SEQ ID NO: 1); miIgHC_2: ACACTCTTTCCCTACACG ACGCTCTTCCG ATCTN NNNNNNNNNNNN NTCTG ACTGGATAG ACTG ATGGGGGTG (SEQ ID NO: 2); miIgHC_3: ACACTCTTTCCCTACACGACGCTCTT
CCGATCTNNNNNNNNNNN NN NTCTG ACGTGGATAGACAGATGGGGGT (SEQ ID NO: 3); m ilg HC_4: AC ACTCTTTCCCTAC ACG ACGCTCTTCCG ATCTN NNNNNNNNNNN N NTCTG ACAAGGGGTAGAGCTGAGGGTT (SEQ ID NO: 4); miIgHC_5:
ACACTCTTTCCCTACACG ACGCTCTTCCG ATCTN NNNNNNNNNNNN NTCT
GACTGGATAGACCGATGGGGCTG (SEQ ID NO: 5); miIgHC_6: ACACTCTTTCCCTACACGAC GCTCTTCCG ATCTN N NNNNNNNNNNN NTCTGACGGGGAAGACATTTGGG AAGG (SEQ ID NO: 6);
miIgHC_7:
ACACTCTTTCCCTACACG ACGCTCTTCCG ATCTN NNNNNNNNNNNN NTCTG ACAG A
GGAGGAACATGTCAGGT (SEQ ID NO: 7); and miIgHC_8: ACACTCTTTCCCTACACGACGCTCTT CCGATCTN NNNNNNNNNNNN NTCTG ACGGGATAGACAGATGGGGCTG (SEQ ID NO: 8).
[0041] TMs of UMI segments targeted for use as annealing sequences were evaluated. Results are listed in Table 1, in order from lowest to highest TM.
Table 1
Figure imgf000019_0001
Templates containing UMIs were generated using reagents as shown in Table 2, under conditions as shown in Table 3.
Table 2
Figure imgf000019_0002
Table 3
Figure imgf000019_0003
[0042] A first primer sequence was synthesized (SEQ ID NO: 9:
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCC, with bold print indicating the adapter sequence). A second primer sequence was also synthesized (SEQ ID NO: 10: CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAAC CGCTCTTCC (with bold print indicating the adapter sequence).
[0043] Illumina primers (SEQ ID NO: 11:
AATGATACGGCGACCACCGAGATCTACACTCTTT CCCTACACGACGCTCTTCCGATCT and SEQ ID NO: 12: CAAGCAGAAGACGGCATACGAGATCGGT
CTCGGCATTCCTGCTGAACCGCTCTTCCGATCT) served as universal primers.
[0044] Primers were tested in both 2-step and 3-step PCR to determine how well they would perform in the method of the invention. Reaction conditions are shown in Tables 4, 5, and 6. Results are shown in Fig. 4A.
Table 4
Figure imgf000020_0001
Table 5
Figure imgf000021_0001
Table 6
Figure imgf000021_0002
[0045] Universal primers were tested using the following combinations: (1) Sequence ID NO: 12 as forward primer, SEQ ID NO: 11 as reverse primer; (2) Sequence ID NO: 12 as forward primer, UMI primer 1 with SEQ ID NO: 11 as reverse primer; (3) Sequence ID NO: 12 as forward primer, UMI primer 2 with SEQ ID NO: 11 as reverse primer; (4) Sequence ID NO: 12 as forward primer, UMI primer 3 with SEQ ID NO: 11 as reverse primer; and (5) Sequence ID NO: 12 as forward primer, UMI primer 5 with SEQ ID NO: 11 as reverse primer. Results are shown in Fig. 4B.
Clear Errors Exist in Current Technology
[0046] The inventors began with 4 distinct clones, which were then spiked into a background sample at different concentrations. Following amplification and sequencing, results indicated that there were actually about 50,000 different clones in the sample, a 12,500-fold increase— and a very unacceptable result if the purpose of the work is to quantitate the amount of target DNA in order to evaluate a clinical sample.
Example of Use of Formula I for Evaluating Results
[0047] For VDJ sequencing, (1-5%) control DNA (e.g., PhiX DNA) was mixed with VDJ amplicons and all were sequenced together. Extract reads for control DNA were based on matches between reads and reference sequence for control DNA. Control DNA sequences were aligned to corresponding reference sequences. The context of specific error patterns were summarized by counting the difference in the alignment between reads and reference (control) DNA, estimating context-specific error rate. For example, if for a small (three nucleotide) fragment GCA, there are 1000 GCA's in all alignments: 991 GCA->GCA, 3GCA- >GCC, 2 GCA->GCG, 2 GCA->GCT, 1 GCA->GC- (deletion) and 1 GCA->GCAx (insertion, x is any one of A, C, G and T), then the error rate for GCA->GCC is 0.003, GCA- >GCG is 0.002 and GCA->GCT is 0.002, GCA->GC- is 0.001 and GCA->GCAx is 0.001.
[0048] For any two pairs of CDR3's (nucleotide sequences, for example A and B, and frequency(A) > frequency(B)) that are different in a single position (due to either mismatch, insertion or deletion), one can look up to the error rate calculated above according to the context of this difference. Assuming the sequence error is generated through a Poisson distribution, frequency(A) =N and frequency(B) = n, the probability that such B would occur n or more times if it were a sequencing error may be calculated using Formula I.
Figure imgf000023_0001
Formula I

Claims

What is claimed is:
1. A method for increasing accuracy of quantitative detection of polynucleotides, the method comprising the steps of
a) labeling a target polynucleotide with a unique molecular identifier and a universal primer binding site to produce at least one labeled target polynucleotide; and
b) amplifying the at least one labeled polynucleotide using at least one universal primer to produce multiple copies of the labeled target polynucleotide.
2. The method of claim 1 wherein step a) is performed at a temperature of from about 50 to about 60 degrees Celsius.
3. The method of claim 1 wherein step b) is performed at a temperature of from about 65 to about 75 degrees Celsius.
4. The method of claim 1 wherein the step of labeling is performed by reverse transcription.
5. The method of claim 1 wherein the step of labeling is performed by ligation.
6. The method of claim 1 further comprising the steps of
c) estimating error rates by amplifying a small amount of control DNA in step (b), sequencing both labeled target polynucleotide and control DNA together, and comparing sequences for control DNA with known sequences, to estimate a context-specific pattern of error;
d) counting the distribution of unique labeled target polynucleotide sequences, where any two unique sequences are grouped if the two sequences differ in a single position; and e) estimating the odds of detecting the presence of a minor sequence in a group of artifacts according to the Poison model using Formula I
Figure imgf000025_0001
Formula I where λ is the expected number of errors given N reads and is computed by λ = N · μ, and μ is the error rate per site estimated from the sequences of control DNA, with variants that give P < 0.001 considered unlikely to be sequencing errors.
PCT/US2013/041031 2012-05-14 2013-05-14 Method for increasing accuracy in quantitative detection of polynucleotides WO2013173394A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP13790712.7A EP2850211B1 (en) 2012-05-14 2013-05-14 Method for increasing accuracy in quantitative detection of polynucleotides
ES13790712T ES2899687T3 (en) 2012-05-14 2013-05-14 Method to increase the precision in the quantitative detection of polynucleotides
CA2873585A CA2873585C (en) 2012-05-14 2013-05-14 Method for increasing accuracy in quantitative detection of polynucleotides
US14/401,322 US20150132754A1 (en) 2012-05-14 2013-05-14 Method for increasing accuracy in quantitative detection of polynucleotides

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261646714P 2012-05-14 2012-05-14
US61/646,714 2012-05-14

Publications (2)

Publication Number Publication Date
WO2013173394A2 true WO2013173394A2 (en) 2013-11-21
WO2013173394A3 WO2013173394A3 (en) 2014-01-23

Family

ID=49584445

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/041031 WO2013173394A2 (en) 2012-05-14 2013-05-14 Method for increasing accuracy in quantitative detection of polynucleotides

Country Status (6)

Country Link
US (1) US20150132754A1 (en)
EP (1) EP2850211B1 (en)
CA (1) CA2873585C (en)
ES (1) ES2899687T3 (en)
PT (1) PT2850211T (en)
WO (1) WO2013173394A2 (en)

Cited By (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2533006A (en) * 2012-09-04 2016-06-08 Guardant Health Inc Systems and methods to detect rare mutations and copy number variation
WO2016138500A1 (en) * 2015-02-27 2016-09-01 Cellular Research, Inc. Methods and compositions for barcoding nucleic acids for sequencing
WO2016126871A3 (en) * 2015-02-04 2016-10-06 The Regents Of The University Of California Sequencing of nucleic acids via barcoding in discrete entities
WO2016172362A1 (en) 2015-04-21 2016-10-27 General Automation Lab Technologies, Inc. High resolution systems, kits, apparatus, and methods for high throughput microbiology applications
US9598736B2 (en) 2013-08-28 2017-03-21 Cellular Research, Inc. Massively parallel single cell analysis
JP2017508471A (en) * 2014-03-28 2017-03-30 ジーイー・ヘルスケア・バイオサイエンス・コーポレイション Accurate detection of rare genetic mutations in next-generation sequencing
US9708659B2 (en) 2009-12-15 2017-07-18 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US9727810B2 (en) 2015-02-27 2017-08-08 Cellular Research, Inc. Spatially addressable molecular barcoding
WO2018005390A1 (en) 2016-06-30 2018-01-04 General Automation Lab Technologies, Inc. High resolution systems, kits, apparatus, and methods using lateral flow for high throughput microbiology applications
US9905005B2 (en) 2013-10-07 2018-02-27 Cellular Research, Inc. Methods and systems for digitally counting features on arrays
US9902992B2 (en) 2012-09-04 2018-02-27 Guardant Helath, Inc. Systems and methods to detect rare mutations and copy number variation
US9920366B2 (en) 2013-12-28 2018-03-20 Guardant Health, Inc. Methods and systems for detecting genetic variants
US20180362967A1 (en) * 2017-06-20 2018-12-20 uBiome, Inc. Method and system for library preparation with unique molecular identifiers
US10161007B2 (en) 2012-08-13 2018-12-25 The Regents Of The University Of California Methods and systems for detecting biological components
US10202641B2 (en) 2016-05-31 2019-02-12 Cellular Research, Inc. Error correction in amplification of samples
US10301677B2 (en) 2016-05-25 2019-05-28 Cellular Research, Inc. Normalization of nucleic acid libraries
US10338066B2 (en) 2016-09-26 2019-07-02 Cellular Research, Inc. Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US10434507B2 (en) 2014-10-22 2019-10-08 The Regents Of The University Of California High definition microdroplet printer
US10465232B1 (en) 2015-10-08 2019-11-05 Trace Genomics, Inc. Methods for quantifying efficiency of nucleic acid extraction and detection
US10501739B2 (en) 2017-10-18 2019-12-10 Mission Bio, Inc. Method, systems and apparatus for single cell analysis
US10619186B2 (en) 2015-09-11 2020-04-14 Cellular Research, Inc. Methods and compositions for library normalization
US10640763B2 (en) 2016-05-31 2020-05-05 Cellular Research, Inc. Molecular indexing of internal sequences
US10669570B2 (en) 2017-06-05 2020-06-02 Becton, Dickinson And Company Sample indexing for single cells
US10697007B2 (en) 2014-06-27 2020-06-30 The Regents Of The University Of California PCR-activated sorting (PAS)
US10697010B2 (en) 2015-02-19 2020-06-30 Becton, Dickinson And Company High-throughput single-cell analysis combining proteomic and genomic information
US10704085B2 (en) 2014-03-05 2020-07-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10722880B2 (en) 2017-01-13 2020-07-28 Cellular Research, Inc. Hydrophilic coating of fluidic channels
WO2020178772A1 (en) * 2019-03-04 2020-09-10 King Abdullah University Of Science And Technology Compositions and methods of labeling nucleic acids and sequencing and analysis thereof
US10822643B2 (en) 2016-05-02 2020-11-03 Cellular Research, Inc. Accurate molecular barcoding
US10844428B2 (en) 2015-04-28 2020-11-24 Illumina, Inc. Error suppression in sequenced DNA fragments using redundant reads with unique molecular indices (UMIS)
US10844429B2 (en) 2017-01-18 2020-11-24 Illumina, Inc. Methods and systems for generation and error-correction of unique molecular index sets with heterogeneous molecular lengths
US10941396B2 (en) 2012-02-27 2021-03-09 Becton, Dickinson And Company Compositions and kits for molecular counting
US11124830B2 (en) 2016-12-21 2021-09-21 The Regents Of The University Of California Single cell genomic sequencing using hydrogel based droplets
US11124823B2 (en) 2015-06-01 2021-09-21 Becton, Dickinson And Company Methods for RNA quantification
US11142791B2 (en) 2016-08-10 2021-10-12 The Regents Of The University Of California Combined multiple-displacement amplification and PCR in an emulsion microdroplet
US11164659B2 (en) 2016-11-08 2021-11-02 Becton, Dickinson And Company Methods for expression profile classification
US11242569B2 (en) 2015-12-17 2022-02-08 Guardant Health, Inc. Methods to determine tumor gene copy number by analysis of cell-free DNA
US11319583B2 (en) 2017-02-01 2022-05-03 Becton, Dickinson And Company Selective amplification using blocking oligonucleotides
US11365441B2 (en) 2019-05-22 2022-06-21 Mission Bio, Inc. Method and apparatus for simultaneous targeted sequencing of DNA, RNA and protein
US11365409B2 (en) 2018-05-03 2022-06-21 Becton, Dickinson And Company Molecular barcoding on opposite transcript ends
US11371076B2 (en) 2019-01-16 2022-06-28 Becton, Dickinson And Company Polymerase chain reaction normalization through primer titration
US11390914B2 (en) 2015-04-23 2022-07-19 Becton, Dickinson And Company Methods and compositions for whole transcriptome amplification
US11397882B2 (en) 2016-05-26 2022-07-26 Becton, Dickinson And Company Molecular label counting adjustment methods
US11447818B2 (en) 2017-09-15 2022-09-20 Illumina, Inc. Universal short adapters with variable length non-random unique molecular identifiers
US11492660B2 (en) 2018-12-13 2022-11-08 Becton, Dickinson And Company Selective extension in single cell whole transcriptome analysis
US11535882B2 (en) 2015-03-30 2022-12-27 Becton, Dickinson And Company Methods and compositions for combinatorial barcoding
US11608497B2 (en) 2016-11-08 2023-03-21 Becton, Dickinson And Company Methods for cell label classification
WO2023059599A1 (en) 2021-10-04 2023-04-13 F. Hoffmann-La Roche Ag Online base call compression
US11639517B2 (en) 2018-10-01 2023-05-02 Becton, Dickinson And Company Determining 5′ transcript sequences
US11649497B2 (en) 2020-01-13 2023-05-16 Becton, Dickinson And Company Methods and compositions for quantitation of proteins and RNA
US11661631B2 (en) 2019-01-23 2023-05-30 Becton, Dickinson And Company Oligonucleotides associated with antibodies
US11661625B2 (en) 2020-05-14 2023-05-30 Becton, Dickinson And Company Primers for immune repertoire profiling
US11667954B2 (en) 2019-07-01 2023-06-06 Mission Bio, Inc. Method and apparatus to normalize quantitative readouts in single-cell experiments
US11739443B2 (en) 2020-11-20 2023-08-29 Becton, Dickinson And Company Profiling of highly expressed and lowly expressed proteins
US11773436B2 (en) 2019-11-08 2023-10-03 Becton, Dickinson And Company Using random priming to obtain full-length V(D)J information for immune repertoire sequencing
US11773441B2 (en) 2018-05-03 2023-10-03 Becton, Dickinson And Company High throughput multiomics sample analysis
US11913065B2 (en) 2012-09-04 2024-02-27 Guardent Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11932901B2 (en) 2020-07-13 2024-03-19 Becton, Dickinson And Company Target enrichment using nucleic acid probes for scRNAseq
US11932849B2 (en) 2018-11-08 2024-03-19 Becton, Dickinson And Company Whole transcriptome analysis of single cells using random priming
US11939622B2 (en) 2019-07-22 2024-03-26 Becton, Dickinson And Company Single cell chromatin immunoprecipitation sequencing assay
US11946095B2 (en) 2017-12-19 2024-04-02 Becton, Dickinson And Company Particles associated with oligonucleotides
US11952622B2 (en) * 2013-07-18 2024-04-09 The Johns Hopkins University Analysis of DNA-containing samples and resolution of mixed contributor DNA samples
US11965208B2 (en) 2019-04-19 2024-04-23 Becton, Dickinson And Company Methods of associating phenotypical data and single cell sequencing data
US11970737B2 (en) 2019-08-26 2024-04-30 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11286518B2 (en) 2016-05-06 2022-03-29 Regents Of The University Of Minnesota Analytical standards and methods of using same
US10774377B1 (en) 2017-10-05 2020-09-15 Verily Life Sciences Llc Use of unique molecular identifiers for improved sequencing of taxonomically relevant genes
WO2020185967A1 (en) * 2019-03-11 2020-09-17 Red Genomics, Inc. Methods and reagents for enhanced next generation sequencing library conversion and insertion of barcodes into nucleic acids.

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5213961A (en) 1989-08-31 1993-05-25 Brigham And Women's Hospital Accurate quantitation of RNA and DNA by competetitive polymerase chain reaction
US20100291668A1 (en) 2009-05-14 2010-11-18 Jeff Bertrand Apparatus for Performing Amplicon Rescue Multiplex PCR

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102008025656B4 (en) * 2008-05-28 2016-07-28 Genxpro Gmbh Method for the quantitative analysis of nucleic acids, markers therefor and their use
WO2009151407A2 (en) * 2008-06-14 2009-12-17 Veredus Laboratories Pte Ltd Influenza sequences
US8828688B2 (en) * 2010-05-27 2014-09-09 Affymetrix, Inc. Multiplex amplification methods

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5213961A (en) 1989-08-31 1993-05-25 Brigham And Women's Hospital Accurate quantitation of RNA and DNA by competetitive polymerase chain reaction
US20100291668A1 (en) 2009-05-14 2010-11-18 Jeff Bertrand Apparatus for Performing Amplicon Rescue Multiplex PCR

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
GLENN K FU ET AL., PNAS, 31 May 2011 (2011-05-31), pages 9026 - 9031
HUBERT HUG ET AL., JOURNAL OF THEORETICAL BIOLOGY, vol. 221, no. 4, 1 April 2003 (2003-04-01), pages 615 - 624
See also references of EP2850211A4
SMITH, A.M.: "Quantitative Phenotyping via Deep Barcode Sequencing", GENOME RESEARCH, vol. 19, 2009, pages 1836 - 1842, XP055240815, DOI: 10.1101/gr.093955.109

Cited By (158)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10059991B2 (en) 2009-12-15 2018-08-28 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10619203B2 (en) 2009-12-15 2020-04-14 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US9816137B2 (en) 2009-12-15 2017-11-14 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10047394B2 (en) 2009-12-15 2018-08-14 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10392661B2 (en) 2009-12-15 2019-08-27 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US10202646B2 (en) 2009-12-15 2019-02-12 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US9845502B2 (en) 2009-12-15 2017-12-19 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US9708659B2 (en) 2009-12-15 2017-07-18 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US10941396B2 (en) 2012-02-27 2021-03-09 Becton, Dickinson And Company Compositions and kits for molecular counting
US11634708B2 (en) 2012-02-27 2023-04-25 Becton, Dickinson And Company Compositions and kits for molecular counting
US10161007B2 (en) 2012-08-13 2018-12-25 The Regents Of The University Of California Methods and systems for detecting biological components
US11891666B2 (en) 2012-08-13 2024-02-06 The Regents Of The University Of California Methods and systems for detecting biological components
US11001896B2 (en) 2012-08-13 2021-05-11 The Regents Of The University Of California System and method to synthesize a target molecule within a droplet
US10745762B2 (en) 2012-08-13 2020-08-18 The Regents Of The University Of California Method and system for synthesizing a target polynucleotide within a droplet
US11203787B2 (en) 2012-08-13 2021-12-21 The Regents Of The University Of California Methods and systems for detecting biological components
US10041127B2 (en) 2012-09-04 2018-08-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11773453B2 (en) 2012-09-04 2023-10-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US9834822B2 (en) 2012-09-04 2017-12-05 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11319598B2 (en) 2012-09-04 2022-05-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US9902992B2 (en) 2012-09-04 2018-02-27 Guardant Helath, Inc. Systems and methods to detect rare mutations and copy number variation
US11001899B1 (en) 2012-09-04 2021-05-11 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10995376B1 (en) 2012-09-04 2021-05-04 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10683556B2 (en) 2012-09-04 2020-06-16 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11319597B2 (en) 2012-09-04 2022-05-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
GB2533006B (en) * 2012-09-04 2017-06-07 Guardant Health Inc Systems and methods to detect copy number variation
US10961592B2 (en) 2012-09-04 2021-03-30 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
GB2533006A (en) * 2012-09-04 2016-06-08 Guardant Health Inc Systems and methods to detect rare mutations and copy number variation
US10947600B2 (en) 2012-09-04 2021-03-16 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11434523B2 (en) 2012-09-04 2022-09-06 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US9598731B2 (en) 2012-09-04 2017-03-21 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US9840743B2 (en) 2012-09-04 2017-12-12 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10894974B2 (en) 2012-09-04 2021-01-19 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10876171B2 (en) 2012-09-04 2020-12-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10876152B2 (en) 2012-09-04 2020-12-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10876172B2 (en) 2012-09-04 2020-12-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10837063B2 (en) 2012-09-04 2020-11-17 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11879158B2 (en) 2012-09-04 2024-01-23 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10822663B2 (en) 2012-09-04 2020-11-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10457995B2 (en) 2012-09-04 2019-10-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10793916B2 (en) 2012-09-04 2020-10-06 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10494678B2 (en) 2012-09-04 2019-12-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11913065B2 (en) 2012-09-04 2024-02-27 Guardent Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10501810B2 (en) 2012-09-04 2019-12-10 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10501808B2 (en) 2012-09-04 2019-12-10 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10738364B2 (en) 2012-09-04 2020-08-11 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11952622B2 (en) * 2013-07-18 2024-04-09 The Johns Hopkins University Analysis of DNA-containing samples and resolution of mixed contributor DNA samples
US10954570B2 (en) 2013-08-28 2021-03-23 Becton, Dickinson And Company Massively parallel single cell analysis
US10131958B1 (en) 2013-08-28 2018-11-20 Cellular Research, Inc. Massively parallel single cell analysis
US10208356B1 (en) 2013-08-28 2019-02-19 Becton, Dickinson And Company Massively parallel single cell analysis
US10253375B1 (en) 2013-08-28 2019-04-09 Becton, Dickinson And Company Massively parallel single cell analysis
US9637799B2 (en) 2013-08-28 2017-05-02 Cellular Research, Inc. Massively parallel single cell analysis
US11618929B2 (en) 2013-08-28 2023-04-04 Becton, Dickinson And Company Massively parallel single cell analysis
US10927419B2 (en) 2013-08-28 2021-02-23 Becton, Dickinson And Company Massively parallel single cell analysis
US11702706B2 (en) 2013-08-28 2023-07-18 Becton, Dickinson And Company Massively parallel single cell analysis
US9598736B2 (en) 2013-08-28 2017-03-21 Cellular Research, Inc. Massively parallel single cell analysis
US10151003B2 (en) 2013-08-28 2018-12-11 Cellular Research, Inc. Massively Parallel single cell analysis
US9905005B2 (en) 2013-10-07 2018-02-27 Cellular Research, Inc. Methods and systems for digitally counting features on arrays
US11149307B2 (en) 2013-12-28 2021-10-19 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11639525B2 (en) 2013-12-28 2023-05-02 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10889858B2 (en) 2013-12-28 2021-01-12 Guardant Health, Inc. Methods and systems for detecting genetic variants
US9920366B2 (en) 2013-12-28 2018-03-20 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10801063B2 (en) 2013-12-28 2020-10-13 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11149306B2 (en) 2013-12-28 2021-10-19 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11118221B2 (en) 2013-12-28 2021-09-14 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11649491B2 (en) 2013-12-28 2023-05-16 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11667967B2 (en) 2013-12-28 2023-06-06 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11767555B2 (en) 2013-12-28 2023-09-26 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11639526B2 (en) 2013-12-28 2023-05-02 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11767556B2 (en) 2013-12-28 2023-09-26 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11434531B2 (en) 2013-12-28 2022-09-06 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11959139B2 (en) 2013-12-28 2024-04-16 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10883139B2 (en) 2013-12-28 2021-01-05 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10982265B2 (en) 2014-03-05 2021-04-20 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11091796B2 (en) 2014-03-05 2021-08-17 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10870880B2 (en) 2014-03-05 2020-12-22 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10704085B2 (en) 2014-03-05 2020-07-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10704086B2 (en) 2014-03-05 2020-07-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11667959B2 (en) 2014-03-05 2023-06-06 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11091797B2 (en) 2014-03-05 2021-08-17 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11447813B2 (en) 2014-03-05 2022-09-20 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
JP2017508471A (en) * 2014-03-28 2017-03-30 ジーイー・ヘルスケア・バイオサイエンス・コーポレイション Accurate detection of rare genetic mutations in next-generation sequencing
EP3122894A4 (en) * 2014-03-28 2017-11-08 GE Healthcare Bio-Sciences Corp. Accurate detection of rare genetic variants in next generation sequencing
US10697007B2 (en) 2014-06-27 2020-06-30 The Regents Of The University Of California PCR-activated sorting (PAS)
US11312990B2 (en) 2014-06-27 2022-04-26 The Regents Of The University Of California PCR-activated sorting (PAS)
US11020736B2 (en) 2014-10-22 2021-06-01 The Regents Of The University Of California High definition microdroplet printer
US10434507B2 (en) 2014-10-22 2019-10-08 The Regents Of The University Of California High definition microdroplet printer
US11732287B2 (en) 2015-02-04 2023-08-22 The Regents Of The University Of California Sequencing of nucleic acids via barcoding in discrete entities
WO2016126871A3 (en) * 2015-02-04 2016-10-06 The Regents Of The University Of California Sequencing of nucleic acids via barcoding in discrete entities
US11111519B2 (en) 2015-02-04 2021-09-07 The Regents Of The University Of California Sequencing of nucleic acids via barcoding in discrete entities
US20170009274A1 (en) * 2015-02-04 2017-01-12 The Regents Of The University Of California Sequencing of nucleic acids via barcoding in discrete entities
US11098358B2 (en) 2015-02-19 2021-08-24 Becton, Dickinson And Company High-throughput single-cell analysis combining proteomic and genomic information
US10697010B2 (en) 2015-02-19 2020-06-30 Becton, Dickinson And Company High-throughput single-cell analysis combining proteomic and genomic information
US9727810B2 (en) 2015-02-27 2017-08-08 Cellular Research, Inc. Spatially addressable molecular barcoding
USRE48913E1 (en) 2015-02-27 2022-02-01 Becton, Dickinson And Company Spatially addressable molecular barcoding
WO2016138500A1 (en) * 2015-02-27 2016-09-01 Cellular Research, Inc. Methods and compositions for barcoding nucleic acids for sequencing
US10002316B2 (en) 2015-02-27 2018-06-19 Cellular Research, Inc. Spatially addressable molecular barcoding
US11535882B2 (en) 2015-03-30 2022-12-27 Becton, Dickinson And Company Methods and compositions for combinatorial barcoding
WO2016172362A1 (en) 2015-04-21 2016-10-27 General Automation Lab Technologies, Inc. High resolution systems, kits, apparatus, and methods for high throughput microbiology applications
EP4070887A1 (en) 2015-04-21 2022-10-12 Isolation Bio Inc. Methods for high throughput microbiology applications
US11390914B2 (en) 2015-04-23 2022-07-19 Becton, Dickinson And Company Methods and compositions for whole transcriptome amplification
US10844428B2 (en) 2015-04-28 2020-11-24 Illumina, Inc. Error suppression in sequenced DNA fragments using redundant reads with unique molecular indices (UMIS)
US11866777B2 (en) 2015-04-28 2024-01-09 Illumina, Inc. Error suppression in sequenced DNA fragments using redundant reads with unique molecular indices (UMIS)
US11124823B2 (en) 2015-06-01 2021-09-21 Becton, Dickinson And Company Methods for RNA quantification
US11332776B2 (en) 2015-09-11 2022-05-17 Becton, Dickinson And Company Methods and compositions for library normalization
US10619186B2 (en) 2015-09-11 2020-04-14 Cellular Research, Inc. Methods and compositions for library normalization
US10465232B1 (en) 2015-10-08 2019-11-05 Trace Genomics, Inc. Methods for quantifying efficiency of nucleic acid extraction and detection
US11242569B2 (en) 2015-12-17 2022-02-08 Guardant Health, Inc. Methods to determine tumor gene copy number by analysis of cell-free DNA
US10822643B2 (en) 2016-05-02 2020-11-03 Cellular Research, Inc. Accurate molecular barcoding
US10301677B2 (en) 2016-05-25 2019-05-28 Cellular Research, Inc. Normalization of nucleic acid libraries
US11845986B2 (en) 2016-05-25 2023-12-19 Becton, Dickinson And Company Normalization of nucleic acid libraries
US11397882B2 (en) 2016-05-26 2022-07-26 Becton, Dickinson And Company Molecular label counting adjustment methods
US10202641B2 (en) 2016-05-31 2019-02-12 Cellular Research, Inc. Error correction in amplification of samples
US11525157B2 (en) 2016-05-31 2022-12-13 Becton, Dickinson And Company Error correction in amplification of samples
US10640763B2 (en) 2016-05-31 2020-05-05 Cellular Research, Inc. Molecular indexing of internal sequences
US11220685B2 (en) 2016-05-31 2022-01-11 Becton, Dickinson And Company Molecular indexing of internal sequences
WO2018005390A1 (en) 2016-06-30 2018-01-04 General Automation Lab Technologies, Inc. High resolution systems, kits, apparatus, and methods using lateral flow for high throughput microbiology applications
US11142791B2 (en) 2016-08-10 2021-10-12 The Regents Of The University Of California Combined multiple-displacement amplification and PCR in an emulsion microdroplet
US10338066B2 (en) 2016-09-26 2019-07-02 Cellular Research, Inc. Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11460468B2 (en) 2016-09-26 2022-10-04 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11467157B2 (en) 2016-09-26 2022-10-11 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11782059B2 (en) 2016-09-26 2023-10-10 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11164659B2 (en) 2016-11-08 2021-11-02 Becton, Dickinson And Company Methods for expression profile classification
US11608497B2 (en) 2016-11-08 2023-03-21 Becton, Dickinson And Company Methods for cell label classification
US11124830B2 (en) 2016-12-21 2021-09-21 The Regents Of The University Of California Single cell genomic sequencing using hydrogel based droplets
US10722880B2 (en) 2017-01-13 2020-07-28 Cellular Research, Inc. Hydrophilic coating of fluidic channels
US10844429B2 (en) 2017-01-18 2020-11-24 Illumina, Inc. Methods and systems for generation and error-correction of unique molecular index sets with heterogeneous molecular lengths
US11761035B2 (en) 2017-01-18 2023-09-19 Illumina, Inc. Methods and systems for generation and error-correction of unique molecular index sets with heterogeneous molecular lengths
US11319583B2 (en) 2017-02-01 2022-05-03 Becton, Dickinson And Company Selective amplification using blocking oligonucleotides
US10676779B2 (en) 2017-06-05 2020-06-09 Becton, Dickinson And Company Sample indexing for single cells
US10669570B2 (en) 2017-06-05 2020-06-02 Becton, Dickinson And Company Sample indexing for single cells
WO2018237092A1 (en) * 2017-06-20 2018-12-27 uBiome, Inc. Method and system for library preparation with unique molecular identifiers
US20180362967A1 (en) * 2017-06-20 2018-12-20 uBiome, Inc. Method and system for library preparation with unique molecular identifiers
CN111201323A (en) * 2017-06-20 2020-05-26 普梭梅根公司 Methods and systems for library preparation using unique molecular identifiers
JP2020528740A (en) * 2017-06-20 2020-10-01 プソマーゲン, インコーポレイテッドPsomagen, Inc. Methods and systems for preparing libraries with unique molecular identifiers
US11447818B2 (en) 2017-09-15 2022-09-20 Illumina, Inc. Universal short adapters with variable length non-random unique molecular identifiers
US11898198B2 (en) 2017-09-15 2024-02-13 Illumina, Inc. Universal short adapters with variable length non-random unique molecular identifiers
US11781129B2 (en) 2017-10-18 2023-10-10 Mission Bio, Inc. Method, systems and apparatus for single cell analysis
US10501739B2 (en) 2017-10-18 2019-12-10 Mission Bio, Inc. Method, systems and apparatus for single cell analysis
US11946095B2 (en) 2017-12-19 2024-04-02 Becton, Dickinson And Company Particles associated with oligonucleotides
US11365409B2 (en) 2018-05-03 2022-06-21 Becton, Dickinson And Company Molecular barcoding on opposite transcript ends
US11773441B2 (en) 2018-05-03 2023-10-03 Becton, Dickinson And Company High throughput multiomics sample analysis
US11639517B2 (en) 2018-10-01 2023-05-02 Becton, Dickinson And Company Determining 5′ transcript sequences
US11932849B2 (en) 2018-11-08 2024-03-19 Becton, Dickinson And Company Whole transcriptome analysis of single cells using random priming
US11492660B2 (en) 2018-12-13 2022-11-08 Becton, Dickinson And Company Selective extension in single cell whole transcriptome analysis
US11371076B2 (en) 2019-01-16 2022-06-28 Becton, Dickinson And Company Polymerase chain reaction normalization through primer titration
US11661631B2 (en) 2019-01-23 2023-05-30 Becton, Dickinson And Company Oligonucleotides associated with antibodies
WO2020178772A1 (en) * 2019-03-04 2020-09-10 King Abdullah University Of Science And Technology Compositions and methods of labeling nucleic acids and sequencing and analysis thereof
US11965208B2 (en) 2019-04-19 2024-04-23 Becton, Dickinson And Company Methods of associating phenotypical data and single cell sequencing data
US11365441B2 (en) 2019-05-22 2022-06-21 Mission Bio, Inc. Method and apparatus for simultaneous targeted sequencing of DNA, RNA and protein
US11667954B2 (en) 2019-07-01 2023-06-06 Mission Bio, Inc. Method and apparatus to normalize quantitative readouts in single-cell experiments
US11939622B2 (en) 2019-07-22 2024-03-26 Becton, Dickinson And Company Single cell chromatin immunoprecipitation sequencing assay
US11970737B2 (en) 2019-08-26 2024-04-30 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US11773436B2 (en) 2019-11-08 2023-10-03 Becton, Dickinson And Company Using random priming to obtain full-length V(D)J information for immune repertoire sequencing
US11649497B2 (en) 2020-01-13 2023-05-16 Becton, Dickinson And Company Methods and compositions for quantitation of proteins and RNA
US11661625B2 (en) 2020-05-14 2023-05-30 Becton, Dickinson And Company Primers for immune repertoire profiling
US11932901B2 (en) 2020-07-13 2024-03-19 Becton, Dickinson And Company Target enrichment using nucleic acid probes for scRNAseq
US11739443B2 (en) 2020-11-20 2023-08-29 Becton, Dickinson And Company Profiling of highly expressed and lowly expressed proteins
WO2023059599A1 (en) 2021-10-04 2023-04-13 F. Hoffmann-La Roche Ag Online base call compression

Also Published As

Publication number Publication date
PT2850211T (en) 2021-11-29
CA2873585C (en) 2021-11-09
CA2873585A1 (en) 2013-11-21
EP2850211B1 (en) 2021-09-08
EP2850211A2 (en) 2015-03-25
WO2013173394A3 (en) 2014-01-23
EP2850211A4 (en) 2016-01-13
US20150132754A1 (en) 2015-05-14
ES2899687T3 (en) 2022-03-14

Similar Documents

Publication Publication Date Title
CA2873585C (en) Method for increasing accuracy in quantitative detection of polynucleotides
JP5985390B2 (en) Multi-primer amplification method for adding barcode to target nucleic acid
San Segundo-Val et al. Introduction to the gene expression analysis
EP2852682B1 (en) Single-particle analysis of particle populations
EP2569453B1 (en) Nucleic acid isolation methods
JP6404714B2 (en) Multivariate diagnostic assay and method for using the same
US10138519B2 (en) Universal sanger sequencing from next-gen sequencing amplicons
JP2010534069A5 (en)
CN102203287A (en) Assay methods for increased throughput of samples and/or targets
CN106536735A (en) Probe set for analyzing a dna sample and method for using the same
US20170175170A1 (en) High-level multiplex amplification
EP3325152B1 (en) Automated sample to ngs library preparation
Ashton et al. Comparative analysis of single-cell RNA sequencing platforms and methods
JP2010017127A5 (en)
US20220136043A1 (en) Systems and methods for separating decoded arrays
CN105112507A (en) Digital constant-temperature detection method of miRNA
EP3325697B1 (en) Optimized clinical sample sequencing
CN109790587B (en) Method for discriminating origin of human genomic DNA of 100pg or less, method for identifying individual, and method for analyzing degree of engraftment of hematopoietic stem cells
Saini et al. Molecular Tools for Microbial Diversity Analysis
CN105247076B (en) Method for amplifying fragmented target nucleic acids using assembler sequences

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13790712

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2873585

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 14401322

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2013790712

Country of ref document: EP