WO2016028316A1 - Methods for quantitative genetic analysis of cell free dna - Google Patents

Methods for quantitative genetic analysis of cell free dna Download PDF

Info

Publication number
WO2016028316A1
WO2016028316A1 PCT/US2014/052317 US2014052317W WO2016028316A1 WO 2016028316 A1 WO2016028316 A1 WO 2016028316A1 US 2014052317 W US2014052317 W US 2014052317W WO 2016028316 A1 WO2016028316 A1 WO 2016028316A1
Authority
WO
WIPO (PCT)
Prior art keywords
cfdna
genetic
library
dna
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2014/052317
Other languages
English (en)
French (fr)
Inventor
Christopher K. Raymond
Lee P. Lim
Christopher D. Armour
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Resolution Bioscience Inc
Original Assignee
Resolution Bioscience Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Resolution Bioscience Inc filed Critical Resolution Bioscience Inc
Priority to EP14761762.5A priority Critical patent/EP3194612B1/en
Priority to HUE14761762A priority patent/HUE068191T2/hu
Priority to CA2957657A priority patent/CA2957657A1/en
Priority to EP24182096.8A priority patent/EP4410978A3/en
Priority to ES14761762T priority patent/ES2984266T3/es
Priority to DK14761762.5T priority patent/DK3194612T3/da
Priority to PT147617625T priority patent/PT3194612T/pt
Priority to JP2017510397A priority patent/JP6709778B2/ja
Priority to CN201480081729.4A priority patent/CN107002118B/zh
Priority to CN202210630244.2A priority patent/CN115029342B/zh
Priority to PCT/US2014/052317 priority patent/WO2016028316A1/en
Priority to PL14761762.5T priority patent/PL3194612T3/pl
Priority to SG11201701113WA priority patent/SG11201701113WA/en
Publication of WO2016028316A1 publication Critical patent/WO2016028316A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups

Definitions

  • CLFK_002_00US_ST25.txt The text file is 117 KB, was created on August 22, 2014, and is being submitted electronically via EFS-Web.
  • the invention relates generally to compositions and methods for the quantitative genetic analysis of cell free DNA (cfDNA).
  • the present invention relates to improved targeted sequence capture compositions and methods for the genetic characterization and analysis of cfDNA.
  • next-generation sequencing technologies 2004-present
  • NSCLC next-generation sequencing technologies
  • molecular diagnostics have consisted of antibody-based tests (immunohistochemistry), in-situ hybridization with DNA probes (fluorescence in situ hybridization), and hybridization or PCR-based tests that query specific nucleotide sequences.
  • DNA sequencing as a molecular diagnostic tool has been generally limited to the coding exons of one or two genes. While DNA sequencing has been used in the diagnosis and treatment of solid cancers, one of the most significant drawbacks of these methods is that they require direct access to tumor tissues. Such material is often difficult to obtain from the initial biopsy used to diagnose the disease and virtually impossible to obtain in multiple repetitions over time. Similarly, biopsies are not possible in patients with inaccessible tumors and not practical in individuals suffering from metastatic disease.
  • the invention relates generally to compositions and methods for improved compositions and methods for the genetic analysis of cfDNA.
  • cfDNA (cfDNA) is provided, comprising: treating cfDNA with one or more end-repair enzymes to generate end-repaired cfDNA; ligating one or more adaptors to each end of the end- repaired cfDNA to generate a cfDNA library; amplifying the cfDNA library to generate cfDNA library clones; determining the number of genome equivalents in the cfDNA clone library; and performing a quantitative genetic analysis of one or more target genetic loci in the cfDNA library clones.
  • the method further comprises isolating cfDNA from a biological sample of a subject.
  • the cfDNA is isolated from a biological sample selected from the group consisting of: amniotic fluid, blood, plasma, serum, semen, lymphatic fluid, cerebral spinal fluid, ocular fluid, urine, saliva, stool, mucous, and sweat.
  • the one or more adaptors comprise a plurality of adaptor species.
  • the one or more adaptors each comprise a primer binding site for amplification of the cfDNA library.
  • the one or more adaptors each comprise one or more unique read codes.
  • the one or more adaptors each comprise one or more sample codes for sample multiplexing.
  • the one or more adaptors each comprise one or more sequences for DNA sequencing.
  • qPCR is performed on the cfDNA clone library and a qPCR measurement is compared to standards of known genome equivalents to determine the genome equivalents of the cfDNA clone library.
  • qPCR is performed with a primer that binds to an Alu sequence and a primer that binds to a sequence in an adaptor.
  • the quantitative genetic analysis is performed on a plurality of genetic loci in the cfDNA library clones. In a further embodiment, the quantitative genetic analysis is performed on a plurality of genetic loci in a plurality of cfDNA clone libraries.
  • the quantitative genetic analysis comprises hybridizing one or more capture probes to a target genetic locus to form capture probe- cfDNA clone complexes.
  • the quantitative genetic analysis comprises isolating the capture probe-cfDNA clone complexes.
  • the quantitative genetic analysis comprises amplification of the cfDNA clone sequence in the isolated hybridized capture probe- cfDNA clone complexes.
  • the quantitative genetic analysis comprises DNA sequencing to generate a plurality of sequencing reads.
  • the quantitative genetic analysis comprises
  • bioinformatics analysis is used: to quantify the number of genome equivalents analyzed in the cfDNA clone library; to detect genetic variants in a target genetic locus; to detect mutations within a target genetic locus; to detect genetic fusions within a target genetic locus; and to measure copy number fluctuations within a target genetic locus.
  • the subject does not have a genetic disease.
  • the subject has not been diagnosed with a genetic disease.
  • the subject has been diagnosed with a genetic disease.
  • the quantitative genetic analysis is used to identify or detect one or more genetic lesions that cause or associated with the genetic disease.
  • the genetic lesion comprises a nucleotide transition or transversion, a nucleotide insertion or deletion, a genomic rearrangement, a change in copy number, or a gene fusion.
  • the genetic lesion comprises a genomic
  • the 3 ' coding region of the ALK gene is fused to the EML4 gene.
  • the genetic disease is cancer.
  • the subject is pregnant.
  • the quantitative genetic analysis is used to identify or detect one or more genetic variants or genetic lesions of one or more target genetic loci in fetal cfDNA.
  • the subject is a transplant recipient.
  • the quantitative genetic analysis is used to identify or detect donor cfDNA in the subject.
  • a method of predicting, diagnosing, or monitoring a genetic disease in a subject comprising: isolating or obtaining cfDNA from a biological sample of a subject; treating the cfDNA with one or more end-repair enzymes to generate end-repaired cfDNA; ligating one or more adaptors to each end of the end-repaired cfDNA to generate a cfDNA library; amplifying the cfDNA library to generate a cfDNA clone library; determining the number of genome equivalents in the cfDNA clone library; and performing a quantitative genetic analysis of one or more target genetic loci associated with the genetic disease in the cfDNA clone library, wherein the identification or detection of one or more genetic lesions in the one or more target genetic loci is prognostic for, diagnostic of, or monitors the progression of the genetic disease.
  • the cfDNA is isolated from a biological sample selected from the group consisting of: amniotic fluid, blood, plasma, serum, semen, lymphatic fluid, cerebral spinal fluid, ocular fluid, urine, saliva, stool, mucous, and sweat.
  • the genetic lesion comprises a nucleotide transition or transversion, a nucleotide insertion or deletion, a genomic rearrangement, a change in copy number, or a gene fusion.
  • the genetic lesion comprises a genomic
  • the 3 ' coding region of the ALK gene is fused to the EML4 gene.
  • the genetic disease is cancer.
  • a companion diagnostic for a genetic disease comprising: isolating or obtaining cfDNA from a biological sample of a subject; treating the cfDNA with one or more end-repair enzymes to generate end- repaired cfDNA; ligating one or more adaptors to each end of the end-repaired cfDNA to generate a cfDNA library; amplifying the cfDNA library to generate a cfDNA clone library; determining the number of genome equivalents in the cfDNA clone library; and performing a quantitative genetic analysis of one or more biomarkers associated with the genetic disease in the cfDNA clone library, wherein detection of, or failure to detect, at least one of the one or more biomarkers indicates whether the subject should be treated for the genetic disease.
  • the cfDNA is isolated from a biological sample selected from the group consisting of: amniotic fluid, blood, plasma, serum, semen, lymphatic fluid, cerebral spinal fluid, ocular fluid, urine, saliva, stool, mucous, and sweat.
  • the biomarker is a genetic lesion.
  • the genetic lesion comprises a nucleotide transition or transversion, a nucleotide insertion or deletion, a genomic rearrangement, a change in copy number, or a gene fusion.
  • the genetic lesion comprises a genomic rearrangement that fuses the 3 ' coding region of the ALK gene to another gene.
  • the 3 ' coding region of the ALK gene is fused to the EML4 gene.
  • the genetic disease is cancer.
  • Figure 1 shows the expected versus observed variant frequencies as a function of admix dilution in the absence of unique read filtering. In the absence of unique read filtering, random base changes at these four selected positions occurred with measurable, non-zero frequencies; thus, demonstrating a lack of sensitivity to detect the particular single nucleotide variants (SNV).
  • SNV single nucleotide variants
  • Figure 2 shows that unique read filtering performed on the data generated in Figure 1.
  • the left hand panel shows the data from Figure 1 on the BRAF I326T SNV without unique read filtering.
  • the right hand panel shows that using unique read filtering of the same data increased the assay sensitivity and allowed the discrimination of true signal from error-prone noise.
  • Figure 3 shows capture probe performance as a function of length and wash temperature.
  • the y-axis shows the total number of reads associated with each capture probe.
  • the bars in the bar chart are broken into two categories, where open bars correspond to on-target reads that align to the intended capture probe targets and solid bars show off-target reads that are associated with a capture probe but that map to unintended regions of the genome.
  • the 40-mer and 60-mer capture probes perform substantially the same with 44° C and 47° C washes. With the 50° C wash, the 40-mer capture probes perform erratically.
  • Figure 4 shows a schematic for the targeted and oriented sequencing of intron 19 of the ALK gene.
  • FIG. 5 shows a schematic for high density capture probe placement for complete sequencing of target regions.
  • Each capture probe captures a collection of sequences that provide cumulative coverage at each base position.
  • coverage is represented by a line, and the amplitude of the line denotes depth of coverage derived from a particular capture probe.
  • Overlapping coverage from adjacent capture probes provides complete sequencing of target regions in both possible directions.
  • the head-to-head placement of opposite strand capture probes ensures that all capture probe binding sites are sequenced.
  • Figure 6 shows a representative example of the size distribution of fragmented
  • Figure 7 shows the performance of high-density 40-mer capture probes in a representative experiment.
  • the y-axis shows the total number of reads, which are broken out as on-target reads, off-target reads, and unmappable reads.
  • the x-axis enumerates each of the 105 capture probes used in this experiment for sequence capture.
  • Figure 8 shows a representative example of the cumulative coverage of a target region using high density 40-mer capture probes. Shown here is the cumulative coverage of TP53 coding exons.
  • Figure 9A shows a representative example of the size distribution of cell-free DNA (cfDNA) libraries. The dominant band is consistent with a collection of 170+10 bp fragments ligated to 90 bp of adaptors.
  • Figure 9B shows a published gel image of cfDNA and a representative cfDNA library generated using the methods diclosed and/or contemplated herein. The qualitative "ladder" appearance is conserved in the library, but the library is shifted to higher mass by the addition of 90 bp of adaptor sequences.
  • Figure 9C shows a representative example of genomic, plasma-derived cfDNA libraries from Ovarian cancer patients (OvC) and "healthy donors" (HD).
  • Figure 10 shows the unique read counts across eight cfDNA libraries derived from four plasma samples. Fragmentation (frag) prior to library construction with this sample 23407 increased the library yield by more than two-fold.
  • Figure 11 shows th representative read coverage of cfDNA across a region of the TP53 gene. Twenty four 131 bp reads captured by the
  • TP53_NM_000546_chrl7:7579351 :region_3:280nt:41 :80:r” capture probe (SEQ ID NO:201) were chosen at random and aligned using the BLAT algorithm within the UCSC genome browser. Twenty one reads map to the target region, and they do so in a pattern of overlapping coverage. The probe used to capture these reads is marked with an arrow.
  • Figure 12 shows an overview of targeted DNA sequencing of the coding regions of the TP53 gene from a cfDNA genomic library.
  • the coverage (horizontal axis) extends across all 10 coding regions and includes intronic regions involved in mRNA splicing.
  • the sequencing depth (vertical axis) reaches a maximum of 4851 and is uniform across all coding exons.
  • Figure 13 shows a plot of unique read counts versus qPCR estimated genome equivalents in an AC A2 -based assay. qPCR measurements are shown on the X-axis versus read counts on the Y-axis. Perfect agreement between these measurements is shown as the diagonal. There is very poor, if any, correlation between measurements, especially at lower genomic inputs. These data show that the ACA2 -based qPCR assay chronically underestimates library complexity and is inadequate for measuring genome equivalents.
  • Figure 14 shows a schematic of the core elements of a qPCR genome equivalent measurement assay that couples an genomic repeat specific primer (e.g., Alu) and a long adaptor-specific primer.
  • an genomic repeat specific primer e.g., Alu
  • A Standard library amplification using a single, 25 nt primer named ACA2 (primer 1).
  • B Longer, 58 nt versions of the ACA2 primer (primer 2) do not amplify genomic libraries because of stem-loop suppression.
  • C Forward and reverse primers directed to a consensus human Alu repeat element (primers 3 and 4) recognize 1000's of loci and readily amplify genomic DNA.
  • Figure 15 shows proof-of-concept data for an Alu plus adaptor-based qPCR assay of genome equivalents.
  • A Amplification of 10 pg of a standard genomic library with various PCR primers. The x-axis specifies PCR primers used for amplification and the Y-axis (log scale) indicates the PCR signal measured in units of fg/ ⁇ .
  • the standard ACA2 primer produced a strong signal, as expected.
  • the ACA2 long primer failed to produce signal owing to PCR suppression.
  • the two Alu primer pairs both produced signal at 1% the amount of ACA2, suggesting that 1% of clones possess an amplifiable Alu sequence.
  • the combination of any Alu primer with the long ACA2 primer also produced signal in -1% of clones.
  • Alu primer pairs amplify comparable signal from genomic DNA or a genomic library.
  • primer pairs consisting of an Alu primer and a long ACA2 primer amplify genomic DNA poorly (L+A1F) or not at all (L+A1R). These same pairs exhibit amplification of library that slightly exceeds the signal from Alu primer pairs.
  • Figure 16 shows a direct comparison of ACA2 primer qPCR assay with the Alu-ACA2 long-primer qPCR assay.
  • the Alu ACA2 long-primer qPCR assay shows an 8-fold increase in detectable genome equivalents, which is more consistent with unique read counts derived from sequencing data.
  • Figure 17 shows a representative example of adaptor structure and function for high sensitivity, quantitative genetic assays that provide accurate determinations of genome equivalents analyzed.
  • A Fine structure of the adaptor ligation strand. Details relating to each numbered element are provided in Example 4.
  • B The duplex formed between 45 nt ligation strands and 12 nt partner oligo strand creates a blunt-end ligation substrate compatible with end-repaired cfDNA fragments (solid bars).
  • C Following ligation, the complement to the ligation strand is created by a DNA polymerase- mediated fill-in reaction.
  • Figure 18 shows a representative example of the size distribution of two DNA samples (NA06994 & NCI-H2228) processed to mimic cfDNA.
  • Figure 19 shows a representative example of the sensitivity of detection of the TP53 point mutation Q331 * in tumor sample DNA (H2228) admixed with normal genomic DNA (N). The most sensitive detection corresponds to ⁇ 1 mutant copy of TP53 among 1000 normal copies of the gene.
  • Figure 20 shows the precise determination of the junction sequence for the EML4-ALK fusion gene harbored in cell line NCI-H2228 using the compositions and methods contemplated herein.
  • Figure 21 shows the detection of the EML4-ALK fusion gene tumor sample DNA (H2228) admixed with normal genomic DNA (N). Because the fusion is present as a heterozygote in the NCI-H2228 cell line, the most sensitive detection corresponds to one gene fusion among -100 normal chromosomal copies of the ALK gene (50 genome equivalents).
  • Figure 22 shows the detection of the MYCN gene amplification in admixtures of cell line NCI-H69 (H69) diluted into normal human DNA (N). The threshold value of two normal diploid copies is shown as a dashed red line.
  • Figure 23 shows the DNA mutations detected in the TP53 gene of three different cancer patients. The canonical gene model is shown at the top of the figure. The peaks represent DNA sequence coverage (X-axis) and depth (Y-axis). Sequencing depth was >4000 genome equivalents for all sample analyzed. An expanded view of exon 7 below the gene model shows where all detected mutations were localized. The frequency of mutant detection in cfDNA (plasma), tumor tissue, and normal adjacent tissue is shown, where available (NA - not available).
  • OVA1 and OVA2 are ovarian cancer patients; CRC406 and CRC407 are colorectal cancer patients. No mutations in TP53 were found in any of the OVA1 samples.
  • Figure 24 shows the DNA sequencing of a larger, thirteen gene panel (boxed).
  • the sequencing identified a KRAS mutation in cfDNA and tumor from ovarian cancer patient OVAL
  • Figure 25 shows the DNA sequencing of a larger, twelve gene panel. The sequencing identified an ERBB2 gene amplification in the plasma of colorectal cancer patient CRC407.
  • the present invention contemplates, in part, compositions and methods for the quantitative genetic analysis of the genetic state of an individual using cell-free DNA (cfDNA).
  • cfDNA cell-free DNA
  • the term "genetic state" refers to the sequence of one or more target genome sequences in the genome in relation to a non-causal normal sequence or in relation to a sequence that is causal for a genetic condition or disease.
  • analyzing the genetic state refers to identifying, quantifying, or monitoring a genetic variant in a target genetic locus, wherein the variant varies with respect to a reference sequence (e.g., a normal or mutated sequence).
  • a reference sequence e.g., a normal or mutated sequence
  • Next-generation sequencing technology has afforded the opportunity to add broad genomic surveys to molecular diagnosis in a variety of scenarios including cancers, fetal diagnostics, paternity testing, pathogen screening and organ transplant monitoring.
  • next-generation sequencing information is being used in a clinical setting to identify mutations within genes that are likely to alter gene function, to identify the gain or loss of genetic material within cells, and to identify genomic rearrangements that are not found in normal, healthy cells.
  • the results of these broad diagnostic surveys are often used to guide patient treatment.
  • molecular diagnostics contemplated herein for the genetic state of an individual leverage the availability of cfDNA to provide deep sequence coverage of select target genes.
  • the cfDNA-based cancer diagnostics contemplated herein possess the ability to detect a variety of genetic changes including somatic sequence variations that alter protein function, large-scale chromosomal rearrangements that create chimeric gene fusions, and copy number variation that includes loss or gain of gene copies.
  • compositions and methods are detectable and quantifiable in the face of significant dilution, or admixture, of normal sequences within cfDNA that are contributed by the normal turnover processes that happen within healthy tissues.
  • the compositions and methods contemplated herein also successfully address the major challenges associated with detecting rare genetic changes causal of disease; namely, that cfDNA is highly fragmented, that cfDNA levels vary substantially between different individuals, and that the degree of admixture of diseased versus normal sequences is highly variable among patients, even within individuals suffering from the same molecular disease and stage.
  • compositions and methods for genetic analysis of comprise interrogating the DNA fraction within biological fluid samples and stool samples.
  • the methods contemplated herein provide a novel comprehensive framework address molecular genetic analysis using cfDNA available from a variety of biological sources. Cloning of purified cfDNA introduces tagged cfDNA sequences that inform downstream analysis and enable amplification of the resulting clone libraries. Hybrid capture with target specific oligonucleotides is used to retrieve specific sequences for subsequent analysis. Independent measurements of the number of genomes present in the library are applied to each sample, and these assays provide a means to estimate the assay's sensitivity.
  • the assays contemplated herein provide reliable, reproducible, and robust methods for the analysis, detection, diagnosis, or monitoring of genetic states, conditions, or disease.
  • the term "about” or “approximately” refers to a quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length that varies by as much as 15%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2% or 1% to a reference quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length.
  • the term "about” or “approximately” refers a range of quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length ⁇ 15%, ⁇ 10%, ⁇ 9%, ⁇ 8%, ⁇ 7%, ⁇ 6%, ⁇ 5%, ⁇ 4%, ⁇ 3%, ⁇ 2%, or ⁇ 1% about a reference quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length.
  • isolated means material that is substantially or essentially free from components that normally accompany it in its native state.
  • obtained or “derived” is used synonymously with isolated.
  • DNA refers to deoxyribonucleic acid.
  • DNA refers to genomic DNA, recombinant DNA, synthetic DNA, or cDNA.
  • DNA refers to genomic DNA or cDNA.
  • the DNA comprises a "target region.”
  • DNA libraries contemplated herein include genomic DNA libraries and cDNA libraries constructed from RNA, e.g., an RNA expression library.
  • the DNA libraries comprise one or more additional DNA sequences and/or tags.
  • a “target genetic locus” or “DNA target region” refers to a region of interest within a DNA sequence. In various embodiments, targeted genetic analyses are performed on the target genetic locus.
  • the DNA target region is a region of a gene that is associated with a particular genetic state, genetic condition, genetic diseases; fetal testing; genetic mosaicism, paternity testing;
  • predicting response to drug treatment diagnosing or monitoring a medical condition; microbiome profiling; pathogen screening; or organ transplant monitoring.
  • circulating DNA As used herein, the terms “circulating DNA,” “circulating cell-free DNA” and “cell-free DNA” are often used interchangeably and refer to DNA that is extracellular DNA, DNA that has been extruded from cells, or DNA that has been released from necrotic or apoptotic cells.
  • a "subject,” “individual,” or “patient” as used herein, includes any animal that exhibits a symptom of a condition that can be detected or identified with compositions contemplated herein. Suitable subjects include laboratory animals (such as mouse, rat, rabbit, or guinea pig), farm animals (such as horses, cows, sheep, pigs), and domestic animals or pets (such as a cat or dog). In particular embodiments, the subject is a mammal. In certain embodiments, the subject is a non-human primate and, in preferred embodiments, the subject is a human.
  • a method for genetic analysis of cfDNA is provided.
  • a method for genetic analysis of cfDNA comprises: generating and amplifying a cfDNA library, determining the number of genome equivalents in the cfDNA library; and performing a quantitative genetic analysis of one or more genomic target loci.
  • a method for genetic analysis of cfDNA comprises treating cfDNA with one or more end-repair enzymes to generate end-repaired cfDNA and ligating one or more adaptors to each end of the end-repaired cfDNA to generate a cfDNA library;
  • amplifying the cfDNA library to generate cfDNA library clones determining the number of genome equivalents of cfDNA library clones; and performing a quantitative genetic analysis of one or more target genetic loci in the cfDNA library clones.
  • methods of genetic analysis contemplated herein comprise generating a cfDNA library comprising treating cfDNA with one or more end- repair enzymes to generate end-repaired cfDNA and ligating one or more adaptors to each end of the end-repaired cfDNA to generate the cfDNA library.
  • the methods and compositions contemplated herein are designed to efficiently analyze, detect, diagnose, and/or monitor genetic states, genetic conditions, genetic diseases, genetic mosaicism, fetal diagnostics, paternity testing, microbiome profiling, pathogen screening, and organ transplant monitoring using cell-free DNA (cfDNA) as an analyte.
  • cfDNA cell-free DNA
  • the size distribution of cfDNA ranges from about 150 bp to about 180 bp fragments. Fragmentation may be the result of endonucleo lytic and/or exonucleo lytic activity and presents a daunting challenge to the accurate, reliable, and robust analysis of cfDNA.
  • Another challenge for analyzing cfDNA is its short half-life in the blood stream, on the order of about 15 minutes.
  • the present invention contemplates, in part, that analysis of cfDNA is like a "liquid biopsy" and is a real-time snapshot of current biological processes.
  • cfDNA is not found within cells and may be obtained from a number of suitable sources including, but not limited to, biological fluids and stool samples, it is not subject to the existing limitations that plague next generation sequencing analysis, such as direct access to the tissues being analyzed. .
  • biological fluids that are suitable sources from which to isolate cfDNA in particular embodiments include, but are not limited to amniotic fluid, blood, plasma, serum, semen, lymphatic fluid, cerebral spinal fluid, ocular fluid, urine, saliva, mucous, and sweat.
  • the biological fluid is blood or blood plasma.
  • kits and other methods known to the skilled artisan can used to isolate cfDNA directly from the biological fluids of a patient or from a previously obtained and optionally stabilized biological sample, e.g., by freezing and/or addition of enzyme chelating agents including, but not limited to EDTA, EGTA, or other chelating agents specific for divalent cations.
  • enzyme chelating agents including, but not limited to EDTA, EGTA, or other chelating agents specific for divalent cations.
  • generating a cfDNA library comprises the end-repair of isolated cfDNA.
  • the fragmented cfDNA is processed by end-repair enzymes to generate end-repaired cfDNA with blunt ends, 5 '-overhangs, or 3 '-overhangs.
  • the end-repair enzymes can yield for example.
  • the end-repaired cfDNA contains blunt ends.
  • the end-repaired cfDNA is processed to contain blunt ends.
  • the blunt ends of the end-repaired cfDNA are further modified to contain a single base pair overhang.
  • end-repaired cfDNA containing blunt ends can be further processed to contain adenine (A)/thymine (T) overhang.
  • end- repaired cfDNA containing blunt ends can be further processed to contain adenine (A)/thymine (T) overhang as the single base pair overhang.
  • the end-repaired cfDNA has non-templated 3' overhangs.
  • the end- repaired cfDNA is processed to contain 3' overhangs.
  • the end- repaired cfDNA is processed with terminal transferase (TdT) to contain 3' overhangs.
  • a G-tail can be added by TdT.
  • the end- repaired cfDNA is processed to contain overhang ends using partial digestion with any known restriction enzymes ⁇ e.g., with the enzyme Sau3A, and the like.
  • generating a cfDNA library comprises ligating one or more adaptors to each end of the end-repaired cfDNA.
  • the present invention contemplates, in part, an adaptor module designed to accommodate large numbers of genome equivalents in cfDNA libraries.
  • Adaptor modules are configured to measure the number of genome equivalents present in cfDNA libraries, and, by extension, the sensitivity of sequencing assays used to identify sequence mutations.
  • the term "adaptor module” refers to a polynucleotide comprising at least five elements: (i) a first element comprising a PCR primer binding site for the single-primer library amplification; (ii) a second element comprising a 5 nucleotide read code that acts to uniquely identified each sequencing read; (iii) a third element comprising a 3 nucleotide sample code to identify different samples and enable sample multiplexing within a sequencing run; (iv) a fourth element comprising a 12 nucleotide anchor sequence that enables calibration of proper base calls in sequencing reads and acts as an anchor for hybridization to a partner oligonucleotide; and (v) a fifth element comprising the two 3 ' terminal nucleotides of Element 4 ( Figure 17 and Tables 12-16).
  • the adaptor module is hybridized to a partner oligonucleotide that is complementary to Element 4 to form an adaptor suitable for ligating to the ends of cfDNA
  • an adaptor module comprises one or more PCR primer sequences, one or more read codes, one or more sample codes, one or more anchor sequences, and two or more 3 ' nucleotides that are efficient ligation substrates.
  • the adaptor module further comprises one or more sequencing primer binding sites.
  • an adaptor module comprises a first element that comprises one or more PCR primer binding sequences for single-primer amplification of a cfDNA library.
  • the PCR primer binding sequence is about 12 to about 40 nucleotides, about 18 to about 40 nucleotides, about 20 to about 35 nucleotides, or about 20 to about 30 nucleotides.
  • the PCR primer binding sequence is about 12 nucleotides, about 13 nucleotides, about 14 nucleotides, about 15 nucleotides, about 16 nucleotides, about 17 nucleotides, about 18 nucleotides, about 19 nucleotides, about 20 nucleotides, about 21 nucleotides, about 22 nucleotides, about 23 nucleotides, about 24 nucleotides, about 25 nucleotides, about 26 nucleotides, about 27 nucleotides, about 28 nucleotides, about 29 nucleotides, about 30 nucleotides, about 31 nucleotides, about 32 nucleotides, about 33 nucleotides, about 34 nucleotides, about 35 nucleotides, about 36 nucleotides, about 37 nucleotides, about 38 nucleotides, about 39 nucleotides, or about 40 nucleotides or more.
  • the PCR primer binding sequence is about 25 nucleotides.
  • an adaptor module comprises a second element that comprises one or more read code sequences.
  • the term "read code” refers to a polynucleotide that is used to identify unique sequencing reads.
  • the read code is a random sequence of nucleotides.
  • the read code is about 1 nucleotide, about 2 nucleotides, about 3 nucleotides, about 4 nucleotides, about 5 nucleotides, about 6 nucleotides, about 7 nucleotides, about 8 nucleotides, about 9 nucleotides, about 10 nucleotides, or more.
  • a 5 nucleotide codes consists of 256 possible unique sequences where each code chosen is 2 nucleotides different from every other code in the set. This feature enables unique and distinct reads to be differentiated from reads that appear to be unique owing to a sequencing error in the code region.
  • codes that have been empirically determined to interfere with adaptor function, owing to particular sequence combinations may be excluded from use, e.g., seven codes of the 256 had an overrepresentation of G nucleotides and were excluded.
  • each read code of 5, 6, 7, 8, 9, 10 or more nucleotides may differ by 2, 3, 4, or 5 nucleotides from every other read code.
  • the read code is about 5 nucleotides and differs from every other read code by 2 nucleotides.
  • an adaptor module comprises a third element that comprises one or more sample code sequences.
  • sample code refers to a polynucleotide that is used to identify the sample.
  • the sample code is also useful in establishing multiplex sequencing reactions because each sample code is unique to the sample and thus, can be used to identify a read from a particular sample within a multiplexed sequencing reaction.
  • the sample code comprises sequence that is about 1 , about 2 nucleotides, about 3 nucleotides, about 4 nucleotides, or about 5 nucleotides, or more. In another embodiment, each sample code of 2, 3, 4, 5 or more nucleotides may differ from every other sample code by 2, 3, 4, or 5 nucleotides.
  • the sample code is about three nucleotides and differs from every other sample code used in other samples by two nucleotides.
  • an adaptor module comprises a fourth element that comprises one or more anchor sequences.
  • an “anchor sequence” refers to a nucleotide sequence of at least 8 nucleotides, at least 10 nucleotides, at least 12 nucleotides, at least 14 nucleotides, or at least 16 nucleotides that hybridizes to a partner oligonucleotide and that comprises the following three properties: (1) each anchor sequence is part of a family of four anchor sequences that collectively represent each of the four possible DNA bases at each site within extension; this feature, balanced base representation, is useful to calibrate proper base calling in sequencing reads in particular embodiments; (2) each anchor sequence is composed of only two of four possible bases, and these are specifically chosen to be either and equal number of A + C or an equal number of G + T; an anchor sequence formed from only two bases reduces the possibility that the anchor sequence will participate in secondary structure formation that would preclude proper adaptor function; and (3) because each anchor sequence is composed of equal numbers of A+C
  • an adaptor module comprises a fifth element that is comprised of the two 3 ' terminal nucleotides of Element 4. These two bases at the 3 ' end of each anchor are chosen based on an empirical determination that shows that these two nucleotides are efficient substrates for ligation to the cfDNA.
  • Element 5 comprises the sequences selected from the group consisting of: AA, CC, TT and GG. In particular embodiments, Element 5 does not comprise the dinucleotide combination CG or TG as the inventors have determined that these combinations are not efficient ligation substrates.
  • a ligation step comprises ligating an adaptor module to the end-repaired cfDNA to generate a "tagged" cfDNA library.
  • a single adaptor module is employed. In some embodiments, two, three, four or five adaptor modules are employed. In some embodiments, an adaptor module of identical sequence is ligated to each end of the fragmented end-repaired DNA.
  • a plurality of adaptor species is ligated to an end-repaired cfDNA library.
  • Each of the plurality of adaptors may comprise one or more primer binding site for amplification of the cfDNA library, one or more read code sequences, one or more sequences for sample multiplexing, and one or more sequences for DNA sequencing.
  • one or more adaptors contemplated herein may be carried out by methods known to those of ordinary skill in the art.
  • one or more adaptors contemplated herein are ligated to end-repaired cfDNA that comprises blunt ends.
  • one or more adaptors contemplated herein are ligated to end-repaired cfDNA that comprises complementary ends appropriate for the ligation method employed.
  • one or more adaptors contemplated herein are ligated to end-repaired cfDNA that comprises a 3' overhang.
  • methods of genetic analysis contemplated herein comprise amplification of a cfDNA library to generate a cfDNA clone library or a library of cfDNA clones.
  • Each molecule of the cfDNA library comprises an adaptor ligated to each end of an end-repaired cfDNA, and each adaptor comprises one or more PCR primer binding sites.
  • different adaptors are ligated to different ends of the end-repaired cfDNA.
  • the same adaptor is ligated to both ends of the cfDNA.
  • Ligation of the same adaptor to both ends of end-repaired cfDNA allows for PCR amplification with a single primer sequence.
  • a portion of the adaptor ligated-cfDNA library will be amplified using standard PCR techniques with a single primer sequence driving amplification.
  • the single primer sequence is about 25 nucleotides, optionally with a projected Tm of > 55° C under standard ionic strength conditions.
  • picograms of the initial cfDNA library are amplified into micrograms of cfDNA clones, implying a 10,000-fold amplification.
  • the amount of amplified product can be measured using methods known in the art, e.g.,
  • a method for genetic analysis of cfDNA comprises determining the number of genome equivalents in the cfDNA clone library.
  • the term "genome equivalent” refers to the number of genome copies in each library.
  • An important challenge met by the compositions and methods contemplated herein is achieving sufficient assay sensitivity to detect and analysis rare genetic mutations or differences in genetic sequence.
  • To determine assay sensitivity value on a sample-by-sample basis the numbers of different and distinct sequences that are present in each sample are measured, by measuring the number of genome equivalents that are present in a sequencing library. To establish sensitivity, the number of genome equivalents must be measured for each sample library.
  • the number of genome equivalents can be determined by qPCR assay or by using bioinformatics-based counting after sequencing is performed.
  • qPCR measurement of genome equivalents is used as a QC step for cfDNA libraries. It establishes an expectation for assay sensitivity prior to sequence analysis and allows a sample to be excluded from analysis if its corresponding cfDNA clone library lacks the required depth of genome equivalents.
  • bioinformatics-based counting of genome equivalents is also used to identify the genome equivalents - and hence the assay sensitivity and false negative estimates - for each given cfDNA clone library.
  • the empirical qPCR assay and statistical counting assays should be well correlated. In cases where sequencing fails to reveal the sequence depth in a cfDNA clone library, reprocessing of the cfDNA clone library and/or additional sequencing may be required.
  • the genome equivalents in a cfDNA clone library are determined using a quantitative PCR (qPCR) assay.
  • qPCR quantitative PCR
  • a standard library of known concentration is used to construct a standard curve and the measurements from the qPCR assay are fit to the resulting standard curve and a value for genome equivalents is derived from the fit.
  • a qPCR "repeat-based" assay comprising one primer that specifically hybridizes to a common sequence in the genome, e.g., a repeat sequence, and another primer that binds to the primer binding site in the adaptor, measured an 8-fold increase in genome equivalents compared to methods using just the adaptor specific primer (present on both ends of the cfDNA clone).
  • the number of genome equivalents measured by the repeat-based assays provides a more consistent library-to-library performance and a better alignment between qPCR estimates of genome equivalents and bioinformatically counted tag equivalents in sequencing runs.
  • repeats suitable for use in the repeat-based genome equivalent assays contemplated herein include, but not limited to: short interspersed nuclear elements (SINEs), e.g., Alu repeats; long interspersed nuclear elements (LINEs), e.g., LINE1, LINE2, LINE3; microsatellite repeat elements, e.g., short tandem repeats (STRs), simple sequence repeats (SSRs); and mammalian-wide interspersed repeats (MIRs).
  • SINEs short interspersed nuclear elements
  • LINEs long interspersed nuclear elements
  • STRs short tandem repeats
  • SSRs simple sequence repeats
  • MIRs mammalian-wide interspersed repeats
  • the repeat is an Alu repeat.
  • a method for genetic analysis of cfDNA comprises quantitative genetic analysis of one or more target genetic loci of the cfDNA library clones.
  • Quantitative genetic analysis comprises one or more of, or all of, the following steps: capturing cfDNA clones comprising a target genetic locus; amplification of the captured targeted genetic locus; sequencing of the amplified captured targeted genetic locus; and bioinformatic analysis of the resulting sequence reads.
  • the present invention contemplates, in part, a capture probe module designed to retain the efficiency and reliability of larger probes but that minimizes uninformative sequence generation in a cfDNA clone library.
  • a “capture probe module” refers to a polynucleotide that comprises a capture probe sequence and a tail sequence.
  • the capture probe module sequence or a portion thereof serves as a primer binding site for one or more sequencing primers.
  • a capture probe module comprises a capture probe.
  • a capture probe refers to a region capable of hybridizing to a specific DNA target region. Because the average size of cfDNA is about 150 to about 170 bp and is highly fragmented the compositions and methods contemplated herein comprise the use of high density and relatively short capture probes to interrogate DNA target regions of interest.
  • capture probes are designed using specific "sequence rules.” For example, regions of redundant sequence or that exhibit extreme base composition biases are generally excluded in designing capture probes.
  • sequence rules For example, regions of redundant sequence or that exhibit extreme base composition biases are generally excluded in designing capture probes.
  • the present inventors have discovered that the lack of flexibility in capture probe design rules does not substantially impact probe performance.
  • capture probes chosen strictly by positional constraint provided on-target sequence information; exhibit very little off-target and unmappable read capture; and yield uniform, useful, on-target reads with only few exceptions.
  • the high redundancy at close probe spacing more than compensates for occasional poor-performing capture probes.
  • a target region is targeted by a plurality of capture probes, wherein any two or more capture probes are designed to bind to the target region within 10 nucleotides of each other, within 15 nucleotides of each other, within 20 nucleotides of each other, within 25 nucleotides of each other, within 30 nucleotides of each other, within 35 nucleotides of each other, within 40 nucleotides of each other, within 45 nucleotides of each other, or within 50 nucleotides or more of each other, as well as all intervening nucleotide lengths.
  • the capture probe is about 25 nucleotides, about 26 nucleotides, about 27 nucleotides, about 28 nucleotides, about 29 nucleotides, about 30 nucleotides, about 31 nucleotides, about 32 nucleotides, about 33 nucleotides, about 34 nucleotides, about 35 nucleotides, about 36 nucleotides, about 37 nucleotides, about 38 nucleotides, about 39 nucleotides, about 40 nucleotides, about 41 nucleotides, about 42 nucleotides, about 43 nucleotides, about 44 nucleotides, or about 45 nucleotides.
  • the capture probe is about 100 nucleotides, about 200 nucleotides, about 300 nucleotides, about 400 nucleotides, or about 100 nucleotides. In another embodiment, the capture probe is from about 100 nucleotides to about 500 nucleotides, about 200 nucleotides to about 500 nucleotides, about 300 nucleotides to about 500 nucleotides, or about 400 nucleotides to about 500 nucleotides, or any intervening range thereof.
  • the capture probe is not 60 nucleotides.
  • the capture probe is substantially smaller than 60 nulceotides but hybridizes comparably, as well as, or better than a 60 nucleotide capture probe targeting the same DNA target region.
  • the capture probe is 40 nucleotides.
  • a capture probe module comprises a tail sequence.
  • tail sequence refers to a polynucleotide at the 5' end of the capture probe module, which in particular embodiments can serve as a primer binding site.
  • a sequencing primer binds to the primer binding site in the tail region.
  • the tail sequence is about 5 to about 100 nucleotides, about 10 to about 100 nucleotides, about 5 to about 75 nucleotides, about 5 to about 50 nucleotides, about 5 to about 25 nucleotides, or about 5 to about 20 nucleotides.
  • the third region is from about 10 to about 50 nucleotides, about 15 to about 40 nucleotides, about 20 to about 30 nucleotides or about 20 nucleotides, or any intervening number of nucleotides.
  • the tail sequence is about 30 nucleotides, about 31 nucleotides, about 32 nucleotides, about 33 nucleotides, about 34 nucleotides, about 35 nucleotides, about 36 nucleotides, about 37 nucleotides, about 38 nucleotides, about 39 nucleotides, or about 40 nucleotides.
  • the capture probe module comprises a specific member of a binding pair to enable isolation and/or purification of one or more captured fragments of a tagged and or amplified cfDNA library that hybridizes to the capture probe.
  • the capture probe module is conjugate to biotin or another suitable hapten, e.g., dinitrophenol, digoxigenin.
  • the capture probe module is hybridized to a tagged and optionally amplified cfDNA library to form a complex.
  • the multifunctional capture probe module substantially hybridizes to a specific genomic target region in the cfDNA library.
  • Hybridization or hybridizing conditions can include any reaction conditions where two nucleotide sequences form a stable complex; for example, the tagged cfDNA library and capture probe module forming a stable tagged cfDNA library— capture probe module complex.
  • reaction conditions are well known in the art and those of skill in the art will appreciated that such conditions can be modified as appropriate, e.g., decreased annealing temperatures with shorter length capture probes, and within the scope of the present invention.
  • Substantial hybridization can occur when the second region of the capture probe complex exhibits 100%, 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92% 91%, 90%, 89%, 88%, 85%, 80%, 75%, or 70% sequence identity, homology or complementarity to a region of the tagged cfDNA library.
  • the capture probe is about 40 nucleotides and has an optimal annealing temperature of about 44° C o about 47° C.
  • the methods contemplated herein comprise isolating a tagged cfDNA library— capture probe module complex.
  • methods for isolating DNA complexes are well known to those skilled in the art and any methods deemed appropriate by one of skill in the art can be employed with the methods of the present invention (Ausubel et ah, Current Protocols in Molecular Biology, 2007-2012).
  • the complexes are isolated using biotin— streptavidin isolation techniques.
  • removal of the single stranded 3 '-ends from the isolated tagged cfDNA library- capture probe module complex is contemplated.
  • the methods comprise 3 '-5' exonuclease enzymatic processing of the isolated tagged DNA library-multifunctional capture probe module complex to remove the single stranded 3' ends.
  • the methods comprise performing 5 '-3' DNA polymerase extension of multifunctional capture probe utilizing the isolated tagged DNA library fragments as template. In certain other embodiments, the methods comprise creating a hybrid capture probe-isolated tagged cfDNA target molecule through the concerted action of a 5 ' FLAP endonuclease, DNA polymerization and nick closure by a DNA ligase.
  • a variety of enzymes can be employed for the 3 '-5' exonuclease enzymatic processing of the isolated tagged cfDNA library-multifunctional capture probe module complex.
  • suitable enzymes which exhibit 3 '-5' exonuclease enzymatic activity, that can be employed in particular embodiments include, but are not limited to: T4 or Exonucleases I, III, V (see also, Shevelev IV, Hubscher U., "The 3' 5' exonucleases,” Nat Rev Mol Cell Biol. 3(5):364-76 (2002)).
  • the enzyme comprising 3 '-5' exonuclease activity is T4 polymerase.
  • an enzyme which exhibits 3 '-5' exonuclease enzymatic activity and is capable of primer template extension can be employed, including for example T4 or Exonucleases I, III, V. Id.
  • the methods contemplated herein comprise performing sequencing and/or PCR on the 3 '-5' exonuclease enzymatically processed complex discussed supra and elsewhere herein.
  • a tail portion of a capture probe molecule is copied in order to generate a hybrid nucleic acid molecule.
  • the hybrid nucleic acid molecule generated comprises the target region capable of hybridizing to the capture probe module and the complement of the capture probe module tail sequence.
  • genetic analysis comprises a) hybridizing one or more capture probe modules to one or more target genetic loci in a plurality of cfDNA library clones to form one or more capture probe module-cfDNA library clone complexes; b) isolating the one or more capture probe module-cfDNA library clone complexes from a); c) enzymatically processing the one or more isolated capture probe module-cfDNA library clone complexes from step b); d) performing PCR on the enzymatically processed complex from c) wherein the tail portion of the capture probe molecule is copied in order to generate amplified hybrid nucleic acid molecules, wherein the amplified hybrid nucleic acid molecules comprise a target sequence in the target genomic locus capable of hybridizing to the capture probe and the complement of the capture probe module tail sequence; and e) performing quantitative genetic analysis on the amplified hybrid nucleic acid molecules from d).
  • methods for determining copy number of a specific target genetic locus comprising: a) hybridizing one or more capture probe modules to one or more target genetic loci in a plurality of cfDNA library clones to form one or more capture probe module-cfDNA library clone complexes; b) isolating the one or more capture probe module-cfDNA library clone complexes from a); c) enzymatically processing the one or more isolated capture probe module-cfDNA library clone complexes from step b); d) performing PCR on the enzymatically processed complex from c) wherein the tail portion of the capture probe molecule is copied in order to generate amplified hybrid nucleic acid molecules, wherein the amplified hybrid nucleic acid molecules comprise a target sequence in the the target genetic locus capable of hybridizing to the capture probe and the complement of the capture probe module tail sequence; e) performing PCR amplification of the amplified hybrid nucleic acid molecules in
  • quantitation allows for a determination of copy number of the specific target region.
  • the enzymatic processing of step c) comprises performing 3 '-5' exonuclease enzymatic processing on the one or more capture probe module- cfDNA library clone complexes from b) using an enzyme with 3 '-5' exonuclease activity to remove the single stranded 3' ends; creating one or more hybrid capture probe module-cfDNA library clone molecules through the concerted action of a 5 ' FLAP endonuclease, DNA polymerization and nick closure by a DNA ligase; or performing 5 '-3' DNA polymerase extension of the capture probe using the isolated cfDNA clone in the complex as a template.
  • step c) comprises performing
  • PCR can be performed using any standard PCR reaction conditions well known to those of skill in the art.
  • the PCR reaction in e) employs two PCR primers.
  • the PCR reaction in e) employs a first PCR primer that hybridizes to a repeat within the target genetic locus.
  • the PCR reaction in e) employs a second PCR primer that hybridizes to the hybrid nucleic acid molecules at the target genetic locus/tail junction.
  • the PCR reaction in e) employs a first PCR primer that hybridizes to the target genetic locus and a second PCR primer hybridizes to the amplified hybrid nucleic acid molecules at the target genetic locus/tail junction.
  • the second primer hybridizes to the target genetic locus/tail junction such that at least one or more nucleotides of the primer hybridize to the target genetic locus and at least one or more nucleotides of the primer hybridize to the tail sequence.
  • the amplified hybrid nucleic acid molecules obtained from step e) are sequenced and the sequences aligned horizontally, i.e., aligned to one another but not aligned to a reference sequence.
  • steps a) through e) are repeated one or more times with one or more capture probe modules.
  • the capture probe modules can be the same or different and designed to target either cfDNA strand of a target genetic locus.
  • when the capture probes are different they hybridize at overlapping or adjacent target sequences within a target genetic locus in the tagged cfDNA clone library.
  • a high density capture probe strategy is used wherein a plurality of capture probes hybridize to a target genetic locus, and wherein each of the plurality of capture probes hybridizes to the target genetic locus within about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 100, 200, bp of any other capture probe that hybridizes to the target genetic locus in a tagged cfDNA clone library, including all intervening distances.
  • the method can be performed using two capture probe modules per target genetic locus, wherein one hybridizes to the "Watson" strand (non- coding or template strand) upstream of the target region and one hybridizes to the "Crick” strand (coding or non-template strand) downstream of the target region.
  • the methods contemplated herein can further be performed multiple times with any number of capture probe modules, for example 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more capture probe modules per target genetic locus any number of which hybridize to the Watson or Crick strand in any combination.
  • the sequences obtained can be aligned to one another in order to identify any of a number of differences.
  • a plurality of target genetic loci are interrogated, e.g., 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 10000, 50000, 100000, 500000 or more in a single reaction, using one or more capture probe modules.
  • the quantitative genetic analysis comprises sequencing a plurality of hybrid nucleic acid molecules, as discussed elsewhere herein, supra, to generate sufficient sequencing depths to obtain a plurality of unique sequencing reads.
  • a unique read is defined as the single consensus read from a
  • Each capture probe yields a set of unique reads that are computationally distilled from total reads by grouping into families.
  • the unique reads for a given sample are then computed as the average of all the unique reads observed on a probe- by-probe basis. Cases where there is an obvious copy number change are excluded from the data set used to compute the average.
  • Unique reads are important because each unique read must be derived from a unique cfDNA clone.
  • Each unique read represents the input and analysis of a haploid equivalent of genomic DNA.
  • the sum of unique reads is the sum of haploid genomes analyzed.
  • the number of genomes analyzed defines the sensitivity of the sequencing assay. By way of a non- limiting example, if the average unique read count is 100 genome equivalents, then that particular assay has a sensitivity of being able to detect one mutant read in 100, or 1%. Any observation less than this is not defensible.
  • the quantitative genetic analysis comprises multiplex sequencing of hybrid nucleic acid molecules derived from a plurality of samples.
  • the quantitative genetic analysis comprises obtaining one or more or a plurality of tagged DNA library clones, each clone comprising a first DNA sequence and a second DNA sequence, wherein the first DNA sequence comprises a sequence in a targeted genetic locus and the second DNA sequence comprises a capture probe sequence; performing a paired end sequencing reaction on the one or more clones and obtaining one or more sequencing reads or performing a sequencing reaction on the one or more clones in which a single long sequencing read of greater than about 100, 200, 300, 400, 500 or more nucleotides is obtained, wherein the read is sufficient to identify both the first DNA sequence and the second DNA sequence; and ordering or clustering the sequencing reads of the one or more clones according to the probe sequences of the sequencing reads.
  • the quantitative genetic analysis further comprises bioinformatic analysis of the sequencing reads.
  • Bioinformatic analysis excludes any purely mental analysis performed in the absence of a composition or method for sequencing.
  • bioinformatics analysis includes, but is not limited to: sequence alignments; genome equivalents analysis; single nucleotide variant (SNV) analysis; gene copy number variation (CNV) analysis; and detection of genetic lesions.
  • bioinformatics analysis is usefulto quantify the number of genome equivalents analyzed in the cfDNA clone library; to detect the genetic state of a target genetic locus; to detect genetic lesions in a target genetic locus; and to measure copy number fluctuations within a target genetic locus.
  • Sequence alignments may be performed between the sequence reads and one or more human reference DNA sequences.
  • sequencing alignments can be used to detect genetic lesions in a target genetic locus including, but not limited to detection of a nucleotide transition or transversion, a nucleotide insertion or deletion, a genomic rearrangement, a change in copy number, or a gene fusion.
  • Detection of genetic lesions that are causal or prognostic indicators may be useful in the diagnosis, prognosis, treatment, and/or monitoring of a particular genetic condition or disease.
  • sequence alignment analysis methods for sequence alignment analysis that can be performed without the need for alignment to a reference sequence, referred to herein as horizontal sequence analysis. Such analysis can be performed on any sequences generated by the methods contemplated herein or any other methods.
  • sequence analysis comprises performing sequence alignments on the reads obtained by the methods contemplated herein.
  • the genome equivalents in a cfDNA clone library are determined using bioinformatics-based counting after sequencing is performed. Each sequencing read is associated with a particular capture probe, and the collection of reads assigned to each capture probe is parsed into groups. Within a group, sets of individual reads share the same read code and the same DNA sequence start position within genomic sequence.
  • the number of genome equivalents analyzed is about 1/10, about 1/12, about 1/14, about 1/16, about 1/18, about 1/20, about 1/25 or less the the number of possible unique clones. It should be understood that the procedure outlined above is merely illustrative and not limiting.
  • the number of genome equivalents to be analyzed may need to be increased.
  • the first solution is to use more than one adaptor set per sample. By combining adaptors, it is possible to multiplicatively expand the total number of possible clones and therefore, expand the comfortable limits of genomic input.
  • the second solution is to expand the read code by 1, 2, 3, 4, or 5 or more bases.
  • the number of possible read codes that differ by at least 2 bases from every other read code scales as 4 ⁇ n ⁇ l) where n is the number of bases within a read code.
  • n is the number of bases within a read code.
  • quantitative genetic analysis comprises bioinformatic analysis of sequencing reads to identify rare single nucleotide variants (SNV).
  • SNV rare single nucleotide variants
  • Next-generation sequencing has an inherent error rate of roughly 0.02-0.02%, meaning that anywhere from 1/200 to 1/500 base calls are incorrect.
  • analysis of 5000 unique molecules using targeted sequence capture technology would generate - at sufficient sequencing depths of >50,000 reads - a collection of 5000 unique reads, with each unique read belonging to a "family" of reads that all possess the same read code.
  • a SNV that occurs within a family is a candidate for being a rare variant. When this same variant is observed in more than one family, it becomes a very strong candidate for being a rare variant that exists within the starting sample.
  • variants that occur sporadically within families are likely to be sequencing errors and variants that occur within one and only one family are either rare or the result of a base alteration that occurred ex vivo (e.g., oxidation of a DNA base or PCR-introduced errors).
  • the methods of detecting SNVs comprise introducing 10- fold more genomic input (genomes or genome equivalents) as the desired target sensitivity of the assay.
  • the desired sensitivity is 2% (2 in 100)
  • the experimental target is an input of 2000 genomes.
  • bioinformatics analysis of sequencing data is used to detect or identify SNV associated with a genetic state, condition or disease, genetic mosaicism, fetal testing, paternity testing, predicting response to drug treatment, diagnosing or monitoring a medical condition, microbiome profiling, pathogen screening, and monitoring organ transplants.
  • a method for copy number determination analysis comprising obtaining one or more or a plurality of clones, each clone comprising a first DNA sequence and a second DNA sequence, wherein the first DNA sequence comprises a sequence in a targeted genetic locus and the second DNA sequence comprises a capture probe sequence.
  • a paired end sequencing reaction on the one or more clones is performed and one or more sequencing reads are obtained.
  • a sequencing reaction on the one or more clones is performed in which a single long sequencing read of greater than about 100 nucleotides is obtained, wherein the read is sufficient to identify both the first DNA sequence and the second DNA sequence.
  • the sequencing reads of the one or more clones can be ordered or clustered according to the probe sequence of the sequencing reads.
  • Copy number analyses include, but are not limited to analyses, that examine the number of copies of a particular gene or mutation that occurs in a given genomic DNA sample and can further include quantitative determination of the number of copies of a given gene or sequence differences in a given sample.
  • copy number analysis is used to detect or identify gene amplification associated with genetic states, conditions, or diseases, fetal testing, genetic mosaicism, paternity testing, predicting response to drug treatment, diagnosing or monitoring a medical condition, microbiome profiling, pathogen screening, and monitoring organ transplants.
  • bioinformatics analysis of sequencing data is used to detect or identify one or more sequences or genetic lesions in a target locus including, but not limited to detection of a nucleotide transition or transversion, a nucleotide insertion or deletion, a genomic rearrangement, a change in copy number, or a gene fusion. Detection of genetic lesions that are causal or prognostic indicators may be useful in the diagnosis, prognosis, treatment, and/or monitoring of a particular genetic condition or disease.
  • genetic lesions are associated with genetic states, conditions, or diseases, fetal testing, genetic mosaicism, paternity testing, predicting response to drug treatment, diagnosing or monitoring a medical condition, microbiome profiling, pathogen screening, and monitoring organ transplants.
  • the present invention contemplates a method of detecting, identifying, predicting, diagnosing, or monitoring a condition or disease in a subject.
  • a method of detecting, identifying, predicting, diagnosing, or monitoring a genetic state, condition or disease in a subject comprises performing a quantitative genetic analysis of one or more target genetic loci in a cfDNA clone library to detect or identify a change in the sequence at the one or more target genetic loci.
  • a method of detecting, identifying, predicting, diagnosing, or monitoring a genetic state, condition or disease comprises isolating or obtaining cfDNA from a biological sample of a subject; treating the cfDNA with one or more end-repair enzymes to generate end-repaired cfDNA; ligating one or more adaptors to each end of the end-repaired cfDNA to generate a cfDNA library; amplifying the cfDNA library to generate a cfDNA clone library; determining the number of genome equivalents in the cfDNA clone library; and performing a quantitative genetic analysis of one or more target genetic loci in a cfDNA clone library to detect or identify a change in the sequence at the one or more target genetic loci.
  • a method of detecting, identifying, predicting, diagnosing, or monitoring a genetic state, or genetic condition or disease selected from the group consisting of: genetic diseases; genetic mosaicism; fetal testing; paternity testing; paternity testing; predicting response to drug treatment; diagnosing or monitoring a medical condition; microbiome profiling; pathogen screening; and organ transplant monitoring comprising isolating or obtaining cfDNA from a biological sample of a subject; treating the cfDNA with one or more end-repair enzymes to generate end-repaired cfDNA; ligating one or more adaptors to each end of the end- repaired cfDNA to generate a cfDNA library; amplifying the cfDNA library to generate a cfDNA clone library; determining the number of genome equivalents in the cfDNA clone library; and performing a quantitative genetic analysis of one or more target genetic loci in a cfDNA clone library to detect or identify a nucleo
  • genetic diseases that can be detected, identified, predicted, diagnosed, or monitored with the compositions and methods contemplated herein include, but are not limited to cancer, Alzheimer's disease (APOE1), Charcot- Marie-Tooth disease, Leber hereditary optic neuropathy (LHON), Angelman syndrome (UBE3A, ubiquitin-protein ligase E3A), Prader-Willi syndrome (region in chromosome 15), ⁇ -Thalassaemia (HBB, ⁇ -Globin), Gaucher disease (type I) (GBA,
  • Glucocerebrosidase Cystic fibrosis (CFTR Epithelial chloride channel), Sickle cell disease (HBB, ⁇ -Globin), Tay-Sachs disease (HEXA, Hexosaminidase A),
  • Phenylketonuria (PAH, Phenylalanine hydrolyase), Familial hypercholesterolaemia (LDLR, Low density lipoprotein receptor), Adult polycystic kidney disease (PKD1, Polycystin), Huntington disease (HDD, Huntingtin), Neurofibromatosis type I (NF1, NF1 tumour suppressor gene), Myotonic dystrophy (DM, Myotonin), Tuberous sclerosis (TSC1, Tuberin), Achondroplasia (FGFR3, Fibroblast growth factor receptor), Fragile X syndrome (FMR1, RNA-binding protein), Duchenne muscular dystrophy (DMD, Dystrophin), Haemophilia A (F8C, Blood coagulation factor VIII), Lesch- Nyhan syndrome (HPRT1, Hypoxanthine guanine ribosyltransferase 1), and
  • ABSCD1 Adrenoleukodystrophy
  • cancers that can be detected, identified, predicted, diagnosed, or monitored with the compositions and methods contemplated herein include, but are not limited to: B cell cancer, e.g., multiple myeloma, melanomas, breast cancer, lung cancer (such as non-small cell lung carcinoma or NSCLC), bronchus cancer, colorectal cancer, prostate cancer, pancreatic cancer, stomach cancer, ovarian cancer, urinary bladder cancer, brain or central nervous system cancer, peripheral nervous system cancer, esophageal cancer, cervical cancer, uterine or endometrial cancer, cancer of the oral cavity or pharynx, liver cancer, kidney cancer, testicular cancer, biliary tract cancer, small bowel or appendix cancer, salivary gland cancer, thyroid gland cancer, adrenal gland cancer, osteosarcoma, chondrosarcoma, cancer of hematological tissues, adenocarcinomas, inflammatory myofibroblastic tumors, gastrointestinal stromal tumor (GIST), colon cancer, multiple myeloma
  • craniopharyngioma ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, neuroblastoma, retinoblastoma, follicular lymphoma, diffuse large B-cell lymphoma, mantle cell lymphoma, hepatocellular carcinoma, thyroid cancer, gastric cancer, head and neck cancer, small cell cancers, essential thrombocythemia, agnogenic myeloid metaplasia, hypereosinophilic syndrome, systemic mastocytosis, familiar hypereosinophilia, chronic eosinophilic leukemia, neuroendocrine cancers, carcinoid tumors, and the like.
  • the genetic lesion is a lesion annotated in the Cosmic database (the lesions and sequence data can be downloaded from
  • cancer.sanger.ac.uk/cosmic/census or a lesion annotated in the Cancer Genome Atlas (the lesions and sequence data can be downloaded from tcga- data.nci.nih.gov/tcga/tcgaDownload.jsp).
  • genes that harbor one or more genetic lesions associated with cancer that can be detected, identified, predicted, diagnosed, or monitored with the compositions and methods contemplated herein include, but are not limited to ABCBl, ABCC2, ABCC4, ABCG2, ABL1, ABL2, AKT1, AKT2, AKT3, ALDH4A1, ALK, APC, AR, ARAF, ARFRP1, ARID 1 A, ATM, ATR, AURKA, AURKB, BCL2, BCL2A1, BCL2L1, BCL2L2, BCL6, BRAF, BRCA1, BRCA2, Clorfl44, CARD11, CBL, CCND1, CCND2, CCND3, CCNE1, CDH1, CDH2, CDH20, CDH5, CDK4, CDK6, CDK8, CDK 2A, CDK 2B, CDK 2C, CEBPA, CHEK1, CHEK2, CRKL, CRLF2, CTNNB1, CYP1B1, CYP2C19, CYP2C8,
  • the genetic lesion comprises a nucleotide transition or transversion, a nucleotide insertion or deletion, a genomic rearrangement, a change in copy number, or a gene fusion.
  • the genetic lesion is a gene fusion that fuses the 3 ' coding region of the ALK gene to another gene.
  • the genetic lesion is a gene fusion that fuses the 3 ' coding region of the ALK gene to the EML4 gene.
  • compositions and methods contemplated herein include but are not limited to: Down Syndrome (Trisomy 21), Edwards Syndrome (Trisomy 18), Patau Syndrome (Trisomy 13), Klinefe Iter's
  • XXY Triple X syndrome
  • XYY syndrome Trisomy 8
  • Trisomy 16 Trisomy 16
  • Robertsonian translocation DiGeorge Syndrome and Wolf- Hirschhorn Syndrome.
  • Illustrative examples of alleles suitable for paternity testing that can be detected, identified, predicted, diagnosed, or monitored with the compositions and methods contemplated herein include but are not limited to 16 or more of: D20S1082, D6S474, D12ATA63, D22S1045, D10S1248, D1S1677, D11S4463, D4S2364, D9S1122, D2S1776, D10S1425, D3S3053, D5S2500, D1S1627, D3S4529, D2S441, D17S974, D6S1017, D4S2408, D9S2157, Amelogenin, D17S1301, D1GATA113, D18S853, D20S482, and D14S1434.
  • genes suitable for predicting the response to drug treatment that can be detected, identified, predicted, diagnosed, or monitored with the compositions and methods contemplated herein include, but are not limited to, one or more of the following genes: ABCB1 (ATP-binding cassette, sub-family B
  • MDR/TAP member 1
  • ACE angiotensin I converting enzyme
  • ADH1A alcohol dehydrogenase 1A (class I), alpha polypeptide
  • ADH1B alcohol dehydrogenase IB (class I), beta polypeptide
  • ADH1C alcohol dehydrogenase 1C (class I), gamma polypeptide
  • ADRB1 adrenergic, beta-1-, receptor
  • ADRB2 adrenergic, beta-2-, receptor, surface
  • AHR aryl hydrocarbon receptor
  • ALDH1A1 aldehyde
  • dehydrogenase 1 family, member Al ALOX5 (arachidonate 5 -lipoxygenase), BRCA1 (breast cancer 1, early onset), COMT (catechol-O-methyltransferase), CYP2A6
  • cytochrome P450, family 2, subfamily A, polypeptide 6 cytochrome P450, family 2, subfamily A, polypeptide 6
  • CYP2B6 cytochrome P450, family 2, subfamily B, polypeptide 6
  • CYP2C9 cytochrome P450, family 2, subfamily C, polypeptide 9
  • CYP2C19 cytochrome P450, family 2, subfamily C, polypeptide 19
  • CYP2D6 cytochrome P450, family 2, subfamily D, polypeptide 6
  • CYP2J2 cytochrome P450, family 2, subfamily J, polypeptide 2
  • CYP3A4 cytochrome P450, family 3, subfamily A, polypeptide 4
  • CYP3A5 cytochrome P450, family 3, subfamily A, polypeptide 5
  • DPYD dihydropyrimidine dehydrogenase
  • DRD2 diopamine receptor D2
  • F5 coagulation factor V
  • GSTPl
  • Illustrative examples of medical conditions that can be detected, identified, predicted, diagnosed, or monitored with the compositions and methods contemplated herein include, but are not limited to: stroke, transient ischemic attack, traumatic brain injury, heart disease, heart attack, angina, atherosclerosis, and high blood pressure.
  • compositions and methods contemplated herein include, but are not limited to: bacteria fungi, and viruses.
  • Illustrative examples of bacterial species that can be screened for with the compositions and methods contemplated herein include, but are not limited to: a Mycobacterium spp., a Pneumococcus spp., an Escherichia spp., a Campylobacter spp., a Corynebacterium spp., a Clostridium spp., a Streptococcus spp., a Staphylococcus spp., a Pseudomonas spp., a Shigella spp., a Treponema spp., or a Salmonella spp.
  • Illustrative examples of fungal species that can be screened for with the compositions and methods contemplated herein include, but are not limited to: an Aspergillis spp., a Blastomyces spp., a Candida spp., a Coccicioides spp., a
  • Cryptococcus spp. dermatophytes, a Tinea spp., a Trichophyton spp., a Microsporum spp., a Fusarium spp., a Histoplasma spp., a Mucoromycotina spp., a Pneumocystis spp., a Sporothrix spp., an Exserophilum spp., or a Cladosporium spp.
  • Influenza A such as H1N1, H1N2, H3N2 and H5N1 (bird flu)
  • Influenza B Influenza C virus
  • Hepatitis A virus Hepatitis B virus
  • Hepatitis C virus Hepatitis D virus
  • Hepatitis E virus Hepatitis E virus
  • Rotavirus any virus of the Norwalk virus group, enteric adenoviruses, parvovirus, Dengue fever virus, Monkey pox, Mononegavirales, Lyssavirus such as rabies virus, Lagos bat virus, Mokola virus, Duvenhage virus, European bat virus 1 & 2 and
  • cytomegalovirus Epstein-Bar virus (EBV), human herpesviruses (HHV), human herpesvirus type 6 and 8, Moloney murine leukemia virus (M-MuLV), Moloney murine sarcoma virus (MoMSV), Harvey murine sarcoma virus (HaMuSV), murine mammary tumor virus (MuMTV), gibbon ape leukemia virus (GaLV), feline leukemia virus (FLV), spumavirus, Friend murine leukemia virus, Murine Stem Cell Virus (MSCV) and Rous Sarcoma Virus (RSV), HIV (human immunodeficiency virus; including HIV type 1, and HIV type 2), visna-maedi virus (VMV) virus, the caprine arthritis- encephalitis virus (CAEV), equine infectious anemia virus (EIAV), feline
  • EBV Epstein-Bar virus
  • HHV human herpesviruses
  • M-MuLV Moloney murine
  • FIV immunodeficiency virus
  • BIV bovine immune deficiency virus
  • SIV simian immunodeficiency virus
  • papilloma virus murine gammaherpesvirus
  • Arenaviruses such as Argentine hemorrhagic fever virus, Venezuelan hemorrhagic fever virus, Lassa fever virus, Machupo virus, Lymphocytic choriomeningitis virus (LCMV), Bunyaviridiae such as Crimean-Congo hemorrhagic fever virus, Hantavirus, hemorrhagic fever with renal syndrome causing virus, Rift Valley fever virus,
  • Filoviridae including Ebola hemorrhagic fever and Marburg hemorrhagic fever
  • Flaviviridae including Kaysanur Forest disease virus, Omsk hemorrhagic fever virus, Tick-borne encephalitis causing virus and Paramyxoviridae such as Hendra virus and Nipah virus, variola major and variola minor (smallpox), alphaviruses such as
  • Venezuelan equine encephalitis virus Venezuelan equine encephalitis virus, eastern equine encephalitis virus, western equine encephalitis virus, SARS-associated coronavirus (SARS-CoV), West Nile virus, and any encephaliltis causing virus.
  • SARS-CoV SARS-associated coronavirus
  • genes suitable for monitoring an organ transplant in a transplant recipient that can be detected, identified, predicted, diagnosed, or monitored with the compositions and methods contemplated herein include, but are not limited to, one or more of the following genes: HLA-A, HLA-B, HLA-C, HLA-DR, HLA-DP, and HLA-DQ.
  • a bioinformatic analysis is used to quantify the number of genome equivalents analyzed in the cfDNA clone library; detect genetic variants in a target genetic locus; detect mutations within a target genetic locus; detect genetic fusions within a target genetic locus; or measure copy number fluctuations within a target genetic locus.
  • a companion diagnostic for a genetic disease comprising: isolating or obtaining cfDNA from a biological sample of a subject; treating the cfDNA with one or more end-repair enzymes to generate end- repaired cfDNA; ligating one or more adaptors to each end of the end-repaired cfDNA to generate a cfDNA library; amplifying the cfDNA library to generate a cfDNA clone library; determining the number of genome equivalents in the cfDNA clone library; and performing a quantitative genetic analysis of one or more biomarkers associated with the genetic disease in the cfDNA clone library, wherein detection of, or failure to detect, at least one of the one or more biomarkers indicates whether the subject should be treated for the genetic disease.
  • the term "companion diagnostic” refers to a diagnostic test that is linked to a particular anti-cancer therapy.
  • the diagnostic methods comprise detection of genetic lesion in a biomarker associated with in a biological sample, thereby allowing for prompt identification of patients should or should not be treated with the anti-cancer therapy.
  • Anti-cancer therapy includes, but is not limited to surgery, radiation, chemotherapeutics, anti-cancer drugs, and immunomodulators.
  • anti-cancer drugs include, but are not limited to:
  • alkylating agents such as thiotepa and cyclophosphamide (CYTOXANTM); alkyl sulfonates such as busulfan, improsulfan and piposulfan; aziridines such as benzodopa, carboquone, meturedopa, and uredopa; ethylenimines and methylamelamines including altretamine, triethylenemelamine, trietylenephosphoramide, triethylenethiophosphaoramide and trimethylolomelamine resume; nitrogen mustards such as chlorambucil, chlornaphazine, cholophosphamide, estramustine, ifosfamide, mechlorethamine, mechlorethamine oxide hydrochloride, melphalan, novembichin, phenesterine, prednimustine, trofosfamide, uracil mustard; nitrosureas such as carmustine, chlorozotocin, fotemus
  • cactinomycin calicheamicin, carabicin, carminomycin, carzinophilin, chromomycins, dactinomycin, daunorubicin, detorubicin, 6-diazo-5-oxo-L-norleucine, doxorubicin and its pegylated formulations, epirubicin, esorubicin, idarubicin, marcellomycin, mitomycins, mycophenolic acid, nogalamycin, olivomycins, peplomycin, potfiromycin, puromycin, quelamycin, rodorubicin, streptonigrin, streptozocin, tubercidin, ubenimex, zinostatin, zorubicin; anti-metabolites such as methotrexate and 5-fluorouracil (5-FU); folic acid analogues such as denopterin, methotrexate, pteropterin, trimetrexate; purine analogs such as fludar
  • mitoguazone mitoxantrone; mopidamol; nitracrine; pentostatin; phenamet; pirarubicin; podophyllinic acid; 2-ethylhydrazide; procarbazine; PSK®; razoxane; sizofiran;
  • TXOTERE® Rhne-Poulenc Rorer, Antony, France
  • chlorambucil gemcitabine
  • 6- thioguanine mercaptopurine
  • methotrexate platinum analogs such as cisplatin and carboplatin
  • vinblastine platinum
  • platinum etoposide (VP-16); ifosfamide; mitomycin C;
  • mitoxantrone vincristine; vinorelbine; navelbine; novantrone; teniposide; aminopterin; xeloda; ibandronate; CPT-11; topoisomerase inhibitor RFS 2000;
  • DMFO difluoromethylomithine
  • TargretinTM retinoic acid derivatives
  • anti-hormonal agents that act to regulate or inhibit hormone action on cancers
  • anti-estrogens including for example tamoxifen, raloxifene, aromatase inhibiting 4(5)-imidazoles, 4-hydroxytamoxifen, trioxifene, keoxifene, LY117018, onapristone, and toremifene (Fareston); and anti- androgens such as flutamide, nilutamide, bicalutamide, leuprolide, and goserelin; and pharmaceutically acceptable salts, acids or derivatives of any of the above.
  • immunomodulators include, but are not limited to: cyclosporine, tacrolimus, tresperimus, pimecrolimus, sirolimus, verolimus, laflunimus, laquinimod and imiquimod, as well as analogs, derivatives, salts, ions and complexes thereof.
  • the purpose of this experiment was to provide a direct proof-of-principle demonstration of rare variant detection using targeted sequence capture technology.
  • Target sequence capture technology provides quantitative, sequence-based genetic analysis of nucleic acids and can be exploited to perform a combined mutational and copy number analysis of drug metabolism genes.
  • the present inventors used targeted sequence capture technology and subsequence genetic analysis to detect rare sequence variants.
  • Genomic DNA inputs play a central role in rare variant detection, but quantitative analysis and control of genomic inputs places bounds on the estimated sensitivity of rare variant analysis.
  • a genomic qPCR assay was used by the present inventors to estimate genomic inputs.
  • One experimental goal for rare variant analysis is to introduce 10-fold more genomic input as the target sensitivity of the assay.
  • the experimental target is to input 1000 genomes.
  • Downstream of sequencing, bioinformatics analysis reveals the number of unique reads, and this has the desirable quality of being both an orthogonal and a more direct measure of genomic inputs.
  • a cell line (ZR75-30) with known SNVs was admixed with a germ line DNA sample (NA12878) in a dilution series ranging from 1-to-l through l-to-1000.
  • Target regions corresponding to known sequence differences were retrieved using targeted sequence capture technology and sequenced. Sequence variants that occur at a frequency of less than 1 per 1000 sequences were detected.
  • MYC_r2_F4 21 GGCGGCTAGGGGACAGGGGCGGGGTGGGCAGCAGCTCGAATTT CTTCCAGATATCCTCGC
  • MYC_r2_Rl 22 AGACGAGCTTGGCGGCGGCCGAGAAGCCGCTCCACATACAGTC
  • MYC_r2_R2 23 AGGAGAGCAGAGAATCCGAGGACGGAGAGAAGGCGCTGGAGT
  • MYC_r2_R3 24 TAAGAGTGGCCCGTTAAATAAGCTGCCAATGAAAATGGGAAAG
  • MYC_r2_R4 25 TTGTATTTGTACAGCATTAATCTGGTAATTGATTATTTTAATGTA
  • MYC_r3_F2 27 AGAGGAGGAACGAGCTAAAACGGAGCTTTTTTGCCCTGCGTGA
  • MYC_r3_F3 28 TCCAACTTGACCCTCTTGGCAGCAGGATAGTCCTTCCGAGTGGA
  • MYC_r3_Rl 29 GCTTGGACGGACAGGATGTATGCTGTGGCTTTTTTAAGGATAAC
  • MYC_r3_R2 30 GCATTTGATCATGCATTTGAAACAAGTTCATAGGTGATTGCTCA
  • MYC_r3_R3 31 CGCCCCGCGCCCTCCCAGCCGGGTCCAGCCGGAGCCATGGGGC
  • ERBB2rlr 32 CTCTGGCCCCGCCGGCCGCGGGACCTCGGCGGGGCATCCACAG
  • ERBB2r2r 34 GCAGGGCACCTTCTTCTGCCACCCACCTGTAAACAGAGGGCTCA
  • ERBB2r3f 35 CCCAAGATCTCCAAGTACTGGGGAACCCCAGGGAGGCCCTGGG
  • ERBB2r3r 36 CTAATGCACACAAAGCCTCCCCCTGGTTAGCAGTGGCCCTGGTC
  • ERBB2r4f 37 CTGCTCCTCTTTTAGAAGGCAGGAGGGCCCCAAGGGAAGCAGA
  • ERBB2r4r 38 TGGGGCAGTGGCGGGCAGGCACTGGGTTGTAAGTTGGGAGTTT
  • ERBB2r5f 39 TCTGCTGCTGTTTGTGCCTCTCTCTGTTACTAACCCGTCCTCTCG
  • ERBB2r5r 40 CCCACCCCTCCCATGTCACCTGTATGACACCTGCATTCCACCCG
  • ERBB2r6r 42 GGTGCCCACCCCTTGCATCCTGGGGGGTAGAGCACATTGGGCA
  • ERBB2r7f 43 CACCCTGCCTGGTACTGCCCTATTGCCCCTGGCACACCAGGGCA
  • ERBB2r8f 45 TTATTCTTCTTGTGCCTGGGCACGGTAATGCTGCTCATGGTGGT
  • ERBB2r8r 46 GAAGGATAGGACAGGGTGGGCTGGGCCAGGCTGCATGCGCAG
  • ERBB2r9f 47 GGGCCCGGACCCTGATGCTCATGTGGCTGTTGACCTGTCCCGGT
  • TYMSr2r 50 GTGTTGAGAACAGACTACTGACTTCTAATAGCAGCGACTTCTTT
  • TYMSr3f 51 AAAAAAAGGATGGGTTCCATATGGGTGGTGTCAAGTGCCCACC
  • TYMSr4f 53 AACCCACCGAGATCTGCAAACTTTGCAGGATGCACCAGATGTC
  • TYMSr4r 54 TGCCTCCCTCAGGTGCCTCTGCACAAAACCAGATTGCTTCCCTC
  • TYMSr5f 55 GTTTTACTTTGCCTTTAGCTGTGGTCTTTCAAACCACCATCCCTC
  • TYMSr6r 58 CCTGCCCACCACTTCTCCCTAAACTGAAGCCCCACATTTGGAGC
  • TYMSr7r 60 GCACAGTTACATTTGCCAGTGGCAACATCCTTAAAAATTAATAA
  • TYMSrlf 61 CGTCCCGCCGCGCCACTTGGCCTGCCTCCGTCCCGCCGCGCCAC
  • TYMSrlr 62 CTGTAAGGCGAGGAGGACGATGCGTCCCCTCCCTCGCAGGATT
  • Capture probe modules were pooled from stock plates, combined with partner oligo # 138 (SEQ ID NO:63)
  • genomic DNA from germ line sample NA12878 and cell line ZR75-30 was fragmented at a concentration of 10-20 ng/ ⁇ to a target fragment size of 500 bp on a Covaris sonication instrument.
  • the DNA was purified with a 1 : 1 concentration of DNA purification beads and end-repaired using the New England Biolabs (NEB) Quick blunt kit at a final concentration of 15 - 30 ng/ ⁇ .
  • the germ line and cell line DNAs were blended at ratios of 1 : 1 , 10: 1, 100: 1 and 1000: 1, respectively.
  • Libraries were constructed, purified and quantified.
  • the sample codes, library quantitation and inputs used for library construction are shown in Table 2. able 2.
  • Genomic libraries were pooled, denatured, combined with probe, hybridized and washed.
  • the washed capture probe-tagged genomic library complexes were amplified with forward and reverse full-length primers, purified, and size-selected for 225-600 bp fragments on a Pippin-prep instrument. Finally, the captured material was sequenced using a 150-V3 Illumina sequencing kit.
  • Genomic DNA is highly intact.
  • Literature suggest that the average size of circulating DNA is about 150 bp, which correlates well to the size of DNAs wrapped around a single nucleosomal histone complex.
  • targeted sequence capture technology contemplated herein were designed to accommodate highly fragmented DNA and to retain the ability to generate comprehensive sequence coverage of targeted DNA. Capture probe density was increased and the length of capture probe sequences was reduced from 60 nucleotide to 40 nucleotide to minimize uninformative sequence generation in the clone library. The human genome is littered with repetitive sequences and drastic fluctuations in base composition, thus, the suitability of implementing higher capture probe densities and shorter capture probes could not be conceded but required empirical validation of the new assay.
  • the new high density shorter capture probes were successfully used to query short fragmented DNAs and the results indicated that the assay design is well suited to sequencing of circulating DNAs that are found in the plasma fraction of blood.
  • PLP1 ex2 69 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTATCTCCAG _R GATGGAGAGAGGGAAAAAAAAAAGATGGGTCTGTGTGGGAGGGCA7
  • PLP1 ex3 70 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGAAAGAAGCC _F AGGTCTTCAATTAATAAGATTCCCTGGTCTCGTTTGTCTACCTGTT
  • PLP1 ex3 71 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCAGACTCGCG _M CCCAATTTTCCCCCACCCCTTGTTATTGCCACAAAATCCTGAGGAT
  • CYP2D6 72 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAAGCACCTAG F CCCCATTCCTGCTGAGCAGGAGGTGGCAGGTACCCCAGACTGGGA
  • CYP2D6 73 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAGTCGGTGGG R GCCAGGATGAGGCCCAGTCTGTTCACACATGGCTGCTGCCTCTCA GCTCT
  • chrX 15 74 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTGGCCCTC F AGCCAGTACAGAAAGTCATTTGTCAAGGCCTTCAGTTGGCAGACG
  • chrX 15 75 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAGAATTCATT R GCCAGCTATAAATCTGTGGAAACGCTGCCACACAATCTTAGCACA
  • chrX 69 76 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTTACTTCCCTC F CAGTTTTGTTGCTTGCAAAACAACAGAATCTTCTCTCCATGAAATC ATG
  • KRAS ex 78 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGTTACCTTTA 1_F AAAGACATCTGCTTTCTGCCAAAATTAATGTGCTGAACTTAAACT
  • VHL r3 89 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACATCAAGACTC R ATCAGTACCATCAAAAGCTGAGATGAAACAGTGTAAGTTTCAACA
  • UGT1A1 90 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGTGTCCAGC r_4F TGTGAAACTCAGAGATGTAACTGCTGACATCCTCCCTATTTTGCAT
  • UGT1A1 91 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACATTTGAAACA r_4R ATTTTATCATGAATGCCATGACCAAAGTATTCTTCTGTATCTTCTT
  • TNFRSF1 92 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGATGGGTGG 4_r3_F GCTCCCGAAGGGGCCTCCCGCAGACTTGCGAAGTTCCCACTCTCT
  • TNFRSF1 93 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCAGGGTGCGG 4_r3_R GGGCATCCAGGCTGCCCAAGCGGAGGCTGGGCCGGCTGTGCTGG
  • PTEN r5 98 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTACTTGTTAAT F TAAAAATTCAAGAGTTTTTTTTTCTTATTCTGAGGTTATCTTTTTAC CA
  • VHL rl F 102 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCGCCCCGCGT
  • VHL rl 103 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCATACGGGC R AGCACGACGCGCGGACTGCGATTGCAGAAGATGACCTGGGAGGG CTCGCG
  • VHL rl 104 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTAGAGGGGCT Ml TCAGACCGTGCTATCGTCCCTGCTGGGTCGGGCCTAAGCGCCGGG
  • VHL rl 105 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGCGCCGAGG M2 AGGAGATGGAGGCCGGGCGGCCGCGGCCCGTGCTGCGCTCGGTG
  • VHL_r2_F 106 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGTGTGGGCC
  • CCGA VHL r2 107 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAAGTGGTCTA R TCCTGTACTTACCACAACAACCTTATCTTTTTAAAAAGTAAAACGT
  • VHL_r3_F 108 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTTGTTCGTTC
  • VHL r3 109 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACATCAAGACTC R ATCAGTACCATCAAAAGCTGAGATGAAACAGTGTAAGTTTCAACA
  • PLP1 ex2 110 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACACCTACTGGA F 40 TGTGCCTGACTGTTTCCCCTTCTTCTTCCC
  • PLP1 ex2 111 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGGAAAAAAA R 40 AGATGGGTCTGTGTGGGAGGGCAGGTACTT
  • PLP1 ex3 112 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTTAATAAGAT F 40 TCCCTGGTCTCGTTTGTCTACCTGTTAATG
  • PLP1 ex3 113 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCCCACCCCTT M 40 GTTATTGCCACAAAATCCTGAGGATGATC
  • CYP2D6 114 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCTGAGCAGG F 40 AGGTGGCAGGTACCCCAGACTGGGAGGTAA
  • CYP2D6 115 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGCCCAGTCT R 40 GTTCACACATGGCTGCTGCCTCTCAGCTCT
  • chrX 15 116 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGAAAGTCATT F 40 TGTCAAGGCCTTCAGTTGGCAGACGTGCTC
  • chrX 15 117 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAATCTGTGGA R 40 AACGCTGCCACACAATCTTAGCACACAAGA
  • chrX 69 118 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGCTTGCAAA F 40 ACAACAGAATCTTCTCTCCATGAAATCATG
  • chrX 69 119 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACATTTTCTCACA R 40 AAGGAAACCAAGATAAAAGGTTTAAATGG
  • KRAS ex 120 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGCTTTCTGCC 1 F 40 AAAATTAATGTGCTGAACTTAAACTTACC
  • KRAS ex 121 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCAATGCAAC 1 R 40 AGACTTTAAAGAAGTTGTGTTTTACAATGC
  • KRAS ex 122 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACATTTTGCAGA 2 F 40 AAACAGATCTGTATTTATTTCAGTGTTACT
  • MYC r2 124 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGCGCCAGGT Fl 40 TTCCGCACCAAGACCCCTTTAACTCAAGAC
  • MYC r2 126 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCATGGTGAAC F3 40 CAGAGTTTCATCTGCGACCCGGACGACGAG
  • MYC r2 127 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGACGGAGAGA R3 40 AGGCGCTGGAGTCTTGCGAGGCGCAGGACT
  • VHL r3 F 130 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAGACCCTAGT 40 CTGCCACTGAGGATTTGGTTTTTGCCCTTC
  • VHL r3 131 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTCAAAAGCTG R 40 AGATGAAACAGTGTAAGTTTCAACAGAAAT
  • UGT1A1 132 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAGAGATGTAA r_4F_40 CTGCTGACATCCTCCCTATTTTGCATCTCA
  • UGT1A1 133 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGAATGCCATG r 4R 40 ACCAAAGTATTCTTCTGTATCTTCTTTCTT
  • TNFRSFl 134 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGGCCTCCCG 4 r3 F 40 CAGACTTGCGAAGTTCCCACTCTCTGGGCG
  • TNFRSFl 135 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCTGCCCAAG 4 r3 R 4 CGGAGGCTGGGCCGGCTGTGCTGGCCTCTT
  • RHD r5 138 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTTTGGAGCAG F 40 GAGTGTGATTCTGGCCAACCACCCTCTCTG
  • RHD r5 139 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTGTTAGACC R 40 CAAGTGCTGCCCAAGGGCAGCGCCCTGCTC
  • PTEN r5 140 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAAGAGTTTTTT F 40 TTTCTTATTCTGAGGTTATCTTTTTACCA
  • VHL rl F 144 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGATCCCGCGG 40 CGTCCGGCCCGGGTGGTCTGGATCGCGGAG
  • VHL rl 145 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCGGACTGCG R 40 ATTGCAGAAGATGACCTGGGAGGGCTCGCG
  • VHL rl 146 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTATCGTCCCT Ml 40 GCTGGGTCGGGCCTAAGCGCCGGGCCCGT
  • VHL rl 147 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGCCGGGCGG M2 40 CCGCGGCCCGTGCTGCGCTCGGTGAACTCG
  • VHL r2 F 148 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCCACCGGTG 40 TGGCTCTTTAACAACCTTTGCTTGTCCCGA
  • VHL r2 149 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACACCACAACAA R 40 CCTTATCTTTTTAAAAAGTAAAACGTCAGT
  • VHL r3 F 150 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAGACCCTAGT 40 CTGCCACTGAGGATTTGGTTTTTGCCCTTC
  • VHL r3 151 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTCAAAAGCTG R 40 AGATGAAACAGTGTAAGTTTCAACAGAAAT
  • the performance of 40 mer capture probes was compared to that of 60 mer capture probes.
  • the 40 mer was designed from the 60 mer by removing 20 nucleotides from the 5' end of the 60 mer. Although the 3' end of both capture probe sets are identical with respect to the sequences that are copied from captured genomic clones, the probe sequence signature (Read 2 of the paired end read) is different between the 60 mer and 40 mer probe sets. This design is useful because it allows the capture probes to be multiplexed during sequencing and their performance subsequently analyzed during downstream bioinformatics deconvolution. Genomic samples
  • a pool of 12 genomic DNA samples (chosen from a Coriell human panel of 112 human genomic DNAs) was used as the target DNA.
  • the 12 samples were broken into four sets of four samples each, as shown in detail in Table 5.
  • the capture probe oligos were combined with partner oligo; the final concentration of duplex capture probe was 1 iiM for each capture probe.
  • Each hybridization reaction had ⁇ 2.5 ⁇ g of genomic library in 40 ⁇ total volume. Each sample was heated to 98° C for 2 min then cooled on ice. 20 ⁇ of capture probe and 90 ⁇ of hybridization buffer were added and the hybridization mix was incubated for 24 hours starting at 80° C and decreasing one degree every 48 minutes to 50° C. The complexes were bound to 20 ul of streptavidin beads in 1 mL total volume of TEzero buffer + 0.05% Tween20 (TT). The beads were washed 3 times, 5 min each with 200 ul of TT, and once at 45° C for 5 min in wash buffer. The beads were then washed with TEzero and each reaction was resuspended in 20 ⁇ TEzero. The complexes were then PCR amplified with full length forward
  • the capture probe performance as a function of length and wash temperature is shown graphically in Figure 3. Overall, the 40 mer capture probes performed as well as the 60 mer capture probes with 44° C and 47° C washes. With the 50° C wash, the 40 mer capture probes exhibit sporadic behavior. These data empirically validate the use of 40 mer capture probes and wash temperatures in the 44° C to 47° C range when using these reagents.
  • sequence capture probes are designed using specific "rules.” For example, regions of redundant sequence or that exhibit extreme base composition biases are generally avoided.
  • One key implication of the requirement for high probe density and close spacing of probes along target regions is that there is little or no latitude to move probes in order to accommodate any such probe design rules.
  • probes were designed based solely on their position relative to one another with no consideration of probe binding sequences; thus, use of this high density approach required empirically validating that the hybridization and processing methods would accommodate such a collection of probes.
  • the human ALK gene encodes a protein kinase important in early development, but normal ALK gene expression is essentially undetectable in normal adults.
  • Oncogenic ALK fusions are created when intron 19 of ALK undergoes illegitimate recombination to fuse the kinase encoding portion of ALK to the 5 ' end of another gene. Such gene fusions often cause ectopic expression of the ALK kinase, which in turn is important in driving the inappropriate cell proliferation observed in pulmonary tumors. In lung cancer, this "other gene” is often EML4, but other fusion partners have also been detected.
  • 40 nucleotide probes were designed that were placed at 80 nucleotide intervals in intron 19 of ALK. These probes were oriented such that they are antisense relative to the gene ( Figure 4).
  • TP53 gene encodes a tumor suppressor, and it is often inactivated by mutations in cancers. Mutations that can inactivate gene function are scattered throughout the gene, and hence conclusive sequence-based assays for TP53 inactivating mutations must address the entire coding region and untranslated regions (UTRs) of the gene. Because circulating DNA fragments are short, high density probes were used to interrogate all target regions of the TP53 gene. Unlike ALK, probes for TP53 are placed in both possible orientations ( Figure 5). At high probe densities, the cumulative coverage from multiple probes provides uniform deep sequencing coverage of target regions.
  • probes that target the fusion-prone region of ALK and the coding regions of TP53 were also included.
  • ALK_chr2:29446368_ 156 ATGTGA CTGGCACGGGAGTTGATCCTGGTTTTCACGCAGC fusion f TCCTGG r rGCTTCCGGCGGTACACTGCAGGTGGGTG
  • ALK_chr2:29446528_ 158 ATGTGA CTGGCACGGGAGTTGATCCTGGTTTTCACGAAAT fusion f ACTAATi 4AAATGATTAAAGAAGGTGTGTCTTTAAT
  • ALK_chr2:29446768_ 161 ATGTGA CTGGCACGGGAGTTGATCCTGGTTTTCACGTGAA fusion f CCAGCA GACTGTGTTGCAAGTATAACCCCACGTGA
  • ALK_chr2:29447008_ 164 ATGTGA CTGGCACGGGAGTTGATCCTGGTTTTCACTTTGA fusion f GGGTGC AGCTGGGATCTTGGTCAGTTGTGTTTCCT
  • ALK_chr2:29447728_ 173 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAAGAG fusion f CCTTTCCCTCTGCCCTTTTCAAGCCTCTGCCCATC
  • ALK_chr2:29447888_ 175 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTTTCCT fusion f ATCTCTCTGCCTGGAGGGTGGTGGAGGGCTGGTT
  • ALK_chr2:29447968_ 176 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAAACA fusion f GGAGCTGCGCCGGTGGAAGCATGTGGGAGCTAGAA
  • ALK_chr2:29448048_ 177 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGACA fusion f CTGAAGGAGCTCCCCACCCCCTGATCAGCCAGGAG
  • ALK_chr2:29448128_ 178 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGGAA fusion f CTGCAGCTGCTCTGGTGGGGGGAAGGTTGGGAGCT
  • MYCNrlf_40 181 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGAAG
  • MYCNrlr_40 182 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTAAC
  • MYCNrlf2_40 183 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTGG
  • MYCNrlr2_40 184 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCGCGC
  • MYCNrlG_40 185 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCCAC
  • MYCNrlr3_40 186 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGGCA
  • MYCNr2f_40 187 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAACAT
  • MYCNr2r_40 188 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTAAAC
  • MYCNr2f2_40 189 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCCTA
  • MYCNr2r2_40 190 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACATGAC
  • MYCNr2G_40 191 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTCTCA
  • MYCNr2r3_40 192 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCAGTG
  • TP53_chrl7:7579779 193 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCTAG region l:75nt:-59:- GGGGCTGGGGTTGGGGTGGGGGTGGTGGGCCTGCC 20:f
  • TP53_chrl7:7579838 194 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCAGTT region_l:75nt:l:40:f TCCATAGGTCTGAAAATGTTTCCTGACTCAGAGGG
  • TP53_chrl7:7579878 195 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTGCC region l:75nt:41:+5:r ATGGAGGAGCCGCAGTCAGATCCTAGCGTCGAGCC
  • TP53_chrl7:7579932 196 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTCATG region_l:75nt:+20:+59 CTGGATCCCCACTTTTCCTCTTGCAGCAGCCAGAC
  • TP53_chrl7:7579640 197 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAGCCC region 2:23nt:-59:- CCCAGCCCTCCAGGTCCCCAGCCCTCCAGGTCCCC 20:f
  • TP53_chrl7:7579741 198 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCAGA region_2:23nt:+20:+59 GACCTGTGGGAAGCGAAAATTCCATGGGACTGACT :r
  • TP53_chrl7:7579252 199 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCAGG region 3:280nt:-59:- GGGATACGGCCAGGCATTGAAGTCTCATGGAAGCC 20:f
  • TP53 chrl7:7579311 200 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCGTG region 3:280nt:l:40:f CAAGTCACAGACTTGGCTGTCCCAGAATGCAAGAA
  • TP53 chrl7:7579351 201 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCAGA region 3:280nt:41:80:r AAACCTACCAGGGCAGCTACGGTTTCCGTCTGGGC
  • TP53 chrl7:7579391 202 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGAAGG region 3:280nt:81: 120 GACAGAAGATGACAGGGGCCAGGAGGGGGCTGGTG :f
  • TP53 chrl7:7579431 203 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGTGGC region 3:280nt:121:16 CCCTGCACCAGCAGCTCCTACACCGGCGGCCCCTG 0:r
  • TP53 chrl7:7579471 204 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGGGG region 3:280nt: 161:20 GAGCAGCCTCTGGCATTCTGGGAGCTTCATCTGGA 0:f
  • TP53 chrl7:7579511 205 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCCCG region 3:280nt:201:24 GACGATATTGAACAATGGTTCACTGAAGACCCAGG 0:r
  • TP53_chrl7:7579610 206 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTGGG region 3:280nt:+20:+5 GGGCTGGGGGGCTGAGGACCTGGTCCTCTGACTGC 9:r
  • TP53_chrl7:7578327 207 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTGG region 4: 185nt:-43:- GCAACCAGCCCTGTCGTCTCTCCAGCCCCAGCTGC 4:f
  • TP53 chrl7:7578370 208 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCATC region_4: 185nt:l:40:f GCTATCTGAGCAGCGCTCATGGTGGGGGCAGCGCC
  • TP53_chrl7:7578410 209 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGCCAT region 4: 185nt:41:80:r CTACAAGCAGTCACAGCACATGACGGAGGTTGTGA
  • TP53 chrl7:7578450 210 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCATGG region 4: 185nt:81: 120 CGCGGACGCGGGTGCCGGGCGGGGGTGTGGAATCA :f
  • TP53_chrl7:7578490 211 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTTTGC region 4: 185nt:121:16 CAACTGGCCAAGACCTGCCCTGTGCAGCTGTGGGT 0:r
  • TP53_chrl7:7578574 212 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGCTTT region 4: 185nt:+20:+5 ATCTGTTCACTTGTGCCCTGACTTTCAACTCTGT
  • TP53_chrl7:7578117 213 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGAGGG region 5: 114nt:-59:- CCACTGACAACCACCCTTAACCCCTCCTCCCAGAG 20:f
  • TP53_chrl7:7578176 214 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTCA region_5: 114nt:l:40:f GGCGGCTCATAGGGCACCACCACACTATGTCGAAA
  • TP53_chrl7:7578216: 215 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACAGGAA region_5 : 114nt:41 : 80:r ATTTGCGTGTGGAGTATTTGGATGACAGAAACACT
  • TP53_chrl7:7578292 216 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCAGG region_5: 114nt:+3:+42 GTCCCCAGGCCTCTGATTCCTCACTGATTGCTCTT
  • TP53_chrl7:7577439 217 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGAGGC region 6: l l lnt:-59:- AAGCAGAGGCTGGGGCACAGCAGGCCAGTGTGCAG 20:f
  • TP53_chrl7:7577498 218 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTGG region 6: l l lnt:l:40:f AGTCTTCCAGTGTGATGATGGTGAGGATGGGCCTC
  • TP53_chrl7:7577538 219 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACACTAC region 6: l l lnt:41:80:r ATGTAACAGTTCCTGCATGGGCGGCATGAACCG TP53_chrl7:7577628: 220 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTTGC region 6: l l lnt:+20:+5 CACAGGTCTCCCCAAGGCGCACTGGCCTCATCTTG 9:r
  • TP53_chrl7:7576974 221 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCTGCA region 7: 138nt:-44:- CCCTTGGTCTCCTCCACCGCTTCTTGTCCTGCTTG
  • TP53 chrl7:7577018 222 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTCG region 7: 138nt:l:40:f CTTAGTGCTCCCTGGGGGCAGCTCGTGGTGAGGCT
  • TP53_chrl7:7577058 223 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGACCG region_7: 138nt:41:80:r GCGCACAGAGGAAGAGAATCTCCGCAAGAAAGGGG
  • TP53 chrl7:7577098 224 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTCTCC region 7: 138nt:81: 120 CAGGACAGGCACAAACACGCACCTCAAAGCTGTTC :f
  • TP53_chrl7:7577138 225 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTCTCTT region 7: 138nt:121:+2 TTCCTATCCTGAGTAGTGGTAATCTACTGGGACG
  • TP53_chrl7:7577175 226 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGACA region 7: 138nt:+20:+5 GGTAGGACCTGATTTCCTTACTGCCTCTTGCTTCT 9:r
  • TP53_chrl7:7576793 227 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGCAT region 8:75nt:-59:- TTTGAGTGTTAGACTGGAAACTTTCCACTTGATAA 20:f
  • TP53 chrl7:7576852 228 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTGA region 8:75nt:l:40:f AGGGTGAAATATTCTCCATCCAGTGGTTTCTTCTT
  • TP53_chrl7:7576892 229 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTAG region 8:75nt:41:+5:r CACTGCCCAACAACACCAGCTCCTCTCCCCAGCCA
  • TP53_chrl7:7576931 230 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTGCCT region_8:75nt:+5:+44: CAGATTCACTTTTATCACCTTTCCTTGCCTCTTTC r
  • TP53 chrl7:7573926 232 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCTGG region 9: 108nt:l:40:f AGTGAGCCCTGCTCCCCCCTGGCTCCTTCCCAGCC
  • TP53 chrl7:7573966 233 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACTCCGA region 9: 108nt:41:80:r GAGCTGAATGAGGCCTTGGAACTCAAGGATGCCCA
  • TP53 chrl7:7574053 234 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCATCT region 9: 108nt:+20:+5 TTTAACTCAGGTACTGTGTATATACTTACTTCTCC
  • TP53_chrl7:7572867 235 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGCAG region 10:83nt:-59:- GGGAGGGAGAGATGGGGGTGGGAGGCTGTCAGTGG 20:f
  • TP53 chrl7:7572926 236 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGTCAG region 10:83nt:l:40:f TCTGAGTCAGGCCCTTCTGTCTTGAACATGAGTTT
  • TP53_chrl7:7573028: 238 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGCAC region 10:83nt:+20:+5 AGACCCTCTCACTCATGTGATGTCATCTCTCCTCC 9:r
  • CDKN2A chr9:21970 248 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGGGCC 956 rs3731249:C:T:r ATCGCGATGTCGCACGGTACCTGCGCGCGGCTGCG
  • EPHXl chrl:2260263 251 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACCCACC 27 rs2234922:A:G:f CTGACTGTGCTCTGTCCCCCCAGGGCTGGACATCC
  • TNFRSF14 chrl:2491 258 ATGTGACTGGCACGGGAGTTGATCCTGGTTTTCACGTGGC 346 rs2234163:G:A:r GTAAGCGCGGCACGCGGCGCAGTGGTCCCCGTCCT
  • Germline sample NA 06994 - a normal human sample obtained from the Coriell repository
  • Cancer cell line NCI-H69 - a cell line known to harbor a mutation in TP53, an amplification of the MYCN locus, and SNVs in ALDH4A1, BRCA1, BRCA2, CDKN2A, DPYD, EPHXl, MYC, RBI and TNFRSF14 that were included in the target probe set;
  • DNA sequencing libraries are generally constructed from sheared DNA fragments. Acoustic disruption was used to generate DNA fragments that ranged in size from 200 to >500 bp. Enzymatic fragmentation of the acoustically fragmented DNA was performed in an effort to emulate circulating DNA, which is pondered to be composed of nucleosomal, -150 bp fragments. Briefly, DNA at 20-40 ng/ ⁇ was sonicated on the 200 bp setting, which yields fragments that range in size from 150 bp to 400 bp in a broad smear.
  • DNA was further fragmented by the addition of 0.01 and 0.02 ⁇ of DNAse enzyme (New England Biolabs recombinant bovine DNAse) to 50 ⁇ aliquots of DNA in DNAse buffer (10 mM Tris pH 8.0, 2.5 mM MnCl 2 , 0.5 mM CaCl 2 ).
  • DNAse reaction was incubated at 37° C for 10 min and stopped with the addition of 0.5 M EDTA to a final concentration of 25 mM.
  • DNA with an average size of 150 bp was purified by "two-sided" bead selection by first adding 0.9 volumes of beads to 1 volume of DNA.
  • Fragmented DNA was end-repaired using the Quick Blunt kit form NEB and blended in the ratios shown in Table 7. Ten nanograms of blended DNA were then ligated to adaptors with the sequences shown in Table 7. For mixes 9 and 15, two ligation reactions with 10 ng each were performed and subsequently pooled. For mix 16, four reactions were done. An estimate of genomic inputs into each library using a qPCR assay is also shown in Table 7.
  • GL:Z 10 1 NNNT vfNCATTGACGTCTAGAGC : 271 181
  • GL:Z 20 1 NTCACTCTACTCGTGAC 272 224
  • Capture probe performance of high density capture probes that were chosen based on their position with target sequences were monitored.
  • a graphical display of the performance of each capture probe is shown in Figure 7.
  • Capture probes that captured a high proportion of off-target and unmappable reads were analyzed further. These capture probes were generally positioned in regions of low sequence complexity/high sequence redundancy. Here, however, such capture probes had no significant deterimental impact on the sequencing depth because the high level of probe redundancy (high density probes) means that all regions are covered by reads derived from several probes. The net effect was excellent uniformity of coverage. See, e.g., Figure 8, probe coverage for the TP53 gene using the 40 mer capture probes. CONCLUSION
  • the purpose of this example was to benchmark the genetic analysis of cfDNA using an efficient cloning procedure for cfDNA and target retrieval system.
  • Plasma samples collected from healthy donors and individuals suffering from either ovarian or colon cancers were used to perform the genetic analysis of circulating DNA.
  • the amount and the overall character of circulating cfDNA can vary widely from individual to individual.
  • the present inventors found that cfDNA is readily clonable with an efficiency indistinguishable from highly purified and fragmented genomic DNA; that the fragment size was remarkably consistent, with an average clone insert size of 170 ⁇ 10 bp (in 7/8 samples); and that the genome representation from such samples was uniform and comparable to experiments performed using purified gDNA. It was further established that by counting unique reads, the depth of representation in each library provided an estimate of minor allele frequency for tumor markers present in the cfDNA of diseased patients. This study established that construction and target retrieval systems contemplated herein were effectively applied to the quantitative genetic analysis of cfDNA.
  • Ligation products were purified by the addition of 100 ⁇ beads, washing, and elution in 40 ⁇ TEzero. All 40 ⁇ of ligation product was amplified by PCR with primer ACA2 (SEQ ID NO:283) and the samples were combined in equal mass for targeted capture.
  • FIG. 9 A A false-color picture of a 2% agarose gel loaded with 50 ng of each library is shown in Figure 9 A.
  • the average fragment size was in a tight range of 260 ⁇ 20 bp.
  • the size of the cfDNA libraries had the same basic superficial appearance as cfDNA in kidney dialysis patients (Atamaniuk et al., Clinical Chemistry 52(3):pp. 523-26 (2006)) except that the cfDNA libraries were shifted to higher mass by the addition of adaptor sequences (Figure 9B).
  • the cfDNA libraries differed dramatically from sonicated gDNA libraries, which appear as broad smears.
  • cfDNA libraries were constructed from the two ovarian cancer patient plasma samples and two plasma samples from healthy volunteers. 38 ⁇ aliquots of cfDNA were end-repaired in 50 ⁇ total volume.. Ligations included 40 ⁇ of end-repaired fragment, 16 ⁇ of adaptor (10 ⁇ ), 8 ul of 10X ligase buffer, 16 ⁇ of 50% PEG and 4 ⁇ of HC T4 DNA ligase in a total volume of 80 ⁇ . The ligation reaction was incubated at 20° C for 1 hour and 65° C for 10 min. For purification, 20 ⁇ of TEzero and 150 ⁇ of beads were added. The purified ligation products were resuspended in 40 ⁇ , all of which was used in a subsequent 200 ⁇ library
  • the average unique read count observed in each of the eight libraries ranges from -700 unique reads to >3000 unique reads, defining a range of sensitivities from -0.15% to -0.03%.
  • Figure 10 A rare mutant read will likely be observed more than once, meaning minimum sensitivities are less than those calculated above.
  • unique reads provide the lower bound on statistically defensible observation frequencies.
  • Sample 23407 was used as a benchmark. 10 ng/mL of cfDNA was recovered from the plasma sample and 20 ng of the isolated cfDNA was used in each of two library construction efforts. The unique read counts indicated that we recovered an average of 700 unique reads (genome equivalents) from unfragmented DNA ("23407" in Figure 10). Given that each genome contains 0.003 ng of gDNA, 2.1 ng of input DNA in this library (10% cloning efficiency) was recovered.
  • the cfDNA libraries resembled a set of discrete bands with random coverage of target regions.
  • Figure 11 shows a random sampling of sequence data.
  • a random set of reads from sample 23407 that was not fragmented prior to cloning (see Figure 10), and that were captured by the TP53 probe "chrl7:7579351 :region_3:280nt:41 :80:r" (SEQ ID NO:201) were aligned using BLAT. Given the way that the sample was prepared, these are likely a reflection of cfDNA fragments in general because the left hand portion of these reads (the read start sites) are randomly distributed across the target region. This random distribution indicates the random breakage of genomic DNA, and it demonstrates that despite the band-like appearance of cfDNA libraries, the sequencing output was a random coverage of the target region. The random distribution is important for effective genetic analysis using technology contemplated herein.
  • Figure 12 provides a more high resolution overview of TP53 coding region sequencing for a typical cfDNA library.
  • the elements of targeted sequencing - coverage across all target regions and uniform depth at each sequenced base - are readily apparent. At this depth of >4000 unique reads per base, and with a requirement that legitimate candidate base changes must be encountered at least twice, it is possible to estimate that the mutation detection sensitivity for this particular library was about 1 mutation in 2000 sequences, or 0.05%>. This level of sensitivity represents a surprising and unexpected outstanding technical achievement.
  • cfDNA was isolated and cloned from plasma clones with an efficiency comparable to highly purified gDNA isolated from cell lines (the gold standard).
  • the cfDNA libraries resembled circulating nucleosomal-sized DNA fragments + adaptors and the ends possessed sufficiently random character, which enabled efficient genetic analysis.
  • assay sensitivity is established on a sample-by-sample basis.
  • the numbers of different and distinct sequences that are present in each sample are measured, by measuring the number of genome equivalents that are present in a sequencing library.
  • the number of genome equivalents must be measured for each sample library.
  • the first method is based on quantitative PCR (qPCR).
  • a genomic library was constructed using ligation of adaptors to genomic fragments and a pair of PCR primers, one that is specific to a common genomic sequence (e.g., Alu I repeat) and one that is specific to the adaptor.
  • the abundance of ligated adaptor: fragment sequences of these cfDNA libraries was measured.
  • a standard library of known concentration was used to construct a standard curve and the measurements were fit to the resulting standard curve and a value for genome equivalents was derived from the fit.
  • the second method to measure genome equivalents used bioinformatics counting after sequencing was performed.
  • Each unique sequence in a library was identified by its random sequence label and the starting nucleotide of the genomic sequence.
  • each unique sequence must be derived from an independent genome. Therefore, the sum of unique sequences present in sequence data established a precise quantitative measurement of the number of genome equivalents present in a sample.
  • the first version of a qPCR-based genome equivalence assay used the ACA2 primer (Table 10), but this assay chronically under-reports the number of genome equivalents that are present in a cfDNA library ( Figure 13). able 10. PCR primers used in the development of the genome equivalent qPCR assay
  • the improved version of the assay was based on endogenous repeats (e.g., Alu repeats) that are found at very high frequency throughout the human genome.
  • endogenous repeats e.g., Alu repeats
  • an Alu-specific primer By coupling an Alu-specific primer with an adaptor-specific primer, the frequency with which adaptors are joined to genomic fragments was reliably measured. Standard curves using libraries of known genome equivalents were generated, and the number of genome equivalents in cloned libraries was measure by fitting to the curve.
  • the PCR primers used to develop an Alu + adaptor-based qPCR assay are shown in Table 10.
  • the PCR primers for Alu amplification were designed from consensus a consensus human Alu sequence (Batzer & Deininger, Nat Rev Genet. 3(5):370-9 (2002)) using PRIMER3 (Alu Fl & Alu Rl, SEQ ID NOs:285 and 286, respectively).
  • the remaining two Alu primers (Alu_F2 and Alu_R2, SEQ ID NOs:287 and 288, respectively) were reported in the literature (Marullo et ah, Genome Biology 11 :R9 (2010)).
  • FIG. 14 A schematic of the assay design is provided in Figure 14. Because a single PCR primer can used to amplify the genomic DNA libraries ( Figure 14A), a primer that recognizes the adaptor sequence but that cannot amplify genomic clones was used.
  • the 58 nucleotide ACA2-FLFP primer (henceforth abbreviated AF, SEQ ID NO:284) fills these criteria because its length induces strong stem-loop PCR suppression ( Figure 14B). Additionally, a functional pair of Alu primers were used ( Figure 14C).
  • a set of such library construction adaptors was specifically designed to measure the number of genome equivalents present in cfDNA libraries, and, by extension, the sensitivity of sequencing assays used to monitor mutant sequences.
  • the architecture of high-sensitivity library adaptors that were configured to accommodate large numbers of genome equivalents in cfDNA libraries is shown in Figure 17. There is a substantial amount of molecular engineering within the 45 nucleotide ligation strand, which is the strand that becomes attached to end repaired cfDNA fragments.
  • the adaptors comprise at least five elements.
  • Element 1 is a PCR primer binding site for the single-primer library
  • Element 2 is a 5 nucleotide read code.
  • the combination of this code with the genomic DNA sequence constitutes the DNA tag that was used to uniquely identify each read.
  • the 5 nucleotide codes consist of 256 possible unique sequences that were chosen to be 2 base changes different from every other code in the set. This feature enabled unique and distinct reads to be differentiated from reads that appeared to be unique owing to a sequencing error in the code region. Seven codes in which G residues are over-represented and that were shown empirically to interfere with adaptor function were removed, leaving 249 random codes. Table 13.
  • GCAGC 307 GTTAG 371 AAGCT 435 ATACG 499
  • GCCAG 308 GTTGA 372 AAGTC 436 ATAGC 500
  • AAACC 342 ATCTC 406 CCTAT 470 TCATC 534
  • Element 3 is a 3 nucleotide sample code that differ by at least two base changes. This element was used to identify different samples and enabled sample multiplexing within a sequencing run. Table 14.
  • Element 3 sample multiplexing 16 AAG 539
  • Element 4 is a 12 nucleotide anchor sequence with three important
  • each 12 base extension is part of a family of four 12 base extensions that collectively represent each of the four possible DNA bases at each site within extension. This feature, balanced base representation, is required by the Illumina sequencing instrument in order to calibrate proper base calling in sequencing reads.
  • Each extension is composed of only two of four possible bases, and these are specifically chosen to be either 6 A's + 6 C's or 6 G's + 6 T's. This extension formed from only two bases greatly reduces the possibility that the extension sequence will participate in secondary structure formation that would preclude proper adaptor function.
  • Element 5 is the two base sequence found at the 3 ' end of Element 4.
  • the particular two base extensions were chosen based on empirical data that shows that these two base sequences are efficient substrates for ligation. Table 15.
  • the adaptor module is hybrized to a partner oligonucleotide. Table 16. The hybridization takes place between the sequence within Element 4 and the partner oligonucleotide. The double-stranded adaptor was ligated to end-repaired cfDNA.
  • the procedure that is outlined is provided as an example and the methods contemplated herein are not meant to be bound by this example.
  • the number of genome equivalents to be analyzed may well exceed the 2500 limit illustrated in the preceding paragraph.
  • the second solution is to expand the code in Element 2 of Figure 17A to 6, 7, or more bases.
  • the results from this example showed that two independent methods for the determination of genome equivalents have utility in sample processing workflow.
  • the first method, qPCR was implemented during the library construction phase of cfDNA analysis and it was used as a quality control step to ensure that adequate numbers of genome equivalents are moved through library amplification, targeted sequence capture, and DNA sequencing.
  • the other method use explicit counting of unique reads as a more direct measure of the actual number of genome equivalents that fell under informatics consideration.
  • genomic events are prevalent in human cancers. These are somatic mutations that alter the function of the affected gene and its expressed protein product(s); genomic rearrangements that create chimeric gene fusions and therefore expressed fusion proteins with novel biological properties; and changes in gene copy number that lead to gene loss and under expression of gene product(s), or, conversely, amplification of genes and over-representation of the corresponding gene product(s).
  • genomic rearrangements that create chimeric gene fusions and therefore expressed fusion proteins with novel biological properties
  • changes in gene copy number that lead to gene loss and under expression of gene product(s), or, conversely, amplification of genes and over-representation of the corresponding gene product(s).
  • these aberrant loci are admixed (blended) with the patient's normal, germline DNA.
  • cfDNA circulating, cell-free DNA
  • the technology is widely applicable to any analytical, diagnostic and monitoring paradigm including, but not limited to r genetic diseases; fetal testing; mendelian disorders; pathogen screening; and organ transplant monitoring in which circulating DNA is a potential analyte.
  • the technical features highlighted in previous examples are applied to the analysis of admixed cancer samples.
  • cancer-derived cell lines were admixed with normal human DNA at defined dilutions, and quantitative genetic analysis was performed.
  • uncharacterized cfDNA was isolated from the plasma of cancer patients and subsequently examined using quantitative genetic analysis.
  • NCI-H2228 - non-small cell lung cancer cell line harbors mutation in TP53 (Q331 *) and EML4-ALK gene fusion (breakpoint unknown); and • NCI-H69 - small cell lung cancer cell line (ATCC), harbors amplification of the MYCN gene (-100 copies).
  • Genomic DNA isolated from cell lines is high molecular weight material that is dissimilar to the small size of cfDNA.
  • genomic DNAs were first fragmented on the "150 bp" setting using a Covaris Acoustic Sonicator. The sonication generally produces a broad smear, and the DNA was further processed using "two sided" bead selection. A dilute solution of DNA purification beads were added to the sample and the higher molecular mass fragments that adhere to the beads were discarded (the size of purified DNA is proportional to the amount of beads added).
  • Fragmented genomic DNA was end repaired, quantified, and mixed in the various ratios shown in Table 17 and described in the results section below.
  • cfDNA libraries may have limited DNA inputs.
  • the amount of cfDNA obtained per mL of patient plasma is widely variable, but the lower limits (e.g., Example 3) are generally -10 ng/mL, which is equivalent to 3300 human genomes.
  • the admixture experiments were modeled to reflect the lower limits of cfDNA that were routinely collected from patients. This constraint was applied to all but the most extreme admixtures. In these latter admixtures, libraries were made to mimic inputs from 4 mLs (NA06994:H2228 1000: 1) or 8 mLs
  • CNV copy number variation
  • Cell line NCI-H2228 is known to harbor a fusion gene between EML4 and
  • ALK the cell line serves as a positive control in both fluorescence in situ hybridization assays and in detection of fusion gene transcripts using RT-PCR.
  • RT-PCR fluorescence in situ hybridization assays
  • sequence analysis revealed precise location and sequence of the junction formed when the two genes fused ( Figure 20).
  • the frequency of normal reads versus junction reads in the NCI-H2228 cell line (378 vs 249, respectively) suggests that the fusion gene is heterozygous with a normal copy of ALK.
  • Detection of junction reads as a function of admixture is shown in Figure 21. As with point mutation detection, the expected values were adjusted to reflect the fact that the mutant allele is found in one copy per diploid genome. No fusion reads were detected in the 1000: 1 admixed sample.
  • Figure 22 shows the results of CNV determination for the MYCN gene as a function of admixture.
  • the NCI-H69 cell line harbors a highly amplified MYCN gene.
  • MYCN is normally found as a single copy gene, two per diploid genome, and thus the expected result for progressively more dilute admixtures is that the tag-calculated CNV should asymptotically approach 2 copies (the asymptote is highlighted in the figure).
  • the validation experiment shown here indicated that the assay system described in this invention is robustly sensitive to highly amplified genes.
  • the OVA1 sample which lacked any detectable changes in TP53, carried a mutation in KRAS that was found in both cfDNA and in the corresponding tumor.
  • This observation highlights a significant feature of the assay system described here.
  • Libraries created from cfDNA can be interrogated with hundreds (as in this example), and even thousands, of targeting probes.
  • the sequencing of the resulting targeted libraries revealed somatic mutations that reside within the tumor and not within the germline of the affected individual. These tumor-associated somatic markers can also be used to quantify the amount of circulating DNA that is shed from the tumor (as opposed to cfDNA that has germline sequence).
  • the discovery of mutations regardless of their biological significance, also provides an estimate of tumor content in admixed cfDNA.
  • FIG. 25 Another example of aberrant gene discovery is shown in Figure 25.
  • the targeted quantitative genetic analysis system revealed the presence of a significant amplification in the ERBB2 gene, otherwise referred to as HER-2/neu. While this type of amplification has been much publicized in breast cancer, it is occasionally identified in colorectal carcinomas as well.
  • Validation experiments with cell line DNA revealed the thresholds of detection of three types of genetic variation that are central to driving neoplastic growth in cancers. Characterization of cfDNA derived from cancer patients revealed tumor- associated genetic changes that were well above the thresholds set by reconstruction experiments in all four samples analyzed. These data indicated that the quantitative analysis system contemplated herein may have significant clinical utility, especially in settings where liquid biopsies are most appropriate.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Plant Pathology (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
PCT/US2014/052317 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna Ceased WO2016028316A1 (en)

Priority Applications (13)

Application Number Priority Date Filing Date Title
EP14761762.5A EP3194612B1 (en) 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna
HUE14761762A HUE068191T2 (hu) 2014-08-22 2014-08-22 A sejtmentes dns kvantitatív genetikai elemzésére szolgáló eljárások
CA2957657A CA2957657A1 (en) 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna
EP24182096.8A EP4410978A3 (en) 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna
ES14761762T ES2984266T3 (es) 2014-08-22 2014-08-22 Métodos para el análisis genético cuantitativo del ADN libre de células
DK14761762.5T DK3194612T3 (da) 2014-08-22 2014-08-22 Fremgangsmåder til kvantitativ genetisk analyse af cellefrit dna
PT147617625T PT3194612T (pt) 2014-08-22 2014-08-22 Métodos para análise genética quantitativa de adn livre de células
JP2017510397A JP6709778B2 (ja) 2014-08-22 2014-08-22 無細胞DNA(cfDNA)の定量的遺伝子解析のための方法
CN201480081729.4A CN107002118B (zh) 2014-08-22 2014-08-22 用于无细胞dna的定量遗传分析的方法
CN202210630244.2A CN115029342B (zh) 2014-08-22 2014-08-22 用于无细胞dna的定量遗传分析的方法
PCT/US2014/052317 WO2016028316A1 (en) 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna
PL14761762.5T PL3194612T3 (pl) 2014-08-22 2014-08-22 Metody ilościowej analizy genetycznej dna wolnego od komórek
SG11201701113WA SG11201701113WA (en) 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2014/052317 WO2016028316A1 (en) 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna

Publications (1)

Publication Number Publication Date
WO2016028316A1 true WO2016028316A1 (en) 2016-02-25

Family

ID=51494529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/052317 Ceased WO2016028316A1 (en) 2014-08-22 2014-08-22 Methods for quantitative genetic analysis of cell free dna

Country Status (11)

Country Link
EP (2) EP3194612B1 (enExample)
JP (1) JP6709778B2 (enExample)
CN (2) CN107002118B (enExample)
CA (1) CA2957657A1 (enExample)
DK (1) DK3194612T3 (enExample)
ES (1) ES2984266T3 (enExample)
HU (1) HUE068191T2 (enExample)
PL (1) PL3194612T3 (enExample)
PT (1) PT3194612T (enExample)
SG (1) SG11201701113WA (enExample)
WO (1) WO2016028316A1 (enExample)

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106047998A (zh) * 2016-05-27 2016-10-26 深圳市海普洛斯生物科技有限公司 一种肺癌基因的检测方法及应用
WO2017190067A1 (en) * 2016-04-28 2017-11-02 Impact Genomics, Inc. Methods of assessing and monitoring tumor load
WO2018039463A1 (en) 2016-08-25 2018-03-01 Resolution Bioscience, Inc. Methods for the detection of genomic copy changes in dna samples
US9932576B2 (en) 2012-12-10 2018-04-03 Resolution Bioscience, Inc. Methods for targeted genomic analysis
WO2018094183A1 (en) * 2016-11-17 2018-05-24 Seracare Life Sciences, Inc. Methods for preparing dna reference material and controls
CN108517567A (zh) * 2018-04-20 2018-09-11 江苏康为世纪生物科技有限公司 用于cfDNA建库的接头、引物组、试剂盒和建库方法
EP3374525A4 (en) * 2015-11-11 2019-05-29 Resolution Bioscience, Inc. HIGHLY EFFICIENT CONSTRUCTION OF DNA LIBRARIES
JP2019518437A (ja) * 2016-04-29 2019-07-04 ザ メディカル カレッジ オブ ウィスコンシン インクThe Medical College Of Wisconsin, Inc. 多重/最適化ミスマッチ増幅(moma)−がんの評価のためのリアルタイムpcr
US10364467B2 (en) 2015-01-13 2019-07-30 The Chinese University Of Hong Kong Using size and number aberrations in plasma DNA for detecting cancer
EP3430170A4 (en) * 2016-03-16 2019-11-27 Dana-Farber Cancer Institute, Inc. METHOD FOR GENOUS CHARACTERIZATION
US10741270B2 (en) 2012-03-08 2020-08-11 The Chinese University Of Hong Kong Size-based analysis of cell-free tumor DNA for classifying level of cancer
WO2021023123A1 (zh) * 2019-08-02 2021-02-11 北京贝瑞和康生物技术有限公司 一种非特异性扩增天然短片段核酸的方法和试剂盒
US11062789B2 (en) 2014-07-18 2021-07-13 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US11211144B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Methods and systems for refining copy number variation in a liquid biopsy assay
US11211147B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing
WO2022020662A3 (en) * 2020-07-24 2022-03-03 Cornell University Methods for assessing the severity and progression of sars-cov2 infections using cell-free dna
US11435339B2 (en) 2016-11-30 2022-09-06 The Chinese University Of Hong Kong Analysis of cell-free DNA in urine
US11475981B2 (en) 2020-02-18 2022-10-18 Tempus Labs, Inc. Methods and systems for dynamic variant thresholding in a liquid biopsy assay
WO2023039539A1 (en) * 2021-09-10 2023-03-16 Foundation Medicine, Inc. Gene fusions in sarcoma
US11773434B2 (en) 2017-06-20 2023-10-03 The Medical College Of Wisconsin, Inc. Assessing transplant complication risk with total cell-free DNA
US11832801B2 (en) * 2016-07-11 2023-12-05 Arizona Board Of Regents On Behalf Of Arizona State University Sweat as a biofluid for analysis and disease identification
US11931674B2 (en) 2019-04-04 2024-03-19 Natera, Inc. Materials and methods for processing blood samples
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US11946101B2 (en) 2015-05-11 2024-04-02 Natera, Inc. Methods and compositions for determining ploidy
US12020778B2 (en) 2010-05-18 2024-06-25 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12024738B2 (en) 2018-04-14 2024-07-02 Natera, Inc. Methods for cancer detection and monitoring
US12084720B2 (en) 2017-12-14 2024-09-10 Natera, Inc. Assessing graft suitability for transplantation
US12100478B2 (en) 2012-08-17 2024-09-24 Natera, Inc. Method for non-invasive prenatal testing using parental mosaicism data
US12110552B2 (en) 2010-05-18 2024-10-08 Natera, Inc. Methods for simultaneous amplification of target loci
US12146195B2 (en) 2016-04-15 2024-11-19 Natera, Inc. Methods for lung cancer detection
US12152275B2 (en) 2010-05-18 2024-11-26 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12203142B2 (en) 2014-04-21 2025-01-21 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US12203127B2 (en) 2014-08-22 2025-01-21 Resolution Bioscience, Inc. Methods for quantitative genetic analysis of cell free DNA
US12221653B2 (en) 2010-05-18 2025-02-11 Natera, Inc. Methods for simultaneous amplification of target loci
US12234509B2 (en) 2018-07-03 2025-02-25 Natera, Inc. Methods for detection of donor-derived cell-free DNA
US12260934B2 (en) 2014-06-05 2025-03-25 Natera, Inc. Systems and methods for detection of aneuploidy
US12270073B2 (en) 2010-05-18 2025-04-08 Natera, Inc. Methods for preparing a biological sample obtained from an individual for use in a genetic testing assay
US12305229B2 (en) 2014-04-21 2025-05-20 Natera, Inc. Methods for simultaneous amplification of target loci
US12378302B2 (en) 2012-11-05 2025-08-05 Foundation Medicine, Inc. Fusion molecules and uses thereof
US12410476B2 (en) 2010-05-18 2025-09-09 Natera, Inc. Methods for simultaneous amplification of target loci
US12460264B2 (en) 2016-11-02 2025-11-04 Natera, Inc. Method of detecting tumour recurrence
US12492429B2 (en) 2014-04-21 2025-12-09 Natera, Inc. Detecting mutations and ploidy in chromosomal segments

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7441584B6 (ja) * 2014-08-22 2024-03-15 レゾリューション バイオサイエンス, インコーポレイテッド 無細胞DNA(cfDNA)の定量的遺伝子解析のための方法
CN111684078B (zh) * 2018-02-12 2024-04-19 豪夫迈·罗氏有限公司 通过评价肿瘤遗传异质性来预测对治疗的应答的方法
EP3762513A1 (en) 2018-03-08 2021-01-13 St. Johns University Circulating serum cell-free dna biomarkers and methods
CN108676891B (zh) * 2018-07-12 2022-02-01 吉林大学 一种直肠腺癌易感性预测试剂盒及系统
US11926821B2 (en) * 2018-10-22 2024-03-12 The Chinese University Of Hong Kong Cell-free DNA quality
JP7560112B2 (ja) * 2019-01-15 2024-10-02 一般社団法人生命科学教育研究所 Pvaスポンジを用いるリキッドバイオプシーの調製方法
CN110819622A (zh) * 2019-10-09 2020-02-21 广州达正生物科技有限公司 一种快速提取石蜡组织rna的方法
CN111254194B (zh) * 2020-01-13 2021-09-07 东南大学 基于cfDNA的测序及数据分析的癌症相关生物标记及其在cfDNA样品分类中的应用
HUE072019T2 (hu) * 2020-09-08 2025-10-28 Resolution Bioscience Inc Adapterek és eljárások genetikai könyvtárak nagy hatékonyságú létrehozásához és genetikai elemzéshez

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130288915A1 (en) * 2012-04-27 2013-10-31 Htg Molecular Diagnostics, Inc. Compositions and methods for alk molecular testing
WO2014071295A1 (en) * 2012-11-02 2014-05-08 Enzymatics Inc. Methods and kits for nucleic acid sample preparation for sequencing
WO2014122288A1 (en) * 2013-02-08 2014-08-14 Qiagen Gmbh Method for separating dna by size

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9023769B2 (en) * 2009-11-30 2015-05-05 Complete Genomics, Inc. cDNA library for nucleic acid sequencing
EP2426217A1 (en) * 2010-09-03 2012-03-07 Centre National de la Recherche Scientifique (CNRS) Analytical methods for cell free nucleic acids and applications
JP2015530113A (ja) * 2012-09-28 2015-10-15 セファイド マイクロrna多重アッセイのための2プライマーpcr
WO2014093330A1 (en) * 2012-12-10 2014-06-19 Clearfork Bioscience, Inc. Methods for targeted genomic analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130288915A1 (en) * 2012-04-27 2013-10-31 Htg Molecular Diagnostics, Inc. Compositions and methods for alk molecular testing
WO2014071295A1 (en) * 2012-11-02 2014-05-08 Enzymatics Inc. Methods and kits for nucleic acid sample preparation for sequencing
WO2014122288A1 (en) * 2013-02-08 2014-08-14 Qiagen Gmbh Method for separating dna by size

Non-Patent Citations (21)

* Cited by examiner, † Cited by third party
Title
"Short Protocols in Molecular Biology: A Compendium ofmethodsftom Current Protocols in Molecular Biology", GREENE PUB. ASSOCIATES AND WILEY-INTERSCIENCE
"Transcription and Translation", 1984
ANAND: "Techniques for the Analysis of Complex Genomes", 1992, ACADEMIC PRESS
ATAMANIUK ET AL., CLINICAL CHEMISTRY, vol. 52, no. 3, 2006, pages 523 - 26
AUSUBEL ET AL.: "Current Protocols in Molecular Biology", 2007
AUSUBEL ET AL.: "Current Protocols in Molecular Biology", July 2008, JOHN WILEY AND SONS
BATZER; DEININGER, NAT REV GENET., vol. 3, no. 5, 2002, pages 370 - 9
GLOVER: "DNA Cloning: A Practical Approach", vol. I, II, 1985, IRL PRESS
HARLOW; LANE: "Antibodies", 1998, COLD SPRING HARBOR LABORATORY PRESS
HOEIJMAKERS WIETEKE A M ET AL: "Linear amplification for deep sequencing.", NATURE PROTOCOLS, vol. 6, no. 7, July 2011 (2011-07-01), pages 1026 - 1036, XP002733471, ISSN: 1750-2799 *
LIN ET AL., MOL. CANCER RES., vol. 7, no. 9, 2009, pages 1466
MANIATIS ET AL.: "Molecular Cloning: A Laboratory Manual", 1982
MANO HIROYUKI: "Non-solid oncogenes in solid tumors: EML4-ALK fusion genes in lung cancer.", CANCER SCIENCE, vol. 99, no. 12, December 2008 (2008-12-01), pages 2349 - 2355, XP002733474, ISSN: 1349-7006 *
MARULLO ET AL., GENOME BIOLOGY, vol. 11, 2010, pages R9
MCKERNAN KEVIN JUDD ET AL: "Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding.", GENOME RESEARCH, vol. 19, no. 9, September 2009 (2009-09-01), pages 1527 - 1541, XP002733473, ISSN: 1549-5469 *
PERBAL: "A Practical Guide to Molecular Cloning", 1984
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 2001
SHEVELEV IV; HUBSCHER U.: "The 3' 5' exonucleases", NAT REV MOL CELL BIOL., vol. 3, no. 5, 2002, pages 364 - 76, XP008098898, DOI: doi:10.1038/nrm801
VOGELSTEIN ET AL., SCIENCE, vol. 339, no. 6127, 2013, pages 1546 - 1558
YEGNASUBRAMANIAN SRINIVASAN: "Preparation of fragment libraries for next-generation sequencing on the applied biosystems SOLiD platform.", 2013, pages 1 - 17, XP002733472, Retrieved from the Internet <URL:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3991121/> [retrieved on 20141211] *

Cited By (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12270073B2 (en) 2010-05-18 2025-04-08 Natera, Inc. Methods for preparing a biological sample obtained from an individual for use in a genetic testing assay
US12020778B2 (en) 2010-05-18 2024-06-25 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12110552B2 (en) 2010-05-18 2024-10-08 Natera, Inc. Methods for simultaneous amplification of target loci
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US12410476B2 (en) 2010-05-18 2025-09-09 Natera, Inc. Methods for simultaneous amplification of target loci
US12152275B2 (en) 2010-05-18 2024-11-26 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12221653B2 (en) 2010-05-18 2025-02-11 Natera, Inc. Methods for simultaneous amplification of target loci
US10741270B2 (en) 2012-03-08 2020-08-11 The Chinese University Of Hong Kong Size-based analysis of cell-free tumor DNA for classifying level of cancer
US11031100B2 (en) 2012-03-08 2021-06-08 The Chinese University Of Hong Kong Size-based sequencing analysis of cell-free tumor DNA for classifying level of cancer
US12100478B2 (en) 2012-08-17 2024-09-24 Natera, Inc. Method for non-invasive prenatal testing using parental mosaicism data
US12378302B2 (en) 2012-11-05 2025-08-05 Foundation Medicine, Inc. Fusion molecules and uses thereof
US9932576B2 (en) 2012-12-10 2018-04-03 Resolution Bioscience, Inc. Methods for targeted genomic analysis
US10907149B2 (en) 2012-12-10 2021-02-02 Resolution Bioscience, Inc. Methods for targeted genomic analysis
US11999949B2 (en) 2012-12-10 2024-06-04 Resolution Bioscience, Inc. Methods for targeted genomic analysis
US12305229B2 (en) 2014-04-21 2025-05-20 Natera, Inc. Methods for simultaneous amplification of target loci
US12203142B2 (en) 2014-04-21 2025-01-21 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US12486542B2 (en) 2014-04-21 2025-12-02 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US12492429B2 (en) 2014-04-21 2025-12-09 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US12260934B2 (en) 2014-06-05 2025-03-25 Natera, Inc. Systems and methods for detection of aneuploidy
US11062789B2 (en) 2014-07-18 2021-07-13 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US11984195B2 (en) 2014-07-18 2024-05-14 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US12203127B2 (en) 2014-08-22 2025-01-21 Resolution Bioscience, Inc. Methods for quantitative genetic analysis of cell free DNA
US10364467B2 (en) 2015-01-13 2019-07-30 The Chinese University Of Hong Kong Using size and number aberrations in plasma DNA for detecting cancer
US11946101B2 (en) 2015-05-11 2024-04-02 Natera, Inc. Methods and compositions for determining ploidy
EP3374525A4 (en) * 2015-11-11 2019-05-29 Resolution Bioscience, Inc. HIGHLY EFFICIENT CONSTRUCTION OF DNA LIBRARIES
US11339391B2 (en) 2015-11-11 2022-05-24 Resolution Bioscience, Inc. High efficiency construction of DNA libraries
EP3889257A1 (en) * 2015-11-11 2021-10-06 Resolution Bioscience, Inc. High efficiency construction of dna libraries
EP3430170A4 (en) * 2016-03-16 2019-11-27 Dana-Farber Cancer Institute, Inc. METHOD FOR GENOUS CHARACTERIZATION
US11479878B2 (en) 2016-03-16 2022-10-25 Dana-Farber Cancer Institute, Inc. Methods for genome characterization
US12146195B2 (en) 2016-04-15 2024-11-19 Natera, Inc. Methods for lung cancer detection
US11408036B2 (en) 2016-04-28 2022-08-09 Lexent Bio, Inc. Methods of assessing and monitoring tumor load
WO2017190067A1 (en) * 2016-04-28 2017-11-02 Impact Genomics, Inc. Methods of assessing and monitoring tumor load
JP2019518437A (ja) * 2016-04-29 2019-07-04 ザ メディカル カレッジ オブ ウィスコンシン インクThe Medical College Of Wisconsin, Inc. 多重/最適化ミスマッチ増幅(moma)−がんの評価のためのリアルタイムpcr
CN106047998A (zh) * 2016-05-27 2016-10-26 深圳市海普洛斯生物科技有限公司 一种肺癌基因的检测方法及应用
US11832801B2 (en) * 2016-07-11 2023-12-05 Arizona Board Of Regents On Behalf Of Arizona State University Sweat as a biofluid for analysis and disease identification
KR102850460B1 (ko) 2016-08-25 2025-08-27 레졸루션 바이오사이언스, 인크. Dna 샘플 중 게놈 카피 변화의 검출을 위한 방법
WO2018039463A1 (en) 2016-08-25 2018-03-01 Resolution Bioscience, Inc. Methods for the detection of genomic copy changes in dna samples
JP2022023213A (ja) * 2016-08-25 2022-02-07 レゾリューション バイオサイエンス, インコーポレイテッド Dna試料中のゲノムコピー変化の検出方法
CN109804080A (zh) * 2016-08-25 2019-05-24 分析生物科学有限公司 用于检测dna样品中基因组拷贝变化的方法
CN109804080B (zh) * 2016-08-25 2023-07-21 分析生物科学有限公司 用于检测dna样品中基因组拷贝变化的方法
JP7304393B2 (ja) 2016-08-25 2023-07-06 レゾリューション バイオサイエンス, インコーポレイテッド Dna試料中のゲノムコピー変化の検出方法
JP2023078336A (ja) * 2016-08-25 2023-06-06 レゾリューション バイオサイエンス, インコーポレイテッド Dna試料中のゲノムコピー変化の検出方法
US11319594B2 (en) 2016-08-25 2022-05-03 Resolution Bioscience, Inc. Methods for the detection of genomic copy changes in DNA samples
JP2019526257A (ja) * 2016-08-25 2019-09-19 レゾリューション バイオサイエンス, インコーポレイテッド Dna試料中のゲノムコピー変化の検出方法
JP2024107320A (ja) * 2016-08-25 2024-08-08 レゾリューション バイオサイエンス, インコーポレイテッド Dna試料中のゲノムコピー変化の検出方法
JP7217224B2 (ja) 2016-08-25 2023-02-02 レゾリューション バイオサイエンス, インコーポレイテッド Dna試料中のゲノムコピー変化の検出方法
KR20230035431A (ko) * 2016-08-25 2023-03-13 레졸루션 바이오사이언스, 인크. Dna 샘플 중 게놈 카피 변화의 검출을 위한 방법
US12460264B2 (en) 2016-11-02 2025-11-04 Natera, Inc. Method of detecting tumour recurrence
WO2018094183A1 (en) * 2016-11-17 2018-05-24 Seracare Life Sciences, Inc. Methods for preparing dna reference material and controls
US11435339B2 (en) 2016-11-30 2022-09-06 The Chinese University Of Hong Kong Analysis of cell-free DNA in urine
US11773434B2 (en) 2017-06-20 2023-10-03 The Medical College Of Wisconsin, Inc. Assessing transplant complication risk with total cell-free DNA
US12084720B2 (en) 2017-12-14 2024-09-10 Natera, Inc. Assessing graft suitability for transplantation
US12385096B2 (en) 2018-04-14 2025-08-12 Natera, Inc. Methods for cancer detection and monitoring
US12024738B2 (en) 2018-04-14 2024-07-02 Natera, Inc. Methods for cancer detection and monitoring
CN108517567B (zh) * 2018-04-20 2020-08-11 江苏康为世纪生物科技有限公司 用于cfDNA建库的接头、引物组、试剂盒和建库方法
CN108517567A (zh) * 2018-04-20 2018-09-11 江苏康为世纪生物科技有限公司 用于cfDNA建库的接头、引物组、试剂盒和建库方法
US12234509B2 (en) 2018-07-03 2025-02-25 Natera, Inc. Methods for detection of donor-derived cell-free DNA
US11931674B2 (en) 2019-04-04 2024-03-19 Natera, Inc. Materials and methods for processing blood samples
WO2021023123A1 (zh) * 2019-08-02 2021-02-11 北京贝瑞和康生物技术有限公司 一种非特异性扩增天然短片段核酸的方法和试剂盒
US11211147B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing
US11211144B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Methods and systems for refining copy number variation in a liquid biopsy assay
US11475981B2 (en) 2020-02-18 2022-10-18 Tempus Labs, Inc. Methods and systems for dynamic variant thresholding in a liquid biopsy assay
WO2022020662A3 (en) * 2020-07-24 2022-03-03 Cornell University Methods for assessing the severity and progression of sars-cov2 infections using cell-free dna
EP4399330A4 (en) * 2021-09-10 2025-07-16 Found Medicine Inc GENE FUSIONS IN SARCOMA
WO2023039539A1 (en) * 2021-09-10 2023-03-16 Foundation Medicine, Inc. Gene fusions in sarcoma

Also Published As

Publication number Publication date
EP3194612B1 (en) 2024-07-03
JP2017525371A (ja) 2017-09-07
PL3194612T3 (pl) 2024-11-04
CN115029342A (zh) 2022-09-09
PT3194612T (pt) 2024-10-02
EP4410978A3 (en) 2024-10-30
ES2984266T3 (es) 2024-10-29
JP6709778B2 (ja) 2020-06-17
EP3194612A1 (en) 2017-07-26
HUE068191T2 (hu) 2024-12-28
CN115029342B (zh) 2025-04-08
CN107002118B (zh) 2022-07-08
CA2957657A1 (en) 2016-02-25
SG11201701113WA (en) 2017-03-30
EP4410978A2 (en) 2024-08-07
CN107002118A (zh) 2017-08-01
DK3194612T3 (da) 2024-10-14

Similar Documents

Publication Publication Date Title
US20250179555A1 (en) Methods for quantitative genetic analysis of cell free dna
EP3194612B1 (en) Methods for quantitative genetic analysis of cell free dna
US20220267763A1 (en) High efficiency construction of dna libraries
EP4211246B1 (en) Adaptors and methods for high efficiency construction of genetic libraries and genetic analysis
KR20240004397A (ko) 다중 라이브러리의 동시 유전자 분석을 위한 조성물 및 방법
HK40112151A (en) Methods for quantitative genetic analysis of cell free dna
JP7441584B2 (ja) 無細胞DNA(cfDNA)の定量的遺伝子解析のための方法
HK40061892A (en) High efficiency construction of dna libraries
HK1241932A1 (en) Methods for quantitative genetic analysis of cell free dna
HK1241932B (en) Methods for quantitative genetic analysis of cell free dna
HK40096223A (en) Adaptors and methods for high efficiency construction of genetic libraries and genetic analysis
HK40096223B (en) Adaptors and methods for high efficiency construction of genetic libraries and genetic analysis
HK1261166A1 (en) High efficiency construction of dna libraries
HK1261166B (en) High efficiency construction of dna libraries

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14761762

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2957657

Country of ref document: CA

REEP Request for entry into the european phase

Ref document number: 2014761762

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014761762

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017510397

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE