US20170211143A1 - Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same - Google Patents

Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same Download PDF

Info

Publication number
US20170211143A1
US20170211143A1 US15/329,228 US201515329228A US2017211143A1 US 20170211143 A1 US20170211143 A1 US 20170211143A1 US 201515329228 A US201515329228 A US 201515329228A US 2017211143 A1 US2017211143 A1 US 2017211143A1
Authority
US
United States
Prior art keywords
cfdna
tissue
cell
fragment
disease
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/329,228
Other languages
English (en)
Inventor
Jay Shendure
Matthew Snyder
Martin Kircher
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Washington
Original Assignee
University of Washington
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Washington filed Critical University of Washington
Priority to US15/329,228 priority Critical patent/US20170211143A1/en
Publication of US20170211143A1 publication Critical patent/US20170211143A1/en
Assigned to UNIVERSITY OF WASHINGTON reassignment UNIVERSITY OF WASHINGTON ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KIRCHER, MARTIN, SHENDURE, JAY, SNYDER, MATTHEW
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6881Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • G06F19/18
    • G06F19/24
    • G06F19/26
    • G06F19/345
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2535/00Reactions characterised by the assay type for determining the identity of a nucleotide base or a sequence of oligonucleotides
    • C12Q2535/122Massive parallel sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2537/00Reactions characterised by the reaction format or use of a specific feature
    • C12Q2537/10Reactions characterised by the reaction format or use of a specific feature the purpose or use of
    • C12Q2537/165Mathematical modelling, e.g. logarithm, ratio

Definitions

  • the present disclosure relates to methods of determining one or more tissues and/or cell-types giving rise to cell-free DNA.
  • the present disclosure provides a method of identifying a disease or disorder in a subject as a function of one or more determined tissues and/or cell-types associated with cell-free DNA in a biological sample from the subject.
  • cfDNA Cell-free DNA
  • the cfDNA comprises double-stranded DNA fragments that are relatively short (primarily less than 200 base-pairs) and are normally at a low concentration (e.g. 1-100 ng/mL in plasma).
  • cfDNA is believed to primarily derive from apoptosis of blood cells (i.e., normal cells of the hematopoietic lineage).
  • other tissues can contribute substantially to the composition of cfDNA in bodily fluids such as circulating plasma.
  • cfDNA has been used in certain specialties (e.g., reproductive medicine, cancer diagnostics, and transplant medicine)
  • existing tests based on cfDNA rely on differences in genotypes (e.g., primary sequence or copy number representation of a particular sequence) between two or more cell populations (e.g., maternal genome vs. fetal genome; normal genome vs. cancer genome; transplant recipient genome vs. donor genome, etc.).
  • genotypes e.g., primary sequence or copy number representation of a particular sequence
  • cell populations e.g., maternal genome vs. fetal genome; normal genome vs. cancer genome; transplant recipient genome vs. donor genome, etc.
  • the present disclosure provides methods of determining one or more tissues and/or cell-types giving rise to cell-free DNA (“cfDNA”) in a biological sample of a subject.
  • the present disclosure provides a method of identifying a disease or disorder in a subject as a function of one or more determined tissues and/or cell-types associated with cfDNA in a biological sample from the subject.
  • the present disclosure provides a method of determining tissues and/or cell types giving rise to cell-free DNA (cfDNA) in a subject, the method comprising isolating cfDNA from a biological sample from the subject, the isolated cfDNA comprising a plurality of cfDNA fragments; determining a sequence associated with at least a portion of the plurality of cfDNA fragments; determining a genomic location within a reference genome for at least some cfDNA fragment endpoints of the plurality of cfDNA fragments as a function of the cfDNA fragment sequences; and determining at least some of the tissues and/or cell types giving rise to the cfDNA fragments as a function of the genomic locations of at least some of the cfDNA fragment endpoints.
  • cfDNA cell-free DNA
  • the present disclosure provides a method of identifying a disease or disorder in a subject, the method comprising isolating cell-free DNA (cfDNA) from a biological sample from the subject, the isolated cfDNA comprising a plurality of cfDNA fragments; determining a sequence associated with at least a portion of the plurality of cfDNA fragments; determining a genomic location within a reference genome for at least some cfDNA fragment endpoints of the plurality of cfDNA fragments as a function of the cfDNA fragment sequences; determining at least some of the tissues and/or cell types giving rise to the cfDNA as a function of the genomic locations of at least some of the cfDNA fragment endpoints; and identifying the disease or disorder as a function of the determined tissues and/or cell types giving rise to the cfDNA.
  • cfDNA cell-free DNA
  • the present disclosure provides a method for determining tissues and/or cell types giving rise to cell-free DNA (cfDNA) in a subject, the method comprising: (i) generating a nucleosome map by obtaining a biological sample from the subject, isolating the cfDNA from the biological sample, and measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of cfDNA; (ii) generating a reference set of nucleosome maps by obtaining a biological sample from control subjects or subjects with known disease, isolating the cfDNA from the biological sample, measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of cfDNA; and (iii) determining tissues and/or cell types giving rise to the cfDNA from the biological sample by comparing the nucleosome map derived from the cfDNA from the biological sample to the reference set of nucleosome maps; wherein (a), (b) and (c)
  • the present disclosure provides a method for determining tissues and/or cell types giving rise to cfDNA in a subject, the method comprising: (i) generating a nucleosome map by obtaining a biological sample from the subject, isolating the cfDNA from the biological sample, and measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of cfDNA; (ii) generating a reference set of nucleosome maps by obtaining a biological sample from control subjects or subjects with known disease, isolating the cfDNA from the biological sample, measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of DNA derived from fragmentation of chromatin with an enzyme such as micrococcal nuclease, DNase, or transposase; and (iii) determining tissues and/or cell types giving rise to the cfDNA from the biological sample by comparing the nucleosome map derived from the cfDNA from
  • the present disclosure provides a method for diagnosing a clinical condition in a subject, the method comprising: (i) generating a nucleosome map by obtaining a biological sample from the subject, isolating cfDNA from the biological sample, and measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of cfDNA; (ii) generating a reference set of nucleosome maps by obtaining a biological sample from control subjects or subjects with known disease, isolating the cfDNA from the biological sample, measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of cfDNA; and (iii) determining the clinical condition by comparing the nucleosome map derived from the cfDNA from the biological sample to the reference set of nucleosome maps; wherein (a), (b) and (c) are: (a) the distribution of likelihoods any specific base-pair in a human genome will appear at a terminus of
  • the present disclosure provides a method for diagnosing a clinical condition in a subject, the method comprising (i) generating a nucleosome map by obtaining a biological sample from the subject, isolating cfDNA from the biological sample, and measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of cfDNA; (ii) generating a reference set of nucleosome maps by obtaining a biological sample from control subjects or subjects with known disease, isolating the cfDNA from the biological sample, measuring distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of DNA derived from fragmentation of chromatin with an enzyme such as micrococcal nuclease (MNase), DNase, or transposase; and (iii) determining the tissue-of-origin composition of the cfDNA from the biological sample by comparing the nucleosome map derived from the cfDNA from the biological sample to the reference set of
  • MNase
  • FIG. 1 shows three types of information that relate cfDNA fragmentation patterns to nucleosome occupancy, exemplified for a small genomic region. These same types of information might also arise through fragmentation of chromatin with an enzyme such as micrococcal nuclease (MNase), DNase, or transposase.
  • FIG. 1A shows the distribution of likelihoods any specific base-pair in a human genome will appear at a terminus of a sequenced fragment (i.e. points of fragmentation);
  • FIG. 1B shows the distribution of likelihoods that any pair of base-pairs of a human genome will appear as a pair of termini of a sequenced fragment (i.e. consecutive pairs of fragmentation points that give rise to an individual molecule);
  • FIG. 1C shows the distribution of likelihoods that any specific base-pair in a human genome will appear within a sequenced fragment (i.e. relative coverage) as a consequence of differential nucleosome occupancy.
  • FIG. 2 shows insert size distribution of a typical cfDNA sequencing library; here shown for the pooled cfDNA sample derived from human plasma containing contributions from an unknown number of healthy individuals (bulk.cfDNA).
  • FIG. 3A shows average periodogram intensities from Fast Fourier Transformation (FFT) of read start coordinates mapping to the first (chr1) human autosome across all cfDNA samples (Plasma), cfDNA from tumor patient samples (Tumor), cfDNA from pregnant female individuals (Pregnancy), MNase of human different human cell lines (Cell lines) and a human DNA shotgun sequencing library (Shotgun).
  • FFT Fast Fourier Transformation
  • FIG. 3B shows average periodogram intensities from Fast Fourier Transformation (FFT) of read start coordinates mapping to the last (chr22) human autosome across all cfDNA samples (Plasma), cfDNA from tumor patient samples (Tumor), cfDNA from pregnant female individuals (Pregnancy), MNase of human different human cell lines (Cell lines) and a human DNA shotgun sequencing library (Shotgun).
  • FFT Fast Fourier Transformation
  • FIG. 4 shows first three principal components (PC) of intensities at 196 base-pairs (bp) periodicity in 10 kilobase-pair (kbp) blocks across all autosomes:
  • FIG. 4A shows PC 2 vs. PC 1;
  • FIG. 4B shows PC 3 vs. PC 2.
  • FIG. 5 shows hierarchical clustering dendogram of Euclidean distances of intensities measured at 196 bp periodicity in 10 kbp blocks across all autosomes.
  • FIG. 6 shows first three principal components of intensities at 181 bp to 202 bp periodicity in 10 kbp blocks across all autosomes:
  • FIG. 6A shows PC 2 vs. PC 1;
  • FIG. 6B shows PC3 vs. PC 2.
  • FIG. 7 shows hierarchical clustering dendogram of Euclidean distances of intensities measured at 181 bp to 202 bp periodicity in 10 kbp blocks across all autosomes.
  • FIG. 8 shows principal component analysis (first 7 of 10 PCs) of intensities at 181 bp to 202 bp periodicity in 10 kbp blocks across all autosomes for the cfDNA data sets:
  • FIG. 8A shows PC 2 vs. PC 1;
  • FIG. 8B shows PC 3 vs. PC 2;
  • FIG. 8C shows PC 4 vs. PC 3;
  • FIG. 8D shows PC 5 vs. PC 4;
  • FIG. 8E shows PC 6 vs. PC 5;
  • FIG. 8F shows PC 7 vs. PC 6.
  • FIG. 9 shows principal component analysis of intensities at 181 bp to 202 bp periodicity in 10 kbp blocks across all autosomes for the MNase data sets:
  • FIG. 9A shows PC 2 vs. PC 1;
  • FIG. 9B shows PC 3 vs. PC 2;
  • FIG. 9C shows PC 4 vs. PC 3;
  • FIG. 9D shows PC 5 vs. PC 4;
  • FIG. 9E shows PC 6 vs. PC 5.
  • FIG. 10 shows average periodogram intensities for a representative human autosome (chr11) across all synthetic cfDNA and MNase data set mixtures:
  • FIG. 11 shows first two principal components of intensities at 181 bp to 202 bp periodicity in 10 kbp blocks across all autosomes for the synthetic MNase data set mixtures.
  • FIG. 12 shows first two principal components of intensities at 181 bp to 202 bp periodicity in 10 kbp blocks across all autosomes for the synthetic cfDNA data set mixtures.
  • FIG. 13 shows hierarchical clustering dendogram of Euclidean distances of intensities at 181 bp to 202 bp periodicity in 10 kbp blocks across all autosomes for the synthetic MNase and cfDNA mixture data sets.
  • FIG. 14 shows read-start density in 1 kbp window around 23,666 CTCF binding sites for a set of samples with at least 100M reads.
  • FIG. 15 shows read-start density in 1 kbp window around 5,644 c-Jun binding sites for a set of samples with at least 100M reads.
  • FIG. 16 shows read-start density for 1 kbp window around 4,417 NF-YB binding sites for a set of samples with at least 100M reads.
  • FIG. 17 shows a schematic overview of the processes giving rise to cfDNA fragments.
  • Apoptotic and/or necrotic cell death results in near-complete digestion of native chromatin.
  • Protein-bound DNA fragments typically associated with histones or transcription factors, preferentially survive digestion and are released into the circulation, while naked DNA is lost. Fragments can be recovered from peripheral blood plasma following proteinase treatment.
  • cfDNA is primarily derived from myeloid and lymphoid cell lineages, but contributions from one or more additional tissues may be present in certain medical conditions.
  • FIG. 18 shows fragment length of cfDNA observed with conventional sequencing library preparation. Length is inferred from alignment of paired-end sequencing reads. A reproducible peak in fragment length at 167 base-pairs (bp) (green dashed line) is consistent with association with chromatosomes. Additional peaks evidence ⁇ 10.4 bp periodicity, corresponding to the helical pitch of DNA on the nucleosome core. Enzymatic end-repair during library preparation removes 5′ and 3′ overhangs and may obscure true cleavage sites.
  • FIG. 19 shows a dinucleotide composition of 167 bp fragments and flanking genomic sequence in conventional libraries. Observed dinucleotide frequencies in the BH01 library were compared to expected frequencies from simulated fragments (matching for endpoint biases resulting from both cleavage and adapter ligation preferences).
  • FIG. 20 shows a schematic of a single-stranded library preparation protocol for cfDNA fragments.
  • FIG. 21 shows fragment length of cfDNA observed with single-stranded sequencing library preparation. No enzymatic end-repair is performed to template molecules during library preparation. Short fragments of 50-120 bp are highly enriched compared to conventional libraries. While ⁇ 10.4 bp periodicity remains, its phase is shifted by ⁇ 3 bp.
  • FIG. 22 shows a dinucleotide composition of 167 bp fragments and flanking genomic sequence in single-stranded libraries. Observed dinucleotide frequencies in the IH02 library were compared to expected frequencies derived from simulated fragments, again matching for endpoint biases. The apparent difference in the background level of bias between BH01 and IH02 relate to differences between the simulations, rather than the real libraries (data not shown).
  • FIG. 23A shows a gel image of representative cfDNA sequencing library prepared with the conventional protocol.
  • FIG. 23B shows a gel image of a representative cfDNA sequencing library prepared with the single-stranded protocol.
  • FIG. 24A shows mononucleotide cleavage biases of cfDNA fragments.
  • FIG. 24B shows dinucleotide cleavage biases of cfDNA fragments.
  • FIG. 25 shows a schematic overview of inference of nucleosome positioning.
  • a per-base windowed protection score (WPS) is calculated by subtracting the number of fragment endpoints within a 120 bp window from the number of fragments completely spanning the window. High WPS values indicate increased protection of DNA from digestion; low values indicate that DNA is unprotected. Peak calls identify contiguous regions of elevated WPS.
  • FIG. 26 shows strongly positioned nucleosomes at a well-studied alpha-satellite array. Coverage, fragment endpoints, and WPS values from sample CH01 are shown for long fragment (120 bp window; 120-180 bp reads) or short fragment (16 bp window; 35-80 bp reads) bins at a pericentromeric locus on chromosome 12. Nucleosome calls from CH01 (middle, blue boxes) are regularly spaced across the locus. Nucleosome calls based on MNase digestion from two published studies (middle, purple and black boxes) are also displayed. The locus overlaps with an annotated alpha-satellite array.
  • FIG. 27 shows inferred nucleosome positioning around a DNase I hypersensitive site (DHS) on chromosome 9. Coverage, fragment endpoints, and WPS values from sample CH01 are shown for long and short fragment bins. The hypersensitive region, highlighted in gray, is marked by reduced coverage in the long fragment bin. Nucleosome calls from CH01 (middle, blue boxes) adjacent to the DHS are spaced more widely than typical adjacent pairs, consistent with accessibility of the intervening sequence to regulatory proteins including transcription factors. Coverage of shorter fragments, which may be associated with such proteins, is increased at the DHS, which overlaps with several annotated transcription factor binding sites (not shown). Nucleosome calls based on MNase digestion from two published studies are shown as in FIG. 26 .
  • DHS DNase I hypersensitive site
  • FIG. 28 shows a schematic of peak calling and scoring according to one embodiment of the present disclosure.
  • FIG. 29 shows CH01 peak density by GC content.
  • FIG. 30 shows a histogram of distances between adjacent peaks by sample. Distances are measured from peak call to adjacent call.
  • FIG. 31 shows a comparison of peak calls between samples. For each pair of samples, the distances between each peak call in the sample with fewer peaks and the nearest peak call in the other sample are calculated and visualized as a histogram with bin size of 1. Negative numbers indicate the nearest peak is upstream; positive numbers indicate the nearest peak is downstream.
  • FIG. 32 shows a comparison of peak calls between samples: FIG. 32A shows IH01 vs. BH01; FIG. 32B shows IH02 vs. BH01; FIG. 32C shows IH02 vs. IH01.
  • FIG. 33A shows nucleosome scores for real vs. simulated peaks.
  • FIG. 33B shows median peak offset within a score bin as a function of the score bin (left y-axis), and the number of peaks in each score bin (right y-axis).
  • FIG. 34 shows a comparison of peak calls between samples and matched simulations: FIG. 34A shows BH01 simulation vs. BH01 actual; FIG. 34B shows IH01 simulation vs. IH01 actual; FIG. 34C shows IH02 simulation vs. IH01 actual.
  • FIG. 35 shows distances between adjacent peaks, sample CH01.
  • the dotted black line indicates the mode of the distribution (185 bp).
  • FIG. 36 shows aggregate, adjusted windowed protection scores (WPS; 120 bp window) around 22,626 transcription start sites (TSS). TSS are aligned at the 0 position after adjusting for strand and direction of transcription. Aggregate WPS is tabulated for both real data and simulated data by summing per-TSS WPS at each position relative to the centered TSS. The values plotted represent the difference between the real and simulated aggregate WPS, further adjusted to local background as described in greater detail below. Higher WPS values indicate preferential protection from cleavage.
  • WPS windowed protection scores
  • FIG. 37 shows aggregate, adjusted WPS around 22,626 start codons.
  • FIG. 38 shows aggregate, adjusted WPS around 224,910 splice donor sites.
  • FIG. 39 shows aggregate, adjusted WPS around 224,910 splice acceptor sites.
  • FIG. 40 shows aggregate, adjusted WPS around various genic features with data from CH01, including for real data, matched simulation, and their difference.
  • FIG. 41 shows nucleosome spacing in NB compartments. Median nucleosome spacing in non-overlapping 100 kilobase (kb) bins, each containing ⁇ 500 nucleosome calls, is calculated genome-wide. A/B compartment predictions for GM12878, also with 100 kb resolution, are from published sources. Compartment A is associated with open chromatin and compartment B with closed chromatin.
  • FIG. 42 shows nucleosome spacing and A/B compartments on chromosomes 7 and 11.
  • A/B segmentation red and blue bars
  • chromosomal G-banding ideograms, gray bars
  • Median nucleosome spacing black dots is calculated in 100 kb bins and plotted above the NB segmentation.
  • FIG. 43 shows aggregate, adjusted WPS for 93,550 CTCF sites for the long (top) and short (bottom) fractions.
  • FIG. 44 shows a zoomed-in view of the aggregate, adjusted WPS for short fraction cfDNA at CTCF sites.
  • the light red bar (and corresponding shading within the plot) indicate the position of the known 52 bp CTCF binding motif.
  • the dark red subsection of this bar indicates the location of the 17 bp motif used for the FIMO motif search.
  • FIG. 45 shows ⁇ 1 to +1 nucleosome spacing calculated around CTCF sites derived from clustered FIMO predicted CTCF sites (purely motif-based: 518,632 sites), a subset of these predictions overlapping with ENCODE ChIP-seq peaks (93,530 sites), and a further subset that have been experimentally observed to be active across 19 cell lines (23,723 sites).
  • the least stringent set of CTCF sites are predominantly separated by distances that are approximately the same as the genome-wide average ( ⁇ 190 bp). However, at the highest stringency, most CTCF sites are separated by a much wider distance ( ⁇ 260 bp), consistent with active CTCF binding and repositioning of adjacent nucleosomes.
  • FIGS. 46-48 show CTCF occupancy repositions flanking nucleosomes:
  • FIG. 46 shows inter-peak distances for the three closest upstream and three closest downstream peak calls for 518,632 CTCF binding sites predicted by FIMO.
  • FIG. 47 shows inter-peak distances for the three closest upstream and three closest downstream peak calls for 518,632 CTCF binding sites predicted by FIMO as in FIG. 46 , but where the same set of CTCF sites has been filtered based on overlap with ENCODE ChIP-seq peaks, leaving 93,530 sites.
  • FIG. 48 shows inter-peak distances for the three closest upstream and three closest downstream peak calls for 93,530 CTCF binding sites predicted by FIMO as in FIG. 47 , but where the set of CTCF sites has been filtered based on overlap with the set of active CTCF sites experimentally observed across 19 cell lines, leaving 23,732 sites.
  • FIG. 49 shows, for the subset of putative CTCF sites with flanking nucleosomes spaced widely (230-270 bp), that both the long (top) and short (bottom) fractions exhibit a stronger signal of positioning with increasingly stringent subsets of CTCF sites. See FIG. 45 for key defining colored lines.
  • FIGS. 50-52 show CTCF occupancy repositions flanking nucleosomes:
  • FIG. 50 shows mean short fraction WPS (top panel) and mean long fraction WPS (bottom panel) for the 518,632 sites, partitioned into distance bins denoting the number of base-pairs separating the flanking +1 and ⁇ 1 nucleosome calls for each site.
  • FIG. 51 shows mean short fraction WPS (top panel) and mean long fraction WPS (bottom panel) for the 518,632 sites of FIG. 50 , but where the same set of CTCF sites has been filtered based on overlap with ENCODE ChIP-seq peaks.
  • FIG. 52 shows mean short fraction WPS (top panel) and mean long fraction WPS (bottom panel) for the sites of FIG. 51 , but where the same set of sites has been further filtered based on overlap with the set of active CTCF sites experimentally observed across 19 cell lines. Key defining colored lines for FIG. 50 is the same as in FIG. 51 and FIG. 52 .
  • FIGS. 53A-H show footprints of transcription factor binding sites from short and long cfDNA fragments. Clustered FIMO binding sites predictions were intersected with ENCODE ChIP-seq data to obtain a confident set of transcription factor (TF) binding sites for a set of additional factors. Aggregate, adjusted WPS for regions flanking the resulting sets of TF binding sites is displayed for both the long and short fractions of cfDNA fragments. Higher WPS values indicate higher likelihood of nucleosome or TF occupancy, respectively.
  • FIG. 53A AP-2;
  • FIG. 53B E2F-2;
  • FIG. 53C EBOX-TF;
  • FIG. 53D IRF;
  • FIG. 53E MYC-MAX;
  • FIG. 53F PAX5-2;
  • FIG. 53G RUNX-AML;
  • FIG. 53H YY1.
  • FIG. 54 shows aggregate, adjusted WPS for transcription factor ETS (210,798 sites). WPS calculated from both long (top) and short (bottom) cfDNA fractions are shown. Signal consistent with TF protection at the binding site itself (short fraction) with organization of the surrounding nucleosomes (long fraction) is observed. Similar analyses for additional TFs are shown in FIGS. 53A-H .
  • FIG. 55 shows aggregate, adjusted WPS for transcription factor MAFK (32,159 sites). WPS calculated from both long (top) and short (bottom) cfDNA fractions are shown. Signal consistent with TF protection at the binding site itself (short fraction) with organization of the surrounding nucleosomes (long fraction) is observed. Similar analyses for additional TFs are shown in FIGS. 53A-H .
  • FIG. 56 shows the inference of mixtures of cell-types contributing to cell-free DNA based on DNase hypersensitivity (DHS) sites.
  • DHS DNase hypersensitivity
  • FIG. 57 shows how partitioning of adjusted WPS scores around transcriptional start sites (TSS) into five gene expression bins (quintiles) defined for NB-4 (an acute promyelocytic leukemia cell line) reveals differences in the spacing and placement of nucleosomes. Highly expressed genes show a strong phasing of nucleosomes within the transcript body. Upstream of the TSS, ⁇ 1 nucleosomes are well-positioned across expression bins, but ⁇ 2 and ⁇ 3 nucleosomes are only well-positioned for medium to highly expressed genes.
  • TSS transcriptional start sites
  • FIG. 58 shows that, for medium to highly expressed genes, a short fragment peak is observed between the TSS and the ⁇ 1 nucleosome, consistent with footprinting of the transcription preinitiation complex, or some component thereof, at transcriptionally active genes.
  • FIG. 60 shows how, to deconvolve multiple contributions, fast Fourier transformation (FFT) was used to quantify the abundance of specific frequency contributions (intensities) in the long fragment WPS for the first 10 kb of gene bodies starting at each TSS. Shown are trajectories of correlation between RNA expression in 76 cell lines and primary tissues with these intensities at different frequencies. Marked with a bold black line is the NB-4 cell line. Correlations are strongest in magnitude for intensities in the 193-199 bp frequency range.
  • FFT fast Fourier transformation
  • FIG. 61 shows the inference of cell-types contributing to cell-free DNA in healthy states and cancer.
  • the top panel shows the ranks of correlation for 76 RNA expression datasets with average intensity in the 193-199 bp frequency range for various cfDNA libraries, categorized by type and listed from highest rank (top rows) to lowest rank (bottom rows). Correlation values and full cell line or tissue names are provided in Table 3. All of the strongest correlations for all three healthy samples (BH01, IH01 and IH02; first three columns) are with lymphoid and myeloid cell lines as well as bone marrow.
  • cfDNA samples obtained from stage IV cancer patients show top correlations with various cancer cell lines, e.g. IC17 (hepatocellular carcinoma, HCC) showing highest correlations with HepG2 (hepatocellular carcinoma cell line), and IC35 (breast ductal carcinoma, DC) with MCF7 (metastatic breast adenocarcinoma cell line).
  • IC17 hepatocellular carcinoma, HCC
  • HepG2 hepatocellular carcinoma cell line
  • IC35 breast ductal carcinoma, DC
  • MCF7 metalstatic breast adenocarcinoma cell line
  • SCLC small cell lung carcinoma, SCLC
  • SCLC-21H small cell lung carcinoma cell line
  • IC20 small cell lung carcinoma, SCC
  • SK-BR-3 metal breast adenocarcinoma cell line
  • IC37 colonrectal adenocarcinoma, AC
  • FIG. 62 shows quantitation of aneuploidy to select samples with high burden of circulating tumor DNA, based on coverage ( FIG. 62A ) or allele balance ( FIG. 62B ).
  • FIG. 62A shows the sums of Z scores for each chromosome calculated based on observed vs. expected numbers of sequencing reads for each sample (black dots) compared to simulated samples that assume no aneuploidy (red dots).
  • FIG. 62B shows the allele balance at each of 48,800 common SNPs, evaluated per chromosome, for a subset of samples that were selected for additional sequencing.
  • FIG. 63 shows a comparison of peak calls to published nucleosome call sets:
  • FIG. 63A shows the distance between nucleosome peak calls across three published data sets (Gaffney et al. 2012, J. S. Pedersen et al. 2014, and A Schep et al. 2015) as well as the calls generated here, including the matched simulation of CA01.
  • Previously published data sets do not show one defined mode at the canonical ⁇ 185 bp nucleosome distance, probably due to their sparse sampling or wide call ranges.
  • all the nucleosome calls from cfDNA show one well-defined mode.
  • the matched simulated data set has shorter mode (166 bp) and a wider distribution.
  • FIG. 63B shows the number of nucleosomes for each of the same list of sets as FIG. 63A .
  • the cfDNA nucleosome calls present the most comprehensive call set with nearly 13M nucleosome peak calls.
  • FIG. 63C shows the distances between each peak call in the IH01 cfDNA sample and the nearest peak call from three previously published data sets.
  • FIG. 63D shows the distances between each peak call in the IH02 cfDNA sample and the nearest peak call from three previously published data sets.
  • FIG. 63E shows the distances between each peak call in the BH01 cfDNA sample and the nearest peak call from three previously published data sets.
  • FIG. 63F shows the distances between each peak call in the CH01 cfDNA sample and the nearest peak call from three previously published data sets.
  • FIG. 63G shows the distances between each peak call in the CA01 cfDNA sample and the nearest peak call from three previously published data sets. Negative numbers indicate the nearest peak is upstream; positive numbers indicate the nearest peak is downstream. With increased cfDNA coverage, a higher proportion of previously published calls are found in closer proximity to the determined nucleosome call. Highest concordance was found with calls generated by Gaffney et al., PLoS Genet ., vol.
  • FIG. 63H shows the distances between each peak call and the nearest peak call from three previously published data sets, but this time for the matched simulation of CA01.
  • the closest real nucleosome positions tend to be away from the peaks called in the simulation for the Gaffney et al., PLoS Genet ., vol. 8, e1003036 (2012) and JS Pedersen et al., Genome Research , vol. 24, pp. 454-466 (2014) calls.
  • Calls generated by A Schep et al. (2015) seem to show some overlap with the simulated calls.
  • the present disclosure provides methods of determining one or more tissues and/or cell-types giving rise to cell-free DNA in a subject's biological sample. In some embodiments, the present disclosure provides a method of identifying a disease or disorder in a subject as a function of one or more determined tissues and/or cell-types associated with cfDNA in a biological sample from the subject.
  • the present disclosure is based on a prediction that cfDNA molecules originating from different cell types or tissues differ with respect to: (a) the distribution of likelihoods any specific base-pair in a human genome will appear at a terminus of a cfDNA fragment (i.e. points of fragmentation); (b) the distribution of likelihoods that any pair of base-pairs of a human genome will appear as a pair of termini of a cfDNA fragment (i.e. consecutive pairs of fragmentation points that give rise to an individual cfDNA molecule); and (c) the distribution of likelihoods that any specific base-pair in a human genome will appear in a cfDNA fragment (i.e. relative coverage) as a consequence of differential nucleosome occupancy.
  • nucleosome maps might also be measured through the sequencing of fragments derived from the fragmentation of chromatin with an enzyme such as micrococcal nuclease (MNase), DNase, or transposase, or equivalent procedures that preferentially fragment genomic DNA between or at the boundaries of nucleosomes or chromatosomes.
  • MNase micrococcal nuclease
  • DNase DNase
  • transposase or equivalent procedures that preferentially fragment genomic DNA between or at the boundaries of nucleosomes or chromatosomes.
  • cfDNA In healthy individuals, cfDNA overwhelmingly derives from apoptosis of blood cells, i.e. cells of the hematopoietic lineage. As these cells undergo programmed cell death, their genomic DNA is cleaved and released into circulation, where it continues to be degraded by nucleases.
  • the length distribution of cfDNA oscillates with a period of approximately 10.5 base-pairs (bp), corresponding to the helical pitch of DNA coiled around the nucleosome, and has a marked peak around 167 bp, corresponding to the length of DNA associated with a linker-associated mononucleosome ( FIG. 2 ).
  • the present disclosure defines a nucleosome map as the measurement of distributions (a), (b) and/or (c) by library construction and massively parallel sequencing of either cfDNA from a bodily fluid or DNA derived from the fragmentation of chromatin with an enzyme such as micrococcal nuclease (MNase), DNase, or transposase, or equivalent procedures that preferentially fragment genomic DNA between or at the boundaries of nucleosomes or chromatosomes.
  • MNase micrococcal nuclease
  • DNase DNase
  • transposase transposase
  • tissue-specific data For example, one could aggregate or summarize signal in the vicinity of tissue-specific DNase I hypersensitive sites.
  • the present disclosure provides a dense, genome-wide map of in vivo nucleosome protection inferred from plasma-borne cfDNA fragments.
  • the CH01 map derived from cfDNA of healthy individuals, comprises nearly 13M uniformly spaced local maxima of nucleosome protection that span the vast majority of the mappable human reference genome. Although the number of peaks is essentially saturated in CH01, other metrics of quality continued to be a function of sequencing depth ( FIGS. 33A-B ).
  • the dense, genome-wide map of nucleosome protection disclosed herein approaches saturation of the mappable portion of the human reference genome, with peak-to-peak spacing that is considerably more uniform and consistent with the expected nucleosome repeat length than previous efforts to generate human genome-wide maps of nucleosome positioning or protection ( FIGS. 63A-H ).
  • the fragments that observed herein are generated by endogenous physiological processes, and are therefore less likely to be subject to the technical variation associated with in vitro micrococcal nuclease digestion.
  • the cell types that give rise to cfDNA considered in this reference map are inevitably heterogeneous (e.g. a mixture of lymphoid and myeloid cell types in healthy individuals). Nonetheless, the map's relative completeness may facilitate a deeper understanding of the processes that dictate nucleosome positioning and spacing in human cells, as well as the interplay of nucleosomes with epigenetic regulation, transcriptional output, and nuclear architecture.
  • the present technology may be used to determine (e.g., predict) the tissue(s) and/or cell type(s) which contribute to the cfDNA in a subject's biological sample.
  • the present disclosure provides a method of determining tissues and/or cell-types giving rise to cell-free DNA (cfDNA) in a subject, the method comprising isolating cfDNA from a biological sample from the subject, the isolated cfDNA comprising a plurality of cfDNA fragments; determining a sequence associated with at least a portion of the plurality of cfDNA fragments; determining a genomic location within a reference genome for at least some cfDNA fragment endpoints of the plurality of cfDNA fragments as a function of the cfDNA fragment sequences; and determining at least some of the tissues and/or cell types giving rise to the cfDNA fragments as a function of the genomic locations of at least some of the cfDNA fragment endpoints.
  • cfDNA cell-free DNA
  • the biological sample comprises, consists essentially of, or consists of whole blood, peripheral blood plasma, urine, or cerebral spinal fluid.
  • the step of determining at least some of the tissues and/or cell-types giving rise to the cfDNA fragments comprises comparing the genomic locations of at least some of the cfDNA fragment endpoints, or mathematical transformations of their distribution, to one or more reference maps.
  • the term “reference map” refers to any type or form of data which can be correlated or compared to an attribute of the cfDNA in the subject's biological sample as a function of the coordinate within the genome to which cfDNA sequences are aligned (e.g., the reference genome).
  • the reference map may be correlated or compared to an attribute of the cfDNA in the subject's biological sample by any suitable means.
  • the correlation or comparison may be accomplished by analyzing frequencies of cfDNA endpoints, either directly or after performing a mathematical transformation on their distribution across windows within the reference genome, in the subject's biological sample in view of numerical values or any other states defined for equivalent coordinates of the reference genome by the reference map.
  • the correlation or comparison may be accomplished by analyzing the determined nucleosome spacing(s) based on the cfDNA of the subject's biological sample in view of the determined nucleosome spacing(s), or another property that correlates with nucleosome spacing(s), in the reference map.
  • the reference map(s) may be sourced or derived from any suitable data source including, for example, public databases of genomic information, published data, or data generated for a specific population of reference subjects which may each have a common attribute (e.g., disease status).
  • the reference map comprises a DNase I hypersensitivity dataset.
  • the reference map comprises an RNA expression dataset.
  • the reference map comprises a chromosome conformation map.
  • the reference map comprises a chromatin accessibility map.
  • the reference map comprises data that is generated from at least one tissue or cell-type that is associated with a disease or a disorder.
  • the reference map comprises positions of nucleosomes and/or chromatosomes in a tissue or cell type.
  • the reference map is generated by a procedure that includes digesting chromatin with an exogenous nuclease (e.g., micrococcal nuclease).
  • the reference map comprises chromatin accessibility data determined by a transposition-based method (e.g., ATAC-seq).
  • the reference map comprises data associated with positions of a DNA binding and/or DNA occupying protein for a tissue or cell type.
  • the DNA binding and/or DNA occupying protein is a transcription factor.
  • the positions are determined by a procedure that includes chromatin immunoprecipitation of a crosslinked DNA-protein complex. In some embodiments, the positions are determined by a procedure that includes treating DNA associated with the tissue or cell type with a nuclease (e.g., DNase-I).
  • the reference map is generated by sequencing of cfDNA fragments from a biological sample from one or more individuals with a known disease. In some embodiments, this biological sample from which the reference map is generated is collected from an animal to which human cells or tissues have been xenografted.
  • the reference map comprises a biological feature corresponding to positions of a DNA binding or DNA occupying protein for a tissue or cell type. In some embodiments, the reference map comprises a biological feature corresponding to quantitative RNA expression of one or more genes. In some embodiments, the reference map comprises a biological feature corresponding to the presence or absence of one or more histone marks. In some embodiments, the reference map comprises a biological feature corresponding to hypersensitivity to nuclease cleavage.
  • the step of comparing the genomic locations of at least some of the cfDNA fragment endpoints to one or more reference maps may be accomplished in a variety of ways.
  • the cfDNA data generated from the biological sample e.g., the genomic locations of the cfDNA fragments, their endpoints, the frequencies of their endpoints, and/or nucleosome spacing(s) inferred from their distribution
  • the tissues or cell-types associated with the reference maps which correlate most highly with the cfDNA data in the biological sample are deemed to be contributing.
  • the reference map(s) having the most similar list of cfDNA endpoints and their locations within the reference genome may be deemed to be contributing.
  • the reference map(s) having the most correlation (or increased correlation, relative to cfDNA from a healthy subject) with a mathematical transformation of the distribution of cfDNA fragment endpoints from the biological sample may be deemed to be contributing.
  • the tissue types and/or cell types which correspond to those reference maps deemed to be contributing are then considered as potential sources of the cfDNA isolated from the biological sample.
  • the step of determining at least some of the tissues and/or cell types giving rise to the cfDNA fragments comprises performing a mathematical transformation on a distribution of the genomic locations of at least some of the cfDNA fragment endpoints.
  • a mathematical transformation suitable for use in connection with the present technology is a Fourier transformation, such as a fast Fourier transformation (“FFT”).
  • FFT fast Fourier transformation
  • the method further comprises determining a score for each of at least some coordinates of the reference genome, wherein the score is determined as a function of at least the plurality of cfDNA fragment endpoints and their genomic locations, and wherein the step of determining at least some of the tissues and/or cell types giving rise to the observed cfDNA fragments comprises comparing the scores to one or more reference map.
  • the score may be any metric (e.g., a numerical ranking or probability) which may be used to assign relative or absolute values to a coordinate of the reference genome.
  • the score may consist of, or be related to a probability, such as a probability that the coordinate represents a location of a cfDNA fragment endpoint, or a probability that the coordinate represents a location of the genome that is preferentially protected from nuclease cleavage by nucleosome or protein binding.
  • the score may relate to nucleosome spacing in particular regions of the genome, as determined by a mathematical transformation of the distribution of cfDNA fragment endpoints within that region.
  • scores may be assigned to the coordinate by any suitable means including, for example, by counting absolute or relative events (e.g., the number of cfDNA fragment endpoints) associated with that particular coordinate, or performing a mathematical transformation on the values of such counts in the region or a genomic coordinate.
  • the score for a coordinate is related to the probability that the coordinate is a location of a cfDNA fragment endpoint. In other embodiments, the score for a coordinate is related to the probability that the coordinate represents a location of the genome that is preferentially protected from nuclease cleavage by nucleosome or protein binding. In some embodiments, the score is related to nucleosome spacing in the genomic region of the coordinate.
  • tissue(s) and/or cell-type(s) referred to in the methods described herein may be any tissue or cell-type which gives rise to cfDNA.
  • the tissue or cell-type is a primary tissue from a subject having a disease or disorder.
  • the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • the tissue or cell type is a primary tissue from a healthy subject.
  • the tissue or cell type is an immortalized cell line.
  • the tissue or cell type is a biopsy from a tumor.
  • the reference map is based on sequence data obtained from samples obtained from at least one reference subject.
  • this sequence data defines positions of cfDNA fragment endpoints within a reference genome—for example, if the reference map is generated by sequencing of cfDNA from subject(s) with known disease.
  • this sequence data on which the reference map is based may comprise any one or more of: a DNase I hypersensitive site dataset, an RNA expression dataset, a chromosome conformation map, or a chromatin accessibility map, or nucleosome positioning map generated by digestion of chromatin with micrococcal nuclease.
  • the reference subject is healthy.
  • the reference subject has a disease or disorder, optionally selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • a disease or disorder optionally selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • the reference map comprises scores for at least a portion of coordinates of the reference genome associated with the tissue or cell type. In some embodiments, the reference map comprises a mathematical transformation of the scores, such as a Fourier transformation of the scores. In some embodiments, the scores are based on annotations of reference genomic coordinates for the tissue or cell type. In some embodiments, the scores are based on positions of nucleosomes and/or chromatosomes. In some embodiments, the scores are based on transcription start sites and/or transcription end sites. In some embodiments, the scores are based on predicted binding sites of at least one transcription factor. In some embodiments, the scores are based on predicted nuclease hypersensitive sites. In some embodiments, the scores are based on predicted nucleosome spacing.
  • the scores are associated with at least one orthogonal biological feature.
  • the orthogonal biological feature is associated with highly expressed genes. In some embodiments, the orthogonal biological feature is associated with lowly expression genes.
  • the threshold value is determined before determining the tissue(s) and/or the cell type(s) giving rise to the cfDNA. In other embodiments, the threshold value is determined after determining the tissue(s) and/or the cell type(s) giving rise to the cfDNA.
  • the step of determining the tissues and/or cell types giving rise to the cfDNA as a function of a plurality of the genomic locations of at least some of the cfDNA fragment endpoints comprises comparing a mathematical transformation of the distribution of the genomic locations of at least some of the cfDNA fragment endpoints of the sample with one or more features of one or more reference maps.
  • a mathematical transformation suitable for this purpose is a Fourier transformation, such as a fast Fourier transformation (“FFT”).
  • the method may further comprise generating a report comprising a list of the determined tissues and/or cell-types giving rise to the isolated cfDNA.
  • the report may optionally further include any other information about the sample and/or the subject, the type of biological sample, the date the biological sample was obtained from the subject, the date the cfDNA isolation step was performed and/or tissue(s) and/or cell-type(s) which likely did not give rise to any cfDNA isolated from the biological sample.
  • the report further includes a recommended treatment protocol including, for example and without limitation, a suggestion to obtain an additional diagnostic test from the subject, a suggestion to begin a therapeutic regimen, a suggestion to modify an existing therapeutic regimen with the subject, and/or a suggestion to suspend or stop an existing therapeutic regiment.
  • a recommended treatment protocol including, for example and without limitation, a suggestion to obtain an additional diagnostic test from the subject, a suggestion to begin a therapeutic regimen, a suggestion to modify an existing therapeutic regimen with the subject, and/or a suggestion to suspend or stop an existing therapeutic regiment.
  • the present technology may be used to determine (e.g., predict) a disease or disorder, or the absence of a disease or a disorder, based at least in part on the tissue(s) and/or cell type(s) which contribute to cfDNA in a subject's biological sample.
  • the present disclosure provides a method of identifying a disease or disorder in a subject, the method comprising isolating cell free DNA (cfDNA) from a biological sample from the subject, the isolated cfDNA comprising a plurality of cfDNA fragments; determining a sequence associated with at least a portion of the plurality of cfDNA fragments; determining a genomic location within a reference genome for at least some cfDNA fragment endpoints of the plurality of cfDNA fragments as a function of the cfDNA fragment sequences; determining at least some of the tissues and/or cell types giving rise to the cfDNA as a function of the genomic locations of at least some of the cfDNA fragment endpoints; and identifying the disease or disorder as a function of the determined tissues and/or cell types giving rise to the cfDNA.
  • cfDNA cell free DNA
  • the biological sample comprises, consists essentially of, or consists of whole blood, peripheral blood plasma, urine, or cerebral spinal fluid.
  • the step of determining the tissues and/or cell-types giving rise to the cfDNA comprises comparing the genomic locations of at least some of the cfDNA fragment endpoints, or mathematical transformations of their distribution, to one or more reference maps.
  • the term “reference map” as used in connection with these embodiments may have the same meaning described above with respect to methods of determining tissue(s) and/or cell type(s) giving rise to cfDNA in a subject's biological sample.
  • the reference map may comprise any one or more of: a DNase I hypersensitive site dataset, an RNA expression dataset, a chromosome conformation map, a chromatin accessibility map, sequence data that is generated from samples obtained from at least one reference subject, enzyme-mediated fragmentation data corresponding to at least one tissue that is associated with a disease or a disorder, and/or positions of nucleosomes and/or chromatosomes in a tissue or cell type.
  • the reference map is generated by sequencing of cfDNA fragments from a biological sample from one or more individuals with a known disease. In some embodiments, this biological sample from which the reference map is generated is collected from an animal to which human cells or tissues have been xenografted.
  • the reference map is generated by digesting chromatin with an exogenous nuclease (e.g., micrococcal nuclease).
  • the reference maps comprise chromatin accessibility data determined by a transposition-based method (e.g., ATAC-seq).
  • the reference maps comprise data associated with positions of a DNA binding and/or DNA occupying protein for a tissue or cell type.
  • the DNA binding and/or DNA occupying protein is a transcription factor.
  • the positions are determined chromatin immunoprecipitation of a crosslinked DNA-protein complex.
  • the positions are determined by treating DNA associated with the tissue or cell type with a nuclease (e.g., DNase-I).
  • the reference map comprises a biological feature corresponding to positions of a DNA binding or DNA occupying protein for a tissue or cell type. In some embodiments, the reference map comprises a biological feature corresponding to quantitative expression of one or more genes. In some embodiments, the reference map comprises a biological feature corresponding to the presence or absence of one or more histone marks. In some embodiments, the reference map comprises a biological feature corresponding to hypersensitivity to nuclease cleavage.
  • the step of determining the tissues and/or cell types giving rise to the cfDNA comprises performing a mathematical transformation on a distribution of the genomic locations of at least some of the plurality of the cfDNA fragment endpoints.
  • the mathematical transformation includes a Fourier transformation.
  • the method further comprises determining a score for each of at least some coordinates of the reference genome, wherein the score is determined as a function of at least the plurality of cfDNA fragment endpoints and their genomic locations, and wherein the step of determining at least some of the tissues and/or cell types giving rise to the observed cfDNA fragments comprises comparing the scores to one or more reference maps.
  • the score may be any metric (e.g., a numerical ranking or probability) which may be used to assign relative or absolute values to a coordinate of the reference genome.
  • the score may consist of, or be related to a probability, such as a probability that the coordinate represents a location of a cfDNA fragment endpoint, or a probability that the coordinate represents a location of the genome that is preferentially protected from nuclease cleavage by nucleosome or protein binding.
  • the score may relate to nucleosome spacing in particular regions of the genome, as determined by a mathematical transformation of the distribution of cfDNA fragment endpoints within that region.
  • scores may be assigned to the coordinate by any suitable means including, for example, by counting absolute or relative events (e.g., the number of cfDNA fragment endpoints) associated with that particular coordinate, or performing a mathematical transformation on the values of such counts in the region or a genomic coordinate.
  • the score for a coordinate is related to the probability that the coordinate is a location of a cfDNA fragment endpoint. In other embodiments, the score for a coordinate is related to the probability that the coordinate represents a location of the genome that is preferentially protected from nuclease cleavage by nucleosome or protein binding. In some embodiments, the score is related to nucleosome spacing in the genomic region of the coordinate.
  • the score for a coordinate is related to the probability that the coordinate is a location of a cfDNA fragment endpoint. In other embodiments, the score for a coordinate is related to the probability that the coordinate represents a location of the genome that is preferentially protected from nuclease cleavage by nucleosome or protein binding. In some embodiments, the score is related to nucleosome spacing in the genomic region of the coordinate.
  • the tissue or cell-type used for generating a reference map is a primary tissue from a subject having a disease or disorder.
  • the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, systemic autoimmune disease, localized autoimmune disease, inflammatory bowel disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • the tissue or cell type is a primary tissue from a healthy subject.
  • the tissue or cell type is an immortalized cell line.
  • the tissue or cell type is a biopsy from a tumor.
  • the reference map is based on sequence data obtained from samples obtained from at least one reference subject.
  • this sequence data defines positions of cfDNA fragment endpoints within a reference genome—for example, if the reference map is generated by sequencing of cfDNA from subject(s) with known disease.
  • this sequence data on which the reference map is based may comprise any one or more of: a DNase I hypersensitive site dataset, an RNA expression dataset, a chromosome conformation map, or a chromatin accessibility map, or nucleosome positioning map generated by digestion with micrococcal nuclease.
  • the reference subject is healthy.
  • the reference subject has a disease or disorder.
  • the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, systemic autoimmune disease, inflammatory bowel disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • the reference map comprises cfDNA fragment endpoint probabilities, or a quantity that correlates with such probabilities, for at least a portion of the reference genome associated with the tissue or cell type. In some embodiments, the reference map comprises a mathematical transformation of the cfDNA fragment endpoint probabilities, or a quantity that correlates with such probabilities.
  • the reference map comprises scores for at least a portion of coordinates of the reference genome associated with the tissue or cell type. In some embodiments, the reference map comprises a mathematical transformation of the scores, such as a Fourier transformation of the scores. In some embodiments, the scores are based on annotations of reference genomic coordinates for the tissue or cell type. In some embodiments, the scores are based on positions of nucleosomes and/or chromatosomes. In some embodiments, the scores are based on transcription start sites and/or transcription end sites. In some embodiments, the scores are based on predicted binding sites of at least one transcription factor. In some embodiments, the scores are based on predicted nuclease hypersensitive sites.
  • the scores are associated with at least one orthogonal biological feature.
  • the orthogonal biological feature is associated with highly expressed genes. In some embodiments, the orthogonal biological feature is associated with lowly expression genes.
  • the threshold value is determined before determining the tissue(s) and/or the cell type(s) giving rise to the cfDNA. In other embodiments, the threshold value is determined after determining the tissue(s) and/or the cell type(s) giving rise to the cfDNA.
  • the step of determining the tissues and/or cell types giving rise to the cfDNA as a function of a plurality of the genomic locations of at least some of the cfDNA fragment endpoints comprises a mathematical transformation of the distribution of the genomic locations of at least some of the cfDNA fragment endpoints of the sample with one or more features of one or more reference maps.
  • this mathematical transformation includes a Fourier transformation.
  • the reference map comprises enzyme-mediated fragmentation data corresponding to at least one tissue that is associated with the disease or disorder.
  • the reference genome is associated with a human.
  • the methods described herein are used for detection, monitoring and tissue(s) and/or cell-type(s)-of-origin assessment of malignancies from analysis of cfDNA in bodily fluids. It is now well documented that in patients with malignancies, a portion of cfDNA in bodily fluids such as circulating plasma can be derived from the tumor. The methods described here can potentially be used to detect and quantify this tumor derived portion. Furthermore, as nucleosome occupancy maps are cell-type specific, the methods described here can potentially be used to determine the tissue(s) and/or cell-type(s)-of-origin of a malignancy.
  • the methods described above may enable cancer detection, monitoring, and/or tissue(s) and/or cell-type(s)-of-origin assignment based on signal from these other tissues rather than the cancer cells per se.
  • the methods described herein are used for detection, monitoring and tissue(s) and/or cell-type(s)-of-origin assessment of tissue damage from analysis of cfDNA in bodily fluids. It is to be expected that many pathological processes will result in a portion of cfDNA in bodily fluids such as circulating plasma deriving from damaged tissues.
  • the methods described here can potentially be used to detect and quantify cfDNA derived from tissue damage, including identifying the relevant tissues and/or cell-types of origin. This may enable diagnosis and/or monitoring of pathological processes including myocardial infarction (acute damage of heart tissue), autoimmune disease (chronic damage of diverse tissues), and many others involving either acute or chronic tissue damage.
  • the methods described herein are used for estimating the fetal fraction of cfDNA in pregnancy and/or enhancing detection of chromosomal or other genetic abnormalities.
  • Relatively shallow sequencing of the maternal plasma-borne DNA fragments, coupled with nucleosome maps described above, may allow a cost-effective and rapid estimation of fetal fraction in both male and female fetus pregnancies.
  • these methods may also enhance the performance of tests directed at detecting chromosomal aberrations (e.g. trisomies) through analysis of cfDNA in maternal bodily fluids.
  • the methods described herein are used for quantifying the contribution of a transplant (autologous or allograft) to cfDNA—Current methods for early and noninvasive detection of acute allograft rejection involve sequencing plasma-borne DNA and identifying increased concentrations of fragments derived from the donor genome. This approach relies on relatively deep sequencing of this pool of fragments to detect, for example, 5-10% donor fractions. An approach based instead on nucleosome maps of the donated organ may enable similar estimates with shallower sequencing, or more sensitive estimates with an equivalent amount of sequencing. Analogous to cancer, it is also possible that cell types other than the transplant itself contribute to cfDNA composition during transplant rejection. To the extent that contributions from such other tissues to cfDNA are consistent between patients during transplant rejection, the methods described above may enable monitoring of transplant rejection based on signal from these other tissues rather than the transplant donor cells per se.
  • the present disclosure also provides methods of diagnosing a disease or disorder using nucleosome reference map(s) generated from subjects having a known disease or disorder.
  • the method comprises: (1) generating a reference set of nucleosome maps, wherein each nucleosome map is derived from either cfDNA from bodily fluids of individual(s) with defined clinical conditions (e.g.
  • STEP 1 Generating a reference set of nucleosome maps, and aggregating or summarizing signal from nucleosome positioning.
  • a preferred method for generating a nucleosome map includes DNA purification, library construction (by adaptor ligation and possibly PCR amplification) and massively parallel sequencing of cfDNA from a bodily fluid.
  • An alternative source for nucleosome maps which are useful in the context of this invention as reference points or for identifying principal components of variation, is DNA derived from digestion of chromatin with micrococcal nuclease (MNase), DNase treatment, ATAC-Seq or other related methods wherein information about nucleosome positioning is captured in distributions (a), (b) or (c). Descriptions of these distributions (a), (b) and (c) are provided above in [0078] and are shown graphically in FIG. 1 .
  • nucleosome occupancy patterns can be summarized or aggregated across continuous or discontinuous regions of the genome.
  • distribution i.e. distribution (a)
  • kbp kilobase-pair
  • Example 3 we quantify the distribution of sites in the reference human genome to which sequencing read start sites map, i.e. distribution (a), in the immediate vicinity of transcription factor binding sites (TFBS) of specific transcription factor (TF), which are often immediately flanked by nucleosomes when the TFBS is bound by the TF.
  • distribution a
  • TFBS transcription factor binding sites
  • TF specific transcription factor
  • nucleosome occupancy includes those generated from cfDNA samples associated with a known disease, as reference maps, i.e. without aggregating signal, for the purposes of comparison to an unknown cfDNA sample.
  • this biological sample from which the reference map of nucleosome occupancy is generated is collected from an animal to which human cells or tissues have been xenografted.
  • sequenced cfDNA fragments mapping to the human genome will exclusively derive from the xenografted cells or tissues, as opposed to representing a mixture of cfDNA derived from the cells/tissues of interest along with hematopoietic lineages.
  • STEP 2 Predicting pathology(s), clinical condition(s) and/or tissue/cell-types-of-origin composition on the basis of comparing the cfDNA-derived nucleosome map of one or more new individuals/samples to the reference set of nucleosome maps either directly or after mathematical transformation of each map.
  • Examples 1 & 2 we first summarize long-range nucleosome ordering within 10 kbp windows along the genome in a diverse set of samples, and then perform principal components analysis (PCA) to cluster samples (Example 1) or to estimate mixture proportions (Example 2).
  • PCA principal components analysis
  • any one of the samples could in principle have been the “unknown”, and its behavior in the PCA analysis used to predict the presence/absence of a clinical condition or its tissue/cell-type-of-origin based on its behavior in the PCA analysis relative to all other nucleosome maps.
  • the unknown sample does not necessarily need to be precisely matched to 1+ members of the reference set in a 1:1 manner. Rather, its similarities to each can be quantified (Example 1), or its nucleosome map can be modeled as a non-uniform mixture of 2+ samples from the reference set (Example 2).
  • tissue/cell-type-of-origin composition of cfDNA in each sample need not be predicted or ultimately known for the method of the present invention to be successful. Rather, the method described herein relies on the consistency of tissue/cell-type-of-origin composition of cfDNA in the context of a particular pathology or clinical condition.
  • tissue/cell-type-of-origin composition of cfDNA in the context of a particular pathology or clinical condition.
  • by surveying the nucleosome maps of a large number of tissues and/or cell types directly by analysis of DNA derived from digestion of chromatin and adding these to the nucleosome map it would be possible to estimate the tissue(s) and/or cell-type(s) contributing to an unknown cfDNA-derived sample.
  • the method may further comprise generating a report comprising a statement identifying the disease or disorder.
  • the report may further comprise a list of the determined tissues and/or cell types giving rise to the isolated cfDNA.
  • the report further comprises a list of diseases and/or disorders which are unlikely to be associated with the subject.
  • the report may optionally further include any other information about the sample and/or the subject, the type of biological sample, the date the biological sample was obtained from the subject, the date the cfDNA isolation step was performed and/or tissue(s) and/or cell type(s) which likely did not give rise to any cfDNA isolated from the biological sample.
  • the report further includes a recommended treatment protocol including, for example and without limitation, a suggestion to obtain an additional diagnostic test from the subject, a suggestion to begin a therapeutic regimen, a suggestion to modify an existing therapeutic regimen with the subject, and/or a suggestion to suspend or stop an existing therapeutic regiment.
  • a recommended treatment protocol including, for example and without limitation, a suggestion to obtain an additional diagnostic test from the subject, a suggestion to begin a therapeutic regimen, a suggestion to modify an existing therapeutic regimen with the subject, and/or a suggestion to suspend or stop an existing therapeutic regiment.
  • cfDNA samples human plasma containing contributions from an unknown number of healthy individuals; bulk.cfDNA
  • MC2.cfDNA a cfDNA sample from single healthy male control individual
  • four cfDNA samples from patients with intracranial tumors tumor.2349, tumor.2350, tumor.2351, tumor.2353
  • six MNase digestion experiments from five different human cell lines (Hap1.MNase, HeLa.MNase, HEK.MNase, NA12878.MNase, HeLaS3, MCF.7) and seven cfDNA samples from different pregnant female individuals (gm1matplas, gm2matplas, im1matplas, fgs002, fgs003, fgs004, fgs005) were analyzed and contrasted with regular shotgun sequencing data set of DNA extracted from a
  • Read start coordinates were extracted and periodograms were created using Fast Fourier Transformation (FFT) as described in the Methods section.
  • FFT Fast Fourier Transformation
  • FIGS. 4 and 5 explore visualizations of the periodogram intensities at 196 bp across contiguous, non-overlapping 10 kbp blocks tiling the full length of human autosomes (see Methods for details).
  • FIG. 4 presents a Principal Component Analysis (PCA) of the data and the projections across the first three components.
  • PCA Principal Component Analysis
  • PC1 Principal component 1
  • PC2 (9.7% of variance) captures the differences between MNase and cfDNA samples.
  • PC3 (6.4% variance) captures differences between individual samples.
  • FIG. 5 shows the hierarchical clustering dendogram of this data based on Euclidean distances of the intensity vectors. We note that the two HeLa S3 experiments tightly cluster in the PCA and dendogram, even though data was generated in different labs and following different experimental protocols. “Normal” cfDNA samples, tumor cfDNA samples and groups of cell line MNase samples also clustered.
  • the three tumor samples originating from the same tumor type appear to cluster, separately from tumor.2351 sample which originates from a different tumor type (see Table 1).
  • the GM1 and IM1 samples cluster separately from the other cfDNA samples obtained from pregnant women. This coincides with higher intensities observed for frequencies below the peak in these samples (i.e., a more pronounced left shoulder in FIG. 3 ). This might indicate subtle differences in the preparation of the cfDNA between the two sets of samples, or biological differences which were not controlled for (e.g., gestational age).
  • FIGS. 6 and 7 show the results of equivalent analyses but based on the frequency range of 181 bp to 202 bp. Comparing these plots, the results are largely stable to a wider frequency range; however additional frequencies may improve sensitivity in more fine-scaled analyses.
  • the cfDNA and MNase data sets were analyzed separately using PCA of intensities for this frequency range. In the following set of analyses, the five cfDNA samples from pregnant women, which show the pronounced left shoulder in FIG. 3 , were excluded.
  • FIG. 8 shows the first 7 principal components of the cfDNA data and FIG. 9 all six principal components for the six MNase data sets.
  • Example 1 basic clustering of samples that were generated or downloaded from public databases was studied. The analyses showed that read start coordinates in these data sets capture a strong signal of nucleosome positioning (across a range of sequencing depths obtained from 20 million sequences to more than a 1,000 million sequences) and that sample origin correlates with this signal. For the goals of this method, it would also be useful to be able to identify mixtures of known cell types and to some extent quantify the contributions of each cell type from this signal. For this purpose, this example explored synthetic mixtures (i.e., based on sequence reads) of two samples.
  • FIG. 10 shows the average intensities for chromosome 11, equivalent to FIG. 3 but for these synthetic mixtures. It can be seen from FIG. 10 how the different sample contributions cause shifts in the global frequency intensity patterns. This signal can be exploited to infer the synthetic mixture proportions.
  • FIG. 11 shows the first two principal components for the MNase data set mixtures and
  • FIG. 12 shows the first two principal components for the cfDNA data set mixtures. In both cases, the first PC directly captures the composition of the mixed data set. It is therefore directly conceivable how mixture proportions for two and possibly more cell types could be estimated from transformation of the frequency intensity data given the appropriate reference sets and using for example regression models.
  • FIG. 13 shows the dendogram of both data sets, confirming the overall similarities of mixture samples deriving from similar sample proportions as well as the separation of the cfDNA and MNase samples.
  • nucleosome positioning is influenced by nearby TF occupancy.
  • the effect on local remodeling of chromatin, and thus on the stable positioning of nearby nucleosomes, is not uniform across the set of TFs; occupancy of a given TF may have local effects on nucleosome positioning that are preferentially 5′ or 3′ of the binding site and stretch for greater or lesser genomic distance in specific cell types.
  • the set of TF binding sites occupied in vivo in a particular cell varies between tissues and cell types, such that if one were able to identify TF binding site occupancy maps for tissues or cell types of interest, and repeated this process for one or more TFs, one could identify components of the mixture of cell types and tissues contributing to a population of cfDNA by identifying enrichment or depletion of one or more cell type- or tissue-specific TF binding site occupancy profiles.
  • ChIP-seq transcription factor (TF) peaks were obtained from the Encyclopedia of DNA Elements (“ENCODE”) project (National Human Genome Research Institute, National Institutes of Health, Bethesda, Md.). Because the genomic intervals of these peaks are broad (200 to 400 bp on average), the active binding sites within these intervals were discerned by informatically scanning the genome for respective binding motifs with a conservative p-value cutoff (1 ⁇ 10 ⁇ 5 , see Methods for details). The intersection of these two independently derived sets of predicted TF binding sites were then carried forward into downstream analysis.
  • ENCODE Encyclopedia of DNA Elements
  • the number of read-starts at each position within 500 bp of each candidate TF binding site was calculated in samples with at least 100 million sequences. Within each sample, all read-starts were summed at each position, yielding a total of 1,014 to 1,019 positions per sample per TF, depending on the length of the TF recognition sequence.
  • FIG. 14 shows the distribution of read-starts around 24,666 CTCF binding sites in the human genome in a variety of different samples, centered around the binding site itself.
  • CTCF is an insulator binding protein and plays a major role in transcriptional repression.
  • Previous studies suggest that CTCF binding sites anchor local nucleosome positioning such that at least 20 nucleosomes are symmetrically and regularly spaced around a given binding site, with an approximate period of 185 bp.
  • One striking feature common to nearly all of the samples in FIG. 14 is the clear periodicity of nucleosome positioning both upstream and downstream of the binding site, suggesting that the local and largely symmetrical effects of CTCF binding in vivo are recapitulated in a variety of cfDNA and MNase-digested samples.
  • the periodicity of the upstream and downstream peaks is not uniform across the set of samples; the MNase-digested samples display slightly wider spacing of the peaks relative to the binding site, suggesting the utility of not only the intensity of the peaks, but also their period.
  • FIG. 15 shows the distribution of read-starts around 5,644 c-Jun binding sites. While the familiar periodicity is again visually identifiable for several samples in this figure, the effect is not uniform. Of note, three of the MNase-digested samples (Hap1.MNase, HEK.MNase, and NA12878.MNase) have much flatter distributions, which may indicate that c-Jun binding sites are not heavily occupied in these cells, or that the effect of c-Jun binding on local chromatin remodeling is less pronounced in these cell types.
  • FIG. 16 shows the distribution of read-starts around 4,417 NF-YB binding sites.
  • the start site distributions in the neighborhood of these TF binding sites demonstrate a departure from symmetry: here, the downstream effects (to the right within each plot) appear to be stronger than the upstream effects, as evidenced by the slight upward trajectory in the cfDNA samples.
  • the difference between the MNase-digested samples and the cfDNA samples show, on average, a flatter profile in which peaks are difficult to discern, whereas the latter have both more clearly discernable periodicity and more identifiable peaks.
  • Whole blood was drawn from pregnant women fgs002, fgs003, fgs004, and fgs005 during routine third-trimester prenatal care and stored briefly in Vacutainer tubes containing EDTA (BD).
  • Whole blood from pregnant women IM1, GM1, and GM2 was obtained at 18, 13, and 10 weeks gestation, respectively, and stored briefly in Vacutainer tubes containing EDTA (BD).
  • Whole blood from glioma patients 2349, 2350, 2351, and 2353 was collected as part of brain surgical procedures and stored for less than three hours in Vacutainer tubes containing EDTA (BD).
  • MC2 Male Control 2
  • Plasma was separated from whole blood by centrifugation at 1,000 ⁇ g for 10 minutes at 4° C., after which the supernatant was collected and centrifuged again at 2,000 ⁇ g for 15 minutes at 4° C. Purified plasma was stored in 1 ml aliquots at ⁇ 80° C. until use.
  • Circulating cfDNA was purified from 2 ml of each plasma sample with the QiaAMP Circulating Nucleic Acids kit (Qiagen, Venlo, Netherlands) as per the manufacturer's protocol. DNA was quantified with a Qubit fluorometer (Invitrogen, Carlsbad, Calif.) and a custom qPCR assay targeting a human Alu sequence.
  • GM12878, HeLa S3, HEK, Hap1 Approximately 50 million cells of each line (GM12878, HeLa S3, HEK, Hap1) were grown using standard methods. Growth media was aspirated and cells were washed with PBS. Cells were trypsinized and neutralized with 2 ⁇ volume of CSS media, then pelleted in conical tubes by centrifugation for at 1,300 rpm for 5 minutes at 4° C. Cell pellets were resuspended in 12 ml ice-cold PBS with 1 ⁇ protease inhibitor cocktail added, counted, and then pelleted by centrifugation for at 1,300 rpm for 5 minutes at 4° C.
  • RSB buffer (10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl 2 , 0.5mM spermidine, 0.02% NP-40, 1 ⁇ protease inhibitor cocktail) to a concentration of 3 million cells per ml and incubated on ice for 10 minutes with gentle inversion. Nuclei were pelleted by centrifugation at 1,300 rpm for five minutes at 4° C.
  • Pelleted nuclei were resuspended in NSB buffer (25% glycerol, 5 mM MgAc 2 , 5 mM HEPES, 0.08 mM EDTA, 0.5 mM spermidine, 1 mM DTT, 1 ⁇ protease inhibitor cocktail) to a final concentration of 15M per ml.
  • Nuclei were again pelleted by centrifugation at 1,300 rpm for 5 minutes at 4° C., and resuspended in MN buffer (500 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl 2 , 1 mM CaCl, 1 ⁇ protease inhibitor cocktail) to a final concentration of 30M per ml.
  • Nuclei were split into 200 ⁇ l aliquots and digested with 4 U of micrococcal nuclease (Worthington Biochemical Corp., Lakewood, N.J., USA) for five minutes at 37° C. The reaction was quenched on ice with the addition of 85 82 l of MNSTOP buffer (500 mM NaCl, 50 mM EDTA, 0.07% NP-40, 1 ⁇ protease inhibitor), followed by a 90 minute incubation at 4° C. with gentle inversion. DNA was purified using phenol:chloroform:isoamyl alcohol extraction. Mononucleosomal fragments were size selected with 2% agarose gel electrophoresis using standard methods and quantified with a Nanodrop spectrophotometer (Thermo Fisher Scientific Inc., Waltham, Mass., USA).
  • MNSTOP buffer 500 mM NaCl, 50 mM EDTA, 0.07% NP-40, 1 ⁇ protease inhibitor
  • Barcoded sequencing libraries for all samples were prepared with the ThruPLEX-FD or ThruPLEX DNA-seq 48 D kits (Rubicon Genomics, Ann Arbor, Mich.), comprising a proprietary series of end-repair, ligation, and amplification reactions. Between 3.0 and 10.0 ng of DNA were used as input for all clinical sample libraries. Two bulk plasma cfDNA libraries were constructed with 30 ng of input to each library; each library was separately barcoded. Two libraries from MC2 were constructed with 2 ng of input to each library; each library was separately barcoded. Libraries for each of the MNase-digested cell lines were constructed with 20 ng of size-selected input DNA. Library amplification for all samples was monitored by real-time PCR to avoid over-amplification.
  • One lane of sequencing was performed for each of samples 2349, 2350, 2351, and 2353, yielding approximately 2.0 ⁇ 10 8 read-pairs per sample.
  • One lane of sequencing was performed for each of the four cell line MNase-digested libraries, yielding approximately 2.0 ⁇ 10 8 read-pairs per library.
  • Four lanes of sequencing were performed for one of the two replicate MC2 libraries and three lanes for one of the two replicate bulk plasma libraries, yielding a total of 10.6 ⁇ 10 9 and 7.8 ⁇ 10 8 read-pairs per library, respectively.
  • DNA insert sizes for both cfDNA and MNase libraries tend be short (majority of data between 80 bp and 240 bp); adapter sequence at the read ends of some molecules were therefore expected.
  • Adapter sequences starting at read ends were trimmed, and forward and reverse read of paired end (“PE”) data for short original molecules were collapsed into single reads (“SRs”); PE reads that overlap with at least 11 bp reads were collapsed to SRs.
  • the SRs shorter than 30 bp or showing more than 5 bases with a quality score below 10 were discarded.
  • the remaining PE and SR data were aligned to the human reference genome (GRCh37, 1000 G release v2) using fast alignment tools (BWA-ALN or BWA-MEM).
  • the resulting SAM (Sequence Alignment/Map) format was converted to sorted BAM (Binary Sequence Alignment/Map format) using SAMtools.
  • PE data provides information about the two physical ends of DNA molecules used in sequencing library preparation. This information was extracted using the SAMtools application programming interface (API) from BAM files. Both outer alignment coordinates of PE data for which both reads aligned to the same chromosome and where reads have opposite orientations were used. For non-trimmed SR data, only one read end provides information about the physical end of the original DNA molecule. If a read was aligned to the plus strand of the reference genome, the left-most coordinate was used. If a read was aligned to the reverse strand, its right-most coordinate was used instead. In cases where PE data was converted to single read data by adapter trimming, both end coordinates were considered. Both end coordinates were also considered if at least five adapter bases were trimmed from a SR sequencing experiment.
  • API application programming interface
  • the ratio of read-starts and coverage was calculated for each non-empty block of each sample. If the coverage was 0, the ratio was set to 0. These ratios were used to calculate a periodogram of each block using Fast Fourier Transform (FFT, spec.pgram in the R statistical programming environment) with frequencies between 1/500 bases and 1/100 bases.
  • FFT Fast Fourier Transform
  • parameters to smooth 3 bp Daniell smoother; moving average giving half weight to the end values
  • detrend the data e.g., subtract the mean of the series and remove a linear trend
  • PCA Principal component analysis
  • Putative transcription factor binding sites obtained through analysis of ChlP-seq data generated across a number of cell types, was obtained from the ENCODE project.
  • Chromosomal coordinates from both sets of predicted sites were intersected with bedtools v2.17.0. To preserve any asymmetry in the plots, only predicted binding sites on the “+” strand were used. Read-starts were tallied for each sample if they fell within 500 bp of either end of the predicted binding site, and summed within samples by position across all such sites. Only samples with at least 100 million total reads were used for this analysis.
  • cfDNA was deeply sequenced to better understand the processes that give rise to it.
  • the resulting data was used to build a genome-wide map of nucleosome occupancy that built on previous work by others, but is substantially more comprehensive.
  • TFs transcription factors
  • sequencing-related statistics including the total number of fragments sequenced, read lengths, the percentage of such fragments aligning to the reference with and without a mapping quality threshold, mean coverage, duplication rate, and the proportion of sequenced fragments in two length bins, were tabulated.
  • Fragment length was inferred from alignment of paired-end reads. Due to the short read lengths, coverage was calculated by assuming the entire fragment had been read. The estimated number of duplicate fragments was based on fragment endpoints, which may overestimate the true duplication rate in the presence of highly stereotyped cleavage.
  • SSP single-stranded library preparation protocol.
  • DSP double-stranded library preparation protocol.
  • dsDNA short double stranded DNA
  • cfDNA was denatured and a biotin-conjugated, single-stranded adaptor was ligated to the resulting fragments. The ligated fragments were then subjected to second-strand synthesis, end-repair and ligation of a second adaptor while the fragments were immobilized to streptavidin beads. Finally, minimal PCR amplification was performed to enrich for adaptor-bearing molecules while also appending a sample index ( FIG. 20 ; Table 2).
  • the resulting library was sequenced to 30-fold coverage (779M fragments).
  • the fragment length distribution again exhibited a dominant peak at ⁇ 167 bp corresponding to the chromatosome, but was considerably enriched for shorter fragments relative to conventional library preparation ( FIGS. 21, 22, 23A -B, 24 A-B).
  • all libraries exhibit ⁇ 10.4 bp periodicity, the fragment sizes are offset by 3 bp for the two methods, consistent with damaged or non-flush input molecules whose true endpoints are more faithfully represented in single-stranded libraries.
  • WPS Windowed Protection Score
  • the value of the WPS correlates with the locations of nucleosomes within strongly positioned arrays, as mapped by other groups with in vitro methods or ancient DNA ( FIG. 26 ). At other sites, the WPS correlates with genomic features such as DNase I hypersensitive (DHS) sites (e.g., consistent with the repositioning of nucleosomes flanking a distal regulatory element) ( FIG. 27 ).
  • DHS DNase I hypersensitive
  • fragment endpoints were also simulated, matching for the depth, size distribution and terminal dinucleotide frequencies of each sample. Genome-wide WPS were then calculated, and 10.3M, 10.2M, and 8.0M were called local maxima by the same heuristic, for simulated datasets matched to BH01, IH01 and IH02, respectively. Peaks from simulated datasets were associated with lower scores than peaks from real datasets ( FIGS. 33A-B ). Furthermore, the relatively reproducible locations of peaks called from real datasets ( FIG. 31 ; FIGS. 32A-C ) did not align well with the locations of peaks called from simulated datasets ( FIG. 31 ; FIGS. 34A-C ).
  • the cfDNA sequencing data from BH01, IH01, and IH02 were pooled and reanalyzed for a combined 231 fold-coverage (‘CH01’; 3.8 B fragments; Table 1).
  • the WPS was calculated and 12.9M peaks were called for this combined sample.
  • This set of peak calls was associated with higher scores and approached saturation in terms of the number of peaks ( FIGS. 33A-B ).
  • the CH01 peak set spans 2.53 gigabases (Gb) of the human reference genome.
  • Nucleosomes are known to be well-positioned in relation to landmarks of gene regulation, for example transcriptional start sites and exon-intron boundaries. Consistent with that understanding, similar positioning was observed in this data as well, in relation to landmarks of transcription, translation and splicing ( FIGS. 36-40 ).
  • the median peak-to-peak spacing within 100 kilobase (kb) windows that had been assigned to compartment A (enriched for open chromatin) or compartment B (enriched for closed chromatin) on the basis of long-range interactions (in situ Hi-C) in a lymphoblastoid cell line was examined.
  • Nucleosomes in compartment A exhibited tighter spacing than nucleosomes in compartment B (median 187 bp (A) vs. 190 bp (B)), with further differences between certain subcompartments ( FIG. 41 ).
  • median nucleosome spacing dropped sharply in pericentromeric regions, driven by strong positioning across arrays of alpha satellites (171 bp monomer length; FIG. 42 ; FIG. 26 ).
  • the long fraction WPS supports strong organization of nucleosomes in the vicinity of CTCF binding sites ( FIG. 43 ). However, a strong signal in the short fraction WPS is also observed that is coincident with the CTCF binding site itself ( FIGS. 44-45 ).
  • CTCF binding sites were stratified based on a presumption that they are bound in vivo (all FIMO predictions vs. the subset intersecting with ENCODE ChIP-seq vs. the further subset intersecting with those that appear to be utilized across 19 cell lines).
  • FIGS. 53A-H Similar analyses were performed for additional TFs for which both FIMO predictions and ENCODE CHiP-seq data were available ( FIGS. 53A-H ). For many of these TFs, such as ETS and MAFK ( FIGS. 54-55 ), a short fraction footprint was observed, accompanied by periodic signal in the long fraction WPS. This is consistent with strong positioning of nucleosomes surrounding bound TFBS. Overall, these data support the view that short cfDNA fragments, which are recovered markedly better by the single-stranded protocol ( FIG. 18 , FIG. 21 ), directly footprint the in vivo occupancy of DNA-bound transcription factors, including CTCF and others.
  • Nucleosome spacing varies between cell types, and as a function of chromatin state and gene expression. In general, open chromatin and transcription are associated with a shorter nucleosome repeat length, consistent with this Example's analyses of compartment A vs. B ( FIG. 41 ).
  • FFT fast Fourier transformation
  • cfDNA samples obtained from five late-stage cancer patients were sequenced.
  • the patterns of nucleosome spacing in these samples revealed additional contributions to cfDNA that correlated most strongly with non-hematopoietic tissues or cell lines, often matching the anatomical origin of the patient's cancer.
  • Table 4 shows clinical and histological diagnoses for 48 patients from whom plasma-borne cfDNA was screened for evidence of high tumor burden, along with total cfDNA yield from 1.0 ml of plasma from each individual and relevant clinical covariates. Of these 48, 44 passed QC and had sufficient material. Of these 44, five were selected for deeper sequencing. cfDNA yield was determined by Qubit Fluorometer 2.0 (Life Technologies).
  • Human peripheral blood plasma for 52 individuals with clinical diagnosis of Stage IV cancer (Table 4) was obtained from Conversant Bio or PlasmaLab International (Everett, Wash., USA) and stored in 0.5 ml or 1 ml aliquots at ⁇ 80° C. until use.
  • Human peripheral blood plasma for four individuals with clinical diagnosis of systemic lupus erythematosus was obtained from Conversant Bio and stored in 0.5 ml aliquots at ⁇ 80° C. until use. Frozen plasma aliquots were thawed on the bench-top immediately before use.
  • Circulating cell-free DNA was purified from 2 ml of each plasma sample with the QiaAMP Circulating Nucleic Acids kit (Qiagen) as per the manufacturer's protocol. DNA was quantified with a Qubit fluorometer (Invitrogen). To verify cfDNA yield in a subset of samples, purified DNA was further quantified with a custom qPCR assay targeting a multicopy human Alu sequence; the two estimates were found to be concordant.
  • each sample was scored on two metrics of aneuploidy to identify a subset likely to contain a high proportion of tumor-derived cfDNA: first, the deviation from the expected proportion of reads derived from each chromosome ( FIG. 62A ); and second, the per-chromosome allele balance profile for a panel of common single nucleotide polymorphisms ( FIG. 62B ).
  • single-stranded libraries derived from five individuals were sequenced to a depth similar to that of IH02 in Example 4 (Table 5; mean 30-fold coverage):
  • Table 5 tabulates sequencing-related statistics, including the total number of fragments sequenced, read lengths, the percentage of such fragments aligning to the reference with and without a mapping quality threshold, mean coverage, duplication rate, and the proportion of sequenced fragments in two length bins, for each sample.
  • Fragment length was inferred from alignment of paired-end reads. Due to the short read lengths, coverage was calculated by assuming the entire fragment had been read. The estimated number of duplicate fragments is based on fragment endpoints, which may overestimate the true duplication rate in the presence of highly stereotyped cleavage.
  • FFT FFT was performed on the long fragment WPS values across gene bodies and correlated the average intensity in the 193-199 bp frequency range against the same 76 expression datasets for human cell lines and primary tissues.
  • many of the most highly ranked cell lines or tissues represent non-hematopoietic lineages, in some cases aligning with the cancer type ( FIG. 61 ; Table 3).
  • the top-ranked correlation was with HepG2, a hepatocellular carcinoma cell line.
  • IC35 where the patient had a ductal carcinoma in situ breast cancer, the top-ranked correlation was with MCF7, a metastatic breast adenocarcinoma cell line.
  • MCF7 a metastatic breast adenocarcinoma cell line.
  • the largest change in correlation rank ( ⁇ 31) was for a small cell lung cancer cell line (SCLC-21H).
  • IC20 a lung squamous cell carcinoma
  • IC35 a colorectal adenocarcinoma
  • a greedy, iterative approach was used to estimate the proportions of various cell-types and/or tissues contributing to cfDNA derived from the biological sample.
  • the cell-type or tissue whose reference map (here, defined by the 76 RNA expression datasets) had the highest correlation with the average FFT intensity in the 193-199 bp frequency of the WPS long fragment values across gene bodies for a given cfDNA sample was identified.
  • a series of “two tissue” linear mixture models were fitted, including the cell-type or tissue with the highest correlation as well as each of the other remaining cell-types or tissues from the full set of reference maps.
  • the cell-type or tissue with the highest coefficient was retained as contributory, unless the coefficient was below 1% in which case the procedure was terminated and this last tissue or cell-type not included. This procedure was repeated, i.e. “three-tissue”, “four-tissue”, and so on, until termination based on the newly added tissue being estimated by the mixture model to contribute less than 1%.
  • the mixture model takes the form:
  • a cfDNA sample derived from a patient with advanced hepatocellular carcinoma predicted 9 contributory cell types, including Hep_G2 (28.6%), HMC.1 (14.3%), REH (14.0%), MCF7 (12.6%), AN3.CA (10.7%), THP.1 (7.4%), NB.4 (5.5%), U.266.84 (4.5%), and U.937 (2.4%).
  • a cfDNA sample corresponding to a mixture of healthy individuals predicted 7 contributory cell types or tissues, including bone marrow (30.0%), NB.4 (19.6%), HMC.1 (13.9%), U.937 (13.4%), U.266.84 (12.5%), Karpas.707 (6.5%), and REH (4.2%).
  • IC17 the sample derived from a cancer patient, the highest proportion of predicted contribution corresponds to a cell line that is closely associated with the cancer type that is present in the patient from whom this cfDNA was derived (Hep_G2 and hepatocellular carcinoma).
  • this approach predicts contributions corresponding only to tissues or cell types that are primarily associated with hematopoiesis, the predominant source of plasma cfDNA in healthy individuals.
  • Human peripheral blood plasma for 52 individuals with clinical diagnosis of Stage IV cancer (Supplementary Table 4) was obtained from Conversant Bio or PlasmaLab International (Everett, Wash., USA) and stored in 0.5 ml or 1 ml aliquots at ⁇ 80° C. until use.
  • Human peripheral blood plasma for four individuals with clinical diagnosis of systemic lupus erythematosus was obtained from Conversant Bio and stored in 0.5 ml aliquots at ⁇ 80° C. until use.
  • Circulating cell-free DNA was purified from 2 ml of each plasma sample with the QiaAMP Circulating Nucleic Acids kit (Qiagen) as per the manufacturer's protocol. DNA was quantified with a Qubit fluorometer (Invitrogen). To verify cfDNA yield in a subset of samples, purified DNA was further quantified with a custom qPCR assay targeting a multicopy human Alu sequence; the two estimates were found to be concordant.
  • Barcoded sequencing libraries were prepared with the ThruPLEX-FD or ThruPLEX DNA-seq 48 D kits (Rubicon Genomics), comprising a proprietary series of end-repair, ligation, and amplification reactions. Between 0.5 ng and 30.0 ng of cfDNA were used as input for all clinical sample libraries. Library amplification for all samples was monitored by real-time PCR to avoid over-amplification, and was typically terminated after 4-6 cycles.
  • Adapter 2 was prepared by combining 4.5 ⁇ l TE (pH 8), 0.5 ⁇ l 1M NaCl, 10 ⁇ l 500 uM oligo Adapter2.1, and 10 ⁇ l 500 ⁇ M oligo Adapter2.2, incubating at 95° C. for 10 seconds, and decreasing the temperature to 14° C. at a rate of 0.1° C/s.
  • Purified cfDNA fragments were dephosphorylated by combining 2 ⁇ CircLigase II buffer (Epicentre), 5 mM MnCl 2 , and 1 U FastAP alkaline phosphatase (Thermo Fisher) with 0.5-10 ng fragments in a 20 ⁇ l reaction volume and incubating at 37° C. for 30 minutes.
  • Fragments were then denatured by heating to 95° C. for 3 minutes, and were immediately transferred to an ice bath.
  • the reaction was supplemented with biotin-conjugated adapter oligo CL78 (5 pmol), 20% PEG-6000 (w/v), and 200 U CircLigase II (Epicentre) for a total volume of 40 ⁇ l, and was incubated overnight with rotation at 60° C., heated to 95° C. for 3 minutes, and placed in an ice bath.
  • BBB bead binding buffer
  • Beads were washed once with 500 ul wash buffer A (WBA) (10 mM Tris-HCl [pH 8], 1 mM EDTA [pH 8], 0.05% Tween-20, 100 mM NaCl, 0.5% SDS) and once with 500 ⁇ l wash buffer B (WBB) (10 mM Tris-HCl [pH 8], 1 mM EDTA [pH 8], 0.05% Tween-20, 100 mM NaCl).
  • WBA wash buffer A
  • WBB wash buffer B
  • Beads were combined with 1 ⁇ Isothermal Amplification Buffer (NEB), 2.5 ⁇ M oligo CL9, 250 82 M (each) dNTPs, and 24 U Bst 2.0 DNA Polymerase (NEB) in a reaction volume of 50 ⁇ l, incubated with gentle shaking by ramping temperature from 15° C. to 37° C. at 1° C./minute, and held at 37° C. for 10 minutes. After collection on a magnetic rack, beads were washed once with 200 ⁇ l WBA, resuspended in 200 ⁇ l of stringency wash buffer (SWB) (0.1 ⁇ SSC, 0.1% SDS), and incubated at 45° C. for 3 minutes.
  • SWB stringency wash buffer
  • Beads were again collected and washed once with 200 ⁇ l WBB. Beads were then combined with 1 ⁇ CutSmart Buffer (NEB), 0.025% Tween-20, 100 ⁇ M (each) dNTPs, and 5 U T4 DNA Polymerase (NEB) and incubated with gentle shaking for 30 minutes at room temperature. Beads were washed once with each of WBA, SWB, and WBB as described above. Beads were then mixed with 1 ⁇ CutSmart Buffer (NEB), 5% PEG-6000, 0.025% Tween-20, 2 ⁇ M double-stranded adapter 2, and 10 U T4 DNA Ligase (NEB), and incubated with gentle shaking for 2 hours at room temperature.
  • Beads were washed once with each of WBA, SWB, and WBB as described above, and resuspended in 25 ⁇ l TET buffer (10 mM Tris-HCl [pH 8], 1 mM EDTA [pH 8], 0.05% Tween-20). Second strands were eluted from beads by heating to 95° C., collecting beads on a magnetic rack, and transferring the supernatant to a new tube. Library amplification for all samples was monitored by real-time PCR to avoid over-amplification, and required an average of 4 to 6 cycles per library.
  • Barcoded paired end (PE) Illumina sequencing data was split allowing up to one substitution in the barcode sequence. Reads shorter or equal to read length were consensus called and adapter trimmed. Remaining consensus single end reads (SR) and the individual PE reads were aligned to the human reference genome sequence (GRCh37, 1000 Genomes phase 2 technical reference downloaded from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/) using the ALN algorithm implemented in BWA v0.7.10.
  • PE reads were further processed with BWA SAMPE to resolve ambiguous placement of read pairs or to rescue missing alignments by a more sensitive alignment step around the location of one placed read end.
  • Aligned SR and PE data was directly converted to sorted BAM format using the SAMtools API. BAM files of the sample were merged across lanes and sequencing runs.
  • Quality control was performed using FastQC (v0.11.2), obtaining a library complexity estimate (Picard tools v1.113), determining the proportion of adapter dimers, the analysis of the inferred library insert size, the nucleotide and dinucleotide frequencies at the outer reads ends as well as checking the mapping quality distributions of each library.
  • Aligned sequencing data was simulated (SR if shorter than 45 bp, PE 45 bp otherwise) for all major chromosomes of the human reference (GRC37h). For this purpose, dinucleotide frequencies were determined from real data on both read ends and both strand orientations. Dinucleotide frequencies were also recorded for the reference genome on both strands. Further, the insert size distribution of the real data was extracted for the 1-500 bp range. Reads were simulated by iterating through the sequence of the major reference chromosomes.
  • the strand is randomly chosen, (2) the ratio of the dinucleotide frequency in the real data over the frequency in the reference sequence is used to randomly decide whether the initiating dinucleotide is considered, (3) an insert size is sampled from the provided insert-size distribution and (4) the frequency ratio of the terminal dinucleotide is used to randomly decide whether the generated alignment is reported.
  • the simulated coverage was matched to that of the original data after PCR duplicate removal.
  • the data of the present disclosure provides information about the two physical ends of DNA molecules used in sequencing library preparation. We extract this information using the SAMtools application programming interface (API) from BAM files.
  • SAMtools application programming interface API
  • both end coordinates of the SR alignment we consider both end coordinates of the SR alignment as read starts.
  • For coverage we consider all positions between the two (inferred) molecule ends, including these end positions.
  • WPS windowed protection scores
  • the identified region is 150-450 bp long, we apply the same above median contiguous window approach, but only report those windows that are between 50-150 bp in size. For score calculation of multiple windows derived from the 150-450 bp regions, we assume the neighboring minima within the region to be zero. We discard regions shorter than 50 bp and longer than 450 bp.
  • clustered FIMO motif-based
  • clustered FIMO motif-based
  • E2F-2 EBF1, Ebox-CACCTG, Ebox, ESR1, ETS, IRF-2, IRF-3, IRF, MAFK, MEF2A-2, MEF2A, MYC-MAX, PAXS-2, RUNX2, RUNX-AML, STAF-2, TCF-LEF, YY1
  • the set of sites was refined to a more confident set of actively bound transcription factor binding sites based on experimental data. For this purpose, only predicted binding sites that overlap with peaks defined by ChlP-seq experiments from publically available ENCODE data (TfbsClusteredV3 set downloaded from UCSC) were retained.
  • Genomic features such as transcription start sites, transcription end sites, start codons, splice donor, and splice acceptor sites were obtained from Ensembl Build version 75. Adjusted WPS surrounding these features was calculated and plotted as described above for transcription factor binding sites.
  • CTCF sites used for this analysis first included clustered FIMO predictions of CTCF binding sites (computationally predicted via motifs). We then created two additional subsets of this set: 1) intersection with the set of CTCF ChlP-seq peaks available through the ENCODE TfbsClusteredV3 (see above), and 2) intersection with a set of CTCF sites that are experimentally observed to be active across 19 tissues.
  • the positions of 10 nucleosomes on either side of the binding site were extracted for each site.
  • the distribution of ⁇ 1 to +1 nucleosome spacing changed substantially, shifting to larger spacing, particularly in the 230-270 bp range. This suggested that truly active CTCF sites largely shift towards wider spacing between the ⁇ 1 and +1 nucleosomes, and that a difference in WPS for both long and short read fractions might therefore be apparent. Therefore, the mean short and long fragment WPS at each position relative to the center of CTCF sites were additionally calculated.
  • nucleosome spacing was taken within bins of ⁇ 1 to +1 nucleosome spacing of less than 160, 160-200, 200-230, 230-270, 270-420, 420-460, and greater than 420 bp. These intervals approximately captured spacings of interest, such as the dominant peak and the emerging peak at 230-270 bp for more confidently active sites.
  • DHS peaks for 349 primary tissue and cell line samples in BED format by Maurano et al. were downloaded from the University of Washington Encode database.
  • Samples derived from fetal tissues, comprising 233 of these peak sets, were removed from the analysis as they behaved inconsistently within tissue type, possibly because of unequal representation of multiple cell types within each tissue sample.
  • 116 samples representing a variety of cell lineages were retained for analysis.
  • the nearest upstream and downstream calls in the CH01 callset were identified, and the genomic distance between the centers of those two calls was calculated. The distribution of all such distances was visualized for each DHS peak callset using a smoothed density estimate calculated for distances between 0 and 500 bp.
  • FPKM expression values measured for 20,344 Ensembl gene identifiers in 44 human cell lines and 32 primary tissues by the Human Protein Atlas (“ma.csv” file) were used in this study. For analyses across tissues, genes with less than 3 non-zero expression values were excluded (19,378 genes passing this filter). The expression data set was provided with one decimal precession for the FPKM values. Thus, a zero expression value (0.0) indicates expression between 0 and a value less than 0.05. Unless otherwise noted, the minimum expression value was set to 0.04 FPKM before log 2 -transformation of the expression values.
  • the long fragment WPS was used to calculate periodograms of genomic regions using Fast Fourier Transform (FFT, spec.pgram in the R statistical programming environment) with frequencies between 1/500 bases and 1/100 bases.
  • FFT Fast Fourier Transform
  • Parameters to smooth (3 bp Daniell smoother; moving average giving half weight to the end values) and de-trend the data (i.e. subtract the mean of the series and remove a linear trend) are optionally additionally used.
  • the recursive time series filter as implemented in the R statistical programming environment was used to remove high frequency variation from trajectories.
  • 24 filter frequencies (1/seq(5,100,4)) were used, and the first 24 values of the trajectory as initial values were used. Adjustments for the 24-value shift in the resulting trajectories were made by repeating the last 24 values of the trajectory.
  • the intensity values as determined from smooth periodograms (FFT) in the context of gene expression for the 120-280 bp range were analyzed.
  • An S-shaped Pearson correlation between gene expression values and FFT intensities around the major inter-nucleosome distance peak was observed.
  • a pronounced negative correlation was observed in the 193-199 bp range.
  • the intensities in this frequency range were averaged correlated with log 2 -transformed expression values.
  • a method of determining tissues and/or cell types giving rise to cell free DNA (cfDNA) in a subject comprising:
  • cfDNA isolating cfDNA from a biological sample from the subject, the isolated cfDNA comprising a plurality of cfDNA fragments;
  • Example 7 wherein the step of determining at least some of the tissues and/or cell types giving rise to the cfDNA fragments comprises comparing the genomic locations of at least some of the cfDNA fragment endpoints to one or more reference maps.
  • Example 7 or Example 8 wherein the step of determining at least some of the tissues and/or cell types giving rise to the cfDNA fragments comprises performing a mathematical transformation on a distribution of the genomic locations of at least some of the cfDNA fragment endpoints.
  • Example 9 The method of Example 9 wherein the mathematical transformation includes a Fourier transformation.
  • any preceding Example further comprising determining a score for each of at least some coordinates of the reference genome, wherein the score is determined as a function of at least the plurality of cfDNA fragment endpoints and their genomic locations, and wherein the step of determining at least some of the tissues and/or cell types giving rise to the observed cfDNA fragments comprises comparing the scores to one or more reference map.
  • Example 11 wherein the score for a coordinate represents or is related to the probability that the coordinate is a location of a cfDNA fragment endpoint.
  • the reference map comprises a DNase I hypersensitive site map generated from at least one cell-type or tissue.
  • the reference map comprises an RNA expression map generated from at least one cell-type or tissue.
  • the reference map comprises a chromosome conformation map generated from at least one cell-type or tissue.
  • the reference map comprises a chromatin accessibility map generated from at least one cell-type or tissue.
  • the reference map comprises sequence data obtained from samples obtained from at least one reference subject.
  • any one of Examples 8 to 20 wherein the reference map is generated by digesting chromatin obtained from at least one cell-type or tissue with an exogenous nuclease (e.g., micrococcal nuclease).
  • an exogenous nuclease e.g., micrococcal nuclease
  • the reference maps comprise chromatin accessibility data determined by a transposition-based method (e.g., ATAC-seq) from at least one cell-type or tissue.
  • a transposition-based method e.g., ATAC-seq
  • references maps comprise data associated with positions of a DNA binding and/or DNA occupying protein for a tissue or cell type.
  • Example 23 The method of Example 23 wherein the DNA binding and/or DNA occupying protein is a transcription factor.
  • Example 23 The method of Example 23 or Example 24 wherein the positions are determined by chromatin immunoprecipitation of a crosslinked DNA-protein complex.
  • Example 23 wherein the positions are determined by treating DNA associated with the tissue or cell type with a nuclease (e.g., DNase-I).
  • a nuclease e.g., DNase-I
  • the reference map comprises a biological feature related to the positions or spacing of nucleosomes, chromatosomes, or other DNA binding or DNA occupying proteins within a tissue or cell type.
  • Example 27 The method of Example 27 wherein the biological feature is quantitative expression of one or more genes.
  • Example 27 The method of Example 27 or Example 28 wherein the biological feature is presence or absence of one or more histone marks.
  • tissue or cell type used to generate a reference map is a primary tissue from a subject having a disease or disorder.
  • Example 31 The method of Example 31 wherein the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • a complication of pregnancy e.g., aneuploid pregnancy
  • myocardial infarction e.g., myocardial infarction
  • inflammatory bowel disease e.g., aneuploid pregnancy
  • systemic autoimmune disease e.g., localized autoimmune disease
  • allotransplantation with rejection e.g., allotransplantation without rejection
  • stroke e.g., stroke, and localized tissue damage.
  • tissue or cell type used to generate a reference map is a primary tissue from a healthy subject.
  • tissue or cell type used to generate a reference map is an immortalized cell line.
  • tissue or cell type used to generate a reference map is a biopsy from a tumor.
  • Example 18 The method of Example 18 wherein the sequence data comprises positions of cfDNA fragment endpoints.
  • Example 36 The method of Example 36 wherein the reference subject is healthy.
  • Example 36 The method of Example 36 wherein the reference subject has a disease or disorder.
  • Example 38 The method of Example 38 wherein the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • the reference map comprises reference scores for at least a portion of coordinates of the reference genome associated with the tissue or cell type.
  • Example 40 The method of Example 40 wherein the reference map comprises a mathematical transformation of the scores.
  • Example 40 The method of Example 40 wherein the scores represent a subset of all reference genomic coordinates for the tissue or cell type.
  • Example 42 The method of Example 42 wherein the subset is associated with positions or spacing of nucleosomes and/or chromatosomes.
  • Example 42 The method of Example 42 or Example 43 wherein the subset is associated with transcription start sites and/or transcription end sites.
  • Example 47 The method of Example 47 wherein the orthogonal biological feature is associated with high expression genes.
  • Example 47 The method of Example 47 wherein the orthogonal biological feature is associated with low expression genes.
  • any one of Examples 7 to 51 wherein the step of determining the tissues and/or cell types giving rise to the cfDNA as a function of a plurality of the genomic locations of at least some of the cfDNA fragment endpoints comprises comparing a Fourier transform of the plurality of the genomic locations of at least some of the cfDNA fragment endpoints, or a mathematical transformation thereof, with a reference map.
  • a method of identifying a disease or disorder in a subject comprising:
  • cfDNA cell free DNA
  • Example 54 wherein the step of determining the tissues and/or cell types giving rise to the cfDNA comprises comparing the genomic locations of at least some of the cfDNA fragment endpoints to one or more reference maps.
  • Example 54 or Example 55 wherein the step of determining the tissues and/or cell types giving rise to the cfDNA comprises performing a mathematical transformation on a distribution of the genomic locations of at least some of the plurality of the cfDNA fragment endpoints.
  • Example 56 The method of Example 56 wherein the mathematical transformation includes a Fourier transformation.
  • any one of Examples 54 to 57 further comprising determining a score for each of at least some coordinates of the reference genome, wherein the score is determined as a function of at least the plurality of cfDNA fragment endpoints and their genomic locations, and wherein the step of determining at least some of the tissues and/or cell types giving rise to the observed cfDNA fragments comprises comparing the scores to one or more reference map.
  • Example 58 wherein the score for a coordinate represents or is related to the probability that the coordinate is a location of a cfDNA fragment endpoint.
  • the reference map comprises a DNase I hypersensitive site map, an RNA expression map, expression data, a chromosome conformation map, a chromatin accessibility map, chromatin fragmentation map, or sequence data obtained from samples obtained from at least one reference subject, and corresponding to at least one cell type or tissue that is associated with a disease or a disorder, and/or positions or spacing of nucleosomes and/or chromatosomes in a tissue or cell type.
  • any one of Examples 55 to 60 wherein the reference map is generated by digesting chromatin from at least one cell-type or tissue with an exogenous nuclease (e.g., micrococcal nuclease).
  • an exogenous nuclease e.g., micrococcal nuclease
  • Example 60 or Example 61 wherein the reference maps comprise chromatin accessibility data determined by applying a transposition-based method (e.g., ATAC-seq) to nuclei or chromatin from at least one cell-type or tissue.
  • a transposition-based method e.g., ATAC-seq
  • any one of Examples 55 to 62 wherein the reference maps comprise data associated with positions of a DNA binding and/or DNA occupying protein for a tissue or cell type.
  • Example 63 The method of Example 63 wherein the DNA binding and/or DNA occupying protein is a transcription factor.
  • Example 63 or Example 64 wherein the positions are determined by applying chromatin immunoprecipitation of a crosslinked DNA-protein complex to at least one cell-type or tissue.
  • Example 63 or Example 64 wherein the positions are determined by treating DNA associated with the tissue or cell type with a nuclease (e.g., DNase-I).
  • a nuclease e.g., DNase-I
  • the reference map comprises a biological feature related to the positions or spacing of nucleosomes, chromatosomes, or other DNA binding or DNA occupying proteins within a tissue or cell type.
  • Example 67 The method of Example 67 wherein the biological feature is quantitative expression of one or more genes.
  • Example 67 The method of Example 67 or Example 68 wherein the biological feature is presence or absence of one or more histone marks.
  • tissue or cell type used to generate a reference map is a primary tissue from a subject having a disease or disorder.
  • Example 71 The method of Example 71 wherein the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • tissue or cell type used to generate a reference map is a primary tissue from a healthy subject.
  • tissue or cell type used to generate a reference map is an immortalized cell line.
  • tissue or cell type used to generate a reference map is a biopsy from a tumor.
  • Example 60 The method of Example 60 wherein the sequence data obtained from samples obtained from at least one reference subject comprises positions of cfDNA fragment endpoint probabilities.
  • Example 76 The method of Example 76 wherein the reference subject is healthy.
  • Example 76 The method of Example 76 wherein the reference subject has a disease or disorder.
  • Example 78 The method of Example 78 wherein the disease or disorder is selected from the group consisting of: cancer, normal pregnancy, a complication of pregnancy (e.g., aneuploid pregnancy), myocardial infarction, inflammatory bowel disease, systemic autoimmune disease, localized autoimmune disease, allotransplantation with rejection, allotransplantation without rejection, stroke, and localized tissue damage.
  • a complication of pregnancy e.g., aneuploid pregnancy
  • myocardial infarction e.g., myocardial infarction
  • inflammatory bowel disease e.g., systemic autoimmune disease
  • localized autoimmune disease e.g., allotransplantation with rejection
  • allotransplantation without rejection e.g., stroke, and localized tissue damage.
  • the reference map comprises cfDNA fragment endpoint probabilities for at least a portion of the reference genome associated with the tissue or cell type.
  • Example 80 The method of Example 80 wherein the reference map comprises a mathematical transformation of the cfDNA fragment endpoint probabilities.
  • Example 80 The method of Example 80 wherein the cfDNA fragment endpoint probabilities represent a subset of all reference genomic coordinates for the tissue or cell type.
  • Example 82 The method of Example 82 wherein the subset is associated with positions or spacing of nucleosomes and/or chromatosomes.
  • Example 82 The method of Example 82 or Example 83 wherein the subset is associated with transcription start sites and/or transcription end sites.
  • Example 87 The method of Example 87 wherein the orthogonal biological feature is associated with high expression genes.
  • Example 87 The method of Example 87 wherein the orthogonal biological feature is associated with low expression genes.
  • any one of Examples 54 to 91 wherein the step of determining the tissue(s) and/or cell type(s) of the cfDNA as a function of a plurality of the genomic locations of at least some of the cfDNA fragment endpoints comprises comparing a Fourier transform of the plurality of the genomic locations of at least some of the cfDNA fragment endpoints, or a mathematical transformation thereof, with a reference map.
  • the reference map comprises DNA or chromatin fragmentation data corresponding to at least one tissue that is associated with the disease or disorder.
  • Example 95 The method of Example 95 wherein the report further comprises a list of the determined tissue(s) and/or cell type(s) of the isolated cfDNA.
  • the biological sample comprises, consists essentially of, or consists of whole blood, peripheral blood plasma, urine, or cerebral spinal fluid.
  • a method for determining tissues and/or cell types giving rise to cell-free DNA (cfDNA) in a subject comprising:
  • a method for determining tissues and/or cell types giving rise to cell-free DNA in a subject comprising:
  • a method for diagnosing a clinical condition in a subject comprising:
  • a method for diagnosing a clinical condition in a subject comprising
  • nucleosome map is generated by:
  • nucleosome occupancy signals are summarized in accordance with any one of aggregating signal from distributions (a), (b), and/or (c), or a mathematical transformation of one of these distributions, around other genomic landmarks such as DNasel hypersensitive sites, transcription start sites, topological domains, other epigenetic marks or subsets of all such sites defined by correlated behavior in other datasets (e.g. gene expression, etc.).
  • any one of Examples 98-101 wherein the distributions are transformed in order to aggregate or summarize the periodic signal of nucleosome positioning within various subsets of the genome, e.g. quantifying periodicity in contiguous windows or, alternatively, in discontiguous subsets of the genome defined by transcription factor binding sites, gene model features (e.g. transcription start sites), tissue expression data or other correlates of nucleosome positioning.
  • Example 109 The method of Example 109, wherein we first summarize long-range nucleosome ordering within contiguous windows along the genome in a diverse set of samples, and then perform principal components analysis (PCA) to cluster samples or to estimate mixture proportions.
  • PCA principal components analysis
  • Example 100 The method of Example 100 or Example 101, wherein the clinical condition is cancer, i.e. malignancies.
  • Example 111 The method of Example 111, wherein the biological sample is circulating plasma containing cfDNA, some portion of which is derived from a tumor.
  • Example 100 The method of Example 100 or Example 101, wherein the clinical condition is selected from tissue damage, myocardial infarction (acute damage of heart tissue), autoimmune disease (chronic damage of diverse tissues), pregnancy, chromosomal aberrations (e.g. trisomies), and transplant rejection.
  • the clinical condition is selected from tissue damage, myocardial infarction (acute damage of heart tissue), autoimmune disease (chronic damage of diverse tissues), pregnancy, chromosomal aberrations (e.g. trisomies), and transplant rejection.
  • Example 114 The method of Example 114 wherein the proportion assigned to each of the one or more determined tissues or cell types is based at least in part on a degree of correlation or of increased correlation, relative to cfDNA from a healthy subject or subjects.
  • Example 114 or Example 115 wherein the degree of correlation is based at least in part on a comparison of a mathematical transformation of the distribution of cfDNA fragment endpoints from the biological sample with the reference map associated with the determined tissue or cell type.
  • Example 114 to 116 wherein the proportion assigned to each of the one or more determined tissues or cell types is based on a mixture model.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Organic Chemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Public Health (AREA)
  • Immunology (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Cell Biology (AREA)
  • Primary Health Care (AREA)
  • Signal Processing (AREA)
  • Hospice & Palliative Care (AREA)
US15/329,228 2014-07-25 2015-07-27 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same Abandoned US20170211143A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/329,228 US20170211143A1 (en) 2014-07-25 2015-07-27 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201462029178P 2014-07-25 2014-07-25
US201462087619P 2014-12-04 2014-12-04
US15/329,228 US20170211143A1 (en) 2014-07-25 2015-07-27 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same
PCT/US2015/042310 WO2016015058A2 (fr) 2014-07-25 2015-07-27 Procédés de détermination de types de tissus et/ou de cellules permettant d'obtenir de l'adn sans cellules, et procédés d'identification d'une maladie ou d'un trouble les employant

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/042310 A-371-Of-International WO2016015058A2 (fr) 2014-07-25 2015-07-27 Procédés de détermination de types de tissus et/ou de cellules permettant d'obtenir de l'adn sans cellules, et procédés d'identification d'une maladie ou d'un trouble les employant

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US16/160,990 Continuation US20190127794A1 (en) 2014-07-25 2018-10-15 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same
US16/160,330 Continuation US10477023B1 (en) 2014-07-25 2018-10-15 Using enhanced answering machine detection (“AMD”) to detect reassigned numbers

Publications (1)

Publication Number Publication Date
US20170211143A1 true US20170211143A1 (en) 2017-07-27

Family

ID=55163988

Family Applications (4)

Application Number Title Priority Date Filing Date
US15/329,228 Abandoned US20170211143A1 (en) 2014-07-25 2015-07-27 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same
US16/160,990 Abandoned US20190127794A1 (en) 2014-07-25 2018-10-15 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same
US16/880,884 Active US11352670B2 (en) 2014-07-25 2020-05-21 Methods of determining tissues and/or cell types giving rise to cell-free DNA, and methods of identifying a disease or disorder using same
US17/805,656 Pending US20230212672A1 (en) 2014-07-25 2022-06-06 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same

Family Applications After (3)

Application Number Title Priority Date Filing Date
US16/160,990 Abandoned US20190127794A1 (en) 2014-07-25 2018-10-15 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same
US16/880,884 Active US11352670B2 (en) 2014-07-25 2020-05-21 Methods of determining tissues and/or cell types giving rise to cell-free DNA, and methods of identifying a disease or disorder using same
US17/805,656 Pending US20230212672A1 (en) 2014-07-25 2022-06-06 Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same

Country Status (8)

Country Link
US (4) US20170211143A1 (fr)
EP (2) EP4358097A1 (fr)
JP (3) JP2017522908A (fr)
KR (2) KR102441391B1 (fr)
CN (2) CN107002122B (fr)
AU (2) AU2015292311B2 (fr)
CA (1) CA2956208A1 (fr)
WO (1) WO2016015058A2 (fr)

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9850523B1 (en) 2016-09-30 2017-12-26 Guardant Health, Inc. Methods for multi-resolution analysis of cell-free nucleic acids
US9920366B2 (en) 2013-12-28 2018-03-20 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10041127B2 (en) 2012-09-04 2018-08-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
WO2018112100A3 (fr) * 2016-12-13 2018-09-13 Bellwether Bio, Inc. Détermination d'un état physiologique chez un individu par analyse de points d'extrémité de fragment d'adn acellulaire dans un échantillon biologique
CN109448783A (zh) * 2018-08-07 2019-03-08 清华大学 一种染色质拓扑结构域边界的分析方法
US10240209B2 (en) 2015-02-10 2019-03-26 The Chinese University Of Hong Kong Detecting mutations for cancer screening
US10364467B2 (en) 2015-01-13 2019-07-30 The Chinese University Of Hong Kong Using size and number aberrations in plasma DNA for detecting cancer
US10453556B2 (en) 2015-07-23 2019-10-22 The Chinese University Of Hong Kong Analysis of fragmentation patterns of cell-free DNA
WO2019222657A1 (fr) * 2018-05-18 2019-11-21 The Johns Hopkins University Adn acellulaire pour évaluer et/ou traiter le cancer
WO2020006369A1 (fr) * 2018-06-29 2020-01-02 Guardant Health, Inc. Procédés et systèmes d'analyse de régions de liaison ctcf dans un adn acellulaire
CN110739027A (zh) * 2019-10-23 2020-01-31 深圳吉因加医学检验实验室 一种基于染色质区域覆盖深度的癌症组织定位方法及系统
US10633713B2 (en) 2017-01-25 2020-04-28 The Chinese University Of Hong Kong Diagnostic applications using nucleic acid fragments
US10636512B2 (en) 2017-07-14 2020-04-28 Cofactor Genomics, Inc. Immuno-oncology applications using next generation sequencing
US10704086B2 (en) 2014-03-05 2020-07-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
CN111386362A (zh) * 2017-11-27 2020-07-07 深圳华大生命科学研究院 一种体液游离dna的文库构建方法及其应用
WO2020160414A1 (fr) 2019-01-31 2020-08-06 Guardant Health, Inc. Compositions et méthodes pour isoler de l'adn acellulaire
US10741270B2 (en) 2012-03-08 2020-08-11 The Chinese University Of Hong Kong Size-based analysis of cell-free tumor DNA for classifying level of cancer
WO2020243722A1 (fr) 2019-05-31 2020-12-03 Guardant Health, Inc. Procédés et systèmes pour améliorer une surveillance de patient après une intervention chirurgicale
US10894974B2 (en) 2012-09-04 2021-01-19 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
CN112424372A (zh) * 2019-01-24 2021-02-26 Illumina公司 用于监测器官健康和疾病的方法和系统
WO2021067484A1 (fr) 2019-09-30 2021-04-08 Guardant Health, Inc. Compositions et procédés d'analyse d'adn acellulaire dans des dosages de partitionnement de méthylation
WO2021108708A1 (fr) 2019-11-26 2021-06-03 Guardant Health, Inc. Procédés, compositions et systèmes pour améliorer la fixation de polynucléotides méthylés
US11062789B2 (en) 2014-07-18 2021-07-13 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US11062791B2 (en) 2016-09-30 2021-07-13 Guardant Health, Inc. Methods for multi-resolution analysis of cell-free nucleic acids
WO2021141220A1 (fr) * 2020-01-09 2021-07-15 서울대학교 산학협력단 Normalisation de données atac-seq et son procédé d'utilisation
WO2021222828A1 (fr) 2020-04-30 2021-11-04 Guardant Health, Inc. Procédés de détermination de séquence à l'aide d'acides nucléiques partitionnés
US11211147B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing
US11211144B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Methods and systems for refining copy number variation in a liquid biopsy assay
WO2022026761A1 (fr) 2020-07-30 2022-02-03 Guardant Health, Inc. Compositions et méthodes pour isoler de l'adn acellulaire
US11242569B2 (en) 2015-12-17 2022-02-08 Guardant Health, Inc. Methods to determine tumor gene copy number by analysis of cell-free DNA
US11261494B2 (en) 2012-06-21 2022-03-01 The Chinese University Of Hong Kong Method of measuring a fractional concentration of tumor DNA
WO2022046947A1 (fr) 2020-08-25 2022-03-03 Guardant Health, Inc. Méthodes et systèmes pour prédire l'origine d'un variant
WO2022073012A1 (fr) 2020-09-30 2022-04-07 Guardant Health, Inc. Compositions et procédés d'analyse d'adn par partitionnement et nucléase dépendante de la méthylation
WO2022087309A1 (fr) 2020-10-23 2022-04-28 Guardant Health, Inc. Compositions et procédés d'analyse d'adn par division et conversion de base
WO2022115810A1 (fr) 2020-11-30 2022-06-02 Guardant Health, Inc. Compositions et méthodes d'enrichissement de polynucléotides méthylés
WO2022140629A1 (fr) 2020-12-23 2022-06-30 Guardant Health, Inc. Procédés et systèmes d'analyse des polynucléotides méthylés
US11410750B2 (en) 2018-09-27 2022-08-09 Grail, Llc Methylation markers and targeted methylation probe panel
WO2022174109A1 (fr) 2021-02-12 2022-08-18 Guardant Health, Inc. Procédés et compositions pour la détection de variants d'acide nucléique
US11435339B2 (en) 2016-11-30 2022-09-06 The Chinese University Of Hong Kong Analysis of cell-free DNA in urine
WO2022204730A1 (fr) 2021-03-25 2022-09-29 Guardant Health, Inc. Procédés et compositions pour quantifier l'adn de cellules immunitaires
US11475981B2 (en) 2020-02-18 2022-10-18 Tempus Labs, Inc. Methods and systems for dynamic variant thresholding in a liquid biopsy assay
US11514289B1 (en) * 2016-03-09 2022-11-29 Freenome Holdings, Inc. Generating machine learning models using genetic data
WO2023282916A1 (fr) 2021-07-09 2023-01-12 Guardant Health, Inc. Procédés de détection de réarrangements génomiques à l'aide d'acides nucléiques acellulaires
US11781183B2 (en) * 2018-03-13 2023-10-10 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Diagnostic use of cell free DNA chromatin immunoprecipitation
WO2023197004A1 (fr) 2022-04-07 2023-10-12 Guardant Health, Inc. Détection de la présence d'une tumeur fondée sur l'état de méthylation des molécules d'acide nucléique acellulaire
US11810672B2 (en) * 2017-10-12 2023-11-07 Nantomics, Llc Cancer score for assessment and response prediction from biological fluids
US11821027B2 (en) 2017-01-10 2023-11-21 Juno Therapeutics, Inc. Epigenetic analysis of cell therapy and related methods
WO2024006908A1 (fr) 2022-06-30 2024-01-04 Guardant Health, Inc. Enrichissement d'adn méthylé de manière aberrante
US11913065B2 (en) 2012-09-04 2024-02-27 Guardent Health, Inc. Systems and methods to detect rare mutations and copy number variation
WO2024056720A1 (fr) 2022-09-13 2024-03-21 Medizinische Universität Graz Détermination de l'état de santé et surveillance de traitement avec de l'adn acellulaire
WO2024073508A2 (fr) 2022-09-27 2024-04-04 Guardant Health, Inc. Procédés et compositions de quantification d'adn de cellules immunitaires

Families Citing this family (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012129363A2 (fr) 2011-03-24 2012-09-27 President And Fellows Of Harvard College Détection et analyse d'acide nucléique d'une cellule isolée
AU2015292311B2 (en) 2014-07-25 2022-01-20 University Of Washington Methods of determining tissues and/or cell types giving rise to cell-free DNA, and methods of identifying a disease or disorder using same
WO2016123698A1 (fr) * 2015-02-06 2016-08-11 Uti Limited Partnership Test diagnostique pour l'évaluation posttransplantation du rejet potentiel d'organes de donneurs
ES2967443T3 (es) * 2016-07-06 2024-04-30 Guardant Health Inc Procedimientos de perfilado de fragmentoma de ácidos nucleicos sin células
WO2018013837A1 (fr) 2016-07-15 2018-01-18 The Regents Of The University Of California Procédés de production de bibliothèques d'acides nucléiques
US11788135B2 (en) 2016-08-05 2023-10-17 The Broad Institute, Inc. Methods for genome characterization
US20200048711A1 (en) * 2016-10-12 2020-02-13 Bellwether Bio, Inc Determining cell type origin of circulating cell-free dna with molecular counting
IL265769B2 (en) 2016-10-19 2023-12-01 Univ Hong Kong Chinese Estimation of gestational age using methylation and size profile of maternal plasma DNA
WO2018081130A1 (fr) 2016-10-24 2018-05-03 The Chinese University Of Hong Kong Méthodes et systèmes de détection d'une tumeur
IL302912A (en) 2016-12-22 2023-07-01 Guardant Health Inc Methods and systems for analyzing nucleic acid molecules
US11535896B2 (en) 2017-05-15 2022-12-27 Katholieke Universiteit Leuven Method for analysing cell-free nucleic acids
WO2018227202A1 (fr) 2017-06-09 2018-12-13 Bellwether Bio, Inc. Détermination du type de cancer chez un sujet par modélisation probabiliste de points d'extrémité de fragment d'acide nucléique circulant
WO2018227211A1 (fr) * 2017-06-09 2018-12-13 Bellwether Bio, Inc. Diagnostic du cancer ou d'autres états physiologiques à l'aide de points d'extrémité sentinelles de fragment d'acide nucléique circulant
CN107545153B (zh) * 2017-10-25 2021-06-11 桂林电子科技大学 一种基于卷积神经网络的核小体分类预测方法
CN108061794B (zh) * 2017-12-25 2020-03-27 苏州大学 一种无染色、无探针、无损检测细胞或类细胞结构微生物的类型及其周期的方法
AU2019207900A1 (en) 2018-01-12 2020-07-09 Claret Bioscience, Llc Methods and compositions for analyzing nucleic acid
WO2019209884A1 (fr) * 2018-04-23 2019-10-31 Grail, Inc. Méthodes et systèmes de dépistage d'affections
US20190385700A1 (en) * 2018-06-04 2019-12-19 Guardant Health, Inc. METHODS AND SYSTEMS FOR DETERMINING The CELLULAR ORIGIN OF CELL-FREE NUCLEIC ACIDS
WO2019236726A1 (fr) 2018-06-06 2019-12-12 The Regents Of The University Of California Procédés de production de bibliothèques d'acides nucléiques et compositions et kits pour leur mise en œuvre
CN108796056A (zh) * 2018-06-28 2018-11-13 元码基因科技(北京)股份有限公司 基于游离dna的目标基因捕获技术进行组织溯源的方法
CN108913682A (zh) * 2018-07-05 2018-11-30 上海奥测医疗科技有限公司 一种制备cfDNA参考品的方法
SG11202100564PA (en) * 2018-07-23 2021-02-25 Univ Hong Kong Chinese Cell-free dna damage analysis and its clinical applications
EP3815005A4 (fr) * 2018-10-08 2022-03-30 Freenome Holdings, Inc. Profilage du facteur de transcription
CN111172263A (zh) * 2018-11-12 2020-05-19 北京医院 一种应用于无创产前检测的参考物质及其制备方法
US20200199685A1 (en) 2018-12-17 2020-06-25 Guardant Health, Inc. Determination of a physiological condition with nucleic acid fragment endpoints
ES2968457T3 (es) * 2018-12-19 2024-05-09 Univ Hong Kong Chinese Características de los extremos del ADN extracelular circulante
US11657897B2 (en) * 2018-12-31 2023-05-23 Nvidia Corporation Denoising ATAC-seq data with deep learning
CN109841265B (zh) * 2019-02-22 2021-09-21 清华大学 使用片段化模式确定血浆游离核酸分子组织来源的方法和系统及应用
WO2020198942A1 (fr) * 2019-03-29 2020-10-08 中国科学技术大学 Procédé et système d'analyse de données de séquençage d'accessibilité de chromatine unicellulaire basés sur le regroupement de pics
CN110272985B (zh) * 2019-06-26 2021-08-17 广州市雄基生物信息技术有限公司 基于外周血血浆游离dna高通量测序技术的肿瘤筛查试剂盒及其系统与方法
US20210189494A1 (en) * 2019-12-18 2021-06-24 The Chinese University Of Hong Kong Cell-free dna fragmentation and nucleases
WO2021126091A1 (fr) * 2019-12-19 2021-06-24 Agency For Science, Technology And Research Procédé d'estimation d'une charge d'adn tumoral circulant et kits et procédés associés
US20230042332A1 (en) 2019-12-24 2023-02-09 Vib Vzw Disease Detection in Liquid Biopsies
AU2021205853A1 (en) * 2020-01-08 2023-11-23 Grail, Inc. Biterminal dna fragment types in cell-free samples and uses thereof
CN111254194B (zh) * 2020-01-13 2021-09-07 东南大学 基于cfDNA的测序及数据分析的癌症相关生物标记及其在cfDNA样品分类中的应用
WO2021163630A1 (fr) * 2020-02-13 2021-08-19 10X Genomics, Inc. Systèmes et procédés de visualisation interactive conjointe de l'expression génique et de l'accessibilité à la chromatine d'adn
CN111724860B (zh) * 2020-06-18 2021-03-16 深圳吉因加医学检验实验室 一种基于测序数据识别染色质开放区域的方法及装置
CN111881418B (zh) * 2020-07-27 2023-05-16 中国农业科学院农业信息研究所 一种基于二分法的大豆气象产量预测方法及系统
CN112085067B (zh) * 2020-08-17 2022-07-12 浙江大学 一种高通量筛选dna损伤反应抑制剂的方法
EP4214329A1 (fr) * 2020-09-17 2023-07-26 The Regents of the University of Colorado, a body corporate Signatures dans un adn libre circulant pour détecter une maladie, suivre une réponse de traitement et prévenir des décisions thérapeutiques
TW202242145A (zh) * 2020-12-29 2022-11-01 比利時商比利時意志有限公司 核小體耗盡循環無細胞染色質片段之轉錄因子結合位點分析
EP4291681A1 (fr) * 2021-02-09 2023-12-20 Illumina, Inc. Méthodes et systèmes pour surveiller la santé des organes et l'apparition de maladies
AU2022255198A1 (en) * 2021-04-08 2023-11-23 Fred Hutchinson Cancer Center Cell-free dna sequence data analysis method to examine nucleosome protection and chromatin accessibility
IL307524A (en) * 2021-04-08 2023-12-01 Delfi Diagnostics Inc A method for cancer detection using genome-wide CFDNA fragility profiles
WO2022271730A1 (fr) 2021-06-21 2022-12-29 Guardant Health, Inc. Procédés et compositions pour l'analyse tissulaire d'origine informée par le numéro de copie
WO2023056065A1 (fr) 2021-09-30 2023-04-06 Guardant Health, Inc. Compositions et procédés de synthèse et d'utilisation de sondes ciblant des réarrangements d'acides nucléiques
GB202205710D0 (en) 2022-04-19 2022-06-01 Univ Of Essex Enterprises Limited Cell-free DNA-based methods

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130230858A1 (en) * 2012-03-02 2013-09-05 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations

Family Cites Families (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0016742D0 (en) * 2000-07-10 2000-08-30 Simeg Limited Diagnostic method
JP2002272497A (ja) 2001-03-15 2002-09-24 Venture Link Co Ltd 癌の診断方法、およびその診断用ベクター
US6927028B2 (en) 2001-08-31 2005-08-09 Chinese University Of Hong Kong Non-invasive methods for detecting non-host DNA in a host using epigenetic differences between the host and non-host DNA
AU2004217872B2 (en) 2003-03-05 2010-03-25 Genetic Technologies Limited Identification of fetal DNA and fetal cell markers in maternal plasma or serum
EP1524321B2 (fr) 2003-10-16 2014-07-23 Sequenom, Inc. Détection non invasive de traits génétiques létaux
US20070122823A1 (en) 2005-09-01 2007-05-31 Bianchi Diana W Amniotic fluid cell-free fetal DNA fragment size pattern for prenatal diagnosis
TR201910868T4 (tr) 2006-02-02 2019-08-21 Univ Leland Stanford Junior Dijital analizle invazif olmayan fetal genetik tarama.
WO2007100911A2 (fr) 2006-02-28 2007-09-07 University Of Louisville Research Foundation Détection d'anomalies chromosomiques à l'aide de polymorphismes mononucléotidiques tandem
TWI335354B (en) 2006-09-27 2011-01-01 Univ Hong Kong Chinese Methods for the detection of the degree of the methylation of a target dna and kits
US7842482B2 (en) 2007-02-26 2010-11-30 The Chinese University Of Hong Kong Methods and kits for diagnosis, prognosis or monitoring of Epstein-Barr virus (EBV)-associated cancer
US20100112590A1 (en) 2007-07-23 2010-05-06 The Chinese University Of Hong Kong Diagnosing Fetal Chromosomal Aneuploidy Using Genomic Sequencing With Enrichment
CA3176319A1 (fr) 2007-07-23 2009-01-29 The Chinese University Of Hong Kong Analyse d'adn tumoral dans un echantillon acellulaire
US20090053719A1 (en) 2007-08-03 2009-02-26 The Chinese University Of Hong Kong Analysis of nucleic acids by digital pcr
US8748100B2 (en) 2007-08-30 2014-06-10 The Chinese University Of Hong Kong Methods and kits for selectively amplifying, detecting or quantifying target DNA with specific end sequences
WO2009051842A2 (fr) 2007-10-18 2009-04-23 The Johns Hopkins University Détection d'un cancer par mesure du nombre de copies génomiques et de la longueur des brins dans de l'adn exempt de cellules
US20100041048A1 (en) 2008-07-31 2010-02-18 The Johns Hopkins University Circulating Mutant DNA to Assess Tumor Dynamics
US8476013B2 (en) * 2008-09-16 2013-07-02 Sequenom, Inc. Processes and compositions for methylation-based acid enrichment of fetal nucleic acid from a maternal sample useful for non-invasive prenatal diagnoses
EP2334812B1 (fr) 2008-09-20 2016-12-21 The Board of Trustees of The Leland Stanford Junior University Diagnostic non invasif d'aneuploïdie foetale par sequençage
US8835110B2 (en) 2008-11-04 2014-09-16 The Johns Hopkins University DNA integrity assay (DIA) for cancer diagnostics, using confocal fluorescence spectroscopy
SG174401A1 (en) 2009-03-31 2011-11-28 Oridis Biomarkers Gmbh Method for diagnosis of cancer and monitoring of cancer treatments
WO2011053790A2 (fr) 2009-10-30 2011-05-05 Fluidigm Corporation Dosage de cibles étroitement liées en diagnostic foetal et dosage de détection de coïncidence pour l'analyse génétique
HUE034854T2 (en) 2009-11-05 2018-03-28 Univ Hong Kong Chinese Fetal genomic analysis from maternal biological samples
EP3406737B1 (fr) 2009-11-06 2023-05-31 The Chinese University of Hong Kong Analyse génomique bassé sur la taille
US20120010085A1 (en) 2010-01-19 2012-01-12 Rava Richard P Methods for determining fraction of fetal nucleic acids in maternal samples
US9260745B2 (en) 2010-01-19 2016-02-16 Verinata Health, Inc. Detecting and classifying copy number variation
US20130210645A1 (en) * 2010-02-18 2013-08-15 The Johns Hopkins University Personalized tumor biomarkers
CN103370456A (zh) * 2010-08-24 2013-10-23 比奥Dx股份有限公司 限定母体循环血液中保守的游离浮动胎儿dna的诊断性和治疗性靶物
EP2426217A1 (fr) 2010-09-03 2012-03-07 Centre National de la Recherche Scientifique (CNRS) Procédés analytiques pour acides nucléiques libres dans les cellules et applications
NZ611599A (en) 2010-11-30 2015-05-29 Univ Hong Kong Chinese Detection of genetic or molecular aberrations associated with cancer
WO2013043922A1 (fr) 2011-09-22 2013-03-28 ImmuMetrix, LLC Compositions et procédés pour analyser des échantillons hétérogènes
US10196681B2 (en) 2011-10-06 2019-02-05 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
EP2771483A1 (fr) 2011-10-25 2014-09-03 ONCOTYROL - Center for Personalized Cancer Medicine GmbH Méthode pour diagnostiquer une maladie basée sur la distribution de l'adn plasmatique
US9757458B2 (en) 2011-12-05 2017-09-12 Immunomedics, Inc. Crosslinking of CD22 by epratuzumab triggers BCR signaling and caspase-dependent apoptosis in hematopoietic cancer cells
CN104334742A (zh) 2012-01-27 2015-02-04 利兰斯坦福青年大学董事会 解析和定量无细胞rna的方法
US9892230B2 (en) 2012-03-08 2018-02-13 The Chinese University Of Hong Kong Size-based analysis of fetal or tumor DNA fraction in plasma
EP3573066B1 (fr) 2012-03-13 2023-09-27 The Chinese University Of Hong Kong Procédés d'analyse de données de séquençage massivement parallèles pour diagnostic prénatal non invasif
CA2874195C (fr) * 2012-05-21 2020-08-25 Sequenom, Inc. Procedes et methodes d'evaluation non invasive de variations genetiques
US20150105267A1 (en) * 2012-05-24 2015-04-16 University Of Washington Through Its Center For Commercialization Whole genome sequencing of a human fetus
US11261494B2 (en) * 2012-06-21 2022-03-01 The Chinese University Of Hong Kong Method of measuring a fractional concentration of tumor DNA
IL305303A (en) 2012-09-04 2023-10-01 Guardant Health Inc Systems and methods for detecting rare mutations and changes in number of copies
WO2014039729A1 (fr) * 2012-09-05 2014-03-13 Stamatoyannopoulos John A Procédés et compositions associés à la régulation des acides nucléiques
US9732390B2 (en) 2012-09-20 2017-08-15 The Chinese University Of Hong Kong Non-invasive determination of methylome of fetus or tumor from plasma
EP3354747B1 (fr) 2012-09-20 2021-02-17 The Chinese University Of Hong Kong Détermination non invasive d'un méthylome d'une tumeur à partir du plasma
AU2014268377B2 (en) * 2013-05-24 2020-10-08 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
LT3004388T (lt) * 2013-05-29 2019-01-25 Chronix Biomedical Donoro ne ląstelinės dnr aptikimas ir jos kiekio nustatymas organo transplantanto recipientų apytakoje
US10262755B2 (en) 2014-04-21 2019-04-16 Natera, Inc. Detecting cancer mutations and aneuploidy in chromosomal segments
CN106460070B (zh) * 2014-04-21 2021-10-08 纳特拉公司 检测染色体片段中的突变和倍性
CA2950596C (fr) * 2014-05-30 2023-10-31 Verinata Health, Inc. Detection des aneuploidies foetales sous-chromosomique et des variationsdu nombre de copies
CN106795562B (zh) 2014-07-18 2022-03-25 香港中文大学 Dna混合物中的组织甲基化模式分析
AU2015292311B2 (en) 2014-07-25 2022-01-20 University Of Washington Methods of determining tissues and/or cell types giving rise to cell-free DNA, and methods of identifying a disease or disorder using same
US10364467B2 (en) 2015-01-13 2019-07-30 The Chinese University Of Hong Kong Using size and number aberrations in plasma DNA for detecting cancer
US11242559B2 (en) 2015-01-13 2022-02-08 The Chinese University Of Hong Kong Method of nuclear DNA and mitochondrial DNA analysis
US10319463B2 (en) 2015-01-23 2019-06-11 The Chinese University Of Hong Kong Combined size- and count-based analysis of maternal plasma for detection of fetal subchromosomal aberrations
SG11201706529TA (en) 2015-02-10 2017-09-28 Univ Hong Kong Chinese Detecting mutations for cancer screening and fetal analysis
HUE064231T2 (hu) 2015-07-23 2024-02-28 Univ Hong Kong Chinese Sejtmentes DNS fragmentációs mintázatának elemzése

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130230858A1 (en) * 2012-03-02 2013-09-05 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations

Cited By (114)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10741270B2 (en) 2012-03-08 2020-08-11 The Chinese University Of Hong Kong Size-based analysis of cell-free tumor DNA for classifying level of cancer
US11031100B2 (en) 2012-03-08 2021-06-08 The Chinese University Of Hong Kong Size-based sequencing analysis of cell-free tumor DNA for classifying level of cancer
US11261494B2 (en) 2012-06-21 2022-03-01 The Chinese University Of Hong Kong Method of measuring a fractional concentration of tumor DNA
US11319598B2 (en) 2012-09-04 2022-05-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10961592B2 (en) 2012-09-04 2021-03-30 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11879158B2 (en) 2012-09-04 2024-01-23 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11773453B2 (en) 2012-09-04 2023-10-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11001899B1 (en) 2012-09-04 2021-05-11 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10457995B2 (en) 2012-09-04 2019-10-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11319597B2 (en) 2012-09-04 2022-05-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10494678B2 (en) 2012-09-04 2019-12-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10501810B2 (en) 2012-09-04 2019-12-10 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10501808B2 (en) 2012-09-04 2019-12-10 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11913065B2 (en) 2012-09-04 2024-02-27 Guardent Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10837063B2 (en) 2012-09-04 2020-11-17 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10947600B2 (en) 2012-09-04 2021-03-16 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10738364B2 (en) 2012-09-04 2020-08-11 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10683556B2 (en) 2012-09-04 2020-06-16 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11434523B2 (en) 2012-09-04 2022-09-06 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10894974B2 (en) 2012-09-04 2021-01-19 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10995376B1 (en) 2012-09-04 2021-05-04 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10876152B2 (en) 2012-09-04 2020-12-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10876171B2 (en) 2012-09-04 2020-12-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10876172B2 (en) 2012-09-04 2020-12-29 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10793916B2 (en) 2012-09-04 2020-10-06 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10041127B2 (en) 2012-09-04 2018-08-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10822663B2 (en) 2012-09-04 2020-11-03 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11667967B2 (en) 2013-12-28 2023-06-06 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10801063B2 (en) 2013-12-28 2020-10-13 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11149307B2 (en) 2013-12-28 2021-10-19 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11639525B2 (en) 2013-12-28 2023-05-02 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11149306B2 (en) 2013-12-28 2021-10-19 Guardant Health, Inc. Methods and systems for detecting genetic variants
US9920366B2 (en) 2013-12-28 2018-03-20 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10883139B2 (en) 2013-12-28 2021-01-05 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10889858B2 (en) 2013-12-28 2021-01-12 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11434531B2 (en) 2013-12-28 2022-09-06 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11959139B2 (en) 2013-12-28 2024-04-16 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11639526B2 (en) 2013-12-28 2023-05-02 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11649491B2 (en) 2013-12-28 2023-05-16 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11118221B2 (en) 2013-12-28 2021-09-14 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11767556B2 (en) 2013-12-28 2023-09-26 Guardant Health, Inc. Methods and systems for detecting genetic variants
US11767555B2 (en) 2013-12-28 2023-09-26 Guardant Health, Inc. Methods and systems for detecting genetic variants
US10870880B2 (en) 2014-03-05 2020-12-22 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11667959B2 (en) 2014-03-05 2023-06-06 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10982265B2 (en) 2014-03-05 2021-04-20 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11447813B2 (en) 2014-03-05 2022-09-20 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10704085B2 (en) 2014-03-05 2020-07-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US10704086B2 (en) 2014-03-05 2020-07-07 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11091797B2 (en) 2014-03-05 2021-08-17 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11091796B2 (en) 2014-03-05 2021-08-17 Guardant Health, Inc. Systems and methods to detect rare mutations and copy number variation
US11984195B2 (en) 2014-07-18 2024-05-14 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US11062789B2 (en) 2014-07-18 2021-07-13 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US10364467B2 (en) 2015-01-13 2019-07-30 The Chinese University Of Hong Kong Using size and number aberrations in plasma DNA for detecting cancer
US10240209B2 (en) 2015-02-10 2019-03-26 The Chinese University Of Hong Kong Detecting mutations for cancer screening
US11168370B2 (en) 2015-02-10 2021-11-09 The Chinese University Of Hong Kong Detecting mutations for cancer screening
US10453556B2 (en) 2015-07-23 2019-10-22 The Chinese University Of Hong Kong Analysis of fragmentation patterns of cell-free DNA
US11615865B2 (en) 2015-07-23 2023-03-28 The Chinese University Of Hong Kong Analysis of fragmentation patterns of cell-free DNA
US11581063B2 (en) 2015-07-23 2023-02-14 The Chinese University Of Hong Kong Analysis of fragmentation patterns of cell-free DNA
US11605445B2 (en) 2015-07-23 2023-03-14 The Chinese University Of Hong Kong Analysis of fragmentation patterns of cell-free DNA
US11242569B2 (en) 2015-12-17 2022-02-08 Guardant Health, Inc. Methods to determine tumor gene copy number by analysis of cell-free DNA
US11514289B1 (en) * 2016-03-09 2022-11-29 Freenome Holdings, Inc. Generating machine learning models using genetic data
US9850523B1 (en) 2016-09-30 2017-12-26 Guardant Health, Inc. Methods for multi-resolution analysis of cell-free nucleic acids
US11817177B2 (en) 2016-09-30 2023-11-14 Guardant Health, Inc. Methods for multi-resolution analysis of cell-free nucleic acids
US11817179B2 (en) 2016-09-30 2023-11-14 Guardant Health, Inc. Methods for multi-resolution analysis of cell-free nucleic acids
US11062791B2 (en) 2016-09-30 2021-07-13 Guardant Health, Inc. Methods for multi-resolution analysis of cell-free nucleic acids
US11435339B2 (en) 2016-11-30 2022-09-06 The Chinese University Of Hong Kong Analysis of cell-free DNA in urine
WO2018112100A3 (fr) * 2016-12-13 2018-09-13 Bellwether Bio, Inc. Détermination d'un état physiologique chez un individu par analyse de points d'extrémité de fragment d'adn acellulaire dans un échantillon biologique
US11821027B2 (en) 2017-01-10 2023-11-21 Juno Therapeutics, Inc. Epigenetic analysis of cell therapy and related methods
US11479825B2 (en) 2017-01-25 2022-10-25 The Chinese University Of Hong Kong Diagnostic applications using nucleic acid fragments
US10633713B2 (en) 2017-01-25 2020-04-28 The Chinese University Of Hong Kong Diagnostic applications using nucleic acid fragments
US10636512B2 (en) 2017-07-14 2020-04-28 Cofactor Genomics, Inc. Immuno-oncology applications using next generation sequencing
US11810672B2 (en) * 2017-10-12 2023-11-07 Nantomics, Llc Cancer score for assessment and response prediction from biological fluids
CN111386362A (zh) * 2017-11-27 2020-07-07 深圳华大生命科学研究院 一种体液游离dna的文库构建方法及其应用
US11781183B2 (en) * 2018-03-13 2023-10-10 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Diagnostic use of cell free DNA chromatin immunoprecipitation
WO2019222657A1 (fr) * 2018-05-18 2019-11-21 The Johns Hopkins University Adn acellulaire pour évaluer et/ou traiter le cancer
CN112805563A (zh) * 2018-05-18 2021-05-14 约翰·霍普金斯大学 用于评估和/或治疗癌症的无细胞dna
US10975431B2 (en) 2018-05-18 2021-04-13 The Johns Hopkins University Cell-free DNA for assessing and/or treating cancer
US10982279B2 (en) 2018-05-18 2021-04-20 The Johns Hopkins University Cell-free DNA for assessing and/or treating cancer
WO2020006369A1 (fr) * 2018-06-29 2020-01-02 Guardant Health, Inc. Procédés et systèmes d'analyse de régions de liaison ctcf dans un adn acellulaire
CN109448783A (zh) * 2018-08-07 2019-03-08 清华大学 一种染色质拓扑结构域边界的分析方法
US11725251B2 (en) 2018-09-27 2023-08-15 Grail, Llc Methylation markers and targeted methylation probe panel
US11410750B2 (en) 2018-09-27 2022-08-09 Grail, Llc Methylation markers and targeted methylation probe panel
US11685958B2 (en) 2018-09-27 2023-06-27 Grail, Llc Methylation markers and targeted methylation probe panel
US11795513B2 (en) 2018-09-27 2023-10-24 Grail, Llc Methylation markers and targeted methylation probe panel
US20210310067A1 (en) * 2019-01-24 2021-10-07 Illumina, Inc. Methods and systems for monitoring organ health and disease
CN112424372A (zh) * 2019-01-24 2021-02-26 Illumina公司 用于监测器官健康和疾病的方法和系统
WO2020160414A1 (fr) 2019-01-31 2020-08-06 Guardant Health, Inc. Compositions et méthodes pour isoler de l'adn acellulaire
US11643693B2 (en) 2019-01-31 2023-05-09 Guardant Health, Inc. Compositions and methods for isolating cell-free DNA
WO2020243722A1 (fr) 2019-05-31 2020-12-03 Guardant Health, Inc. Procédés et systèmes pour améliorer une surveillance de patient après une intervention chirurgicale
US11939636B2 (en) 2019-05-31 2024-03-26 Guardant Health, Inc. Methods and systems for improving patient monitoring after surgery
WO2021067484A1 (fr) 2019-09-30 2021-04-08 Guardant Health, Inc. Compositions et procédés d'analyse d'adn acellulaire dans des dosages de partitionnement de méthylation
US11891653B2 (en) 2019-09-30 2024-02-06 Guardant Health, Inc. Compositions and methods for analyzing cell-free DNA in methylation partitioning assays
CN110739027A (zh) * 2019-10-23 2020-01-31 深圳吉因加医学检验实验室 一种基于染色质区域覆盖深度的癌症组织定位方法及系统
WO2021108708A1 (fr) 2019-11-26 2021-06-03 Guardant Health, Inc. Procédés, compositions et systèmes pour améliorer la fixation de polynucléotides méthylés
WO2021141220A1 (fr) * 2020-01-09 2021-07-15 서울대학교 산학협력단 Normalisation de données atac-seq et son procédé d'utilisation
US11211147B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing
US11475981B2 (en) 2020-02-18 2022-10-18 Tempus Labs, Inc. Methods and systems for dynamic variant thresholding in a liquid biopsy assay
US11211144B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Methods and systems for refining copy number variation in a liquid biopsy assay
WO2021222828A1 (fr) 2020-04-30 2021-11-04 Guardant Health, Inc. Procédés de détermination de séquence à l'aide d'acides nucléiques partitionnés
WO2022026761A1 (fr) 2020-07-30 2022-02-03 Guardant Health, Inc. Compositions et méthodes pour isoler de l'adn acellulaire
WO2022046947A1 (fr) 2020-08-25 2022-03-03 Guardant Health, Inc. Méthodes et systèmes pour prédire l'origine d'un variant
US11946106B2 (en) 2020-09-30 2024-04-02 Guardant Health, Inc. Methods and systems to improve the signal to noise ratio of DNA methylation partitioning assays
WO2022073011A1 (fr) 2020-09-30 2022-04-07 Guardant Health, Inc. Procédés et systèmes pour améliorer le rapport signal sur bruit de dosages de partitionnement de méthylation d'adn
WO2022073012A1 (fr) 2020-09-30 2022-04-07 Guardant Health, Inc. Compositions et procédés d'analyse d'adn par partitionnement et nucléase dépendante de la méthylation
WO2022087309A1 (fr) 2020-10-23 2022-04-28 Guardant Health, Inc. Compositions et procédés d'analyse d'adn par division et conversion de base
WO2022115810A1 (fr) 2020-11-30 2022-06-02 Guardant Health, Inc. Compositions et méthodes d'enrichissement de polynucléotides méthylés
WO2022140629A1 (fr) 2020-12-23 2022-06-30 Guardant Health, Inc. Procédés et systèmes d'analyse des polynucléotides méthylés
WO2022174109A1 (fr) 2021-02-12 2022-08-18 Guardant Health, Inc. Procédés et compositions pour la détection de variants d'acide nucléique
WO2022204730A1 (fr) 2021-03-25 2022-09-29 Guardant Health, Inc. Procédés et compositions pour quantifier l'adn de cellules immunitaires
WO2023282916A1 (fr) 2021-07-09 2023-01-12 Guardant Health, Inc. Procédés de détection de réarrangements génomiques à l'aide d'acides nucléiques acellulaires
WO2023197004A1 (fr) 2022-04-07 2023-10-12 Guardant Health, Inc. Détection de la présence d'une tumeur fondée sur l'état de méthylation des molécules d'acide nucléique acellulaire
WO2024006908A1 (fr) 2022-06-30 2024-01-04 Guardant Health, Inc. Enrichissement d'adn méthylé de manière aberrante
WO2024056720A1 (fr) 2022-09-13 2024-03-21 Medizinische Universität Graz Détermination de l'état de santé et surveillance de traitement avec de l'adn acellulaire
WO2024073508A2 (fr) 2022-09-27 2024-04-04 Guardant Health, Inc. Procédés et compositions de quantification d'adn de cellules immunitaires

Also Published As

Publication number Publication date
AU2022202587A1 (en) 2022-05-26
EP3172341A2 (fr) 2017-05-31
JP2023123420A (ja) 2023-09-05
WO2016015058A2 (fr) 2016-01-28
EP3172341A4 (fr) 2018-03-28
KR20170044660A (ko) 2017-04-25
JP2021045161A (ja) 2021-03-25
US20210010081A1 (en) 2021-01-14
US20230212672A1 (en) 2023-07-06
AU2015292311B2 (en) 2022-01-20
KR20220127359A (ko) 2022-09-19
CN107002122B (zh) 2023-09-19
CA2956208A1 (fr) 2016-01-28
CN117402950A (zh) 2024-01-16
EP4358097A1 (fr) 2024-04-24
CN107002122A (zh) 2017-08-01
JP2017522908A (ja) 2017-08-17
AU2015292311A1 (en) 2017-03-09
US20190127794A1 (en) 2019-05-02
US11352670B2 (en) 2022-06-07
KR102441391B1 (ko) 2022-09-07
WO2016015058A3 (fr) 2016-03-17

Similar Documents

Publication Publication Date Title
US11352670B2 (en) Methods of determining tissues and/or cell types giving rise to cell-free DNA, and methods of identifying a disease or disorder using same
Snyder et al. Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin
CN113096726B (zh) 使用无细胞dna片段尺寸以确定拷贝数变异
JP6829211B2 (ja) 癌スクリーニング及び胎児分析のための変異検出
CN105518151B (zh) 循环核酸肿瘤标志物的鉴别和用途
JP7022188B2 (ja) 無細胞核酸の多重解像度分析のための方法
RU2018121254A (ru) Высокоэффективное построение библиотек днк
JP2019504618A5 (fr)
Liu At the dawn: cell-free DNA fragmentomics and gene regulation
US20240110238A1 (en) Methods for genome characterization
AU2015288920A1 (en) A method for detecting a genetic variant
KR102029393B1 (ko) 무세포 dna를 포함하는 샘플에서 순환 종양 dna를 검출하는 방법 및 그 용도
US20190309374A1 (en) Determining a physiological condition in an individual by analyzing cell-free dna fragment endpoints in a biological sample
US20220403467A1 (en) Determining cell type origin of circulating cell-free dna with molecular counting
US20200255905A1 (en) Diagnosis of cancer or other physiological condition using circulating nucleic acid fragment sentinel endpoints
Zhou Fragmentomic and Epigenetic Analyses for Cell-Free DNA Molecules
WO2024056722A1 (fr) Détermination de l'état de santé avec de l'adn libre circulant à l'aide d'éléments cis-régulateurs et de réseaux d'interaction
CN108796056A (zh) 基于游离dna的目标基因捕获技术进行组织溯源的方法

Legal Events

Date Code Title Description
AS Assignment

Owner name: UNIVERSITY OF WASHINGTON, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHENDURE, JAY;SNYDER, MATTHEW;KIRCHER, MARTIN;REEL/FRAME:045434/0720

Effective date: 20150812

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION