US20160371428A1 - Systems and methods for determining aneuploidy risk using sample fetal fraction - Google Patents

Systems and methods for determining aneuploidy risk using sample fetal fraction Download PDF

Info

Publication number
US20160371428A1
US20160371428A1 US15/186,774 US201615186774A US2016371428A1 US 20160371428 A1 US20160371428 A1 US 20160371428A1 US 201615186774 A US201615186774 A US 201615186774A US 2016371428 A1 US2016371428 A1 US 2016371428A1
Authority
US
United States
Prior art keywords
fetal fraction
target sample
genetic data
maternal
gestational age
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/186,774
Inventor
Allison Ryan
Katie Kobara
Zachary Demko
Susan J. Gross
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Natera Inc
Original Assignee
Natera Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Natera Inc filed Critical Natera Inc
Priority to US15/186,774 priority Critical patent/US20160371428A1/en
Assigned to NATERA, INC. reassignment NATERA, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GROSS, SUSAN J., DEMKO, ZACHARY, KOBARA, KATIE, RYAN, ALLISON
Publication of US20160371428A1 publication Critical patent/US20160371428A1/en
Assigned to ORBIMED ROYALTY OPPORTUNITIES II, LP reassignment ORBIMED ROYALTY OPPORTUNITIES II, LP SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NATERA, INC.
Assigned to NATERA, INC. reassignment NATERA, INC. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: ORBIMED ROYALTY OPPORTUNITIES II, LP
Priority to US17/365,786 priority patent/US20210327542A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G06F19/18
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection

Definitions

  • the present invention generally relates to molecular biology methods and systems, and more specifically to methods and systems for determining aneuploidy risk in a target maternal blood sample.
  • Noninvasive prenatal testing using cell-free DNA can be used to detect abnormalities in a fetus.
  • cfDNA cell-free DNA
  • Noninvasive prenatal testing is used to determine the genetic state of a fetus from genetic material that is obtained in a noninvasive manner, for example from a blood draw on the pregnant mother.
  • the blood could be separated and the plasma isolated, and size selection can optionally be used to isolate the DNA of the appropriate length.
  • This isolated DNA can then be measured by a number of means, such as by hybridizing to a genotyping array and measuring the fluorescence, or by sequencing on a high throughput sequencer.
  • SNP based noninvasive prenatal testing is one type of noninvasive prenatal test. SNP based noninvasive prenatal testing is often used to screen for fetal aneuploidies. But the accuracy of SNP based tests is dependent on the amount of fetal DNA present in a maternal blood or plasma sample. SNP based testing returns no call results when the amount of fetal DNA is not sufficient to provide the desired accuracy.
  • Low amounts of fetal DNA may be caused by a number of factors.
  • One common factor is maternal weight. For example, as maternal weight increases, the amount of fetal DNA in maternal blood plasma, or other fluids often decreases. Thus, SNP based noninvasive prenatal tests that screen for fetal aneuploidies are sometimes unavailable to pregnant women.
  • FIG. 1 illustrates a fetal fraction distribution, according to an example embodiment.
  • FIG. 2 illustrates a log normal fetal fraction distribution, according to an example embodiment.
  • FIG. 3A illustrates a generated model for 19 weeks gestational age, according to an example embodiment.
  • FIG. 3B illustrates a generated model for 13 weeks gestational age, according to an example embodiment.
  • FIG. 4 is a flow chart of a method according to one embodiment of the invention.
  • FIG. 5 illustrates an example computer system for performing embodiments of the present invention.
  • FIG. 6 illustrates an example system for performing embodiments of the present invention.
  • FIG. 7 illustrates a posterior fetal fraction risk distribution, according to an example embodiment.
  • FIG. 8 illustrates a result set for an example embodiment for fetal fraction-based high risk assessment that predicts an aneuploidy in cases with low fetal fraction.
  • FIG. 9 illustrates a redraw success rate distribution, according to an example embodiment.
  • FIG. 10 illustrates a distribution of fetal fraction based risk scores in cases identified as high risk and low fetal fraction, according to an example embodiment.
  • FIG. 11A illustrates an estimated detection rate for trisomy 13 and 18, according to an example embodiment.
  • FIG. 11B illustrates an estimated detection rate for digynic triploidy, according to an example embodiment.
  • FIG. 12 illustrates a probability density function (PDF) of normalized euploid data, according to an example embodiment.
  • PDF probability density function
  • FIG. 13 illustrates a cumulative distribution function (CDF) of normalized euploid data, according to an example embodiment.
  • CDF cumulative distribution function
  • FIG. 14 illustrates a plot of redraw success rate, according to an example embodiment.
  • FIG. 15 illustrates a result set for identified high risk samples, according to an example embodiment.
  • system, method and/or computer program product embodiments for determining aneuploidy risk in a target sample of maternal blood, plasma, or other fluid based on the amount of fetal DNA.
  • Such embodiments may be used in situations where a low or extremely low fetal fraction renders traditional aneuploidal risk methodologies inconclusive or inaccurate.
  • such embodiments may be used to determine the risk of trisomy 13, trisomy 18, or maternal triploidy, which are all aneuploidies associated with a low or extremely low fetal fraction.
  • An embodiment operates by receiving genetic data for a target sample (sample of interest) of maternal blood, plasma, or other fluid, and known genetic data from a plurality of known noninvasive prenatal testing samples.
  • a fetal fraction distribution is determined for the received known genetic data based on gestational age and the maternal weight associated with the target sample.
  • a model for a plurality of ploidy states is then generated based on a fixed ratio reduction of the determined fetal fraction distribution.
  • a fetal fraction based data likelihood for the target sample is then determined for each of the plurality of ploidy states using the generated model and the fetal fraction associated with the target sample.
  • An aneuploidy risk score is then output for the target sample based on applying a Bayesian probability determination that combines each fetal fraction based data likelihood with a previously determined risk score as a conditional value. Accordingly, aneuploidy risk can be determined in a target sample even when the sample contains a low amount of fetal DNA. This allows aneuploidy risk to be determined even when SNP based noninvasive prenatal testing would be unreliable or unavailable.
  • obtaining genetic data refers to both, unless indicated otherwise by context, (1) acquiring DNA sequence information by laboratory techniques, e.g. use of an automated high throughput DNA sequencer, and (2) acquiring information that had been previously obtained by laboratory techniques, wherein the information is electronically transmitted to an analyzer, e.g. by computer over the Internet, by electronic transfer from the sequencing device, etc.
  • aneuploidy refers to the state where the wrong number of chromosomes are present in a cell.
  • a somatic human cell it refers to the case where a cell does not contain 22 pairs of autosomal chromosomes and one pair of sex chromosomes.
  • a human gamete it refers to the case where a cell does not contain one of each of the 23 chromosomes.
  • a single chromosome it refers to the case where more or less than two homologous but nonidentical chromosomes are present, and where each of the two chromosomes originate from a different parent.
  • telomere length refers to the quantity and chromosomal identity of one or more chromosomes in a cell.
  • Certain aneuploidies are often associated with a reduced amount of fetal DNA in the target sample.
  • trisomy 13, trisomy 18, and maternal triploidy are often associated with a reduced amount of fetal DNA in the target sample.
  • Embodiments of the invention determine aneuploidy risk in a target maternal sample based on a relationship between the amount of fetal DNA in the target sample and the presence of certain aneuploidies.
  • FIG. 6 illustrates a system for performing embodiments of the present invention.
  • FIG. 6 includes an analysis system 602 for determining a risk of fetal aneuploidy.
  • Analysis system 602 may include one or more processors for executing the functions described herein. Such functions may be implemented on the processor as engines or logical elements that perform the analytical functionality described herein, such as a modeling engine and a probability engine. Interaction of a user with such analytical engines may be conducted through an appropriate user interface.
  • Analysis system 602 may be coupled to a database of known samples 604 via, for example, a network 610 .
  • Network 610 may be any type of communication network, including intranets, local area networks, or wide area networks such as the Internet. Genetic data from samples with a known ploidy state may be used to form a baseline for comparison with a target sample in question, as discussed further below.
  • Database 604 may be a collection of data from a variety of sources including clinical studies and commercial data sets.
  • a fetal fraction distribution is defined for such known genetic data from the plurality of known prenatal testing samples by analysis system 602 .
  • the fetal fraction distribution may be based on the maternal weight and the gestational age corresponding to each sample. This is because gestational age and maternal weight are often factors for the amount of fetal DNA present in a maternal blood sample.
  • a fetal fraction distribution may be defined for known genetic data from a plurality of noninvasive known prenatal testing samples.
  • the plurality of known prenatal testing samples may be selected based on various criteria to ensure an accurate and representative fetal fraction distribution.
  • known genetic data for a known prenatal testing sample may be selected or filtered for inclusion in the fetal fraction distribution based on an associated low aneuploidy risk result, a no call result due to low fetal fraction, and a low confidence result.
  • Known genetic data for a known prenatal testing sample may also be selected based on whether the maternal weight associated with the sample is available or whether the sample was collected in a clinical trial in the United States or a foreign country. A selection based on country of origin may be done to prevent unit conversion uncertainty in maternal weight for the sample.
  • known genetic data for a plurality of known prenatal testing samples may be grouped into sets according to gestational age and maternal weight.
  • known genetic data for the plurality of known prenatal testing samples may include sample data taken at a gestational age ranging from 9 to 20 weeks at one week increments.
  • Known genetic data for the plurality of known prenatal testing samples may also include sample data corresponding to a maternal weight ranging from 110 to 250 pounds at 20 pounds increments.
  • sampling of the known genetic data from known prenatal testing samples may be accurate to within plus or minus ten days of gestational age and plus or minus five pounds of maternal weight.
  • the average fetal fraction is computed for the known genetic data in each set of known prenatal testing samples.
  • the standard deviation may also be computed.
  • the average fetal fraction and standard deviation is only computed for sets of known prenatal testing samples containing at least 50 samples. This may be done to ensure an accurate and representative fetal fraction distribution.
  • the result is a grid of distribution parameters (e.g. average fetal fractions and standard deviations) that correspond to the grid of sample conditions.
  • FIG. 1 illustrates an example fetal fraction distribution based on known genetic data from a plurality of known prenatal testing samples grouped according to gestational age and maternal weight, according to an example embodiment.
  • a set of known prenatal testing samples is grouped together based on the gestational ages of 9 weeks, 12 weeks, and 18 weeks.
  • each set of known prenatal testing samples is further grouped together based on maternal weight.
  • the average fetal fraction is computed for the known genetic data of each resulting set of known prenatal testing samples.
  • the average fetal fraction of prenatal testing samples with a maternal weight of 200 lbs. and a gestational age of 9 weeks is around 0.06.
  • the average fetal fraction of prenatal testing samples with a maternal weight of 200 lbs. and a gestational age of 12 weeks is around 0.07.
  • the average fetal fraction of prenatal testing samples with a maternal weight of 200 lbs. and a gestational age of 18 weeks is around 0.08.
  • a fetal fraction distribution may become more symmetric when transformed to log space. Therefore, modeling of fetal fraction may be conducted in log space.
  • the fetal fraction distribution may be transformed to a log-normal distribution.
  • the fetal fraction distribution may transformed to a continuous probability distribution of the fetal fraction whose logarithm is normally distributed.
  • the logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the known genetic data of the known prenatal testing samples.
  • FIG. 2 illustrates an example log normal fetal fraction distribution based on the transformation of a fetal fraction distribution to log space, according to an example embodiment.
  • the logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the known prenatal testing samples.
  • the gestational age is 10 weeks plus or minus 10 days and the maternal weight is 230 pounds plus or minus 5 pounds.
  • the log normal fetal fraction distribution represents a probability density function (PDF) that describes the relative likelihood for fetal fraction to take on a given value where the gestational age is around 10 weeks plus or minus 10 days and the maternal weight is 230 pounds plus or minus 5 pounds.
  • PDF probability density function
  • the probability of having an aneuploidy can be computed from the log normal fetal fraction distribution. Specifically, the probability of having an aneuploidy can be computed as the integral of the PDF over a defined range.
  • the effect of an aneuploidy may be modeled as a fixed rate reduction in the average fetal fraction compared to the expected average fetal fraction for a given maternal weight and gestational age.
  • the average fetal fraction of a trisomy 13 pregnancy may be 80% of the average fetal fraction for a euploid pregnancy of the same maternal weight and gestational age.
  • Trisomy 13, trisomy 18, and maternal triploidy may be modeled using a fixed rate reduction in the average fetal fraction.
  • the effect of an aneuploidy may be modeled according to various other reductions in the average fetal fraction compared to the expected average fetal fraction for a given maternal weight and gestational age.
  • a model may be generated for a plurality of ploidy states based on the fixed ratio reduction of the fetal fraction distribution.
  • a ploidy state may be referred to as a hypothesis.
  • a fetal fraction distribution may be transformed to a log-normal distribution of fetal fraction prior to generation of a model.
  • a model may be generated for three hypotheses: trisomy 13, trisomy 18, and maternal triploidy.
  • a fixed rate reduction in the average fetal fraction corresponds to a constant subtracted offset.
  • the log fetal fraction for euploid prenatal testing samples is Gaussian distributed with a mean m and a standard deviation s
  • the log fetal fraction for prenatal testing samples with an aneuploidy is Gaussian distributed with a mean m-c and a standard deviation s-c where c is a constant subtracted offset for a given aneuploidy.
  • a constant subtracted offset for a given aneuploidy may be determined by an analysis of empirical data.
  • the constant subtracted offset for trisomies 13 and 18 is log(0.79). In other words, in this example, the mean for the trisomy 13 and 18 hypothesis distributions are reduced by log(0.79).
  • the constant substracted offset for maternal triploidy is log(0.22).
  • the mean of the maternal triploidy hypothesis distribution is reduced by log(0.22).
  • analysis system 602 may also be coupled to a database 606 containing genetic data for a target sample, either directly or over network 610 .
  • Genetic data about the target sample, stored in database 606 may have been obtained from, for example, a sequencer 608 .
  • the target sample is one for which a fetal aneuploidy risk is to be determined. While the examples herein will refer to maternal blood, one of skill in the art will recognize that the target sample may be, for example, a maternal blood or plasma containing both maternal DNA and fetal DNA. Such DNA may be, for example, cell-free DNA. As would be appreciated by a person of ordinary skill in the art, a target maternal blood sample that contains fetal DNA may be obtained using various methods.
  • the obtained prenatal target sample is modified using standard molecular biology techniques in order to be sequenced on a DNA sequencer, such as sequencer 608 .
  • the technique will involve forming a genetic library containing priming sites for the DNA sequencing procedure.
  • a plurality of loci may be targeted for site specific amplification.
  • the targeted loci are polymorphic loci, e.g., a single nucleotide polymorphisms.
  • libraries may be encoded using a DNA sequence that is specific for the patient, e.g. barcoding, thereby permitting multiple patients to be analyzed in a single flow cell (or flow cell equivalent) of a high throughput DNA sequencer.
  • the samples are mixed together in the DNA sequencer flow cell, the determination of the sequence of the barcode permits identification of the patient source that contributed the DNA that had been sequenced.
  • Methods are known in the art for obtaining genetic data from a sample. Typically this involves amplification of DNA in the sample, a process which transforms a small amount of genetic material to a larger amount of genetic material that contains a similar set of genetic data. This can be done by a wide variety of methods, including, but not limited to, Polymerase Chain Reaction (PCR), ligand mediated PCR, degenerative oligonucleotide primer PCR, Multiple Displacement Amplification, allele-specific amplification techniques, Molecular Inversion Probes (MIP), padlock probes, other circularizing probes, and combination thereof.
  • PCR Polymerase Chain Reaction
  • ligand mediated PCR ligand mediated PCR
  • degenerative oligonucleotide primer PCR Multiple Displacement Amplification
  • MIP Molecular Inversion Probes
  • padlock probes other circularizing probes, and combination thereof.
  • the DNA amplification transforms the initial sample of DNA into a sample of DNA that is similar in the set of sequences, but of much greater quantity. In some cases, amplification may not be required.
  • the genetic data of the target sample can be transformed from a molecular state to an electronic state by measuring the appropriate genetic material using tools and or techniques taken from a group including, but not limited to: genotyping microarrays, and high throughput sequencing.
  • Some high throughput sequencing methods and systems include Sanger DNA sequencing, pyrosequencing, the ILLUMINA SOLEXA platform, ILLUMINA's GENOME ANALYZER, ILLUMINA's HISEQ or MISEQ, APPLIED BIOSYSTEM's SOLiD platform, ION TORRENT'S PGM or PROTON platforms, HELICOS's TRUE SINGLE MOLECULE SEQUENCING platform, HALCYON MOLECULAR's electron microscope sequencing method, or any other sequencing method. All of these methods physically transform the genetic data stored in a sample of DNA into a set of genetic data that is typically stored in a memory device en route to being processed.
  • a fetal fraction based data likelihood for a target sample may be computed by analysis system 602 for each ploidy state (e.g., trisomy 13, trisomy 18, and maternal triploidy) using the generated model and the fetal fraction associated with the target sample, where each ploidy state corresponds to a hypothesis.
  • a fetal fraction based data likelihood for a target sample may be computed for each hypothesis (e.g. trisomy 13, trisomy 18, maternal triploidy, etc.) by evaluating the Gaussian probability density function at the observed log value of the fetal fraction associated with the target sample at each of the three hypotheses.
  • FIG. 3A illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy based on a fixed ratio reduction of a determined fetal fraction distribution, according to an embodiment.
  • FIG. 3A illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy where the gestational age is 19 weeks and the maternal weight is 166 pounds.
  • a fetal fraction based data likelihood for a target sample with a gestational age of 19 weeks and a maternal weight of 166 pounds may be computed for trisomy 13, trisomy 18, and maternal triploidy by evaluating the respective Gaussian probability density function at the observed log value of the fetal fraction associated with the target sample.
  • the fetal fraction based data likelihood of trisomy 13 or trisomy 18 for a target sample with a fetal fraction of 0.10, a maternal weight of 166 pounds, and a gestational age of 19 weeks is around 35%.
  • the fetal fraction based data likelihood of trisomy 13 or trisomy 18 for a target sample with a fetal fraction of 0.20, a maternal weight of 166 pounds, and a gestational age of 19 weeks is around 10%.
  • FIG. 3B illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy based on a fixed ratio reduction of a determined fetal fraction distribution, according to an embodiment.
  • FIG. 3B illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy where the gestational age is 13 weeks and the maternal weight is 166 pounds.
  • a fetal fraction based data likelihood for a target sample with a gestational age of 13 weeks and a maternal weight of 166 pounds may be computed for trisomy 13, trisomy 18, and maternal triploidy by evaluating the respective Gaussian probability density function at the observed log value of the fetal fraction associated with the target sample.
  • an aneuploidy risk score for the fetus associated with the target sample may be determined.
  • each fetal fraction based data likelihood can be combined with a previously determined risk score in order to determine the aneuploidy risk score for the fetus associated with the target sample.
  • a previously determined risk score may be, for example, an age based prior risk score for the mother associated with the target sample.
  • a previously determined risk score may be a SNP-based prior risk score.
  • a previously determined risk score may be based on other prior risk factors, including a combination of prior risk factors.
  • an aneuploidy risk score for the fetus associated with the target sample may be determined based on the posterior probability of the presence of any of trisomy 13, trisomy 18, and maternal triploidy.
  • the fetal fraction based data likelihoods may be combined with previously determined risk scores for trisomy 13, trisomy 18, and maternal triploidy using Bayes' theorem to determine an aneuploidy risk score for the fetus associated with the target sample.
  • the previously determined risk scores for trisomy 13 and trisomy 18 depend on maternal age and gestational age and may be determined empirically.
  • the previously determined risk score for maternal triploidy is 1/5505.
  • FIG. 4 is a flowchart of a method 400 for determining aneuploidy risk in a target maternal blood sample, according to an embodiment.
  • Method 400 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions run on a processing device), or a combination thereof.
  • processing logic may be implemented in, for example, analysis system 602 .
  • step 402 of FIG. 4 known genetic data from a plurality of known noninvasive prenatal testing samples is received.
  • the known genetic data from the plurality of known prenatal testing samples may be received from a variety of sources including clinical studies and commercial data sets.
  • a fetal fraction distribution may be defined for known genetic data from a plurality of noninvasive known prenatal testing samples, a plurality of invasive known prenatal testing samples, or a combination of both.
  • the received known genetic data from the plurality of known prenatal testing samples may be optionally filtered based on various criteria to ensure that an accurate and representative fetal fraction distribution is determined in step 406 .
  • known genetic data for the known prenatal testing samples may be filtered based on an associated low aneuploidy risk result, a no call result due to low fetal fraction, and a low confidence result.
  • the received known genetic data for the known prenatal testing samples may also be filtered based on whether the maternal weight associated with a sample is available or whether a sample was collected in a clinical in the United States or a foreign country. The filtering based on country of origin may be done to prevent unit conversion uncertainty in maternal weight for a sample.
  • step 404 of FIG. 4 genetic data for a target maternal blood sample containing fetal DNA is received.
  • the genetic data includes at least gestational age of the associated fetus, a maternal weight, and a fetal DNA fraction of the target sample.
  • a target maternal blood sample that contains fetal DNA may be obtained using various methods.
  • a fetal fraction distribution is determined for the known genetic data from step 402 .
  • the determined fetal fraction distribution is based on the maternal weight and the gestational age associated with the target blood sample of step 404 .
  • the received known genetic data for the plurality of known prenatal testing samples is grouped into sets according to gestational age and maternal weight. As discussed above, the sampling of the known genetic data from known prenatal testing samples may be done at various intervals for gestational age and maternal weight.
  • the average fetal fraction is then computed.
  • the average fetal fraction may only be computed where a set of known prenatal testing samples includes a minimum number of 50 samples. This may be done to ensure an accurate and representative fetal fraction distribution.
  • the fetal fraction distribution is transformed to a log-normal distribution.
  • the logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the received known genetic data of step 402 .
  • the log normal fetal fraction distribution represents a PDF that describes the relative likelihood for fetal fraction to take on a given value where the gestational age is equal to gestational age and the maternal weight associated with the received genetic data for the target sample of step 404 .
  • a model is generated for a plurality of ploidy states based on the log-normal distribution of fetal fraction of step 408 .
  • trisomy 13, trisomy 18, and maternal triploidy distributions are generated from the log-normal distribution of fetal fraction of step 408 . This involves reducing the mean for the trisomy 13, trisomy 18, and maternal triploidy distributions by respective constant subtracted offset. As would be appreciated by a person of ordinary skill in the art, the constant subtracted offsets for the trisomy 13, trisomy 18, and maternal triploidy distributions may be determined experimentally.
  • fetal fraction based data likelihoods for the received target sample of step 404 are computed for each of the ploidy states using the generated model of step 410 and the fetal fraction associated with the target sample.
  • a fetal fraction based data likelihood for the received target sample is computed for trisomy 13, trisomy 18, and maternal triploidy by evaluating the Gaussian probability density functions for trisomy 13, trisomy 18, and maternal triploidy at the observed log value of the fetal fraction associated with the target sample.
  • a Bayesian probability determination is applied to combine the fetal fraction based data likelihoods of step 412 with previously determined risk scores.
  • a previously determined risk score may be an age based prior risk score for the mother associated with the target sample or an SNP-based prior risk score.
  • aneuploidy risk scores for trisomy 13, trisomy 18, and maternal triploidy are output based on the applying in step 414 .
  • the outputting may be performed using various methods and mediums.
  • the aneuploidy risks scores for trisomy 13, trisomy 18, and maternal triploidy are independently determined. Because each aneuploidy risk score is an independent posterior probability of the presence of either trisomy 13, trisomy 18, or maternal triploidy, the resulting aneuploidy risk scores can be compared to identify the most likely ploidy state.
  • a probability that the sample is euploid is also determined and taken into account.
  • an additional type of analysis is made available to individuals whose aneuploidy risk may not be able to be determined by traditional methods, such as SNP-based methods.
  • This analysis may also be used to confirm a previously determined risk score in situations where extremely low fetal fraction is an issue.
  • FIG. 7 illustrates a posterior fetal fraction risk distribution, according to an example embodiment.
  • a posterior risk distribution is computed by combining data likelihoods with prior risk for a gestational age between 9 and 11 weeks. The cutoff is at 1/100 risk. This sets the fetal fraction limit for a high risk call.
  • FIG. 8 illustrates a result set for a pilot study of an example embodiment for fetal fraction-based high risk assessment that predicts an aneuploidy in cases with low fetal fraction.
  • the result set of FIG. 8 indicates that the example embodiment for fetal fraction-based risk assessment is able to predict abnormalities in a clinical data sample set.
  • FIG. 8 illustrates some of the abnormalities detected in the pilot study.
  • FIG. 9 illustrates a redraw success rate distribution, according to an example embodiment.
  • FIG. 9 shows fetal fraction change observed from approximately 3,000 Non-Invasive Prenatal Testing (NIPT) redraws.
  • NIPT Non-Invasive Prenatal Testing
  • the example embodiment of FIG. 9 provides useful information when an embodiment for NIPT single-nucleotide polymorphism (SNP) fails to provide a prediction.
  • SNP single-nucleotide polymorphism
  • the example embodiment of FIG. 9 provides a fetal fraction-based risk score and a probability of successful call on redraw, making it possible to predict redraw success based on a predicted range of redraw fetal fraction.
  • FIG. 10 illustrates a distribution of fetal fraction based risk scores in cases identified as high risk and low fetal fraction, according to follow up study of an example embodiment.
  • FIG. 10 shows that roughly 5 cases had a fetal fraction based risk score of 0.2.
  • the objective was to test whether high fetal fraction-based risk predicts aneuploidy in cases with unusually low fetal fraction.
  • An attempt to collect follow up was made for 896 samples, where the adjusted fetal fraction was below approximately the 2 nd percentile, and the maternal weight was available. 525 samples were eligible for inclusion in the follow up study, from domestic clinics and direct sales clinics.
  • 143 samples were identified as having high fetal fraction-based risk with low fetal fraction. In particular, the fetal fraction-based risk was greater than 0.01 and the fetal fraction was 2.5 SD below mean.
  • Karyotype was available for 70 samples.
  • FIG. 11A illustrates an estimated detection rate for trisomy 13 and 18, according to an example embodiment. Specifically, FIG. 11A illustrates what fraction of affected cases that are not identified by a NIPT SNP embodiment will be identified by the fetal fraction-based risk score >1/100. The estimated detection rate is based on the sample data set of FIG. 10 . In FIG. 11A , the estimated detection rate for trisomy 13/18 is 91.4%.
  • FIG. 11B illustrates an estimated detection rate for digynic triploidy, according to an example embodiment. Specifically, FIG. 11B illustrates what fraction of affected cases that are not identified by a NIPT SNP embodiment will be identified by the fetal fraction-based risk score >1/100. The estimated detection rate is based on the sample data set of FIG. 10 . In FIG. 11B , the estimated detection rate for digynic triploidy is 96.6%. Retroactive application of such high risk fetal fraction criteria to 29,000 NIPT cases would have resulted in 432 high risk calls (1.5%). Application of the SNP method would result in 115 (0.4%) high risk calls (for T13, T18, digynic triploidy). This results in a 1.8% combined high risk call rate. The expected aneuploidy rate based on priors was 0.3%. The theoretical PPV was thus 16% (0.3%/1.8%).
  • FIG. 12 illustrates a PDF of normalized euploid data, according to an example embodiment.
  • FIG. 12 shows empirical density plots of fetal fractions after normalization. There are 39 density curves. Each of the 39 density curves comes from a set of data with approximately the same maternal weight and gestational age, with between 400 and 500 samples each. Each data set is normalized by its observed mean and variance. The plot in FIG. 12 shows that the Gaussian fit is appropriate because the distributions are very similar.
  • FIG. 13 illustrates a CDF of the normalized euploid data of FIG. 12 , according to an example embodiment.
  • FIG. 13 shows empirical density plots of fetal fractions after normalization. There are 39 density curves. Each of the 39 density curves comes from a set of data with approximately the same maternal weight and gestational age, with between 400 and 500 samples each. Each data set is normalized by its observed mean and variance. The plot in FIG. 13 shows that the Gaussian fit is appropriate because the distributions are very similar.
  • FIG. 14 illustrates a plot of redraw success rate, according to an example embodiment. Specifically, FIG. 14 plots the redraw success rate against material weight bucket center. This plots shows that another characteristic of the fetal fraction distribution is the redraw success rate. Specifically, the ability to make a call is strongly dependent on fetal fraction and a successful redraw is often based on an increase in fetal fraction between the first and second draw. The ability to predict the probability of success for a redraw is often useful for doctors and patients. This is because many cases with low fetal fraction will not be at high risk for aneuploidy, but still have low probability of a successful redraw, and so other testing embodiments may be preferred.
  • FIG. 15 illustrates an example result set for identified high risk samples, according to an embodiment. Specifically, FIG. 15 illustrates a result set for 143 sample cases that were identified as having high extremely low fetal fraction (ELFF) risk based on not having received a successful high or low risk draw call, and having a computed ELFF risk score greater than 0.01. FIG. 15 further illustrates that follow-up results were successfully collected for 70 of these sample cases. Of these 70 sample cases, 7 were found to be aneuploid.
  • ELFF extremely low fetal fraction
  • Computer system 500 can be any well-known computer capable of performing the functions described herein.
  • Computer system 5 includes one or more processors (also called central processing units, or CPUs), such as a processor 5 .
  • processors also called central processing units, or CPUs
  • Processor 504 is connected to a communication infrastructure or bus 506 .
  • One or more processors 504 may each be a graphics processing unit (GPU).
  • a GPU is a processor that is a specialized electronic circuit designed to process mathematically intensive applications.
  • the GPU may have a parallel structure that is efficient for parallel processing of large blocks of data, such as mathematically intensive data common to computer graphics applications, images, videos, etc.
  • Computer system 500 also includes user input/output device(s) 503 , such as monitors, keyboards, pointing devices, etc., that communicate with communication infrastructure 506 through user input/output interface(s) 502 .
  • user input/output device(s) 503 such as monitors, keyboards, pointing devices, etc.
  • Computer system 500 also includes a main or primary memory 508 , such as random access memory (RAM).
  • Main memory 508 may include one or more levels of cache.
  • Main memory 508 has stored therein control logic (i.e., computer software) and/or data.
  • Computer system 500 may also include one or more secondary storage devices or memory 510 .
  • Secondary memory 510 may include, for example, a hard disk drive 512 and/or a removable storage device or drive 514 .
  • Removable storage drive 514 may be a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup device, and/or any other storage device/drive.
  • Removable storage drive 514 may interact with a removable storage unit 518 .
  • Removable storage unit 518 includes a computer usable or readable storage device having stored thereon computer software (control logic) and/or data.
  • Removable storage unit 518 may be a floppy disk, magnetic tape, compact disk, DVD, optical storage disk, and/ any other computer data storage device.
  • Removable storage drive 514 reads from and/or writes to removable storage unit 518 in a well-known manner.
  • secondary memory 510 may include other means, instrumentalities or other approaches for allowing computer programs and/or other instructions and/or data to be accessed by computer system 500 .
  • Such means, instrumentalities or other approaches may include, for example, a removable storage unit 522 and an interface 520 .
  • the removable storage unit 522 and the interface 520 may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a memory stick and USB port, a memory card and associated memory card slot, and/or any other removable storage unit and associated interface.
  • Computer system 500 may further include a communication or network interface 524 .
  • Communication interface 524 enables computer system 500 to communicate and interact with any combination of remote devices, remote networks, remote entities, etc. (individually and collectively referenced by reference number 528 ).
  • communication interface 524 may allow computer system 500 to communicate with remote devices 528 over communications path 526 , which may be wired and/or wireless, and which may include any combination of LANs, WANs, the Internet, etc. Control logic and/or data may be transmitted to and from computer system 500 via communication path 526 .
  • a tangible apparatus or article of manufacture comprising a tangible computer useable or readable medium having control logic (software) stored thereon is also referred to herein as a computer program product or program storage device.
  • control logic software stored thereon
  • control logic when executed by one or more data processing devices (such as computer system 500 ), causes such data processing devices to operate as described herein.
  • references herein to “one embodiment,” “an embodiment,” “an example embodiment,” or similar phrases indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of persons skilled in the relevant art(s) to incorporate such feature, structure, or characteristic into other embodiments whether or not explicitly mentioned or described herein.

Abstract

Disclosed herein are system, method, and computer program product embodiments for determining aneuploidy risk in a target sample of maternal blood or plasma based on the amount of fetal DNA. An embodiment operates by receiving known genetic data from known prenatal testing samples and genetic data for the target sample. A fetal fraction distribution is determined for the known genetic data based on gestational age and the maternal weight associated with the target sample. A model is then generated based on a fixed ratio reduction of the determined fetal fraction distribution. A fetal fraction based data likelihood for the target sample is then determined for each of the plurality of ploidy states using the generated model. An aneuploidy risk score is then outputted based on applying a Bayesian probability determination that combines each fetal fraction based data likelihood with a previously determined risk score as a conditional value.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of U.S. Provisional Patent Application No. 62/182,085, filed Jun. 19, 2015, which is hereby incorporated by reference herein in its entirety.
  • FIELD OF THE INVENTION
  • The present invention generally relates to molecular biology methods and systems, and more specifically to methods and systems for determining aneuploidy risk in a target maternal blood sample.
  • BACKGROUND
  • Noninvasive prenatal testing using cell-free DNA (cfDNA) can be used to detect abnormalities in a fetus. As a result, noninvasive prenatal testing is rapidly becoming part of clinical care for pregnant women.
  • Noninvasive prenatal testing is used to determine the genetic state of a fetus from genetic material that is obtained in a noninvasive manner, for example from a blood draw on the pregnant mother. The blood could be separated and the plasma isolated, and size selection can optionally be used to isolate the DNA of the appropriate length. This isolated DNA can then be measured by a number of means, such as by hybridizing to a genotyping array and measuring the fluorescence, or by sequencing on a high throughput sequencer.
  • Single Nucleotide Polymorphism (SNP) based noninvasive prenatal testing is one type of noninvasive prenatal test. SNP based noninvasive prenatal testing is often used to screen for fetal aneuploidies. But the accuracy of SNP based tests is dependent on the amount of fetal DNA present in a maternal blood or plasma sample. SNP based testing returns no call results when the amount of fetal DNA is not sufficient to provide the desired accuracy.
  • Low amounts of fetal DNA may be caused by a number of factors. One common factor is maternal weight. For example, as maternal weight increases, the amount of fetal DNA in maternal blood plasma, or other fluids often decreases. Thus, SNP based noninvasive prenatal tests that screen for fetal aneuploidies are sometimes unavailable to pregnant women.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The presently disclosed embodiments will be further explained with reference to the attached drawings, wherein like structures are referred to by like numerals throughout the several views. The drawings shown are not necessarily to scale, with emphasis instead generally being placed upon illustrating the principles of the presently disclosed embodiments.
  • FIG. 1 illustrates a fetal fraction distribution, according to an example embodiment.
  • FIG. 2 illustrates a log normal fetal fraction distribution, according to an example embodiment.
  • FIG. 3A illustrates a generated model for 19 weeks gestational age, according to an example embodiment.
  • FIG. 3B illustrates a generated model for 13 weeks gestational age, according to an example embodiment.
  • FIG. 4 is a flow chart of a method according to one embodiment of the invention.
  • FIG. 5 illustrates an example computer system for performing embodiments of the present invention.
  • FIG. 6 illustrates an example system for performing embodiments of the present invention.
  • FIG. 7 illustrates a posterior fetal fraction risk distribution, according to an example embodiment.
  • FIG. 8 illustrates a result set for an example embodiment for fetal fraction-based high risk assessment that predicts an aneuploidy in cases with low fetal fraction.
  • FIG. 9 illustrates a redraw success rate distribution, according to an example embodiment.
  • FIG. 10 illustrates a distribution of fetal fraction based risk scores in cases identified as high risk and low fetal fraction, according to an example embodiment.
  • FIG. 11A illustrates an estimated detection rate for trisomy 13 and 18, according to an example embodiment.
  • FIG. 11B illustrates an estimated detection rate for digynic triploidy, according to an example embodiment.
  • FIG. 12 illustrates a probability density function (PDF) of normalized euploid data, according to an example embodiment.
  • FIG. 13 illustrates a cumulative distribution function (CDF) of normalized euploid data, according to an example embodiment.
  • FIG. 14 illustrates a plot of redraw success rate, according to an example embodiment.
  • FIG. 15 illustrates a result set for identified high risk samples, according to an example embodiment.
  • While the above-identified drawings set forth presently disclosed embodiments, other embodiments are also contemplated, as noted in the discussion. This disclosure presents illustrative embodiments by way of representation and not limitation. Numerous other modifications and embodiments can be devised by those skilled in the art which fall within the scope and spirit of the principles of the presently disclosed embodiments.
  • DETAILED DESCRIPTION
  • Provided herein are system, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for determining aneuploidy risk in a target sample of maternal blood, plasma, or other fluid based on the amount of fetal DNA. Such embodiments may be used in situations where a low or extremely low fetal fraction renders traditional aneuploidal risk methodologies inconclusive or inaccurate. For example, such embodiments may be used to determine the risk of trisomy 13, trisomy 18, or maternal triploidy, which are all aneuploidies associated with a low or extremely low fetal fraction. An embodiment operates by receiving genetic data for a target sample (sample of interest) of maternal blood, plasma, or other fluid, and known genetic data from a plurality of known noninvasive prenatal testing samples. A fetal fraction distribution is determined for the received known genetic data based on gestational age and the maternal weight associated with the target sample. A model for a plurality of ploidy states is then generated based on a fixed ratio reduction of the determined fetal fraction distribution. A fetal fraction based data likelihood for the target sample is then determined for each of the plurality of ploidy states using the generated model and the fetal fraction associated with the target sample. An aneuploidy risk score is then output for the target sample based on applying a Bayesian probability determination that combines each fetal fraction based data likelihood with a previously determined risk score as a conditional value. Accordingly, aneuploidy risk can be determined in a target sample even when the sample contains a low amount of fetal DNA. This allows aneuploidy risk to be determined even when SNP based noninvasive prenatal testing would be unreliable or unavailable.
  • The term “obtaining genetic data” as used herein refers to both, unless indicated otherwise by context, (1) acquiring DNA sequence information by laboratory techniques, e.g. use of an automated high throughput DNA sequencer, and (2) acquiring information that had been previously obtained by laboratory techniques, wherein the information is electronically transmitted to an analyzer, e.g. by computer over the Internet, by electronic transfer from the sequencing device, etc.
  • The term “aneuploidy” refers to the state where the wrong number of chromosomes are present in a cell. In the case of a somatic human cell it refers to the case where a cell does not contain 22 pairs of autosomal chromosomes and one pair of sex chromosomes. In the case of a human gamete, it refers to the case where a cell does not contain one of each of the 23 chromosomes. In the case of a single chromosome, it refers to the case where more or less than two homologous but nonidentical chromosomes are present, and where each of the two chromosomes originate from a different parent.
  • The term “ploidy state” refers to the quantity and chromosomal identity of one or more chromosomes in a cell.
  • Certain aneuploidies are often associated with a reduced amount of fetal DNA in the target sample. For example, trisomy 13, trisomy 18, and maternal triploidy are often associated with a reduced amount of fetal DNA in the target sample. Embodiments of the invention determine aneuploidy risk in a target maternal sample based on a relationship between the amount of fetal DNA in the target sample and the presence of certain aneuploidies.
  • FIG. 6 illustrates a system for performing embodiments of the present invention.
  • FIG. 6 includes an analysis system 602 for determining a risk of fetal aneuploidy. Analysis system 602 may include one or more processors for executing the functions described herein. Such functions may be implemented on the processor as engines or logical elements that perform the analytical functionality described herein, such as a modeling engine and a probability engine. Interaction of a user with such analytical engines may be conducted through an appropriate user interface. Analysis system 602 may be coupled to a database of known samples 604 via, for example, a network 610. Network 610 may be any type of communication network, including intranets, local area networks, or wide area networks such as the Internet. Genetic data from samples with a known ploidy state may be used to form a baseline for comparison with a target sample in question, as discussed further below. Database 604 may be a collection of data from a variety of sources including clinical studies and commercial data sets.
  • In an embodiment, a fetal fraction distribution is defined for such known genetic data from the plurality of known prenatal testing samples by analysis system 602. The fetal fraction distribution may be based on the maternal weight and the gestational age corresponding to each sample. This is because gestational age and maternal weight are often factors for the amount of fetal DNA present in a maternal blood sample. A fetal fraction distribution may be defined for known genetic data from a plurality of noninvasive known prenatal testing samples.
  • The plurality of known prenatal testing samples may be selected based on various criteria to ensure an accurate and representative fetal fraction distribution. In an embodiment, known genetic data for a known prenatal testing sample may be selected or filtered for inclusion in the fetal fraction distribution based on an associated low aneuploidy risk result, a no call result due to low fetal fraction, and a low confidence result. Known genetic data for a known prenatal testing sample may also be selected based on whether the maternal weight associated with the sample is available or whether the sample was collected in a clinical trial in the United States or a foreign country. A selection based on country of origin may be done to prevent unit conversion uncertainty in maternal weight for the sample.
  • In an embodiment, known genetic data for a plurality of known prenatal testing samples may be grouped into sets according to gestational age and maternal weight. In an embodiment, known genetic data for the plurality of known prenatal testing samples may include sample data taken at a gestational age ranging from 9 to 20 weeks at one week increments. Known genetic data for the plurality of known prenatal testing samples may also include sample data corresponding to a maternal weight ranging from 110 to 250 pounds at 20 pounds increments. In an embodiment, sampling of the known genetic data from known prenatal testing samples may be accurate to within plus or minus ten days of gestational age and plus or minus five pounds of maternal weight.
  • In an embodiment, the average fetal fraction is computed for the known genetic data in each set of known prenatal testing samples. The standard deviation may also be computed. In an embodiment, the average fetal fraction and standard deviation is only computed for sets of known prenatal testing samples containing at least 50 samples. This may be done to ensure an accurate and representative fetal fraction distribution. The result is a grid of distribution parameters (e.g. average fetal fractions and standard deviations) that correspond to the grid of sample conditions.
  • FIG. 1 illustrates an example fetal fraction distribution based on known genetic data from a plurality of known prenatal testing samples grouped according to gestational age and maternal weight, according to an example embodiment. In the example of FIG. 1, a set of known prenatal testing samples is grouped together based on the gestational ages of 9 weeks, 12 weeks, and 18 weeks. Moreover, each set of known prenatal testing samples is further grouped together based on maternal weight. The average fetal fraction is computed for the known genetic data of each resulting set of known prenatal testing samples.
  • For example, in FIG. 1, the average fetal fraction of prenatal testing samples with a maternal weight of 200 lbs. and a gestational age of 9 weeks is around 0.06. The average fetal fraction of prenatal testing samples with a maternal weight of 200 lbs. and a gestational age of 12 weeks is around 0.07. The average fetal fraction of prenatal testing samples with a maternal weight of 200 lbs. and a gestational age of 18 weeks is around 0.08.
  • Given a particular gestational age and fetal fraction, a fetal fraction distribution may become more symmetric when transformed to log space. Therefore, modeling of fetal fraction may be conducted in log space.
  • In an embodiment, the fetal fraction distribution may be transformed to a log-normal distribution. In other words, the fetal fraction distribution may transformed to a continuous probability distribution of the fetal fraction whose logarithm is normally distributed. Specifically, the logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the known genetic data of the known prenatal testing samples.
  • FIG. 2 illustrates an example log normal fetal fraction distribution based on the transformation of a fetal fraction distribution to log space, according to an example embodiment. In the example of FIG. 2, the logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the known prenatal testing samples.
  • In the example of FIG. 2, for known genetic data for around 800 known prenatal testing samples, the gestational age is 10 weeks plus or minus 10 days and the maternal weight is 230 pounds plus or minus 5 pounds. Thus, in FIG. 2, the log normal fetal fraction distribution represents a probability density function (PDF) that describes the relative likelihood for fetal fraction to take on a given value where the gestational age is around 10 weeks plus or minus 10 days and the maternal weight is 230 pounds plus or minus 5 pounds.
  • In an embodiment, the probability of having an aneuploidy can be computed from the log normal fetal fraction distribution. Specifically, the probability of having an aneuploidy can be computed as the integral of the PDF over a defined range.
  • In an embodiment, the effect of an aneuploidy may be modeled as a fixed rate reduction in the average fetal fraction compared to the expected average fetal fraction for a given maternal weight and gestational age. For example, the average fetal fraction of a trisomy 13 pregnancy may be 80% of the average fetal fraction for a euploid pregnancy of the same maternal weight and gestational age. Trisomy 13, trisomy 18, and maternal triploidy may be modeled using a fixed rate reduction in the average fetal fraction. As would be appreciated by a person of ordinary skill in the art, the effect of an aneuploidy may be modeled according to various other reductions in the average fetal fraction compared to the expected average fetal fraction for a given maternal weight and gestational age.
  • In an embodiment, a model may be generated for a plurality of ploidy states based on the fixed ratio reduction of the fetal fraction distribution. A ploidy state may be referred to as a hypothesis.
  • A fetal fraction distribution may be transformed to a log-normal distribution of fetal fraction prior to generation of a model. In an embodiment, a model may be generated for three hypotheses: trisomy 13, trisomy 18, and maternal triploidy.
  • In an embodiment for a log-normal distribution of fetal fraction, a fixed rate reduction in the average fetal fraction corresponds to a constant subtracted offset. Thus, for a pregnancy with a particular gestational age and maternal weight, the log fetal fraction for euploid prenatal testing samples is Gaussian distributed with a mean m and a standard deviation s, but the log fetal fraction for prenatal testing samples with an aneuploidy is Gaussian distributed with a mean m-c and a standard deviation s-c where c is a constant subtracted offset for a given aneuploidy. As would be appreciated by a person of ordinary skill in the art, a constant subtracted offset for a given aneuploidy may be determined by an analysis of empirical data.
  • In an embodiment, there may be a single constant subtracted offset for trisomies 13 and 18 and a different offset for maternal triploidy. In an embodiment, the constant subtracted offset for trisomies 13 and 18 is log(0.79). In other words, in this example, the mean for the trisomy 13 and 18 hypothesis distributions are reduced by log(0.79).
  • In an embodiment, the constant substracted offset for maternal triploidy is log(0.22). In other words, in this example, the mean of the maternal triploidy hypothesis distribution is reduced by log(0.22).
  • Returning to FIG. 6, analysis system 602 may also be coupled to a database 606 containing genetic data for a target sample, either directly or over network 610. Genetic data about the target sample, stored in database 606, may have been obtained from, for example, a sequencer 608. The target sample is one for which a fetal aneuploidy risk is to be determined. While the examples herein will refer to maternal blood, one of skill in the art will recognize that the target sample may be, for example, a maternal blood or plasma containing both maternal DNA and fetal DNA. Such DNA may be, for example, cell-free DNA. As would be appreciated by a person of ordinary skill in the art, a target maternal blood sample that contains fetal DNA may be obtained using various methods.
  • In some embodiments of the invention, the obtained prenatal target sample is modified using standard molecular biology techniques in order to be sequenced on a DNA sequencer, such as sequencer 608. In some embodiments, the technique will involve forming a genetic library containing priming sites for the DNA sequencing procedure. A plurality of loci may be targeted for site specific amplification. In some embodiments the targeted loci are polymorphic loci, e.g., a single nucleotide polymorphisms. In embodiments employing the formation of genetic libraries, libraries may be encoded using a DNA sequence that is specific for the patient, e.g. barcoding, thereby permitting multiple patients to be analyzed in a single flow cell (or flow cell equivalent) of a high throughput DNA sequencer. Although the samples are mixed together in the DNA sequencer flow cell, the determination of the sequence of the barcode permits identification of the patient source that contributed the DNA that had been sequenced.
  • Methods are known in the art for obtaining genetic data from a sample. Typically this involves amplification of DNA in the sample, a process which transforms a small amount of genetic material to a larger amount of genetic material that contains a similar set of genetic data. This can be done by a wide variety of methods, including, but not limited to, Polymerase Chain Reaction (PCR), ligand mediated PCR, degenerative oligonucleotide primer PCR, Multiple Displacement Amplification, allele-specific amplification techniques, Molecular Inversion Probes (MIP), padlock probes, other circularizing probes, and combination thereof. Many variants of the standard protocol can be used, for example increasing or decreasing the times of certain steps in the protocol, increasing or decreasing the temperature of certain steps, increasing or decreasing the amounts of various reagents, etc. The DNA amplification transforms the initial sample of DNA into a sample of DNA that is similar in the set of sequences, but of much greater quantity. In some cases, amplification may not be required.
  • The genetic data of the target sample can be transformed from a molecular state to an electronic state by measuring the appropriate genetic material using tools and or techniques taken from a group including, but not limited to: genotyping microarrays, and high throughput sequencing. Some high throughput sequencing methods and systems include Sanger DNA sequencing, pyrosequencing, the ILLUMINA SOLEXA platform, ILLUMINA's GENOME ANALYZER, ILLUMINA's HISEQ or MISEQ, APPLIED BIOSYSTEM's SOLiD platform, ION TORRENT'S PGM or PROTON platforms, HELICOS's TRUE SINGLE MOLECULE SEQUENCING platform, HALCYON MOLECULAR's electron microscope sequencing method, or any other sequencing method. All of these methods physically transform the genetic data stored in a sample of DNA into a set of genetic data that is typically stored in a memory device en route to being processed.
  • In an embodiment, a fetal fraction based data likelihood for a target sample may be computed by analysis system 602 for each ploidy state (e.g., trisomy 13, trisomy 18, and maternal triploidy) using the generated model and the fetal fraction associated with the target sample, where each ploidy state corresponds to a hypothesis. Specifically, a fetal fraction based data likelihood for a target sample may be computed for each hypothesis (e.g. trisomy 13, trisomy 18, maternal triploidy, etc.) by evaluating the Gaussian probability density function at the observed log value of the fetal fraction associated with the target sample at each of the three hypotheses.
  • FIG. 3A illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy based on a fixed ratio reduction of a determined fetal fraction distribution, according to an embodiment. Specifically, FIG. 3A illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy where the gestational age is 19 weeks and the maternal weight is 166 pounds. Thus, in FIG. 3A, a fetal fraction based data likelihood for a target sample with a gestational age of 19 weeks and a maternal weight of 166 pounds may be computed for trisomy 13, trisomy 18, and maternal triploidy by evaluating the respective Gaussian probability density function at the observed log value of the fetal fraction associated with the target sample.
  • For example, in FIG. 3A, the fetal fraction based data likelihood of trisomy 13 or trisomy 18 for a target sample with a fetal fraction of 0.10, a maternal weight of 166 pounds, and a gestational age of 19 weeks is around 35%. Similarly, the fetal fraction based data likelihood of trisomy 13 or trisomy 18 for a target sample with a fetal fraction of 0.20, a maternal weight of 166 pounds, and a gestational age of 19 weeks is around 10%.
  • FIG. 3B illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy based on a fixed ratio reduction of a determined fetal fraction distribution, according to an embodiment. Specifically, FIG. 3B illustrates an example of a generated model for trisomy 13, trisomy 18, and maternal triploidy where the gestational age is 13 weeks and the maternal weight is 166 pounds. Thus, in FIG. 3B, a fetal fraction based data likelihood for a target sample with a gestational age of 13 weeks and a maternal weight of 166 pounds may be computed for trisomy 13, trisomy 18, and maternal triploidy by evaluating the respective Gaussian probability density function at the observed log value of the fetal fraction associated with the target sample.
  • By determining fetal fraction based data likelihoods for different ploidy states using a generated model for a target sample, an aneuploidy risk score for the fetus associated with the target sample may be determined. Specifically, in an embodiment, each fetal fraction based data likelihood can be combined with a previously determined risk score in order to determine the aneuploidy risk score for the fetus associated with the target sample. A previously determined risk score may be, for example, an age based prior risk score for the mother associated with the target sample. In another example, a previously determined risk score may be a SNP-based prior risk score. As would be appreciated by a person of ordinary skill in the art, a previously determined risk score may be based on other prior risk factors, including a combination of prior risk factors.
  • In an embodiment, an aneuploidy risk score for the fetus associated with the target sample may be determined based on the posterior probability of the presence of any of trisomy 13, trisomy 18, and maternal triploidy. Specifically, the fetal fraction based data likelihoods may be combined with previously determined risk scores for trisomy 13, trisomy 18, and maternal triploidy using Bayes' theorem to determine an aneuploidy risk score for the fetus associated with the target sample. In an embodiment, the previously determined risk scores for trisomy 13 and trisomy 18 depend on maternal age and gestational age and may be determined empirically. In an embodiment, the previously determined risk score for maternal triploidy is 1/5505.
  • FIG. 4 is a flowchart of a method 400 for determining aneuploidy risk in a target maternal blood sample, according to an embodiment. Method 400 can be performed by processing logic that can comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions run on a processing device), or a combination thereof. Such processing logic may be implemented in, for example, analysis system 602.
  • In step 402 of FIG. 4, known genetic data from a plurality of known noninvasive prenatal testing samples is received. As would be appreciated by a person of ordinary skill in the art, the known genetic data from the plurality of known prenatal testing samples may be received from a variety of sources including clinical studies and commercial data sets. Moreover, as would be appreciated by a person of ordinary skill in the art, a fetal fraction distribution may be defined for known genetic data from a plurality of noninvasive known prenatal testing samples, a plurality of invasive known prenatal testing samples, or a combination of both.
  • The received known genetic data from the plurality of known prenatal testing samples may be optionally filtered based on various criteria to ensure that an accurate and representative fetal fraction distribution is determined in step 406. In an embodiment, known genetic data for the known prenatal testing samples may be filtered based on an associated low aneuploidy risk result, a no call result due to low fetal fraction, and a low confidence result. The received known genetic data for the known prenatal testing samples may also be filtered based on whether the maternal weight associated with a sample is available or whether a sample was collected in a clinical in the United States or a foreign country. The filtering based on country of origin may be done to prevent unit conversion uncertainty in maternal weight for a sample.
  • In step 404 of FIG. 4, genetic data for a target maternal blood sample containing fetal DNA is received. The genetic data includes at least gestational age of the associated fetus, a maternal weight, and a fetal DNA fraction of the target sample. As would be appreciated by a person of ordinary skill in the art, a target maternal blood sample that contains fetal DNA may be obtained using various methods.
  • In step 406 of FIG. 4, a fetal fraction distribution is determined for the known genetic data from step 402. The determined fetal fraction distribution is based on the maternal weight and the gestational age associated with the target blood sample of step 404. In other words, the received known genetic data for the plurality of known prenatal testing samples is grouped into sets according to gestational age and maternal weight. As discussed above, the sampling of the known genetic data from known prenatal testing samples may be done at various intervals for gestational age and maternal weight.
  • For each set of known prenatal testing samples, the average fetal fraction is then computed. In an embodiment, the average fetal fraction may only be computed where a set of known prenatal testing samples includes a minimum number of 50 samples. This may be done to ensure an accurate and representative fetal fraction distribution.
  • In step 408 of FIG. 4, the fetal fraction distribution is transformed to a log-normal distribution. In an embodiment, the logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the received known genetic data of step 402. As would be appreciated by a person of ordinary skill in the art, the log normal fetal fraction distribution represents a PDF that describes the relative likelihood for fetal fraction to take on a given value where the gestational age is equal to gestational age and the maternal weight associated with the received genetic data for the target sample of step 404.
  • In step 410 of FIG. 4, a model is generated for a plurality of ploidy states based on the log-normal distribution of fetal fraction of step 408. In an embodiment, trisomy 13, trisomy 18, and maternal triploidy distributions are generated from the log-normal distribution of fetal fraction of step 408. This involves reducing the mean for the trisomy 13, trisomy 18, and maternal triploidy distributions by respective constant subtracted offset. As would be appreciated by a person of ordinary skill in the art, the constant subtracted offsets for the trisomy 13, trisomy 18, and maternal triploidy distributions may be determined experimentally.
  • In step 412 of FIG. 4, fetal fraction based data likelihoods for the received target sample of step 404 are computed for each of the ploidy states using the generated model of step 410 and the fetal fraction associated with the target sample. In an embodiment, a fetal fraction based data likelihood for the received target sample is computed for trisomy 13, trisomy 18, and maternal triploidy by evaluating the Gaussian probability density functions for trisomy 13, trisomy 18, and maternal triploidy at the observed log value of the fetal fraction associated with the target sample.
  • In step 414 of FIG. 4, a Bayesian probability determination is applied to combine the fetal fraction based data likelihoods of step 412 with previously determined risk scores. As would be appreciated by a person of ordinary skill in the art, a previously determined risk score may be an age based prior risk score for the mother associated with the target sample or an SNP-based prior risk score.
  • In step 416 of FIG. 4, aneuploidy risk scores for trisomy 13, trisomy 18, and maternal triploidy are output based on the applying in step 414. As would be appreciated by a person of ordinary skill in the art, the outputting may be performed using various methods and mediums.
  • In an embodiment, the aneuploidy risks scores for trisomy 13, trisomy 18, and maternal triploidy are independently determined. Because each aneuploidy risk score is an independent posterior probability of the presence of either trisomy 13, trisomy 18, or maternal triploidy, the resulting aneuploidy risk scores can be compared to identify the most likely ploidy state.
  • In an embodiment, a probability that the sample is euploid is also determined and taken into account.
  • In this manner, an additional type of analysis is made available to individuals whose aneuploidy risk may not be able to be determined by traditional methods, such as SNP-based methods. This analysis may also be used to confirm a previously determined risk score in situations where extremely low fetal fraction is an issue.
  • FIG. 7 illustrates a posterior fetal fraction risk distribution, according to an example embodiment. In the example of FIG. 7, a posterior risk distribution is computed by combining data likelihoods with prior risk for a gestational age between 9 and 11 weeks. The cutoff is at 1/100 risk. This sets the fetal fraction limit for a high risk call.
  • FIG. 8 illustrates a result set for a pilot study of an example embodiment for fetal fraction-based high risk assessment that predicts an aneuploidy in cases with low fetal fraction. The result set of FIG. 8 indicates that the example embodiment for fetal fraction-based risk assessment is able to predict abnormalities in a clinical data sample set. Specifically, in the example of FIG. 8, there were 143 cases with high risk, low fetal fraction. 70 cases were with karyotype. There was a 10% positive predictive value (PPV) if the associated clinical sample data set was restricted to cases with karyotype and a 4.9% PPV if missing karyotypes were assumed unaffected. FIG. 8 illustrates some of the abnormalities detected in the pilot study.
  • FIG. 9 illustrates a redraw success rate distribution, according to an example embodiment. FIG. 9 shows fetal fraction change observed from approximately 3,000 Non-Invasive Prenatal Testing (NIPT) redraws. The example embodiment of FIG. 9 provides useful information when an embodiment for NIPT single-nucleotide polymorphism (SNP) fails to provide a prediction. Specifically, the example embodiment of FIG. 9 provides a fetal fraction-based risk score and a probability of successful call on redraw, making it possible to predict redraw success based on a predicted range of redraw fetal fraction.
  • FIG. 10 illustrates a distribution of fetal fraction based risk scores in cases identified as high risk and low fetal fraction, according to follow up study of an example embodiment. For example, FIG. 10 shows that roughly 5 cases had a fetal fraction based risk score of 0.2. In the follow-up study, the objective was to test whether high fetal fraction-based risk predicts aneuploidy in cases with unusually low fetal fraction. An attempt to collect follow up was made for 896 samples, where the adjusted fetal fraction was below approximately the 2nd percentile, and the maternal weight was available. 525 samples were eligible for inclusion in the follow up study, from domestic clinics and direct sales clinics. 143 samples were identified as having high fetal fraction-based risk with low fetal fraction. In particular, the fetal fraction-based risk was greater than 0.01 and the fetal fraction was 2.5 SD below mean. Karyotype was available for 70 samples.
  • FIG. 11A illustrates an estimated detection rate for trisomy 13 and 18, according to an example embodiment. Specifically, FIG. 11A illustrates what fraction of affected cases that are not identified by a NIPT SNP embodiment will be identified by the fetal fraction-based risk score >1/100. The estimated detection rate is based on the sample data set of FIG. 10. In FIG. 11A, the estimated detection rate for trisomy 13/18 is 91.4%.
  • FIG. 11B illustrates an estimated detection rate for digynic triploidy, according to an example embodiment. Specifically, FIG. 11B illustrates what fraction of affected cases that are not identified by a NIPT SNP embodiment will be identified by the fetal fraction-based risk score >1/100. The estimated detection rate is based on the sample data set of FIG. 10. In FIG. 11B, the estimated detection rate for digynic triploidy is 96.6%. Retroactive application of such high risk fetal fraction criteria to 29,000 NIPT cases would have resulted in 432 high risk calls (1.5%). Application of the SNP method would result in 115 (0.4%) high risk calls (for T13, T18, digynic triploidy). This results in a 1.8% combined high risk call rate. The expected aneuploidy rate based on priors was 0.3%. The theoretical PPV was thus 16% (0.3%/1.8%).
  • FIG. 12 illustrates a PDF of normalized euploid data, according to an example embodiment. Specifically, FIG. 12 shows empirical density plots of fetal fractions after normalization. There are 39 density curves. Each of the 39 density curves comes from a set of data with approximately the same maternal weight and gestational age, with between 400 and 500 samples each. Each data set is normalized by its observed mean and variance. The plot in FIG. 12 shows that the Gaussian fit is appropriate because the distributions are very similar.
  • FIG. 13 illustrates a CDF of the normalized euploid data of FIG. 12, according to an example embodiment. Specifically, FIG. 13 shows empirical density plots of fetal fractions after normalization. There are 39 density curves. Each of the 39 density curves comes from a set of data with approximately the same maternal weight and gestational age, with between 400 and 500 samples each. Each data set is normalized by its observed mean and variance. The plot in FIG. 13 shows that the Gaussian fit is appropriate because the distributions are very similar.
  • FIG. 14 illustrates a plot of redraw success rate, according to an example embodiment. Specifically, FIG. 14 plots the redraw success rate against material weight bucket center. This plots shows that another characteristic of the fetal fraction distribution is the redraw success rate. Specifically, the ability to make a call is strongly dependent on fetal fraction and a successful redraw is often based on an increase in fetal fraction between the first and second draw. The ability to predict the probability of success for a redraw is often useful for doctors and patients. This is because many cases with low fetal fraction will not be at high risk for aneuploidy, but still have low probability of a successful redraw, and so other testing embodiments may be preferred.
  • FIG. 15 illustrates an example result set for identified high risk samples, according to an embodiment. Specifically, FIG. 15 illustrates a result set for 143 sample cases that were identified as having high extremely low fetal fraction (ELFF) risk based on not having received a successful high or low risk draw call, and having a computed ELFF risk score greater than 0.01. FIG. 15 further illustrates that follow-up results were successfully collected for 70 of these sample cases. Of these 70 sample cases, 7 were found to be aneuploid.
  • FIG. 15 shows that among the cohort with successful follow-up, the positive predictive value of high ELFF risk is 7/58=12.07%. FIG. 15 further shows that assuming all cases without follow-up are euploid, the positive predictive value is 7/113=6.19%. This value can be considered the lower bound PPV based on the data set of FIG. 15.
  • Various embodiments can be implemented, for example, using one or more well-known computer systems, such as computer system 500 shown in FIG. 5. Computer system 500 can be any well-known computer capable of performing the functions described herein.
  • Computer system 5 includes one or more processors (also called central processing units, or CPUs), such as a processor 5. Processor 504 is connected to a communication infrastructure or bus 506.
  • One or more processors 504 may each be a graphics processing unit (GPU). In an embodiment, a GPU is a processor that is a specialized electronic circuit designed to process mathematically intensive applications. The GPU may have a parallel structure that is efficient for parallel processing of large blocks of data, such as mathematically intensive data common to computer graphics applications, images, videos, etc.
  • Computer system 500 also includes user input/output device(s) 503, such as monitors, keyboards, pointing devices, etc., that communicate with communication infrastructure 506 through user input/output interface(s) 502.
  • Computer system 500 also includes a main or primary memory 508, such as random access memory (RAM). Main memory 508 may include one or more levels of cache. Main memory 508 has stored therein control logic (i.e., computer software) and/or data.
  • Computer system 500 may also include one or more secondary storage devices or memory 510. Secondary memory 510 may include, for example, a hard disk drive 512 and/or a removable storage device or drive 514. Removable storage drive 514 may be a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup device, and/or any other storage device/drive.
  • Removable storage drive 514 may interact with a removable storage unit 518. Removable storage unit 518 includes a computer usable or readable storage device having stored thereon computer software (control logic) and/or data. Removable storage unit 518 may be a floppy disk, magnetic tape, compact disk, DVD, optical storage disk, and/ any other computer data storage device. Removable storage drive 514 reads from and/or writes to removable storage unit 518 in a well-known manner.
  • According to an exemplary embodiment, secondary memory 510 may include other means, instrumentalities or other approaches for allowing computer programs and/or other instructions and/or data to be accessed by computer system 500. Such means, instrumentalities or other approaches may include, for example, a removable storage unit 522 and an interface 520. Examples of the removable storage unit 522 and the interface 520 may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a memory stick and USB port, a memory card and associated memory card slot, and/or any other removable storage unit and associated interface.
  • Computer system 500 may further include a communication or network interface 524. Communication interface 524 enables computer system 500 to communicate and interact with any combination of remote devices, remote networks, remote entities, etc. (individually and collectively referenced by reference number 528). For example, communication interface 524 may allow computer system 500 to communicate with remote devices 528 over communications path 526, which may be wired and/or wireless, and which may include any combination of LANs, WANs, the Internet, etc. Control logic and/or data may be transmitted to and from computer system 500 via communication path 526.
  • In an embodiment, a tangible apparatus or article of manufacture comprising a tangible computer useable or readable medium having control logic (software) stored thereon is also referred to herein as a computer program product or program storage device. This includes, but is not limited to, computer system 500, main memory 508, secondary memory 510, and removable storage units 518 and 522, as well as tangible articles of manufacture embodying any combination of the foregoing. Such control logic, when executed by one or more data processing devices (such as computer system 500), causes such data processing devices to operate as described herein.
  • Based on the teachings contained in this disclosure, it will be apparent to persons skilled in the relevant art(s) how to make and use embodiments of the invention using data processing devices, computer systems and/or computer architectures other than that shown in FIG. 5. In particular, embodiments may operate with software, hardware, and/or operating system implementations other than those described herein.
  • It is to be appreciated that the Detailed Description section, and not the Summary and Abstract sections (if any), is intended to be used to interpret the claims. The Summary and Abstract sections (if any) may set forth one or more but not all exemplary embodiments of the invention as contemplated by the inventor(s), and thus, are not intended to limit the invention or the appended claims in any way.
  • While the invention has been described herein with reference to exemplary embodiments for exemplary fields and applications, it should be understood that the invention is not limited thereto. Other embodiments and modifications thereto are possible, and are within the scope and spirit of the invention. For example, and without limiting the generality of this paragraph, embodiments are not limited to the software, hardware, firmware, and/or entities illustrated in the figures and/or described herein. Further, embodiments (whether or not explicitly described herein) have significant utility to fields and applications beyond the examples described herein.
  • Embodiments have been described herein with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined as long as the specified functions and relationships (or equivalents thereof) are appropriately performed. Also, alternative embodiments may perform functional blocks, steps, operations, methods, etc. using orderings different than those described herein.
  • References herein to “one embodiment,” “an embodiment,” “an example embodiment,” or similar phrases, indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of persons skilled in the relevant art(s) to incorporate such feature, structure, or characteristic into other embodiments whether or not explicitly mentioned or described herein.
  • The breadth and scope of the invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims (24)

What is claimed is:
1. A method for determining aneuploidy risk in a target sample, comprising:
receiving known genetic data from a plurality of known noninvasive prenatal testing samples;
receiving genetic data for the target sample, the genetic data including a gestational age, a maternal weight, and a fetal fraction associated with the target sample;
determining a fetal fraction distribution for the received known genetic data based on the gestational age and the maternal weight associated with the target sample;
generating a model for a plurality of ploidy states based on a fixed ratio reduction of the determined fetal fraction distribution compared to an expected average fetal fraction for the gestational age and the maternal weight associated with the target sample;
determining a fetal fraction based data likelihood for the target sample for each of the plurality of ploidy states using the generated model and the fetal fraction associated with the target sample;
applying a Bayesian probability determination to combine each fetal fraction based data likelihood with a previously determined risk score as a conditional value; and
outputting an aneuploidy risk score for the target sample based on the applying.
2. The method of claim 1, wherein the previously determined risk score is a SNP based risk score.
3. The method of claim 1, further comprising:
transforming the determined fetal fraction distribution to logarithm space, wherein a logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the known prenatal testing samples.
4. The method of claim 1, wherein determining a fetal fraction based data likelihood for the target sample comprises computing an integral of a probability density function of the generated model.
5. The method of claim 1, wherein the generated model is associated with trisomy 13.
6. The method of claim 1, wherein the generated model is associated with trisomy 18.
7. The method of claim 1, wherein the generated model is associated with maternal triploidy.
8. The method of claim 1, wherein determining a fetal fraction distribution for the received known genetic data comprises:
grouping the genetic data for the plurality of known prenatal testing samples into sets according to gestational age and maternal weight; and
generating a grid of distribution parameters corresponding to each set, wherein the distribution parameters include average fetal fraction and standard deviation.
9. A system for determining aneuploidy risk in a target sample, comprising:
means for receiving known genetic data from a plurality of known noninvasive prenatal testing samples;
means for receiving genetic data for the target sample, the genetic data including a gestational age, a maternal weight, and a fetal fraction associated with the target sample;
means for determining a fetal fraction distribution for the received known genetic data based on the gestational age and the maternal weight associated with the target sample;
means for generating a model for a plurality of ploidy states based on a fixed ratio reduction of the determined fetal fraction distribution compared to an expected average fetal fraction for the gestational age and the maternal weight associated with the target sample;
means for determining a fetal fraction based data likelihood for the target sample for each of the plurality of ploidy states using the generated model and the fetal fraction associated with the target sample;
means for applying a Bayesian probability determination to combine each fetal fraction based data likelihood with a previously determined risk score as a conditional value; and
means for outputting an aneuploidy risk score for the target sample based on the applying.
10. The method of claim 9, wherein the previously determined risk score is a SNP based risk score.
11. The method of claim 9, further comprising:
means for transforming the determined fetal fraction distribution to logarithm space, wherein a logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the known prenatal testing samples.
12. The method of claim 9, wherein the means for determining a fetal fraction based data likelihood for the target sample comprises means for computing an integral of a probability density function of the generated model.
13. The method of claim 9, wherein the generated model is associated with trisomy 13.
14. The method of claim 9, wherein the generated model is associated with trisomy 18.
15. The method of claim 9, wherein the generated model is associated with maternal triploidy.
16. The method of claim 9, wherein the means for determining a fetal fraction distribution for the received known genetic data comprises:
means for grouping the genetic data for the plurality of known prenatal testing samples into sets according to gestational age and maternal weight; and
means for generating a grid of distribution parameters corresponding to each set, wherein the distribution parameters include average fetal fraction and standard deviation.
17. A system for determining aneuploidy risk in a target sample, comprising:
a known testing samples database containing known genetic data from a plurality of known noninvasive prenatal testing samples;
a target sample database containing genetic data for at least the target sample, the genetic data including a gestational age, a maternal weight, and a fetal fraction associated with the target sample;
an aneuploidy risk analysis system in communication with the known testing samples database and the target sample database, the aneuploidy risk analysis system comprises:
a logical element configured to determine a fetal fraction distribution for the received known genetic data based on the gestational age and the maternal weight associated with the target sample;
a modeling engine configured to generate a model for a plurality of ploidy states based on a fixed ratio reduction of the determined fetal fraction distribution compared to an expected average fetal fraction for the gestational age and the maternal weight associated with the target sample; and
a probability engine configured to determine a fetal fraction based data likelihood for the target sample for each of the plurality of ploidy states using the generated model and the fetal fraction associated with the target sample, apply a Bayesian probability determination to combine each fetal fraction based data likelihood with a previously determined risk score as a conditional value, and output an aneuploidy risk score for the target sample based on the Bayesian probability determination.
18. The system of claim 17, wherein the previously determined risk score is a SNP based risk score.
19. The system of claim 17, wherein the modeling engine is further configured to transform the determined fetal fraction distribution to logarithm space, wherein a logarithm of the fetal fraction is assumed Gaussian distributed with a mean and standard deviation that are a function of gestational age and maternal weight for the known prenatal testing samples.
20. The system of claim 17, wherein the logic element is configured to determine a fetal fraction based data likelihood for the target sample by computing an integral of a probability density function of the generated model.
21. The system of claim 17, wherein the generated model is associated with trisomy 13.
22. The system of claim 17, wherein the generated model is associated with trisomy 18.
23. The system of claim 17, wherein the generated model is associated with maternal triploidy. The system of claim 17, wherein the probability engine is configured to determine a fetal fraction distribution for the received known genetic data by grouping the genetic data for the plurality of known prenatal testing samples into sets according to gestational age and maternal weight, and generating a grid of distribution parameters corresponding to each set, wherein the distribution parameters include average fetal fraction and standard deviation.
24. The system of claim 17, further comprising a DNA sequencer in communication with the target sample database and configured to supply genetic data about the target sample.
US15/186,774 2015-06-19 2016-06-20 Systems and methods for determining aneuploidy risk using sample fetal fraction Abandoned US20160371428A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US15/186,774 US20160371428A1 (en) 2015-06-19 2016-06-20 Systems and methods for determining aneuploidy risk using sample fetal fraction
US17/365,786 US20210327542A1 (en) 2015-06-19 2021-07-01 Systems and methods for determining aneuploidy risk using sample fetal fraction

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562182085P 2015-06-19 2015-06-19
US15/186,774 US20160371428A1 (en) 2015-06-19 2016-06-20 Systems and methods for determining aneuploidy risk using sample fetal fraction

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/365,786 Continuation US20210327542A1 (en) 2015-06-19 2021-07-01 Systems and methods for determining aneuploidy risk using sample fetal fraction

Publications (1)

Publication Number Publication Date
US20160371428A1 true US20160371428A1 (en) 2016-12-22

Family

ID=57587067

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/186,774 Abandoned US20160371428A1 (en) 2015-06-19 2016-06-20 Systems and methods for determining aneuploidy risk using sample fetal fraction
US17/365,786 Pending US20210327542A1 (en) 2015-06-19 2021-07-01 Systems and methods for determining aneuploidy risk using sample fetal fraction

Family Applications After (1)

Application Number Title Priority Date Filing Date
US17/365,786 Pending US20210327542A1 (en) 2015-06-19 2021-07-01 Systems and methods for determining aneuploidy risk using sample fetal fraction

Country Status (1)

Country Link
US (2) US20160371428A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019020180A1 (en) * 2017-07-26 2019-01-31 Trisomytest, S.R.O. A method for non-invasive prenatal detection of fetal chromosome aneuploidy from maternal blood based on bayesian network
US20210164048A1 (en) * 2018-08-07 2021-06-03 Singlera Genomics, Inc. A non-invasive prenatal test with accurate fetal fraction measurement
US11525134B2 (en) 2017-10-27 2022-12-13 Juno Diagnostics, Inc. Devices, systems and methods for ultra-low volume liquid biopsy
WO2023034090A1 (en) 2021-09-01 2023-03-09 Natera, Inc. Methods for non-invasive prenatal testing

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9424392B2 (en) 2005-11-26 2016-08-23 Natera, Inc. System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
US11326208B2 (en) 2010-05-18 2022-05-10 Natera, Inc. Methods for nested PCR amplification of cell-free DNA
US10316362B2 (en) 2010-05-18 2019-06-11 Natera, Inc. Methods for simultaneous amplification of target loci
US11332785B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US20190010543A1 (en) 2010-05-18 2019-01-10 Natera, Inc. Methods for simultaneous amplification of target loci
US11339429B2 (en) 2010-05-18 2022-05-24 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
CA2798758C (en) 2010-05-18 2019-05-07 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US9677118B2 (en) 2014-04-21 2017-06-13 Natera, Inc. Methods for simultaneous amplification of target loci
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
US11332793B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for simultaneous amplification of target loci
RU2717641C2 (en) 2014-04-21 2020-03-24 Натера, Инк. Detection of mutations and ploidy in chromosomal segments
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
US11485996B2 (en) 2016-10-04 2022-11-01 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
US10011870B2 (en) 2016-12-07 2018-07-03 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US11525159B2 (en) 2018-07-03 2022-12-13 Natera, Inc. Methods for detection of donor-derived cell-free DNA

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019020180A1 (en) * 2017-07-26 2019-01-31 Trisomytest, S.R.O. A method for non-invasive prenatal detection of fetal chromosome aneuploidy from maternal blood based on bayesian network
US11525134B2 (en) 2017-10-27 2022-12-13 Juno Diagnostics, Inc. Devices, systems and methods for ultra-low volume liquid biopsy
US20210164048A1 (en) * 2018-08-07 2021-06-03 Singlera Genomics, Inc. A non-invasive prenatal test with accurate fetal fraction measurement
WO2023034090A1 (en) 2021-09-01 2023-03-09 Natera, Inc. Methods for non-invasive prenatal testing

Also Published As

Publication number Publication date
US20210327542A1 (en) 2021-10-21

Similar Documents

Publication Publication Date Title
US20210327542A1 (en) Systems and methods for determining aneuploidy risk using sample fetal fraction
JP7159270B2 (en) Methods and procedures for non-invasive evaluation of genetic mutations
Chiang et al. The impact of structural variation on human gene expression
US11923046B2 (en) Noninvasive prenatal molecular karyotyping from maternal plasma
ES2939547T3 (en) Methods and procedures for the non-invasive evaluation of genetic variations
DK2183693T4 (en) Diagnosis of fetal chromosomal aneuploidy using genome sequencing
US9121069B2 (en) Diagnosing cancer using genomic sequencing
EP2805280B1 (en) Diagnostic processes that factor experimental conditions
EP3489368B1 (en) Molecular testing of multiple pregnancies
JP2016540520A (en) Methods and processes for non-invasive assessment of chromosomal changes
US8756020B2 (en) Enhanced risk probabilities using biomolecule estimations
US20130261984A1 (en) Methods and systems for determining fetal chromosomal abnormalities
EP3662479A1 (en) A method for non-invasive prenatal detection of fetal sex chromosomal abnormalities and fetal sex determination for singleton and twin pregnancies
CN110770839A (en) Method for the accurate computational decomposition of DNA mixtures from contributors of unknown genotype
CN116583904A (en) Sample validation for cancer classification
TW201720932A (en) Accurate quantification of fetal DNA fraction by shallow-depth sequencing of maternal plasma DNA
EP3588506B1 (en) Systems and methods for genomic and genetic analysis
Chu et al. High‐resolution epigenomic liquid biopsy for noninvasive phenotyping in pregnancy
WO2023097278A1 (en) Sample contamination detection of contaminated fragments for cancer classification

Legal Events

Date Code Title Description
AS Assignment

Owner name: NATERA, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RYAN, ALLISON;KOBARA, KATIE;DEMKO, ZACHARY;AND OTHERS;SIGNING DATES FROM 20150623 TO 20150624;REEL/FRAME:038957/0045

AS Assignment

Owner name: ORBIMED ROYALTY OPPORTUNITIES II, LP, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNOR:NATERA, INC.;REEL/FRAME:043482/0472

Effective date: 20170808

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: NATERA, INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:ORBIMED ROYALTY OPPORTUNITIES II, LP;REEL/FRAME:052472/0712

Effective date: 20200421

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION