AU2016256598A1 - Breast cancer risk assessment - Google Patents

Breast cancer risk assessment Download PDF

Info

Publication number
AU2016256598A1
AU2016256598A1 AU2016256598A AU2016256598A AU2016256598A1 AU 2016256598 A1 AU2016256598 A1 AU 2016256598A1 AU 2016256598 A AU2016256598 A AU 2016256598A AU 2016256598 A AU2016256598 A AU 2016256598A AU 2016256598 A1 AU2016256598 A1 AU 2016256598A1
Authority
AU
Australia
Prior art keywords
risk
breast cancer
processing system
subject
cancer phenotype
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2016256598A
Inventor
Paul James
Sarah Dilys SAWYER
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peter MacCallum Cancer Institute
Original Assignee
Peter MacCallum Cancer Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2015901495A external-priority patent/AU2015901495A0/en
Application filed by Peter MacCallum Cancer Institute filed Critical Peter MacCallum Cancer Institute
Publication of AU2016256598A1 publication Critical patent/AU2016256598A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A method, system and processing system for assessing a risk of an individual developing a breast cancer phenotype. The method (300) includes determining, in a biological sample from a human subject, an absence of an identified pathogenic mutation to the BRCA-1 and BRCA-2 genes (310). In response to a successful determination, the method (300) includes determining in the biological sample a presence or absence of risk alleles of common allelic variants associated with a breast cancer phenotype at a plurality of independent loci (320). A polygenic risk score for the human subject can then be calculated based upon the presence or absence of the risk alleles and using case data indicative of women who developed breast cancer (330) which did not carry the pathogenic mutation to the BRCA-1 and BRCA-2 genes. A high polygenic risk score indicates a higher risk for developing a breast cancer phenotype.

Description

BREAST CANCER RISK ASSESSMENT
Field of Invention [001] The present invention relates breast cancer risk assessment. In particular embodiments, the present invention relates to a method, system and processing system for assessing a risk of an individual developing a breast cancer phenotype.
Background [002] Large, international genome-wide association studies (GWAS) have compared many thousands of common genomic variants (also known as single-nucleotide polymorphisms, SNPs) in breast cancer cases from the general population (unselected for family history) against healthy controls to identify variants conferring increased risk of disease. These variants have minor allele frequencies of greater than 5% and have individual breast cancers relative risks of <1.5. Due to biallelic inheritance, these variants give rise to three genotypes: the common allele homozygote (AA), the heterozygote (Aa) and minor allele homozygote (aa).
[003] Individually, the associations for these variants are too weak to be useful, however, when the risks for these variants are combined multiplicatively (termed polygenic risk, measured by the polygenic risk score, PRS), the risk becomes more appreciable.
[004] Whilst particular methods have previously been disclosed which determine the risk of a female developing a breast cancer phenotype based upon detecting a pathogenic mutation to the BRCA1 and BRCA2 genes, it is known that a large proportion of females who develop a breast cancer phenotype, have no identifiable pathogenic mutation within the BRCA1 and BRCA2 genes.
[005] Therefore, it would be desirable to alleviate one or more of the above-mentioned problems or at least provide a commercial alternative.
[006] The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that the prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.
Summary [007] In one aspect there is provided a method for assessing a human subject's risk for developing a breast cancer phenotype comprising: determining, in a biological sample from the human subject, an absence of an identified pathogenic mutation to the BRCA-1 and BRCA-2 genes; in response to a successful determination, determining in the biological sample a presence or absence of risk alleles of common allelic variants associated with a breast cancer phenotype at a plurality of independent loci; and calculating, based upon the presence or absence of the risk alleles, a polygenic risk score for the human subject using odds ratios based upon case data indicative of women who developed breast cancer, wherein each woman of the case data did not carry the pathogenic mutation to the BRCA-1 and BRCA-2 genes; wherein a high polygenic risk score indicates a higher risk for developing a breast cancer phenotype.
[008] In certain embodiments, determining the presence or absence of risk alleles is achieved by amplification of nucleic acid from said sample.
[009] In certain embodiments, amplification comprises polymerase chain reaction (PCR).
[010] In certain embodiments, primers for amplification are located on a chip.
[011] In certain embodiments, the amplification comprises: admixing an amplification primer or amplification primer pair with a nucleic acid template isolated from the biological sample, wherein the primer or primer pair is complementary or partially complementary to a region proximal to or including the polymorphism, and is capable of initiating nucleic acid polymerization by a polymerase on the nucleic acid template; and extending the primer or primer pair in a DNA polymerization reaction comprising a polymerase and the template nucleic acid to generate an amplicon.
[012] In certain embodiments, the amplicon is detected by a process that includes one or more of: hybridizing the amplicon to an array, digesting the amplicon with a restriction enzyme, or real-time PCR analysis.
[013] In certain embodiments, the amplification comprises performing a polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), or ligase chain reaction (LCR) using nucleic acid isolated from the organism or biological sample as a template in the PCR, RT-PCR, or LCR.
[014] In certain embodiments, the method further comprises cleaving amplified nucleic acid.
[015] In certain embodiments, said sample is derived from saliva or blood.
[016] In certain embodiments, the method further comprises the step of making a decision on at least one of timing and frequency of breast cancer phenotype diagnostic testing for said subject.
[017] In certain embodiments, the method further comprises the step of making a decision on at least one of timing and frequency of breast cancer phenotype treatment for said subject.
[018] In certain embodiments, the method further comprises the step of subjecting the subject identified as having an increased risk of developing a breast cancer phenotype to breast cancer phenotype treatment.
[019] In certain embodiments, the presence or absence of risk alleles is determined for all single nucleotide polymorphisms set forth in Table 1.
[020] In certain embodiments, the presence or absence of risk alleles is determined for at least some of the single nucleotide polymorphisms set forth in Table 1.
[021] In certain embodiments, the polygenic risk score is calculated based on said determination.
[022] In certain embodiments, the method further comprises the step of recording results of assessing the human subject's risk for developing a breast cancer phenotype on a computer readable medium.
[023] In certain embodiments, said results are communicated to at least one of the subject and the subject's physician.
[024] In certain embodiments, said results are recorded in the form of a report.
[025] In certain embodiments, the method includes: calculating, at a processing system, a first probability that the subject develops a breast cancer phenotype, wherein the first probability is calculated based upon: the polygenic risk score for the individual; and a first mean and a first standard deviation of first polygenic risk scores for the women of the case data; calculating, at the processing system, a second probability that the subject develops a breast cancer phenotype, wherein the second probability is calculated based upon: the polygenic risk score for the individual; and second mean and a second standard deviation of polygenic risk scores for a control population; and calculating, at the processing system, the risk for the subject developing a breast cancer phenotype based upon: a population lifetime risk; the first probability; and the second probability.
[026] In certain embodiments, the method includes the processing system calculating the first probability according to Equation 3.
[027] In certain embodiments, the method includes the processing system calculating the second probability according to Equation 4.
[028] In certain embodiments, the method includes the processing system obtaining, from a data store, at least one of: the first standard deviation; the second standard deviation; the first mean; the second mean; and the population lifetime risk.
[029] In certain embodiments, the risk is an absolute risk.
[030] In certain embodiments, the method includes the processing system calculating the absolute risk for the subject according to Equation 5.
[031] In certain embodiments, the method includes the processing system calculating a relative risk for the subject according to Equation 6.
[032] In another aspect there is provided a report comprising the results of the method outlined above.
[033] Other aspects and embodiments will be appreciated throughout the detailed description.
Brief Description of the Figures [034] Example embodiments should become apparent from the following description, which is given by way of example only, of at least one preferred but non-limiting embodiment, described in connection with the accompanying figures.
[035] Figure 1 illustrates a functional block diagram of an example processing system that can be utilised to embody or give effect to a particular embodiment; [036] Figure 2 illustrates an example network infrastructure that can be utilised to embody or give effect to a particular embodiment; [037] Figure 3A illustrates a flowchart representing a method for assessing a risk of an individual developing a breast cancer phenotype; [038] Figure 3B illustrates a flowchart representing a method performed by a processing system for assessing a risk of an individual developing a breast cancer phenotype; [039] Figure 4 illustrates a flowchart representing an example initialization method performed by a processing system for assessing a risk of an individual developing a breast cancer phenotype; [040] Figure 5 illustrates a flowchart representing an example method performed by a processing system for assessing a risk of an individual developing a breast cancer phenotype; [041] Figure 6 illustrates a sample clinical report generated by the processing system indicating the risk of an individual at ‘high risk’ of developing a breast cancer phenotype; [042] Figure 7A is a normal Q-Q plot of polygenic risk scores based on 55 common genomic variants for 915 familial breast cancer cases with no identifiable BRCA1 or BRCA2 mutation; and [043] Figure 7B is a normal Q-Q plot of polygenic risk scores based on 55 common genomic variants for 711 general population unaffected controls.
Detailed Description of the Preferred Embodiments
Definitions [044] Unless otherwise indicated, the recombinant protein, cell culture, and immunological techniques utilized in the present invention are standard procedures, well known to those skilled in the art. Such techniques are described and explained throughout the literature in sources such as, J. Perbal, A Practical Guide to Molecular Cloning, John Wiley and Sons (1984), J. Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbour Laboratory Press (1989), T.A. Brown (editor), Essential Molecular Biology: A Practical Approach, Volumes 1 and 2, IRL Press (1991), D.M. Glover and B.D. Hames (editors), DNA Cloning: A Practical Approach, Volumes 1-4, IRL Press (1995 and 1996), and F.M. Ausubel et al. (editors), Current Protocols in Molecular Biology, Greene Pub. Associates and Wiley- Interscience (1988, including all updates until present), Ed Harlow and David Lane (editors) Antibodies: A Laboratory Manual, Cold Spring Harbour Laboratory, (1988), and J.E. Coligan et al. (editors) Current Protocols in Immunology, John Wiley &amp; Sons (including all updates until present).
[045] Unless stated otherwise, the following terms and phrases as used herein are intended to have the following meanings.
[046] As used herein, "biological sample" refers to any sample that can be from or derived a human patient, e.g., bodily fluids (blood, saliva, urine etc.), biopsy, tissue, and/or waste from the patient. Thus, tissue biopsies, stool, sputum, saliva, blood, lymph, tears, sweat, urine, vaginal secretions, or the like can easily be screened for SNPs, as can essentially any tissue of interest that contains the appropriate nucleic acids. These samples are typically taken, following informed consent, from a patient by standard medical laboratory methods. The sample may be in a form taken directly from the patient, or may be at least partially processed (purified) to remove at least some non-nucleic acid material.
[047] As used herein, the term "SNP" or "single nucleotide polymorphism" refers to a genetic variation between individuals; e.g., a single nitrogenous base position in the DNA of organisms that is variable. As used herein, "SNPs" is the plural of SNP. Of course, when one refers to DNA herein, such reference may include derivatives of the DNA such as amplicons, RNA transcripts thereof, etc.
[048] A "polymorphism" is a locus that is variable; that is, within a population, the nucleotide sequence at a polymorphism has more than one version or allele. One example of a polymorphism is a "single nucleotide polymorphism", which is a polymorphism at a single nucleotide position in a genome (the nucleotide at the specified position varies between individuals or populations).
[049] The term "mammal" as used herein refers to any animal classified as a mammal, including, without limitation, humans, higher primates, domestic and farm animals, and zoo, sports or pet animals such horses, pigs, cattle, dogs, cats and ferrets, etc. In a preferred embodiment of the invention, the mammal is a human or another higher primate.
[050] Administration "in combination with" one or more further therapeutic agents includes simultaneous (concurrent) and consecutive administration in any order.
[051] As used herein, "breast cancer phenotype" refers to a phenotype that displays a predisposition towards developing breast cancer in an individual. A phenotype that displays a predisposition for breast cancer, can, for example, show a higher likelihood that the cancer will develop in an individual with the phenotype than in members of a relevant general population under a given set of environmental conditions (diet, physical activity regime, geographic location, etc.).
[052] The term "allele" refers to one of two or more different nucleotide sequences that occur or are encoded at a specific locus, or two or more different polypeptide sequences encoded by such a locus. For example, a first allele can occur on one chromosome, while a second allele occurs on a second homologous chromosome, e.g., as occurs for different chromosomes of a heterozygous individual, or between different homozygous or heterozygous individuals in a population. One example of a polymorphism is a "single nucleotide polymorphism" (SNP), which is a polymorphism at a single nucleotide position in a genome (the nucleotide at the specified position varies between individuals or populations).
[053] An allele "positively" correlates with a trait when it is linked to it and when presence of the allele is an indictor that the trait or trait form will occur in an individual comprising the allele. An allele negatively correlates with a trait when it is linked to it and when presence of the allele is an indicator that a trait or trait form will not occur in an individual comprising the allele.
[054] A marker polymorphism or allele is "correlated" or "associated" with a specified phenotype (e.g. breast cancer phenotype susceptibility, etc.) when it can be statistically linked (positively or negatively) to the phenotype. That is, the specified polymorphism occurs more commonly in a case population (e.g., breast cancer patients) than in a control population. This correlation is often inferred as being causal in nature, but it need not be -simple genetic linkage to (association with) a locus for a trait that underlies the phenotype is sufficient for correlation/association to occur.
[055] A "favourable allele" is an allele at a particular locus that positively correlates with a desirable phenotype, e.g., resistance to a breast cancer phenotype, e.g., an allele that negatively correlates with predisposition to a breast cancer phenotype. A favourable allele of a linked marker is a marker allele that segregates with the favourable allele. A favourable allelic form of a chromosome segment is a chromosome segment that includes a nucleotide sequence that positively correlates with the desired phenotype, or that negatively correlates with the unfavourable phenotype at one or more genetic loci physically located on the chromosome segment.
[056] An "unfavourable allele" is an allele at a particular locus that negatively correlates with a desirable phenotype, or that correlates positively with an undesirable phenotype, e.g., positive correlation to breast cancer susceptibility. An unfavourable allele of a linked marker is a marker allele that segregates with the unfavourable allele. An unfavourable allelic form of a chromosome segment is a chromosome segment that includes a nucleotide sequence that negatively correlates with the desired phenotype, or positively correlates with the undesirable phenotype at one or more genetic loci physically located on the chromosome segment.
[057] A "risk allele" is an allele that positively correlates with the risk of developing a disease or condition, such as a breast cancer phenotype, i.e. indicates that an individual has an increased likelihood to develop a breast cancer phenotype.
[058] The "polygenic risk score" is used to define an individuals' risk of developing a disease or progressing to a more advanced stage of a disease, based on a large number, typically thousands, of common genetic variants each of which might have modest individual effect sizes contribute to the disease or its progression, but in aggregate have significant predicting value. In the present case, the polygenic risk score is used to predict the likelihood that a patient will develop a breast cancer phenotype using common single nucleotide polymorphisms (SNPs) associated with the breast cancer phenotype. The log of the odds ratio (OR) from every variant reaching a P<0.1 in the discovery dataset is used to calculate the polygenic risk score. Specifically, for each variant used in the score, the log of the Odds Ratio for each variant is multiplied by the number of reference alleles (0, 1 or 2) carried by the individual. The resulting log-additive score is then standardized to the same measure in population controls by the same measurement amongst population controls, resulting in the final polygenic risk score.
[059] " Allele frequency" refers to the frequency (proportion or percentage) at which an allele is present at a locus within an individual, within a line, or within a population of lines. For example, for an allele "A" diploid individuals of genotype "AA", "Aa" or "aa" may have allele frequencies of 2, 1, or 0, respectively. One can estimate the allele frequency within a line or population (e.g., cases or controls) by averaging the allele frequencies of a sample of individuals from that line or population. Similarly, one can calculate the allele frequency within a population of lines by averaging the allele frequencies of lines that make up the population.
[060] An individual is "homozygous" if the individual has only one type of allele at a given locus (e.g., a diploid individual has a copy of the same allele at a locus for each of two homologous chromosomes). An individual is "heterozygous" if more than one allele type is present at a given locus (e.g., a diploid individual with one copy each of two different alleles). The term "homogeneity" indicates that members of a group have the same genotype at one or more specific loci. In contrast, the term "heterogeneity" is used to indicate that individuals within the group differ in genotype at one or more specific loci.
[061] A " locus" is a chromosomal position or region. For example, a polymorphic locus is a position or region where a polymorphic nucleic acid, trait determinant, gene or marker is located. In a further example, a "gene locus" is a specific chromosome location (region) in the genome of a species where a specific gene can be found. Similarly, the term "quantitative trait locus" or "QTL" refers to a locus with at least two alleles that differentially affect the expression or alter the variation of a quantitative or continuous phenotypic trait in at least one genetic background, e.g., in at least one population or progeny.
[062] A "marker," "molecular marker" or "marker nucleic acid" refers to a nucleotide sequence or encoded product thereof (e.g., a protein) used as a point of reference when identifying a locus or a linked locus. A marker can be derived from genomic nucleotide sequence or from expressed nucleotide sequences (e.g., from an RNA, nRNA, mRNA, a cDNA, etc.), or from an encoded polypeptide. The term also refers to nucleic acid sequences complementary to or flanking the marker sequences, such as nucleic acids used as probes or primer pairs capable of amplifying the marker sequence. A "marker probe" is a nucleic acid sequence or molecule that can be used to identify the presence of a marker locus, e.g., a nucleic acid probe that is complementary to a marker locus sequence. Nucleic acids are "complementary" when they specifically hybridize in solution, e.g., according to Watson-Crick base pairing rules. A "marker locus" is a locus that can be used to track the presence of a second linked locus, e.g., a linked or correlated locus that encodes or contributes to the population variation of a phenotypic trait. For example, a marker locus can be used to monitor segregation of alleles at a locus, such as a QTL, that are genetically or physically linked to the marker locus. Thus, a "marker allele," alternatively an "allele of a marker locus" is one of a plurality of polymorphic nucleotide sequences found at a marker locus in a population that is polymorphic for the marker locus. In one aspect, the present invention provides marker loci correlating with a phenotype of interest, e.g., breast cancer susceptibility/resistance. Each of the identified markers is expected to be in close physical and genetic proximity (resulting in physical and/or genetic linkage) to a genetic element, e.g., a QTL, that contributes to the relevant phenotype. Markers corresponding to genetic polymorphisms between members of a population can be detected by methods well-established in the art. These include, e.g., PCR-based sequence specific amplification methods, detection of restriction fragment length polymorphisms (RFLP), detection of isozyme markers, detection of allele specific hybridization (ASH), detection of single nucleotide extension, detection of amplified variable sequences of the genome, detection of self-sustained sequence replication, detection of simple sequence repeats (SSRs), detection of single nucleotide polymorphisms (SNPs), or detection of amplified fragment length polymorphisms (AFLPs).
[063] A "marker probe" is a nucleic acid sequence or molecule that can be used to identify the presence of a marker locus, e.g., a nucleic acid probe that is complementary to a marker locus sequence. Nucleic acids are "complementary" when they specifically hybridize in solution, e.g., according to Watson-Crick base pairing rules.
[064] A "marker locus" is a locus that can be used to track the presence of a second linked locus, e.g., a linked or correlated locus that encodes or contributes to the population variation of a phenotypic trait. For example, a marker locus can be used to monitor segregation of alleles at a locus, such as a QTL, that are genetically or physically linked to the marker locus. Thus, a "marker allele," alternatively an "allele of a marker locus" is one of a plurality of polymorphic nucleotide sequences found at a marker locus in a population that is polymorphic for the marker locus. In one aspect, the present invention provides marker loci correlating with a phenotype of interest, e.g., a phenotype increasing the likelihood that an individual will develop a breast cancer phenotype. Markers corresponding to genetic polymorphisms between members of a population can be detected by methods well-established in the art. These include, e.g., PCR-based sequence specific amplification methods, detection of restriction fragment length polymorphisms (RFLP), detection of isozyme markers, detection of allele specific hybridization (ASH), detection of single nucleotide extension, detection of amplified variable sequences of the genome, detection of self-sustained sequence replication, detection of simple sequence repeats (SSRs), detection of single nucleotide polymorphisms (SNPs), or detection of amplified fragment length polymorphisms (AFLPs).
[065] The term "amplifying" in the context of nucleic acid amplification is any process whereby additional copies of a selected nucleic acid (or a transcribed form thereof) are produced. Typical amplification methods include various polymerase based replication methods, including the polymerase chain reaction (PCR), ligase mediated methods such as the ligase chain reaction (LCR) and RNA polymerase based amplification (e.g., by transcription) methods.
[066] A " gene" is one or more sequence(s) of nucleotides in a genome that together encode one or more expressed molecules, e.g., an RNA, or polypeptide. The gene can include coding sequences that are transcribed into RNA which may then be translated into a polypeptide sequence, and can include associated structural or regulatory sequences that aid in replication or expression of the gene.
[067] A "genotype" is the genetic constitution of an individual (or group of individuals) at one or more genetic loci. Genotype is defined by the allele(s) of one or more known loci of the individual, typically, the compilation of alleles inherited from its parents. A "haplotype" is the genotype of an individual at a plurality of genetic loci on a single DNA strand. Typically, the genetic loci described by a haplotype are physically and genetically linked, i.e., on the same chromosome strand.
[068] A "haplotype" is the genotype of an individual at a plurality of genetic loci on a single DNA strand. Typically, the genetic loci described by a haplotype are physically and genetically linked, i.e., on the same chromosome strand.
[069] A "set" of markers or probes refers to a collection or group of markers or probes, or the data derived therefrom, used for a common purpose, e.g., identifying an individual with a specified phenotype. Frequently, data corresponding to the markers or probes, or derived from their use, is stored in an electronic medium. While each of the members of a set possess utility with respect to the specified purpose, individual markers selected from the set as well as subsets including some, but not all of the markers, are also effective in achieving the specified purpose.
[070] The term "treatment", as used herein, unless otherwise indicated, means alleviating, inhibiting, blocking, reducing, preventing or otherwise delaying the progression, either partially or completely, of breast cancer or a symptom thereof. The term "treatment" also includes a prophylactic regimen and therefore encompasses alleviating, inhibiting, blocking, preventing or otherwise delaying the onset of breast cancer or a symptom thereof. The term "treatment" encompasses the use of natural and synthetic substances and pharmaceutical agents (e.g., drugs), as well as any other treatment modalities, illustrative examples of which include dietary supplements, physical therapy or exercise regimen, surgical intervention, and combinations thereof.
[071] A "computer readable medium" is an information storage medium that can be accessed by a computer using an available or custom interface. Examples include memory (e.g., ROM or RAM, flash memory, etc.), optical storage media (e.g., CD-ROM), magnetic storage media (e.g., computer hard drives, floppy disks, etc.), punch cards, and many others that are available and know to those skilled in the art. Information can be transmitted between a system of interest and the computer, or to or from the computer to or from the computer readable medium for storage or access of stored information. This transmission can be an electrical transmission, or can be made by other available methods, such as an IR link, a wireless connection, or the like.
[072] The polymorphisms and genes, and corresponding marker probes, amplicons or primers described above can be embodied in any system herein, either in the form of physical nucleic acids, or in the form of system instructions that include sequence information for the nucleic acids. For example, the system can include primers or amplicons corresponding to (or that amplify a portion of) a gene or polymorphism described herein. As in the methods above, the set of marker probes or primers optionally detects a plurality of polymorphisms in a plurality of said genes or genetic loci. Thus, for example, the set of marker probes or primers detects at least one polymorphism in each of these polymorphisms or genes, or any other polymorphism, gene or locus defined herein. Any such probe or primer can include a nucleotide sequence of any such polymorphism or gene, or a complementary nucleic acid thereof, or a transcribed product thereof (e.g., a nRNA or mRNA form produced from a genomic sequence, e.g., by transcription or splicing).
[073] As used herein, "logistic regression" refers to methods for predicting the probability of occurrence of an event by fitting data to a logistic curve. One skilled in the art will understand how to employ such methods in the context of the invention.
Example Processing System and Networked Communications System [074] A particular embodiment can be realised using a processing system, an example of which is shown in Fig. 1. In particular, the processing system 100 generally includes at least one processor 102, or processing unit or plurality of processors, memory 104, at least one input device 106 and at least one output device 108, coupled together via a bus or group of buses 110. In certain embodiments, input device 106 and output device 108 could be the same device. An interface 112 also can be provided for coupling the processing system 100 to one or more peripheral devices, for example interface 112 could be a PCI card or PC card. At least one storage device 114 which houses at least one database 116 can also be provided. The memory 104 can be any form of memory device, for example, volatile or non-volatile memory, solid state storage devices, magnetic devices, etc. The processor 102 could include more than one distinct processing device, for example to handle different functions within the processing system 100.
[075] Input device 106 receives input data 118 and can include, for example, a keyboard, a pointer device such as a pen-like device or a mouse, audio receiving device for voice controlled activation such as a microphone, data receiver or antenna such as a modem or wireless data adaptor, data acquisition card, etc.. Input data 118 could come from different sources, for example keyboard instructions in conjunction with data received via a network. Output device 108 produces or generates output data 120 and can include, for example, a display device or monitor in which case output data 120 is visual, a printer in which case output data 120 is printed, a port for example a USB port, a peripheral component adaptor, a data transmitter or antenna such as a modem or wireless network adaptor, etc.. Output data 120 could be distinct and derived from different output devices, for example a visual display on a monitor in conjunction with data transmitted to a network. A user could view data output, or an interpretation of the data output, on, for example, a monitor or using a printer. The storage device 114 can be any form of data or information storage means, for example, volatile or non-volatile memory, solid state storage devices, magnetic devices, etc..
[076] In use, the processing system 100 is adapted to allow data or information to be stored in and/or retrieved from, via wired or wireless communication means, the at least one database 116 and/or the memory 104. The interface 112 may allow wired and/or wireless communication between the processing unit 102 and peripheral components that may serve a specialised purpose. The processor 102 receives instructions as input data 118 via input device 106 and can display processed results or other output to a user by utilising output device 108. More than one input device 106 and/or output device 108 can be provided. It should be appreciated that the processing system 100 may be any form of terminal, server, specialised hardware, or the like.
[077] The processing device 100 may be a part of a networked communications system 200, as shown in Fig. 2. Processing device 100 could connect to network 202, for example the Internet or a WAN. Input data 118 and output data 120 could be communicated to other devices via network 202. Other terminals, for example, thin client 204, further processing systems 206 and 208, notebook computer 210, mainframe computer 212, PDA 214, pen-based computer 216, server 218, etc., can be connected to network 202. A large variety of other types of terminals or configurations could be utilised. The transfer of information and/or data over network 202 can be achieved using wired communications means 220 or wireless communications means 222. Server 218 can facilitate the transfer of data between network 202 and one or more databases 224. Server 218 and one or more databases 224 provide an example of an information source.
[078] Other networks may communicate with network 202. For example, telecommunications network 230 could facilitate the transfer of data between network 202 and mobile or cellular telephone 232 or a PDA-type device 234, by utilising wireless communication means 236 and receiving/transmitting station 238. Satellite communications network 240 could communicate with satellite signal receiver 242 which receives data signals from satellite 244 which in turn is in remote communication with satellite signal transmitter 246. Terminals, for example further processing system 248, notebook computer 250 or satellite telephone 252, can thereby communicate with network 202. A local network 260, which for example may be a private network, LAN, etc., may also be connected to network 202. For example, network 202 could be connected with Ethernet 262 which connects terminals 264, server 266 which controls the transfer of data to and/or from database 268, and printer 270. Various other types of networks could be utilised.
[079] The processing device 100 is adapted to communicate with other terminals, for example further processing systems 206, 208, by sending and receiving data, 118, 120, to and from the network 202, thereby facilitating possible communication with other components of the networked communications system 200.
[080] Thus, for example, the networks 202, 230, 240 may form part of, or be connected to, the Internet, in which case, the terminals 206, 212, 218, for example, may be web servers, Internet terminals or the like. The networks 202, 230, 240, 260 may be or form part of other communication networks, such as LAN, WAN, Ethernet, token ring, FDDI ring, star, etc., networks, or mobile telephone networks, such as GSM, CDMA,3G, 4G, etc., networks, and may be wholly or partially wired, including for example optical fibre, or wireless networks, depending on a particular implementation.
Breast Cancer Risk Assessment [081] The following modes, given by way of example only, are described in order to provide a more precise understanding of the subject matter of a preferred embodiment or embodiments. In the figures, incorporated to illustrate features of an example embodiment, like reference numerals are used to identify like parts throughout the figures.
[082] Embodiments involve detection and analysis of a large number of common genetic variants (e.g. SNPs) which can be used to calculate a polygenic risk score suitable for identifying individuals at a greater risk of developing a breast cancer phenotype.
[083] Detection methods for detecting relevant alleles include a variety of methods well known in the art, e.g., gene amplification technologies. For example, detection can include amplifying the polymorphism or a sequence associated therewith and detecting the resulting amplicon. This can include admixing an amplification primer or amplification primer pair with a nucleic acid template isolated from the organism or biological sample (e.g., comprising the SNP or other polymorphism), where the primer or primer pair is complementary or partially complementary to at least a portion of the target gene, or to a sequence proximal thereto. Amplification can be performed by DNA polymerization reaction (such as PCR, RT-PCR) comprising a polymerase and the template nucleic acid to generate the amplicon. The amplicon is detected by any available detection method, e.g., sequencing, hybridizing the amplicon to an array (or affixing the amplicon to an array and hybridizing probes to it), digesting the amplicon with a restriction enzyme (e.g., RFLP), real-time PCR analysis, single nucleotide extension, allele-specific hybridization, or the like. Genotyping can also be performed by other known techniques, such as using primer mass extension and MALDI-TOF mass spectrum (MS) analysis, such as the MassEXTEND methodology of Sequenom, San Diego, Calif. In certain embodiments, primers for amplification are located on a chip. Amplification can include performing a a polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), or ligase chain reaction (LCR) using nucleic acid isolated from the organism or biological sample as a template in the PCR, RT-PCR, or LCR. In certain embodiments, the method further comprises cleaving the amplified nucleic acid.
[084] A polygenic risk score was generated by determining the association of previously identified SNPs from genome-wide association studies in 915 breast cancer cases that had no detectable BRCA1 and BRCA2 gene mutation, 711 general population controls and 55 SNPs, and creating a rank ordered list of all independent SNPs below P<0.1 threshold.
[085] A hypothesis was tested that a polygenic risk score derived from the measured associations of common variants could be predictive of development of a breast cancer phenotype when a pathogenic mutation to the BRCA1 and BRCA2 genes was not detected. It was found that the polygenic risk score effectively identifies individuals at higher risk of developing a breast cancer phenotype. The score for each individual is the number of risk variants carried, weighted for the effect size (Odds Ratio). In the next step, performance of polygenic risk score to predict progression to the development of a breast cancer phenotype was assessed.
[086] The results, discussed in the Example below, show that the polygenic risk score is a useful tool to identify such patients for early intervention, and also to test candidate agents that might be effective in slowing down or inhibiting the development of a breast cancer phenotype by the most vulnerable patient population which do not carry a pathogenic mutation to the BRCA-1 and BRCA-2 genes. In particular, a decision can be made on at least one of timing and frequency of breast cancer phenotype diagnostic testing for the patient. Additionally or alternatively, a decision can be made on at least one of timing and frequency of breast cancer phenotype treatment for the patient. Additionally or alternatively, due to the results, the assessed individual can be subjected to breast cancer phenotype treatment.
[087] In particular embodiments, this method provides enhanced early detection options to identify patients that are at the greatest risk for developing a breast cancer phenotype, making it possible, in some cases, to prevent development, or at least slowing down the progress, of a breast cancer phenotype, e.g., by taking early preventative action, treating the patients with any existing treatment option, changes in the patient's lifestyle, including diet, exercise, etc.). In addition, the polygenic risk score determined in accordance with the present invention can also assist in providing an indication of how likely it is that a patient will respond to any particular therapy for the treatment of a breast cancer phenotype, including experimental therapies. Accordingly, the present invention also enables the identification of a patient population for testing treatment options for preventing or slowing down the development of a breast cancer phenotype.
[088] Presence of a high genetic propensity to breast cancer phenotypes can be treated as a warning to commence prophylactic or therapeutic treatment. For example, individuals with elevated risk of developing a breast cancer phenotype may be monitored differently (e.g., more frequent mammography) or may be treated prophylactically (e.g., with one or more drugs or surgery). Presence of a high propensity to a breast cancer phenotype also indicates the utility of performing secondary testing, such as a biopsy and other methods known in the art.
[089] Polymorphic profiling is useful, for example, in selecting agents to affect treatment or prophylaxis of breast cancer phenotypes in a given individual. Individuals having similar polymorphic profiles are likely to respond to agents in a similar way.
[090] Polymorphic profiling is also useful for stratifying individuals in clinical trials of agents being tested for capacity to treat breast cancer phenotypes or related conditions. Such trials are performed on treated or control populations having similar or identical polymorphic profiles (see EP 99965095.5), for example, a polymorphic profile indicating an individual has an increased risk of developing a breast cancer phenotype. Use of genetically matched populations eliminates or reduces variation in treatment outcome due to genetic factors, leading to a more accurate assessment of the efficacy of a potential drug. Computer-implemented algorithms can be used to identify more genetically homogenous subpopulations in which treatment or prophylaxis has a significant effect notwithstanding that the treatment or prophylaxis is ineffective in more heterogeneous larger populations. In such methods, data is provided for a first population with a breast cancer phenotype treated with an agent, and a second population also with the breast cancer phenotype but treated with a placebo. Subpopulations of each of the first and second populations are then selected such that the individuals in the subpopulations have greater similarity of polymorphic profiles with each other than do the individuals in the original first and second populations. There are many criteria by which similarity can be assessed. For example, one criterion is to require that individuals in the subpopulations have at least one susceptibility allele at each of at least one of the above genes. Another criterion is that individuals in the subpopulations have at least 75% susceptibility alleles for each of the polymorphic sites at which the polymorphic profile is determined. Regardless of the criteria used to assess similarity, the endpoint data of the subpopulations are compared to determine whether treatment or prophylaxis has achieved a statistically significant result in the subpopulations. As a result of computer implementation, billions of criteria for similarity can be analysed to identify one or a few subpopulations showing statistical significance.
[091] Polymorphic profiling is also useful for excluding individuals with no predisposition to breast cancer phenotypes from clinical trials. Including such individuals in the trial increases the size of the population needed to achieve a statistically significant result. Individuals with no predisposition to breast cancer phenotypes can be identified by determining the numbers of resistances and susceptibility alleles in a polymorphic profile as described above. For example, if a subject is genotyped at ten sites in ten genes of the invention associated with breast cancer phenotypes, twenty alleles are determined in total. If over 50% and preferably over 60% or 75% percent of these are resistance genes, the individual is unlikely to develop a breast cancer phenotype and can be excluded from the trial.
[092] In other embodiments, stratifying individuals in clinical trials may be accomplished using polymorphic profiling in combination with other stratification methods, including, but not limited to, family history, risk models (e.g., Gail Score, Claus model), clinical phenotypes (e.g., atypical lesions), and specific candidate biomarkers.
[093] Polymorphic profiles can also be used after the completion of a clinical trial to elucidated differences in response to a given treatment. For example, the set of polymorphisms can be used to stratify the enrolled patients into disease sub-types or classes. It is also possible to use the polymorphisms to identify subsets of patients with similar polymorphic profiles who have unusual (high or low) response to treatment or who do not respond at all (non-responders). In this way, information about the underlying genetic factors influencing response to treatment can be used in many aspects of the development of treatment (these range from the identification of new targets, through the design of new trials to product labelling and patient targeting). Additionally, the polymorphisms can be used to identify the genetic factors involved in adverse response to treatment (adverse events). For example, patients who show adverse response may have more similar polymorphic profiles than would be expected by chance. This allows the early identification and exclusion of such individuals from treatment. It also provides information that can be used to understand the biological causes of adverse events and to modify the treatment to avoid such outcomes.
[094] Polymorphic profiles can also be used for other purposes, including paternity testing and forensic analysis as described by US 6,525,185. In forensic analysis, the polymorphic profile from a sample at the scene of a crime is compared with that of a suspect. A match between the two is evidence that the suspect in fact committed the crime, whereas lack of a match excludes the suspect. The present polymorphic sites can be used in such methods, as can other polymorphic sites in the human genome.
[095] Referring to Figure 3A there is shown a flowchart representing a method 300 for assessing a human subject's risk for developing a breast cancer phenotype.
[096] In particular, at step 310 the method 300 includes determining, in a biological sample from the human subject, an absence of an identified pathogenic mutation to the BRCA-1 and BRCA-2 genes.
[097] At step 320, in response to a successful determination, the method 300 includes determining in the biological sample the presence or absence of risk alleles of common allelic variants associated with a breast cancer phenotype at a plurality of independent loci.
[098] At step 330 the method 300 includes calculating, based upon the presence or absence of the risk alleles, the polygenic risk score for the human subject using odds ratios based upon case data indicative of women who developed breast cancer, wherein each woman of the case data did not carry the pathogenic mutation to the BRCA-1 and BRCA-2 genes. A high polygenic risk score is indicative of a higher risk for developing a breast cancer phenotype.
[099] Referring to Figure 3B there is shown a flowchart representing a method 350 for assessing a risk of an individual developing a breast cancer phenotype.
[0100] In particular, at step 360, the method 350 includes calculating, at a processing system 100, a first probability that the individual, who does not have an identifiable pathogenic mutation within the BRCA-2 and BRCA-2 genes, develops a breast cancer phenotype. In particular, the first probability is calculated based upon a polygenic risk score for the individual and first score data. The first score data is indicative of polygenic risk scores for individuals diagnosed with a breast cancer phenotype, wherein each individual had no identifiable pathogenic mutation within the BRCA-1 and BRCA-2 genes.
[0101] At step 370, the method 350 includes calculating, at the processing system 100, a second probability that an individual develops a breast cancer phenotype. The second probability is calculated based upon the polygenic risk score for the individual and second score data indicative of polygenic risk scores for a control population.
[0102] At step 380, the method 350 includes calculating, at the processing system 100, the risk for the individual developing breast cancer based upon a population lifetime risk, the first probability and the second probability.
[0103] Referring more specifically to Figure 4 there is shown a flowchart representing an initialisation method used for the assessing a risk of an individual developing a breast cancer phenotype.
[0104] In particular, at step 410, the method 400 includes the processing system 100 obtaining case data indicative of common genomic variants for women diagnosed with breast cancer. In particular, 55 common genomic variants as outlined in Table 1 are tested for the women. Each woman has a ‘high risk’ for breast and/or ovarian cancer based on their personal and family history through questionnaires, and in-person consultation. Cancer histories are verified, where possible, through local cancer registries. Importantly, all women for the case data have no identifiable pathogenic mutation in the BRCA-1 and BRCA-2 genes. The case data is stored in a data store 116 accessible by the processing system 100.
AMajor Allele/Minor Allele, MAF = Minor allele frequency
Table 1 Common genomic variant associations used to calculate polygenic risk score [0105] At step 420, the method 400 includes the processing system 100 obtaining control data indicative of the genotype data for the same common genomic variants as for the case data for women randomly identified from the general population.
[0106] At step 430, the method 400 includes the processing system 100 performing logistic regression to estimate the per-allele association for each variant and familial breast cancer risk (measured by Odds ratios and 95% confidence intervals). Additionally, the processing system 100 calculates p-values for each variant using the Cochran-Armitage trend test.
[0107] At step 440, the method includes the processing system 100 calculating adjusted odd ratios for any participants missing genotype data which is less than a threshold number of genotypes, for example less than 16 genotypes. The adjusted OR(i) is calculated by the processing system 100 using the minor allele frequency for either breast cancer cases (no identifiable BRCA mutation) or population controls, and assigned an odds ratio based on calculated heterozygous and homozygous odds ratios for the specific group. The processing system 100 is configured to calculate the adjusted odds ratios for participants according to Equation 1:
... Eqn. 1 where: m is the minor allele frequency; h is the homozygous odds ratio for the minor allele (aa); and h is the odds ratio for the heterozygotes (Aa).
[0108] Each adjusted odds ratios that is calculated by the processing system 100 is stored in the data store 116 as the odds ratio for the respective participant.
[0109] At step 450, the method 400 includes the processing system 100 calculating the polygenic risk score for each participant in the case and control data. In particular, for each variant, the natural log of the per-allele odds ratio (OR) measured between the breast cancer cases and population controls, is combined for each participant. This log-additive score is then standardised by the processing system 100 to the same measure in the control population. In particular the processing system 100 calculates the polygenic risk score (PRS) for each participant according to Equation 2 as shown below:
... Eqn. (2) where: x is the polygenic risk score for the individual; n is the number of variants; A is the per-allele odds ratio for the variant i; R is the number of alleles (0, 1 or 2); and A is the mean score for the same measure in the control population.
[0110] The polygenic risk scores for the individuals of the case data and the control data are stored in the data store 116.
[0111] At step 460, the method 400 includes the processing system 100 calculating a first mean value of the calculated polygenic risk scores for the individuals of the case data. Preferably, the processing system 100 may use a statistical executable function stored in memory of the processing system 100 which is applied to the polygenic risk scores for the individuals of the case data to determine the mean of the polygenic risk scores of the individuals associated with the case data. The first mean value is stored in the data store 116.
[0112] At step 470, the method 400 includes the processing system 100 calculating a first standard deviation value of the calculated polygenic risk scores for the individuals of the case data. Preferably, the processing system 100 may call a statistical executable function stored in memory of the processing system 100 which is applied to the polygenic risk scores for the individuals of the case data to determine the standard deviation of the individuals associated with the case data. The first standard deviation is stored in the data store 116.
[0113] At step 480, the method 400 includes the processing system 100 calculating a second mean value of the calculated polygenic risk scores for the individuals of the control data. Preferably, the processing system 100 may use a statistical executable function stored in memory of the processing system 100 which is applied to the polygenic risk scores for the individuals of the control data to determine the mean of the polygenic risk scores of the individuals associated with the control data. The second mean value is stored in the data store 116.
[0114] At step 490, the method 400 includes the processing system 100 calculating a second standard deviation value of the calculated polygenic risk scores for the individuals of the control data. Preferably, the processing system 100 may call a statistical executable function stored in memory of the processing system 100 which is applied to the polygenic risk scores for the individuals of the control data to determine the standard deviation of the individuals associated with the control data. The second standard deviation is stored in the data store 116.
[0115] It will be appreciated that the initialization method need only be performed once or potentially a small number of times such that the first and second mean values and the first and second standard deviation values have been calculated and stored in the data store 116 for use in tests for a test subject. It will be appreciated that new case and control data may be acquired over time and therefore the initialization process may be undertaken again by the processing system 100. It will also be appreciated that the processing system 100 that performs the initialization process may not be the same processing system 100 which is used for assessing the risk for a test subject, wherein the data stored in the data store 116 by the processing system 100 performing the initialization process can be accessed by another processing system 100 determining the risk for a test subject.
[0116] Referring to Figure 5 there is shown another example method 500 for assessing a risk of an individual developing a breast cancer phenotype. The method 500 depicted by the flowchart of Figure 5 is intended to be performed after the initialization process described in relation to Figure 4 has been completed.
[0117] In particular, at step 510, the method 500 includes using a set of marker probes or primers configured to detect, in the biological sample derived from the individual, the presence of at least two single nucleotide polymorphisms (SNPs) known to be associated with a breast cancer phenotype. More specifically, the at least two SNPs can include at least some or all of the SNPs provided in Table 1 outlined above.
[0118] At step 520, the method 500 includes determining, based on the one or more signals from the detector, whether the test individual has a pathogenic mutation to the BRCA-1 or BRCA-2 genes. In the event that these mutations do not exist for the test individual, the method continues to step 530, otherwise the method ends.
[0119] At step 530, the method 500 includes the processing system 100 calculating the polygenic risk score for the test individual based upon the presence or absence of the at least two single nucleotide polymorphisms for the biological sample derived from the individual. The same method described above in relation to Equation 2 can be used by the processing system 100 to calculate the PRS for the test individual. Data can be retrieved by the processing system 100 from the data store 116 to calculate the PRS for the test individual.
[0120] At step 540, the method 500 includes the processing system 100 calculating the first probability that the test individual develops a breast cancer phenotype based upon the PRS of the individual, the first mean value and the first standard deviation. More specifically, the processing system 100 calculates the first probability according to Equation 3 below:
...Eqn (3) where: fp (x) is the first probability; σρ is the first standard deviation; μρ is the first mean; and x is the polygenic risk score for the individual.
[0121] The first mean value and the first standard deviation value can be retrieved from the data store 116 for use in the Equation 3. Alternatively, a function corresponding to Equation 3 may be stored in memory which has the first mean value and the first standard deviation value already stored in the function. The first probability can be considered a conditional probability. The first probability is stored in memory of the processing which can either be volatile or non-volatile memory.
[0122] At step 550, the method 500 includes the processing system 100 calculating the second probability that the test individual develops a breast cancer phenotype based upon the PRS of the individual, the second mean value and the second standard deviation. More specifically, the processing system 100 calculates the second probability according to Equation 4 below:
... Eqn (4) where: /p(x) is the second probability;
Op is the second standard deviation; μρ is the second mean; and x is the polygenic risk score for the individual.
[0123] The first mean value and the first standard deviation value can be retrieved from the data store 116 for use in the Equation 3. Alternatively, a function corresponding to Equation 3 may be stored in memory which has the first mean value and the first standard deviation value already stored in the function. The second probability can be considered a conditional probability. The second probability is stored in memory of the processing system 100 which can either be volatile or non-volatile memory.
[0124] At step 560, the method 500 includes the processing system 100 calculating the absolute risk for the test individual to develop a breast cancer phenotype based on the first probability, the second probability and the population lifetime risk which can be retrieved by the processing system 100 from the data store 116. In particular, the processing system 100 can calculate the absolute risk for the test individual according to Equation 5 as shown below:
... Eqn (5) where: τ is the population lifetime risk; fp (x) is the first probability; and fp (x) is the second probability.
[0125] The population lifetime risk can be retrieved by the processing system 100 from the data store 116. The absolute risk can be stored in memory of the processing system 100.
[0126] At step 570, the method 500 includes the processing system 100 calculating the relative risk of the test individual developing a breast cancer phenotype. In particular, the processing system 100 is configured to calculate the relative risk of the test individual according to Equation 6 as shown below:
... Eqn (6) where: τ is the population lifetime risk.
[0127] The relative risk can be stored in memory 104 of the processing system 100.
[0128] At step 580, the method 500 includes the processing system 100 generating a clinical report in relation to the risk that the test individual develops a breast cancer phenotype. In particular, a sample report is shown in Figure 6. More specifically, the following information is presented in the report: a. The relative risk for familial breast cancer cases plotted along the x-axis of a graph forming a ‘bell-shaped’ curve (n=915, line 610) b. The population average (relative risk of 1) indicated on the graph with solid line 620. c. Lifestyle factors known to increase breast cancer risk indicated on the graph which includes alcohol consumption (RR 1.13, 2 standards drinks per week, line 625, [26]) and a body mass index above 30 (RR 1.25, line 630) d. The breast cancer relative risk associated with a BRCA-1 and BRCA-2 mutation (RR>10) - box 640. e. The relative risk of breast cancer calculated using the above described method indicated on the graph by line 650.
[0129] The clinical report generated can include interpretation statements for patients who fall into different high, intermediate and low-risk categories based on their PRS. Risk categories include quartiles of PRS for the breast cancer cases: ‘low’ consisted of cases in the first quartile, ‘intermediate’ comprised of cases in the third and second quartile; ‘high’ were cases in the fourth quartile. The following categories are outlined below in Table 2.
Table 2: Polygenic risk score category ranges and interpretation statements
[0130] The results which can be embodied in the form of the report can be communicated to at least one of the subject and the subject's physician.
[0131] It will be appreciated that the processing system 100 may be a distributed processing system which include a plurality of processing systems which perform different processing system.
[0132] Additionally, it will be appreciated that the processing system 100 may be embodied as a server processing system which serves client processing systems. In particular, a client processing system transfers, via a computer network, a risk request to the server processing system which is indicative of genotype data for the test individual, The server processing system then performs the method 500 as described above to determine the absolute and relative risk for the test individual and generate the report. The server processing system can then transfer, via the computer network, a risk response indicative of the absolute and relative risk for the test individual and/or the generated report. It will be appreciated that the computer network could be a local network or a public network such as the Internet.
Example [0133] Women diagnosed with breast cancer were tested for 55 common genomic variants (summarised in Table 1 above). These women are participants from the Victorian familial breast and ovarian cancer cohort (VFBOCC) and had all been assessed as ‘high risk’ for breast and/or ovarian cancer based on their personal and family history through questionnaires, and in-person consultation at Victorian and Tasmanian familial cancer clinics. Cancer histories were verified, where possible, through local cancer registries. All cases had consented for the diagnostic genetic testing of BRCA-1 and BRCA-2 genes and had no identifiable pathogenic mutation.
[0134] General population control genotype data was obtained from the Australian Ovarian Cancer Study. Population controls consisted of unaffected adults randomly identified from the Australian electoral roll.
[0135] Logistic regression was used to estimate the per-allele association for each variant and familial breast cancer risk (measured by Odds ratios and 95% confidence intervals) and p-values were calculated by the Cochran-Armitage trend test. Statistical analysis conducted using and Microsoft Excel.
[0136] The polygenic risk score (PRS) was calculated as described by previous studies (Comen et al., 2011, Pharoah et al., 2008, Gail, 2008, Sawyer et al., 2012). For each variant, the log of the per-allele odds ratio (OR) measured between the breast cancer cases and population controls, was combined for each individual (x). This log-additive score was then standardised to the same measure in population controls as discussed above. Further details of this analysis can be found in the following publication (22 variants): Sawyer, S, Mitchell, G, McKinley, J, Chenevix-Trench, G, Beesley, J Chen, X Bowtell, D Trainer, A Harris, M Lindeman, G and James PA. A role for common genetic variants in the assessment of familial breast cancer, Journal of Clinical Oncology 30(35)4330.
[0137] Adjusted odd ratios were used for participants missing data for no more than 16 genotypes. The adjusted odds ratio was calculated by using the minor allele frequency for breast cancer cases (mutation negative), and assigned an odds ratio based on calculated heterozygous and homozygous odd ratios.
[0138] The parameters of the polygenic risk score for each participant group (breast cancer cases and population controls) was examined by Quantile - Quantile probability plots generated using statistical software. The observed normal values (x) and expected normal values (y) were found to be linearly related (y = x), indicating the data was normally distributed as shown in Figure 7A and Figure 7B, where in Figure 7A indicates the breast cancer patients with mutation negative where n=915, and where in Figure 7B indicates the control population where n=711.
[0139] The polygenic risk score mean (μ) and standard deviation (σ) were then calculated for breast cancer cases and population controls.
[0140] Probability functions were then used to calculate an absolute risk of breast cancer that took into consideration a 10% population lifetime risk (τ) of developing breast cancer within the general population, based on data provided by the Australian institute of health and welfare (AIHW, 2012).
[0141] The relative and absolute risks (derived from the polygenic risk score) for familial breast cancer cases (mutation negative BRCA1/2 index cases of the VFBOCC) was then presented in a clinical report generated in Microsoft Excel (see Figure 6).
[0142] The relative risk for familial breast cancer cases was plotted along the x-axis of a graph forming a ‘bell-shaped’ curve (n=915, line 610, Figure 6) [0143] The population average (relative risk of 1) was then indicated on the graph with line 620.
[0144] Lifestyle factors known to increase breast cancer risk were included in the graph and consisted of alcohol consumption (RR 1.13, 2 standards drinks per week, line 625,(Hamajima et al., 2002)) and a body mass index above 30 (RR 1.25, line 630, (Parkin and Boyd, 2011)).
[0145] The breast cancer relative risk associated with a BRCA-1 and BRCA-2 mutation (RR>10) was greater than the range derived from the common genomic variant information (Antoniou et ah, 2003). Consequently, the relative risk for a BRCA mutation was indicated in a box 640 on the bottom right side of the graph.
[0146] The index case relative risk of breast cancer was calculated from their PRS and indicated on the graph by a line 650.
[0147] The clinical report included interpretation statements for patients who fell into different categories such as high, intermediate and low-risk categories based on their PRS (see Table 2). The interpretation statements for each risk category were extrapolated from findings of (Sawyer et al., 2012) and unpublished data on 55 common genomic variants.
Risk categories were based on quartiles of PRS for the breast cancer cases: ‘low’ consisted of cases in the first quartile, ‘intermediate’ comprised of cases in the third and second quartile; ‘high’ were cases in the fourth quartile.
[0148] The performance of the polygenic risk score predicting familial breast cancer cases (n=915) from general population (n=711) was examined by receiver operating curves (ROC). The discriminatory accuracy of the polygenic risk score is measured by the area under the curve (AUC) (the balance between sensitivity and specificity). An AUC of 0.5 specifies the scores discriminatory accuracy was no better than chance, whereas an AUC of 1.0 recognises the score has perfect discriminatory accuracy.
[0149] ROC analysis was used to compare the area under the curve for polygenic risk scores that were calculated by adding single nucleotide variants (SNPs, also known as common genetic variants). Table 3 below lists the area under each curve for each combination of SNPs. A polygenic risk score based on 2SNPs has an AUC of 0.52 indicting its performance predicting breast cancer cases is no better than chance alone. However, the AUC gradually increases when more SNPs are used to generate a PRS. When all 55 SNPs are used in the polygenic risk score, an AUC of 0.64 is achieved.
Table 3 - Area under the curve for descending combinations of SNPs according to odds ratio from Table 1 [0150] A pairwise comparison was used to determine the confidence interval and p-value difference in the area under the curve (AUC) between the risk prediction scores [27]. This analysis found there was a statistically significant difference between the AUC for a PRS based on 2 SNPs compared to the PRS derived from 55 SNPs (P < 0.0001). This indicates there is a significant improvement in the ability for the polygenic risk score predicting familial breast cancer when information from all 55 SNPs are utilised to generate a PRS. Table 4 shows the results of the pairwise comparison of the ROC analysis of Table 1 based on PRS according to descending per allele odds ratio associations.
a DeLong et al., 1988
Table 4 - Pairwise comparison of ROC analysis for descending combinations of SNPs according to odds ratio from Table 1 [0151] Referring to Table 5 there is shown another example ROC analysis in which polygenic risk scores have been generated by adding SNPs from Table 1 in a random order.
[0152] Table 5 below lists the area under the curve for each combination of SNPs. A polygenic risk score based on 2 SNPs has an AUC of 0.55 indicting its performance predicting breast cancer cases is no better than chance alone. Once more, the AUC gradually increases when more SNPs are used to generate a PRS. When all 55 SNPs are used in the polygenic risk score, an AUC of 0.64 is achieved.
Table 5 - Area under the curve for random combinations of SNPs according to odds ratio from Table 1 [0153] A pairwise comparison was used to determine the confidence interval and p-value difference in the area under the curve (AUC) between the risk prediction scores [27]. Table 6 shows results of the pairwise comparison of the ROC analysis of Table 5 based on PRS that randomly added SNPs.
a DeLong et al., 1988
Table 6 - Pairwise comparison of ROC analysis for randomly added SNPs from Table 1 [0154] Results of the above analysis found there was a statistically significant difference between the AUC for a PRS based on 2 SNPs compared to the PRS derived from 55 SNPs (P < 0.0001). This indicates there is a significant improvement in the ability for the polygenic risk score predicting familial breast cancer when information from all 55 SNPs are utilised to generate a PRS.
Conclusion [0155] Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
[0156] Many modifications will be apparent to those skilled in the art without departing from the scope of the present invention.
References 1. Michailidou, K., et al., Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet, 2013. 45(4): p. 353-61. 2. Thomas, G., et al., A multistage genome-wide association study in breast cancer identifies two new risk alleles at lpl 1.2 and 14q24.1 (RAD51L1). Nat Genet, 2009. 41(5): p. 579-84. 3. Garcia-Closas, M., et al., Genome-wide association studies identify four ER negative-specific breast cancer risk loci. Nat Genet, 2013. 45(4): p. 392-8. 4. Cox, A., et al., A common coding variant in CASP8 is associated with breast cancer risk. Nat Genet, 2007. 39(3): p. 352-8. 5. Stacey, S.N., et al., Common variants on chromosomes 2q35 and 16ql2 confer susceptibility to estrogen receptor-positive breast cancer. Nat Genet, 2007. 39(7): p. 865-9. 6. Gaudet, M.M., et al., Identification of a BRCA2-specific modifier locus at 6p24 related to breast cancer risk. PLoS Genet, 2013. 9(3): p. el003173. 7. Stacey, S.N., et al., Common variants on chromosome 5pl2 confer susceptibility to estrogen receptor-positive breast cancer. Nat Genet, 2008. 40(6): p. 703-6. 8. Wang, X., et al., Common variants associated with breast cancer in genome-wide association studies are modifiers of breast cancer risk in BRCA1 and BRCA2 mutation carriers. Hum Mol Genet, 2010. 19(14): p. 2886-97. 9. Bojesen, S.E., et al., Multiple independent variants at the TERT locus are associated with telomere length and risks of breast and ovarian cancer. Nat Genet, 2013. 45(4): p. 371-84, 384el-2. 10. Easton, D.F., et al., Genome-wide association study identifies novel breast cancer susceptibility loci. Nature, 2007. 447(7148): p. 1087-93. 11. Zheng, W., et al., Genome-wide association study identifies a new breast cancer susceptibility locus at 6q25.1. Nat Genet, 2009. 41(3): p. 324-8. 12. Antoniou, A.C., et al., Common alleles at 6q25.1 and lpl 1.2 are associated with breast cancer risk for BRCA1 and BRCA2 mutation carriers. Hum Mol Genet, 2011. 20(16): p. 3304-21. 13. Turnbull, C., et al., Genome-wide association study identifies five new breast cancer susceptibility loci. Nat Genet, 2010. 42(6): p. 504-7. 14. Fletcher, 0., et al., Novel breast cancer susceptibility locus at 9q31.2: results of a genome-wide association study. J Natl Cancer Inst, 2011. 103(5): p. 425-35. 15. Antoniou, A.C., et al., Common variants in LSP1, 2q35 and 8q24 and breast cancer risk for BRCA1 and BRCA2 mutation carriers. Hum Mol Genet, 2009. 18(22): p. 4442-56. 16. French, J.D., et al., Functional variants at the llql3 risk locus for breast cancer regulate cyclin D1 expression through long-range enhancers. Am J Hum Genet, 2013. 92(4): p. 489-503. 17. Antoniou, A.C., et al., Common variants at 12pll, 12q24, 9p21, 9q31.2 and in ZNF365 are associated with breast cancer risk for BRCA1 and/or BRCA2 mutation carriers. Breast Cancer Res, 2012. 14(1): p. R33. 18. Ghoussaini, M., et al., Genome-wide association analysis identifies three new breast cancer susceptibility loci. Nat Genet, 2012. 19. Couch, F.J., et al., Common Variants at the 19pl3.1 and ZNF365 Loci Are Associated with ER Subtypes of Breast Cancer and Ovarian Cancer Risk in BRCA1 and BRCA2 Mutation Carriers. Cancer Epidemiol Biomarkers Prev, 2012. 20. Antoniou, A.C., et al., Common variants at 12pll, 12q24, 9p21, 9q31.2 and in ZNF365 are associated with breast cancer risk for BRCA1 and/or BRCA2 mutation carriers. Breast Cancer Res, 2012.14(1): p. R33. 21. Figueroa, J.D., et al., Associations of common variants at lpll.2 and 14q24.1 (RADS 1 LI) with breast cancer risk and heterogeneity by tumor subtype: findings from the Breast Cancer Association Consortium. Hum Mol Genet, 2011. 22. Couch, F.J., et al., Genome-wide association study in BRCA1 mutation carriers identifies novel loci associated with breast and ovarian cancer risk. PLoS Genet, 2013. 9(3): p. el003212. 23. Orr, N., et al., Genetic variants at chromosomes 2q35, 5pl2, 6q25.1, 10q26.13, and 16ql2.1 influence the risk of breast cancer in men. PLoS Genet, 2011. 7(9): p. el002290. 24. Ahmed, S., et al., Newly discovered breast cancer susceptibility loci on 3p24 and 17q23.2. Nat Genet, 2009. 41(5): p. 585-90. 25. Stevens, K.N., et al., Common breast cancer susceptibility loci are associated with triple-negative breast cancer. Cancer Res, 2011. 71(19): p. 6240-9. 26. Hamajima, N., et al., Alcohol, tobacco and breast cancer—collaborative reanalysis of individual data from 53 epidemiological studies, including 58,515 women with breast cancer and 95,067 women without the disease. Br J Cancer, 2002. 87(11): p. 1234-45. 27. DeLong, E.R., D.M. DeLong, and D.L. Clarke-Pearson, Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 1988. 44(3): p. 837-45.

Claims (24)

  1. Claims
    1. A method for assessing a human subject's risk for developing a breast cancer phenotype comprising: determining, in a biological sample from the human subject, an absence of an identified pathogenic mutation to the BRCA-1 and BRCA-2 genes; in response to a successful determination, determining in the biological sample a presence or absence of risk alleles of common allelic variants associated with a breast cancer phenotype at a plurality of independent loci; and calculating, based upon the presence or absence of the risk alleles, a polygenic risk score for the human subject using odds ratios based upon case data indicative of women who developed breast cancer, wherein each women of the case data did not carry the pathogenic mutation to the BRCA-1 and BRCA-2 genes; wherein a high polygenic risk score indicates a higher risk for developing a breast cancer phenotype.
  2. 2. The method according to claim 1, wherein determining the presence or absence of risk alleles is achieved by amplification of nucleic acid from said sample.
  3. 3. The method according to claim 2, wherein amplification comprises polymerase chain reaction (PCR).
  4. 4. The method according to claim 3, wherein primers for amplification are located on a chip.
  5. 5. The method according to claim 3, wherein the amplification comprises: admixing an amplification primer or amplification primer pair with a nucleic acid template isolated from the biological sample, wherein the primer or primer pair is complementary or partially complementary to a region proximal to or including the polymorphism, and is capable of initiating nucleic acid polymerization by a polymerase on the nucleic acid template; and extending the primer or primer pair in a DNA polymerization reaction comprising a polymerase and the template nucleic acid to generate an amplicon.
  6. 6. The method according to claim 5, wherein the amplicon is detected by a process that includes one or more of: hybridizing the amplicon to an array, digesting the amplicon with a restriction enzyme, or real-time PCR analysis.
  7. 7. The method of claim 2, wherein the amplification comprises performing a polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), or ligase chain reaction (LCR) using nucleic acid isolated from the organism or biological sample as a template in the PCR, RT-PCR, or LCR.
  8. 8. The method of claim 2, further comprising cleaving amplified nucleic acid.
  9. 9. The method of any one of claims 1 to 8, wherein said sample is derived from saliva or blood.
  10. 10. The method of any one of claims 1 to 9, further comprising the step of making a decision on at least one of timing and frequency of breast cancer phenotype diagnostic testing for said subject.
  11. 11. The method of any one of claims 1 to 10, further comprising the step of making a decision on at least one of timing and frequency of breast cancer phenotype treatment for said subject.
  12. 12. The method of any one of claims 1 to 11, further comprising the step of subjecting the subject identified as having an increased risk of developing a breast cancer phenotype to breast cancer phenotype treatment.
  13. 13. The method of any one of claims 1 to 12 wherein the presence or absence of the risk alleles is determined for all single nucleotide polymorphisms set forth in Table 1.
  14. 14. The method of any one of claims 1 to 12, wherein the presence or absence of the risk alleles is determined for at least some of the single nucleotide polymorphisms set forth in Table 1.
  15. 15. The method of any one of claims 1 to 14, further comprising the step of recording results of assessing the human subject's risk for developing a breast cancer phenotype on a computer readable medium.
  16. 16. The method of claim 15 wherein said results are communicated to at least one of the subject and the subject's physician.
  17. 17. The method of claim 15 or 16 wherein said results are recorded in the form of a report.
  18. 18. The method according to any one of claims 1 to 17, wherein the method includes: calculating, at a processing system, a first probability that the subject develops a breast cancer phenotype, wherein the first probability is calculated based upon: the polygenic risk score for the individual; and a first mean and a first standard deviation of first polygenic risk scores for the women of the case data; calculating, at the processing system, a second probability that an the subject develops a breast cancer phenotype, wherein the second probability is calculated based upon: the polygenic risk score for the individual; and second mean and a second standard deviation of polygenic risk scores for a control population; and calculating, at the processing system, the risk for the individual developing breast cancer based upon: a population lifetime risk; the first probability; and the second probability.
  19. 19. The method according to claim 18, wherein the method includes the processing system calculating the first probability according to Equation 3.
  20. 20. The method according to claim 18 or 19, wherein the method includes the processing system calculating the second probability according to Equation 4.
  21. 21. The method according to any one of claims 18 to 20, wherein the method includes the processing system obtaining, from a data store, at least one of: the first standard deviation; the second standard deviation; the first mean; the second mean; and the population lifetime risk.
  22. 22. The method according to any one of claims 18 to 21, wherein the risk is an absolute risk.
  23. 23. The method according to claim 22, wherein the method includes the processing system calculating the absolute risk for the subject according to Equation 5.
  24. 24. The method according to claim 23, wherein the method includes the processing system calculating a relative risk for the subject according to Equation 6.
AU2016256598A 2015-04-27 2016-04-27 Breast cancer risk assessment Abandoned AU2016256598A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
AU2015901495 2015-04-27
AU2015901495A AU2015901495A0 (en) 2015-04-27 Breast cancer risk assessment
PCT/AU2016/050297 WO2016172764A1 (en) 2015-04-27 2016-04-27 Breast cancer risk assessment

Publications (1)

Publication Number Publication Date
AU2016256598A1 true AU2016256598A1 (en) 2017-10-26

Family

ID=57197937

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2016256598A Abandoned AU2016256598A1 (en) 2015-04-27 2016-04-27 Breast cancer risk assessment

Country Status (2)

Country Link
AU (1) AU2016256598A1 (en)
WO (1) WO2016172764A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220157404A1 (en) * 2019-03-19 2022-05-19 Themba Inc. Using relatives' information to determine genetic risk for non-mendelian phenotypes

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107868799A (en) * 2017-11-20 2018-04-03 湖南丰晖生物科技有限公司 Expression vector and its construction method and application
US20200294618A1 (en) * 2019-03-12 2020-09-17 Ambry Genetics Corporation Method to perform medical procedures on breast cancer patients guided by an snp derived polygenic risk score
CN110172510A (en) * 2019-04-30 2019-08-27 北京组学生物科技有限公司 For the reagent system and kit of breast cancer and its application
EP4139508A1 (en) * 2020-04-20 2023-03-01 Myriad Genetics, Inc. Comprehensive polygenic risk prediction for breast cancer
WO2024085660A1 (en) * 2022-10-18 2024-04-25 제노플랜 인크 Device and method for predicting risk of disease incidence

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009097270A2 (en) * 2008-01-28 2009-08-06 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Method of determining breast cancer risk

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220157404A1 (en) * 2019-03-19 2022-05-19 Themba Inc. Using relatives' information to determine genetic risk for non-mendelian phenotypes

Also Published As

Publication number Publication date
WO2016172764A1 (en) 2016-11-03

Similar Documents

Publication Publication Date Title
AU2018202299B9 (en) Methods for assessing risk of developing breast cancer
AU2016256598A1 (en) Breast cancer risk assessment
JP7126704B2 (en) Methods for assessing the risk of developing colorectal cancer
US20200102617A1 (en) Improved Methods For Assessing Risk of Developing Breast Cancer
AU2022279367A1 (en) Methods of assessing risk of developing breast cancer
WO2023102601A1 (en) Breast cancer risk assessment
AU2022407711A1 (en) Breast cancer risk assessment
WO2023205842A1 (en) Methods of assessing risk of developing prostate cancer

Legal Events

Date Code Title Description
MK1 Application lapsed section 142(2)(a) - no request for examination in relevant period