WO2017176214A1 - System and method for detecting variations in nucleic acid sequence for use in next-generation sequencing - Google Patents

System and method for detecting variations in nucleic acid sequence for use in next-generation sequencing Download PDF

Info

Publication number
WO2017176214A1
WO2017176214A1 PCT/SG2017/050195 SG2017050195W WO2017176214A1 WO 2017176214 A1 WO2017176214 A1 WO 2017176214A1 SG 2017050195 W SG2017050195 W SG 2017050195W WO 2017176214 A1 WO2017176214 A1 WO 2017176214A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
seq
chrl
panel
cancer
Prior art date
Application number
PCT/SG2017/050195
Other languages
French (fr)
Inventor
Mei Ling CHONG
Mei Peng
Original Assignee
Angsana Molecular And Diagnostics Laboratory Pte. Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Angsana Molecular And Diagnostics Laboratory Pte. Ltd. filed Critical Angsana Molecular And Diagnostics Laboratory Pte. Ltd.
Publication of WO2017176214A1 publication Critical patent/WO2017176214A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means
    • C12Q1/6823Release of bound markers

Definitions

  • the present invention generally relates to next generation nucleic acid sequencing and the use of methods in detecting variations in nucleic acid sequence preferably for use in cancer detection and targeted cancer therapy guidance.
  • Pathological examinations play a very important and active role in the diagnosis of cancer.
  • Traditional histology of cancer tissue samples are still used.
  • the tissue is preserved in formalin fixed paraffin embedded (FFPE) blocks and processed for microscopic examination.
  • Methods such as FISH and IHC methods are still the gold standards recommended to be used for companion diagnosis for NSCLC targeted therapy. While pathological examination goes a long way towards determining the best treatment plan biomarker detection is also relevant.
  • CML chronic myeloid leukemia
  • NGS next-generation sequencing
  • an aspect of the invention relates to a method for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample from a subject comprising: (a) extracting nucleic acid from a slice of the sample; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; (c) amplifying the nucleic acid; (d) purifying the amplified nucleic acid; (e) contacting the amplified nucleic acid with reversible dye terminators; (f) detecting a signal from the dye terminators; and (g) converting the signal to a nucleic acid sequence to analysing the same for variations.
  • Another aspect of the invention relates to a System for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
  • kits for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a system as described herein comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719- 748; and fluorescent nucleotides.
  • Another aspect of the invention relates to a panel of primers for use in detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a parallel sequencing method comprising primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748.
  • Another aspect of the invention relates to a method for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: (a) extracting nucleic acid from a slice of a fixed formalin paraffin embedded sample from a subject; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences comprising oligonucleotides identified by SEQ ID NOS.
  • Another aspect of the invention relates to a system for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
  • a kit for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; and fluorescent nucleotides.
  • Figure 1 The workflow of Solid Tumour Panel and the different QC steps from nucleic acid extraction until report generation.
  • Figure 5 Percentage of amplicons having 500x or higher sequencing depth in all the tested samples.
  • FIG. 1 A KRAS substitution mutation (c.35G>A p.G12D) was identified in colorectal cancer.
  • FIG. 2 An EGFR deletion (c.2235_2249del15 p.E746_A750delELREA) was identified in lung cancer.
  • Figure 7. Detected variants in Horizon Diagnostics and clinical samples. Red cells indicate single nucleotide variation, blue cells indicate indel and green cells indicate wildtype.
  • an aspect of the invention relates to a method for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample from a subject comprising: (a) extracting nucleic acid from a slice of the sample; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; (c) amplifying the nucleic acid; (d) purifying the amplified nucleic acid; (e) contacting the amplified nucleic acid with reversible dye terminators; (f) detecting a signal from the dye terminators; and (g) converting the signal to a nucleic acid sequence to analysing the same for variations.
  • the method is an alternative massive parallel sequencing or next generation nucleic acid sequencing method.
  • nucleic acid refers to any alteration present in the nucleic acid sequences isolated from a subjects from the nucleic acid sequences isolated from a reference sequence often found in the majority of subjects, wherein the alteration may be indicative of a disease condition or a response to treatment.
  • the reference sequences are the nucleic acid sequences identified by SEQ ID NOS. 719-748.
  • a variation in nucleic acid may include a single nucleotide polymorphisms (SNP); a single nucleotide variant (SNV) in somatic cells such as nonsynonymous SNP or SNV that are either missense or nonsense.
  • a variation in nucleic acid may include structural variation such as deletions, inversions, insertions, duplications or an indel which can refer to either an insertion or a deletion in the nucleic acid sequence.
  • a variation in nucleic acid may include any one of the variations to the reference nucleic acid sequences identified by SEQ ID NOS. 719-748 listed in tables 2, 3, 4 or 6.
  • nucleic acid' refers to deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) or complementary deoxyribonucleic acid (cDNA).
  • DNA or RNA is isolated from a fixed formalin paraffin embedded (FFPE) sample from a subject.
  • FFPE fixed formalin paraffin embedded
  • a cDNA is formed based on an RNA isolated from an FFPE sample from a subject.
  • a cDNA is formed using reverse transcription methods known in the art. Generally a nucleic acid containing an adenine base or derivative thereof will hybridise to a nucleic acid containing either a thymine base or derivative thereof or a uridine base or derivative thereof.
  • nucleic acid containing a cytosine base or derivative thereof will hybridise to a nucleic acid containing a guanidine base or derivative thereof.
  • complementary is essentially an anti-sense strand of an mRNA whereby under normal physiological conditions each base will hybridise to the mRNA from which it is formed.
  • sample' refers to any fixed formalin paraffin embedded (FFPE) sample from a subject.
  • FFPE sample is thinly sliced with a microtome.
  • the slices are 5 to 20 ⁇ in thickness or preferably about 10 ⁇ in thickness.
  • the sliced sample must be unstained to avoid any of the stains destroying or denaturing the nucleic acid in the sample.
  • the term 'subject' refers to a vertebrate animal such as a mammal from which a biopsy sample is taken, fixed in formalin and embedded in a paraffin block.
  • the subject is that is a vertebrate animal such as a mammal suspected of having or suffering from a proliferative disease such as cancer.
  • this may include a subject at risk of having a proliferative disease, a subject that has a proliferative disease or a subject that has had a proliferative disease in the past.
  • this may include a subject at risk of having cancer, a subject that has cancer or a subject that has had cancer in the past.
  • the subject comprises a human.
  • nucleic acid from the sample is isolated from a slice of the PPPE block using methods known in the art. There are many process known for purification of DNA from sample using a combination of physical and chemical methods.
  • panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748 may include a target region known to have variations that may relate to a disease state or a likelihood of responding to a treatment.
  • the target region may include any one of the variations to the nucleic acid sequences identified by SEQ ID NOS. 719-748 listed in tables 3, 4 or 6.
  • the primers are designed to amplify the target region which may include any one of the variations listed in tables 2, 3, 4 or 6 and the nucleotides either side of the variation by about 5-10 nucleotides, or 7-15 nucleotides, or 10 to 20 nucleotides, or 15 to 30 nucleotides, or more.
  • the target region may include any of the 359 regions discussed herein.
  • at least one target region is amplified with a primer pair for each of the nucleic acid sequences identified by SEQ ID NOS. 719-748.
  • SEQ ID NO. 719 amplifies 2 regions with 2 primer pairs;
  • SEQ ID NO. 720 amplifies 9 regions with 9 primer pairs;
  • SEQ ID NO. 721 amplifies 2 regions with 2 primer pairs;
  • SEQ ID NO. 722 amplifies 2 regions with 2 primer pairs;
  • SEQ ID NO. 723 amplifies 33 regions with 33 primer pairs;
  • SEQ ID NO. 724 amplifies 10 regions with 10 primer pairs;
  • SEQ ID NO. 725 amplifies 3 regions with 3 primer pairs;
  • SEQ ID NO. 726 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 727 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 728 amplifies 4 regions with 4 primer pairs; SEQ ID NO. 729 amplifies 4 regions with 4 primer pairs; SEQ ID NO. 730 amplifies 1 region with 1 primer pair; SEQ ID NO. 731 amplifies 1 region with 1 primer pairs; SEQ ID NO. 732 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 733 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 734 amplifies 10 regions with 10 primer pairs; SEQ ID NO. 735 amplifies 3 regions with 3 primer pairs; SEQ ID NO. 736 amplifies 19 regions with 19 primer pairs; SEQ ID NO.
  • SEQ ID NO. 738 amplifies 7 regions with 7 primer pairs
  • SEQ ID NO. 739 amplifies 1 10 regions with 1 10 primer pairs
  • SEQ ID NO. 740 amplifies 3 regions with 3 primer pairs
  • SEQ ID NO. 741 amplifies 5 regions with 5 primer pairs
  • SEQ ID NO. 742 amplifies 1 1 regions with 1 1 primer pairs
  • SEQ ID NO. 743 amplifies 61 regions with 61 primer pairs
  • SEQ ID NO. 744 amplifies 8 regions with 8 primer pairs
  • SEQ ID NO. 745 amplifies 4 regions with 4 primer pairs
  • SEQ ID NO. 746 amplifies 2 regions with 2 primer pairs
  • SEQ ID NO. 747 amplifies 5 regions with 5 primer pairs
  • SEQ ID NO. 748 amplifies 12 regions with 12 primer pairs.
  • the panel of primers designed to target regions of nucleic acid sequences comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
  • the nucleic acid may be amplified by any method known in the art provided multiple amplicons may be formed.
  • the nucleic acid is amplified by polymerase chain reaction (PCR) with the primers.
  • purifying the amplified nucleic acid comprises removing some of the primers either side of the amplicons.
  • the primers are removed using FuPa reagent but any other method known in the art may be used for this purpose.
  • purifying the amplified nucleic acid may further comprises added identifier sequences sometimes called barcode sequences to either side of the amplicons where the primer sequences have been removed.
  • purifying the amplified nucleic acid may also or alternatively comprises adding paramagnetic particles to the amplified nucleic acid and removing any nucleic acid not adhering to the paramagnetic particles and removing the paramagnetic particles.
  • the method further comprises amplifying the purified amplified nucleic acid after removing the paramagnetic particles.
  • the amplicons may be sequenced in a mass parallel method using reversible dye terminators.
  • This sequencing method is based on reversible dye-terminators that enable the identification of single bases as they are introduced into DNA strands.
  • the amplicons aggregate in clusters on a chip and primers together with modified nucleotides with reversible 3' blockers that force the primers to add on only one nucleotide at a time as well as fluorescent tags are added.
  • a signal from the dye terminators is detected for example a camera takes a picture of the chip and a computer determines what base was added by the wavelength of the fluorescent tag and records it for every spot on the chip.
  • the method further comprises quantifying the amplified nucleic acid and adjusting its concentration to at least 1000 parts per million.
  • the method further comprises staining a second slice of the fixed formalin paraffin embedded sample; capturing a micrograph of the stained sample; and displaying the micrograph with sequence variations detected. Displaying the variations in the nucleic acid sequence and the histological details of the FFPE sample will allow a pathologist or other professionals to make more accurate diagnosis and treatment regimes.
  • FISH fluorescence in situ hybridization
  • ImmunoHistoChemistry to detect the amount of target protein in a sample.
  • the latter method is however semi-quantitative. In various embodiments these results may be displayed together with the variations in the nucleic acid sequence.
  • the sample is a solid tumour sample.
  • a solid tumour refers to an abnormal mass of tissue that usually does not contain cysts or liquid areas. Different types of solid tumours are named for the type of cells that form them.
  • the solid tumour is selected from any one of colorectal cancer; lung cancer; gastrointestinal cancer; poorly differentiated malignant neoplasm likely malignant adrenocortical tumour; mucinous adenocarcinoma; or signet ring cell diffuse adenocarcinoma.
  • the method further comprises determining a cancer treatment regime for the subject based on the variations detected.
  • the term 'cancer treatment regime', 'treatment', "treat” 'cancer therapy' and synonyms thereof refer to both therapeutic treatment and prophylactic or preventative measures, wherein the object is to cure, prevent or slow down (lessen) a cancer condition.
  • a subject that responds to a treatment refers to measures that cure, prevent or slow down (lessen) a cancer condition in the subject.
  • a sample with a KRAS (pAla146Thr) mutation predicts or suggests that the subject is likely to respond to treatment with oxaliplatin and resistant or unresponsive to panitumumab, bevacizumab or cetuximab.
  • a sample with a loss of function mutation in PTEN indicates or predicts that the subject is likely to respond to treatment with everolimus.
  • Figure 1 is a non-limiting example of the workflow of various embodiments.
  • Another aspect of the invention relates to a System for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
  • the senor detects the wavelength of each fluorescent tag. Any method known in the art to detect individual wavelengths of the four different fluorescent tags used would be suitable.
  • the analyser is a computer programed to make bioinformatics calculations including variant calling for somatic amplicon sequencing.
  • the display may be a screen for displaying a visual report. In various embodiments the display may be a print out of the analysed results. A non-limiting example of a display of various embodiments is depicted in figure 2.
  • the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
  • kits for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a system as described herein comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719- 748; and fluorescent nucleotides.
  • the kit further comprises paramagnetic particles. [0070]. In various embodiments the kit further comprises a panel of primers comprising oligonucleotides identified by SEQ ID NOS. 1 -718
  • Another aspect of the invention relates to a panel of primers for use in detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a parallel sequencing method comprising primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748.
  • the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
  • Another aspect of the invention relates to a method for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: (a) extracting nucleic acid from a slice of a fixed formalin paraffin embedded sample from a subject; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences comprising oligonucleotides identified by SEQ ID NOS.
  • the solid tumour is selected from any one of colorectal cancer; lung cancer; gastrointestinal cancer; poorly differentiated malignant neoplasm likely malignant adrenocortical tumour; mucinous adenocarcinoma; or signet ring cell diffuse adenocarcinoma.
  • purifying the nucleic acid comprises adding paramagnetic particles to the amplified nucleic acid and removing any nucleic acid not adhering to the paramagnetic particles and removing the paramagnetic particles.
  • the method further comprises amplifying the purified amplified nucleic acid after removing the paramagnetic particles.
  • the method further comprises quantifying the amplified nucleic acid and adjusting its concentration to at least 1000 parts per million.
  • the panel of primers designed to target regions of nucleic acid sequences comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
  • a variation from the reference sequence SEQ ID NO. 737 indicates that a treatment regime with oxaliplatin is preferred.
  • a variation from the reference sequence SEQ ID NO. 746 indicates that a treatment regime with everolimus is preferred.
  • Another aspect of the invention relates to a system for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
  • system further comprises paramagnetic particles; and a magnetic plate for placing under the chamber.
  • the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
  • kits for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; and fluorescent nucleotides.
  • kit further comprises paramagnetic particles.
  • panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
  • the method is designed to balance the need of comprehensive coverage of hotspot regions of each interest gene (359 regions), sequencing depth (average 4000X, minimum 500X for reporting 5% MAF variants) to assure sensitivity (5% MAF) and assay throughput (10 clinical samples) to match the potential clinical volume. A total of 50 clinical samples have been successfully sequenced.
  • the assay validation has also shown that the test can consistently detect variety of mutations at the 5% detection limit ( Figure 3 and 4) in FFPE samples.
  • panitumumab, bevacizumab and cetuximab based on reports relating to this mutation (Douillard et al. The New England journal of medicine 2013, 369:1023-34).
  • Investigative drug pimasertib, refametinib, selumetinib and sorafenib, inhibitors of the RAF/MEK/ERK pathway were also listed in the report as potential choices based on preclinical studies and clinical trials information.
  • 15-123 alignt ring cell diffuse adenocarcinoma
  • FFPE paraffin-embedded
  • HD200 Horizon Diagnostics, UK
  • the extracted DNA was quantified by the Quant-iTTM dsDNA HS Assay (Life Technologies, Darmstadt, Germany) on the Qubit 3.0 fluorometer (Life Technologies).
  • PCR was performed using 5-30 ng of DNA per primer pool, followed by adding 2 ⁇ of FuPa in the reaction. Pool 1 and pool 2 mixtures were combined prior adding 8 ⁇ of Switch Solution, 0.5 ⁇ of lllumina TruSeq adapters (IDT) and 4 ⁇ of DNA ligase. Generated libraries were purified using Agencourt AMPure XP reagent and were further amplified using KAPA HiFi HotStart ReadyMix (Kapa Biosystems, South Africa). Individual library was quantified by KAPA Library Quantification Kit (Kapa Biosystems). Generated libraries were normalized to the same concentration before pooling together for sequencing. In total, 10 samples were pooled together with HD200 and NTC. After the pooled library preparation, samples were sequenced using the MiSeq Reagent Kit v2 (2x 150 bp) according to the manufacturer's instructions.
  • SEQ ID NO. 719 is the reference sequence to Homo sapiens serine/threonine kinase 1 (AKT1 ) having GenBank accession number NG_012188, version NG_012188.1
  • SEQ ID NO. 720 is the reference sequence to Homo sapiens ALK receptor tyrosine kinase (ALK) having GenBank accession number NG_009445, version NG_009445.1
  • SEQ ID NO. 721 is the reference sequence to Homo sapiens B-Raf proto-oncogene, serine/threonine kinase (BRAF) having GenBank accession number NG_007873, version NG_007873.3 [00101 ].
  • SEQ ID NO. 722 is the reference sequence to Homo sapiens catenin beta 1 (CTNNB1 ) having GenBank accession number NG_013302, version
  • SEQ ID NO. 723 is the reference sequence to Homo sapiens discoidin domain receptor tyrosine kinase 2 (DDR2) having GenBank accession number NG_016290, version NG_016290.1
  • SEQ ID NO. 724 is the reference sequence to Homo sapiens epidermal growth factor receptor (EGFR) having GenBank accession number
  • SEQ ID NO. 725 is the reference sequence to Homo sapiens erb-b2 receptor tyrosine kinase 2 (ERBB2) having GenBank accession number NG_007503, version NG_007503.1
  • SEQ ID NO. 726 is the reference sequence to Homo sapiens estrogen receptor 1 (ESR1 ) having GenBank accession number, NG_008493, version NG_008493.2
  • SEQ ID NO. 727 is the reference sequence to Homo sapiens fibroblast growth factor receptor 1 (FGFR1 ) having GenBank accession number
  • SEQ ID NO. 728 is the reference sequence to Homo sapiens fibroblast growth factor receptor 2 (FGFR2) having GenBank accession number
  • SEQ ID NO. 729 is the reference sequence to Homo sapiens fibroblast growth factor receptor 3 (FGFR3) having GenBank accession number
  • SEQ ID NO. 730 is the reference sequence to Homo sapiens G protein subunit alpha 1 1 (GNA1 1 ) having GenBank accession number NG_033852, version NG_033852.2
  • SEQ ID NO. 731 is the reference sequence to Homo sapiens G protein subunit alpha q (GNAQ) having GenBank accession number NG_027904, version NG 027904.2 [001 1 1 ].
  • SEQ ID NO. 732 is the reference sequence to Homo sapiens GNAS complex locus (GNAS) having GenBank accession number NG_016194, version
  • SEQ ID NO. 733 is the reference sequence to Homo sapiens HRas proto-oncogene, GTPase (HRAS) having GenBank accession number NG_007666, version NG_007666.1
  • SEQ ID NO. 734 is the reference sequence to Homo sapiens KIT proto-oncogene receptor tyrosine kinase (KIT) having GenBank accession number NG_007456, version NG_007456.1
  • SEQ ID NO. 735 is the reference sequence to Homo sapiens KRAS proto-oncogene, GTPase (KRAS) having GenBank accession number NG_007524, version NG_007524.1
  • SEQ ID NO. 736 is the reference sequence to Homo sapiens mitogen -activated protein kinase kinase 1 (MAP2K1 ) having GenBank accession number NG_008305, version NG_008305.1
  • SEQ ID NO. 737 is the reference sequence to Homo sapiens mitogen -activated protein kinase kinase 2 (MAP2K2) having GenBank accession number NG_007996, version NG_007996.1
  • SEQ ID NO. 738 is the reference sequence to Homo sapiens MET proto-oncogene, receptor tyrosine kinase (MET) having GenBank accession number NG_008996, version NG_008996.1
  • SEQ ID NO. 739 is the reference sequence to Homo sapiens mechanistic target of rapamycin (MTOR) having GenBank accession number
  • SEQ ID NO. 740 is the reference sequence to Homo sapiens NRAS proto-oncogene, GTPase (NRAS) having GenBank accession number NG_007572, version NG_007572.1
  • SEQ ID NO. 741 is the reference sequence to Homo sapiens platelet derived growth factor receptor alpha (PDGFRA) having GenBank accession number NG_009250, version NG_009250.1 [00121 ].
  • SEQ ID NO. 742 is the reference sequence to Homo sapiens phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) having GenBank accession number NM_006218, version NM_006218.3
  • SEQ ID NO. 743 is the reference sequence to Homo sapiens patched 1 (PTCH1 ) having GenBank accession number NG_007664, version
  • SEQ ID NO. 744 is the reference sequence to Homo sapiens phosphatase and tensin homolog (PTEN) having GenBank accession number
  • SEQ ID NO. 745 is the reference sequence to Homo sapiens ret proto-oncogene (RET) having GenBank accession number NG_007489, version
  • SEQ ID NO. 746 is the reference sequence to Homo sapiens ROS proto-oncogene 1 , receptor tyrosine kinase (ROS1 ) having GenBank accession number NG_033929, version NG_033929.1
  • SEQ ID NO. 747 is the reference sequence to Homo sapiens smoothened, frizzled class receptor (SMO) having GenBank accession number
  • SEQ ID NO. 748 is the reference sequence to Homo sapiens tumor protein p53 (TP53) having GenBank accession number NG_017013, version
  • Angsana Solid Tumour Panel consists of 15 (out of total 374) amplicons that were in the original design, but disclaimed subsequently due to low specificity. During wet-lab process these 15 amplicons are still amplified but disregarded in the succeeding bioinformatics analysis. These amplicons are listed as the shaded last 15 primer pairs in Table 2.
  • chrl 11194502 11194591 TGCATGGTCTGGACAAAATGCT GGTTGACTCATGCCTATCAGATGAA MTOR chrl 11199386 11199519 CGGGCACTCTTCCACATGTTT CTGGGCTGTCCAGTTCTGTC MTOR
  • chrl 11205007 11205110 C GCTCAGTCCCAAGGGTTGTTTC MTOR chrl 11206727 11206854 AGAGATCTGGGTGCATGTAGGT TGATGCTTTCTGTCTATGTGTGTGT MTOR chrl 11210127 11210249 AGCACCTTACTCTTCTGATGCG CAGCAGTGCTGTGAAAAGTGG MTOR chrl 11210239 11210313 CATCTTGGCTTGGGTCTCATCA GGAAGATTGGTAGTTTAAGGAGATTTGGAT MTOR chrl 11217199 11217316 AGCCACACATGCCATCATTCTA GCTACCTGGTATGAGAAACTGCA MTOR chrl 11217300 11217375 TCCATTTTCTTGTCATAGGCCACAA TGATCCCTTCTTCTGTGTACCTCA MTOR chrl 11227468 11227597 CTCCCTAGCCAAGTCTCTACCT CCTCAGACCTGTTGAGTGTAACT MTOR chrl
  • chrl 11272337 11272461 A TGACTATGCCTCCCGGATCATT MTOR chrl 11272455 11272546 CTCTGGTCCAGTGTTCGAACAA GGAGATTGTATGTTAGCAGTTACTGTATGT MTOR chrl 11272789 11272914 GAAAGATGGCCTGGGAACTTAAGA CAACCTGGATGACTACCTGCAT MTOR chrl 11272904 11272976 GGGCATCAAACAACTTAACAATAGGA I GAC I A I I CA I CACCA I GC I 1 1 I GC I MTOR chrl 11273446 11273559 AGAAAATCTCTCTGGAGGATGACGTA ATCATTCTTCTCATTGAGCAAATTGTGG MTOR chrl 11273549 11273628 GGGCAGGTAGAGCTTAAATTCACC I CAGC I CC I C I GAC I 1 1 I C I C I 1 MTOR chrl 11276181 11276308 TCCTTGACCCA
  • chrl 162743184 162743300 CC GCACTGACATCTAGGGCAAAATCTTT DDR2 chrl 162745431 162745551 GGCAACTTCTGAGTTTATCTATGTCTGT CTGATTGAGATCTCCATTCTCCATGT DDR2 chrl 162745541 162745663 TGTATCACTGATGACCCTCTCTGTAT CCAGGTTACTCTCATGACCACA DDR2
  • chrl 162745932 162746052 C CCTGCTCATTCCAAAGTCAGCTAT DDR2 chrl 162746042 162746166 ACGAAACTGTTTAGTGGGTAAGAACT AGACAGGGCTTTAAAATGCTGAGA DDR2 chrl 162748329 162748449 GAAATTTAACAGGGTGTTGTTGTGCA CTGTTCATCTGACAGCTGGGAATAG DDR2 chrl 162748439 162748533 TTGTGGGAGACTTTCACCTTTTGT ACAGGTCCACATCCATTCATCC DDR2 chrl 162749869 162749988 CCCGTCTTTGTAATATTCTCTCTCTCTCT GCAGAAGGTGGAI M CI 1 GGAATGA DDR2 chrl 162749978 162750075 AGCTGCTGGAGAAGAGATACGA GTGAGTGGTAGGTCTTGTAGGGA DDR2 chrlO 43609076 43609165 GGCTATGGC
  • chrl2 25378561 25378663 TT AGTTAAGGACTCTGAAGATGTACCTATGG KRAS chrl2 25380260 25380337 TCCTCATGTACTGGTCCCTCAT GTTTCTCCCTTCTCAGGATTCCTAC KRAS chrl2 25398189 25398310 TGGTCCTGCACCAGTAATATGC ATTATAAGGCCTGCTGAAAATGACTGA KRAS chrl4 105241352 105241484 CCACCTCGTCCTGTAAAGCAG GCACTTTCGGCAAGGTGATC AKT1 chrl4 105246444 105246557 G CG CC AC AG AG AAGTTGTT TCTCACCACCCGCACGT AKT1
  • MAP2K chrl5 66735605 66735722 GAACATTGTCACTAACTGGTCTGGTA CCGACTCATTAACTTATCAATGAGGAACTT 1
  • MAP2K chrl5 66736897 66737023 AAAGGAGGAAGGCAAATTTGTGATG CCTCTGTG C ATG ATCTTGTG CTT 1
  • MAP2K chrl5 66737013 66737080 TCCTCTAGGTAATAAAAGGCCTGACAT GGAAGCAACAGCCTTTGGATTATATCTAAA 1
  • MAP2K chrl5 66774071 66774199 TACCTGTGTCAGTTCCCTCCTT TCATACCGACATGTAGGACCTTGT 1
  • MAP2K chrl5 66781535 66781633 CAAGGAGCCAGGCA 1 1 1 1 1 L 1 1 AT CAAATCCAGAGTATACGCTTCCAGA 1
  • MAP2K chrl9 4094446 4094533 CAGAACCCGCTGGCATCA GACTGCCCTGTCTTGTCTCC 2
  • MAP2K chrl9 4095347 4095447 CGGAAGGAGTGGCACATCTG ACCCTCTGTTCTCCTCCACAG 2
  • MAP2K chrl9 4095411 4095521 ACCATTTATTGACAAACTCCTGGAAGT GTGACCGGACAGGACAGTGA 2
  • MAP2K chrl9 4099182 4099279 GCGTCCAGACCGGAAGTT AAGAGCTGGAGGCCATCTTTG 2
  • MAP2K chrl9 4101238 4101346 GGCCTTACCTCGGTGCATG CTCTGACTGCTCAGCTCTGAC 2
  • MAP2K chrl9 4110387 4110512 CATCGACTGCCTTGAGAAGGAT GGGAGATCAGCATTTGCATGGAA 2
  • MAP2K chrl9 4110512 4110639 CGGACGCACTCACCATGT TTGCAGCTGATCCACCTTGA 2
  • MAP2K chrl9 4117321 4117442 GGACAGAGCCTGGAGCTAATC GTCACCAAAGTCCAGCACAGAC 2
  • MAP2K chrl9 4117552 4117651 CCTTGGCTTTCTGGGTGAGAAA ATGGAGTCTCCCTAGGTAGCTAAC 2 chr2 29432565 29432669 AGTTGACAGGGTACCAGGAGAT CCAAGATTGGAGACTTCGGGAT ALK chr2 29432659 29432776 AGGCAGTCTTTACTCACCTGTAGA CGTTGTACACTCATCTTCCTAGGGAT ALK chr2 29436753 29436870 GCTCTGGAGGGAGACCTAGTAT CCTGTGGCTGTCAGTATTTGGA ALK chr2 29436861 29436982 CTTTGACTCACCGGTGGATGAA GATTTCCCTCCTCTCACTGACAAG ALK chr2 29443489 29443620 GGTTCCATCGAGGAACTTGCT GTTCATCCTGCTGGAGCTCAT ALK chr2 29443608 29443710 TCTCTCGGAGGAAGGACTTGAG CTCAGTTAATTTTGG
  • chr3 41266075 41266201 ACAGAAAAGCGGCTGTTAGTCA TGAAGGACTGAGAAAATCCCTGTTC 1
  • chr3 178916742 178916857
  • PIK3CA chr3 178916808 178916905
  • chr3 178921451 178921570 ATGCCATCTTATTCCAGACGCAT TAAGCATCAGCATTTGACTTTACCTTATCA PIK3CA chr3 178927401 178927517 TTTACATAGGTGGAATGAATGGCTGAA AGCGG 1 A 1 AA 1 AGGAG 1 1 1 1 AAAGG 1 AA PIK3CA
  • chr3 178947769 178947887 a TACTTGTCCATCGTCTTTCACCATG PIK3CA chr3 178951974 178952095 TCAATGATGCTTGGCTCTGGAA GAAGA 1 CCAA 1 CCA 1 1 1 1 I G I I G I CCA PIK3CA
  • chr3 178952089 178952205 CA GGTCTTTGCCTGCTGAGAGTTA PIK3CA chr4 1803554 1803640 CCCTGAGCGTCATCTGC TCACTGTACACCTTGCAGTGG FGFR3 chr4 1806069 1806205 TTTGCAGCCGAGGAGGAGC AGATCTTGTGCACGGTGGG FGFR3 chr4 1807831 1807932 CTGGTGACCGAGGACAACGT GGCGTCCTACTGGCATGA FGFR3 chr4 1808937 1809063 GGGACGACTCCGTGTTTG CTCCATCTGCACTGAGTCTCATG FGFR3
  • chr4 55144502 55144621 GGL 1 1 1 1 1 CTGTTCTTCATTTTCATACCC GATATCCAGCTCTTTCTTTGGCTTCT A
  • chr4 55151978 55152104 G CTATTC AG CTAC AG ATGG CTTG A TGCCTTTCGACACATAGTTCGAAT A
  • chr4 55592134 55592251 AA CATGACTGATATGGTAGACAGAGCCTA KIT
  • chr4 55593388 55593508 CCACATTTCTCTTCCATTGTAGAGCA
  • chr4 55593527 55593645 TTTGTTCTCTCTCCAGAGTGCTCTA GATCATAAGGAAGTTGTGTTGGGTCTA KIT
  • chr4 55593630 55593750 GGAAGGTTGTTGAGGAGATAAATGGA TGGAGTTCCTTAAAGTCACTGTTATGTG KIT
  • chr4 55595458 55595572 TCT CATGATCTTCCTGCTTTGAACAAATAAATG KIT chr4 55597439 55597559 GGAGGTAGAGCATGACCCATGA CCTATTCTCACAGATCTCCTTTTGTCG KIT chr4 55599266 55599387 CAGAGACTTGGCAGCCAGAAAT ATCGAAAGTTGAAACTAAAAATCaTTGCA KIT chr4 55602678 55602798 TTCTATTACAGGCTCGACTACCTGT CAAGAAGATGCTCTGAGTCTAATGAAGTT KIT chr6 117630010 117630122 GTCTCCCTCCTGTTTGCACATA ACTACTGTAAACCTGGTGTTTGTAATAAGT R0S1
  • chr6 117638291 117638405 GG CAAATTTAATCATCCCAACATTCTGAAGCA ROS1 chr6 152419790 152419918 GTGTCTTTGGAGTTCCTCTTCCTT CTCCAGCAGCAGGTCATAGAG ESR1 chr6 152419908 152420028 ATCTGTACAGCATGAAGTGCAAGA TGCAAGGAATGCGATGAAGTAGAG ESR1 chr7 55211031 55211131 GGTGGCTGGTTATGTCCTCATT CTTCAGTCCGGTTTTATTTGCATCA EGFR chr7 55221796 55221914 CCACGTACCAGATGGATGTGAA GACTCTCCAAGATGGGATACTCCA EGFR chr7 55233005 55233131 ACTGTATCCAGTGTGCCCACTA CTGTTCTCCTTCACTTTCCACTCA EGFR chr7 55241601 55241731 GGTGACCCTTGTCTCTGTGTTC TGTGCCAGGGACC
  • chr7 116411830 116411952 G GGGCACTTACAAGCCTATCCAA MET chr7 116411950 116412069 CGATGCAAGAGTACACACTCCT ACAACCCACTGAGGTATATGTATAGGTATT MET chr7 116417439 116417558 AG 1 G 1 AACCAAG 1 I I M U M I GC AAGCTATTTATTAGGTTGCAAACCACAA MET chr7 116423367 116423490 TGTCCTTTCTGTAGGCTGGATGA C I GAC I I GG I GG I AAAC I 1 1 I GAG I 1 I G MET chr7 128845060 128845188 GCCAGAATGAGGTGCAGAACA CGATGTAGCTGTGCATGTCCT SMO chr7 128846007 128846135 GCACAGCTCCAATGAGACTCTG GCAGGTGGAAGTAGGAGGTCTT SMO chr7 128846285 128846415
  • chr9 98218550 98218660 C CTGATGACGGTCGAGCTGTT PTCH1 chr9 98218647 98218772 GCACTGAGCTTGATTCCGATGA GAT GAACCGAGGACACCTTAG PTCH1 chr9 98220270 98220396 GCTATGCTGAAAGGAATTTGACTTCC CCTCTTCTGGGAGCAGTACATC PTCH1 chr9 98220376 98220496 CAACACCACGCTGATGAACAG GGACACCTCAGACTTTGTGGA PTCH1 chr9 98220486 98220609 TTGCTGCAGATGGTCCTTACTT GAAACTGTGATGCTCTTCTACCCT PTCH1 chr9 98221822 98221902 GGATGAAGGCTGTTGCTGAGTT GTCCACGACAAAGCCGACTACA PTCH1 chr9 98221895 98222021 CG CTACTTACTTCTC AG CCTTGT GGATGC
  • chr9 98229603 98229711 TCT CTGACCTTGTGCCTCTTCTGTT PTCH1 chr9 98229707 98229777 CCCAGAAAAAGGAAGATCACCACTA GTGGTGGTGAAAACAAGGTATTAACTAGA PTCH1
  • chr9 98230980 98231070 A GGACACTCTCATL 1111 GCTGAGA PTCH1 chr9 98231060 98231187 GG 1111 111 CAAGAGGAAAGG GTGACACAGGACACCCTCAGCT PTCH1 chr9 98231164 98231249 GAGAGCAGGTCCCTTGTGG ACGCACGTGTACTACACCAC PTCH1 chr9 98231231 98231363 GTGACGGGCTGCACAGAGAT ACACGACAATACCCGCTACAG PTCH1 chr9 98231337 98231446 AATCTGCGTTTCATGGGCAAAG TCAGCATAACATTGCAACATGTTTCC PTCH1 chr9 98232021 98232118 CCCGTTACCCACATTCCTTTATAAGT TATATCGACGCGAGGACAGGAGACT PTCH1
  • chr9 98232108 98232226 GC GCATGTTGGTGACCTCTGAAI 1111 PTCH1 chr9 98238353 98238463 GCAGAGCGGGAATTGGGATT ATGTCCAGTG C AG CTCTC AG PTCH1 chr9 98239036 98239147 CCAAAG 1 1 1 ICI 11 IGI 111 IGC GICIAAIGCCACCAICCICIGI 111 PTCH1 chr9 98239721 98239845 ATCCTAGTGGAAAAGGCTGCAA TGTGCTCATTGATCGGAATTTCCTTTA PTCH1
  • chr9 98240292 98240413 AACA TCAAAAGGTGCTTTCCTTCACCA PTCH1
  • chr9 98240402 98240479 CAGAGAAGGATTTCAGGATGTCGT CATTTGGGCATTTCGCATTCTGT PTCH1
  • chr9 98241249 98241365 AG CAAGCAAATGTACGAGCACTTCAAG PTCH1 chr9 98241355 98241443 CGTTCCAGTTGATGTGTGAGACA GAATACTGATGATGTGCCTTCCCTT PTCH1 chr9 98242207 98242332 AGGGAAGTGGCTTTTGAGGAAAG CCTTGTTTTGAATGGTGGATGTCATG PTCH1 chr9 98242322 98242401 CCTCCTGCCAGTGCATATACTT CATGTGACCTGCCTACTAATTCCC PTCH1 chr9 98242647 98242766 GTGTTTTGCTCTCCACCCTTCT CAGCTGGGAGGAAATGCTGAATA PTCH1 chr9 98242758 98242875 CGGTCCATGTAACCATGACCAA TTTTCATGGTCTCGTCTCCTAATTTCTTT PTCH1 chr9 98244209 98244335
  • Table 3 A list of mutations with known mutations and the detected mutations using the system.
  • HD266 was identified as wild-type for BRAF, EGFR, KIT and PDGFRA, whereas HD141 was identified as wild-type for EGFR, KIT, KRAS and PDGFRA.
  • Clinical samples and proficiency test samples with wildtype KRAS exon 2 were used to assess the specificity of this assay (Table 5). In conclusion, 100% specificity (33/33) was achieved as shown in below table.
  • HD705, HD273, HD308, HD179, HD127, HD301 , HD300, HD850 and HD200 which harboured known positive mutations with of mutation allele frequency (MAF) ranged from 1 -10% were used to assess the lowest limit of detection of this assay.
  • HD705, HD301 , HD300 and HD200 were diluted to 5 ng, 15 ng and 30 ng for library generation in order to assess the lowest DNA input for detection of mutation with allele frequency ⁇ 5%.
  • Three independent experiments were performed with 5 ng of DNA input.
  • HD705, HD301 , HD300 and HD200 were used to assess the lowest limit of DNA input.
  • Three different DNA amount (5 ng, 15 ng and 30 ng) were used to generate NGS library.
  • the library generation and sequencing were successful for the different DNA input.
  • the positive mutations in these samples were identified consistently in three independent experiments ( Figure 4). Repeated sequencing using 5 ng DNA input demonstrated reliable sensitivity at 5% variant frequency with high intra-run and inter-run reproducibility. This suggests that successful sequencing can be obtained from at least 5 ng of DNA input.
  • Sequencing products were further purified using BigDye XTerminator (Life Technologies) and analysed on the 3500 DNA analyser using the Sequencing Analysis 3.0 software (Applied Biosystems, Carlsbad, CA).

Abstract

A system and method for detecting variations in nucleic acid sequence isolated from a fixed formalin paraffin embedded sample from a subject for use in next-generation sequencing. The system and method including primer panels are suitable for use in cancer detection and targeted cancer therapy guidance. Preferably, the primers panels are designed to target regions of nucleic acid sequences of AKTI, ALK, BRAF, CTNNB1, DDR2, EGFR, ERBB2, ESR1, FGFR1, FGFR2, FGFR3, GNA11, GNAQ, GNAS, HRAS, KIT, KRAS, MAP2K1, MAP2K2, MET, MTOR, NRAS, PDGFRA, PIK3CA, PTCH1, PTEN, RET, ROS1, SMO, TP53.

Description

System and Method for detecting variations in nucleic acid sequence for use in next- generation sequencing
[001 ]. FIELD OF INVENTION
[002]. The present invention generally relates to next generation nucleic acid sequencing and the use of methods in detecting variations in nucleic acid sequence preferably for use in cancer detection and targeted cancer therapy guidance.
[003]. BACKGROUND
[004]. The following discussion of the background to the invention is intended to facilitate an understanding of the present invention. However, it should be appreciated that the discussion is not an acknowledgment or admission that any of the material referred to was published, known or part of the common general knowledge in any jurisdiction as at the priority date of the application.
[005]. Scientist were able to sequence DNA with the advent of Sanger chain termination method (Sanger et al. PNAS 1977, 74: 5463-5467,). This was enhanced with the capillary electrophoresis (CE) based sequencing machines. These technologies are referred to as first generation nucleic acid sequencing. Next generation nucleic acid sequencing refers to short read, massively parallel sequencing methods. Next
Generation Sequencing represents a powerful tool for detecting genetic variation associated with human disease. Personalized medicine and targeted therapy and promise to revolutionise cancer treatment.
[006]. Traditionally, oncologists have used single gene testing to interrogate a few well studied mutations in identifying targeted therapies. First generation sequencing such as the Sanger sequencing method was the most widely used to identify single nucleotide variations and indels in cancer specimen (Muzzey et al. Current genetic medicine reports 2015, 3:158-65). However, it remains labour-intensive, time-consuming and expensive when done in large scale.
[007]. The landscape of cancer treatment has been widely transforming in the last decade due to genetics and personalized medicine (Schilsky Nature reviews Clinical oncology 2014, 1 1 :432-8.). In the bygone era, same set of cancer therapies was applied to each patient indiscriminately where drug response was measured over time and current treatment was changed if its deemed not effective (Simon et al. Nature reviews Drug discovery 2013, 12:358-69). This trial and error approach often results in loss of valuable time and resources for patients, in-addition to adverse reactions to certain treatment (Frampton et al. Nature biotechnology 2013, 31 :1023-31 ).
[008]. Pathological examinations play a very important and active role in the diagnosis of cancer. Traditional histology of cancer tissue samples are still used. The tissue is preserved in formalin fixed paraffin embedded (FFPE) blocks and processed for microscopic examination. Methods such as FISH and IHC methods are still the gold standards recommended to be used for companion diagnosis for NSCLC targeted therapy. While pathological examination goes a long way towards determining the best treatment plan biomarker detection is also relevant.
[009]. The limitations of extracting DNA from FFPE biopsy sample are small sample size and low tumour content (< 40%) due to the difficulties of the biopsy procedure to reach the less necrosis region. In such low tumour content cases, oncologist can choose real time PCR based assay with higher sensitivity instead of conventional Sanger Sequencing. However, RT-PCR needs 20-50 ng of DNA input for each amplicon and it can only detect a known mutation in each reaction. Another major challenge for any PCR based test from FFPE treated tumour tissue is poor DNA quality.
[0010]. At present cancer treatment plans are increasingly dependent on the understanding of biomarkers. Over the past few years, strong associations were observed between the presence of kinase mutations and response to molecular targeted drugs. For instance, chronic myeloid leukemia (CML) patients with Bcr-Abl aberration show a high response rate to the Bcr-Abl kinase inhibitor imatinib (Gadzicki et al. Cancer genetics and cytogenetics 2005, 159:164-7). Other well-known examples include
associations between EGFR mutations and response to EGFR inhibitors (gefitinib or erlotinib) in non-small cell lung cancer (NSCLC) (Paez et al. Science 2004, 304:1497- 500) and HER2 amplification and response to HER2 inhibitors (trastuzumab or lapatinib) in breast cancer (Smith et al. Lancet 2007, 369:29-36). These findings strongly support the notion that predictive biomarkers are useful for the selection of an effective targeted therapy for a given patient.
[001 1 ]. Cancer drug response as a complex biological system involving multiple biomarkers in multiple genes, a more-comprehensive and more-sensitive interrogation of biomarkers is needed for characterizing system functions, and predicting treatment responses and outcomes (Zhang et al. Molecular aspects of medicine 2015). Therefore, the demand for faster, more accurate and more cost-effective genomic information has led to the development of next-generation sequencing (NGS) technologies. [0012]. In contrast to Sanger sequencing, where only a single DNA fragment is sequenced at once, NGS methods are able to interrogate millions to even billions of individual DNA fragments in parallel (Muzzey et al. 2015). Several targeted NGS platforms exist, including cancer panels from illumina and Ion Torrent. The emergence of bench top NGS such as Ion Torrent PGM and MiSeq have offered a powerful mutation detection method in clinical genetic testing (Bodi et al. JBT 2013, 24:73-86). Despite the Ion Torrent PGM reported high accuracy in calling single nucleotide variations, requiring only 10ng of DNA input compared with up to 1 μg of DNA required by other enrichment protocols, and targeting relatively short gene segments for amplification (Goswami et al. Methods in molecular biology 2016, 1392:143-51 ) it tends to generate many false positive calls in detecting insertions and deletions (indels), due to the nature of the sequencing chemistry which may hinder its utility for clinical genetic testing (Quail et al. BMC genomics 2012, 13:341 ). It has been documented that indel errors occurring in homopolymer DNA regions have significantly affected the specificity of indel detection of PGM. The competing MiSeq platform from illumina is capable of producing DNA sequences with high accuracy but is far more demanding in quantity of starting DNA material required.
[0013]. There is a need for alternative methods and systems to ameliorate at least one of the problems mentioned above.
[0014]. SUMMARY
[0015]. It is an object of the present invention to provide a system and method for detecting mutations in fixed formalin paraffin embedded samples that may be used in non-limiting applications such as diagnosis of cancer and prognosis of effective treatment.
[0016]. Accordingly, an aspect of the invention relates to a method for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample from a subject comprising: (a) extracting nucleic acid from a slice of the sample; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; (c) amplifying the nucleic acid; (d) purifying the amplified nucleic acid; (e) contacting the amplified nucleic acid with reversible dye terminators; (f) detecting a signal from the dye terminators; and (g) converting the signal to a nucleic acid sequence to analysing the same for variations. [0017]. Another aspect of the invention relates to a System for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
[0018]. Another aspect of the invention relates to a kit for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a system as described herein, comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719- 748; and fluorescent nucleotides.
[0019]. Another aspect of the invention relates to a panel of primers for use in detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a parallel sequencing method comprising primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748.
[0020]. Another aspect of the invention relates to a method for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: (a) extracting nucleic acid from a slice of a fixed formalin paraffin embedded sample from a subject; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences comprising oligonucleotides identified by SEQ ID NOS. 719- 748; (c) amplifying the nucleic acid; (d) purifying the amplified nucleic acid; (e) contacting the amplified nucleic acid with reversible dye terminators; (f) detecting a signal from the dye terminators; and (g) converting the signal to a nucleic acid sequence to analysing the same for variations, wherein detecting variations from the reference sequence in the nucleic acid indicates the subject has a solid tumour cancer and indicates a cancer treatment regime for the subject.
[0021 ]. Another aspect of the invention relates to a system for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display. [0022]. Another aspect of the invention relates to a kit for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; and fluorescent nucleotides.
[0023]. Other aspects of the invention will become apparent to those of ordinary skill in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.
[0024]. BRIEF DESCRIPTION OF DRAWINGS
[0025]. The present invention will now be described, by way of illustrative example only, with reference to the accompanying drawings, of which:
[0026]. Figure 1. The workflow of Solid Tumour Panel and the different QC steps from nucleic acid extraction until report generation.
[0027]. Figure 2. An example of a report.
[0028]. Figure 3. Correlation of expected MAF and observed MAF in Horizon Diagnostic samples.
[0029]. Figure 4. Mean ± SE variant frequencies obtained from four Horizon
Diagnostics samples based on two intra run and three independent inter run.
[0030]. Figure 5. Percentage of amplicons having 500x or higher sequencing depth in all the tested samples.
[0031 ]. Figure 6. Examples of mutations detected using Sanger Sequencing. (A) A KRAS substitution mutation (c.35G>A p.G12D) was identified in colorectal cancer. (B) An EGFR deletion (c.2235_2249del15 p.E746_A750delELREA) was identified in lung cancer.
[0032]. Figure 7. Detected variants in Horizon Diagnostics and clinical samples. Red cells indicate single nucleotide variation, blue cells indicate indel and green cells indicate wildtype.
[0033]. The accompanying drawings are not to be understood as superseding the generality of the preceding description of the invention.
DETAILED DESCRIPTION [0035]. In this study, a novel next generation method was demonstrated to exploit the advantages of high accuracy in calling single nucleotide variations, requiring only 10ng of DNA input compared with up to 1 μg of DNA required by other enrichment protocols, targeting relatively short gene segments for amplification and/or producing DNA sequences with high accuracy. The assay is compatible with 5 ng of FFPE isolated DNA and the libraries were sequenced using lllumina platform to achieve good quality of sequencing results.
[0036]. Accordingly, an aspect of the invention relates to a method for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample from a subject comprising: (a) extracting nucleic acid from a slice of the sample; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; (c) amplifying the nucleic acid; (d) purifying the amplified nucleic acid; (e) contacting the amplified nucleic acid with reversible dye terminators; (f) detecting a signal from the dye terminators; and (g) converting the signal to a nucleic acid sequence to analysing the same for variations.
[0037]. The method is an alternative massive parallel sequencing or next generation nucleic acid sequencing method.
[0038]. As used herein the term 'variations in nucleic acid' refers to any alteration present in the nucleic acid sequences isolated from a subjects from the nucleic acid sequences isolated from a reference sequence often found in the majority of subjects, wherein the alteration may be indicative of a disease condition or a response to treatment. In various embodiments the reference sequences are the nucleic acid sequences identified by SEQ ID NOS. 719-748. In various embodiments a variation in nucleic acid may include a single nucleotide polymorphisms (SNP); a single nucleotide variant (SNV) in somatic cells such as nonsynonymous SNP or SNV that are either missense or nonsense. In various embodiments a variation in nucleic acid may include structural variation such as deletions, inversions, insertions, duplications or an indel which can refer to either an insertion or a deletion in the nucleic acid sequence. In various embodiments a variation in nucleic acid may include any one of the variations to the reference nucleic acid sequences identified by SEQ ID NOS. 719-748 listed in tables 2, 3, 4 or 6.
[0039]. As used herein the term 'nucleic acid' refers to deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) or complementary deoxyribonucleic acid (cDNA). In various embodiments the DNA or RNA is isolated from a fixed formalin paraffin embedded (FFPE) sample from a subject. In various other embodiments a cDNA is formed based on an RNA isolated from an FFPE sample from a subject. In various other embodiments a cDNA is formed using reverse transcription methods known in the art. Generally a nucleic acid containing an adenine base or derivative thereof will hybridise to a nucleic acid containing either a thymine base or derivative thereof or a uridine base or derivative thereof. Similarly, a nucleic acid containing a cytosine base or derivative thereof will hybridise to a nucleic acid containing a guanidine base or derivative thereof. As such complementary is essentially an anti-sense strand of an mRNA whereby under normal physiological conditions each base will hybridise to the mRNA from which it is formed.
[0040]. As used herein the term 'sample' refers to any fixed formalin paraffin embedded (FFPE) sample from a subject. In various embodiments the FFPE sample is thinly sliced with a microtome. In various embodiments the slices are 5 to 20 μηι in thickness or preferably about 10 μηι in thickness. In various embodiments the sliced sample must be unstained to avoid any of the stains destroying or denaturing the nucleic acid in the sample.
[0041 ]. As used herein the term 'subject' refers to a vertebrate animal such as a mammal from which a biopsy sample is taken, fixed in formalin and embedded in a paraffin block. In various embodiments the subject is that is a vertebrate animal such as a mammal suspected of having or suffering from a proliferative disease such as cancer. In various embodiments this may include a subject at risk of having a proliferative disease, a subject that has a proliferative disease or a subject that has had a proliferative disease in the past. In various embodiments this may include a subject at risk of having cancer, a subject that has cancer or a subject that has had cancer in the past. In various embodiments the subject comprises a human.
[0042]. In various embodiments nucleic acid from the sample is isolated from a slice of the PPPE block using methods known in the art. There are many process known for purification of DNA from sample using a combination of physical and chemical methods.
[0043]. In various embodiments panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748 may include a target region known to have variations that may relate to a disease state or a likelihood of responding to a treatment. In various embodiments the target region may include any one of the variations to the nucleic acid sequences identified by SEQ ID NOS. 719-748 listed in tables 3, 4 or 6. In various embodiments the primers are designed to amplify the target region which may include any one of the variations listed in tables 2, 3, 4 or 6 and the nucleotides either side of the variation by about 5-10 nucleotides, or 7-15 nucleotides, or 10 to 20 nucleotides, or 15 to 30 nucleotides, or more. In various embodiments the target region may include any of the 359 regions discussed herein. In various embodiments at least one target region is amplified with a primer pair for each of the nucleic acid sequences identified by SEQ ID NOS. 719-748. In various embodiments SEQ ID NO. 719 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 720 amplifies 9 regions with 9 primer pairs; SEQ ID NO. 721 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 722 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 723 amplifies 33 regions with 33 primer pairs; SEQ ID NO. 724 amplifies 10 regions with 10 primer pairs; SEQ ID NO. 725 amplifies 3 regions with 3 primer pairs; SEQ ID NO. 726 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 727 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 728 amplifies 4 regions with 4 primer pairs; SEQ ID NO. 729 amplifies 4 regions with 4 primer pairs; SEQ ID NO. 730 amplifies 1 region with 1 primer pair; SEQ ID NO. 731 amplifies 1 region with 1 primer pairs; SEQ ID NO. 732 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 733 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 734 amplifies 10 regions with 10 primer pairs; SEQ ID NO. 735 amplifies 3 regions with 3 primer pairs; SEQ ID NO. 736 amplifies 19 regions with 19 primer pairs; SEQ ID NO. 737 amplifies 20 regions with 20 primer pairs; SEQ ID NO. 738 amplifies 7 regions with 7 primer pairs; SEQ ID NO. 739 amplifies 1 10 regions with 1 10 primer pairs; SEQ ID NO. 740 amplifies 3 regions with 3 primer pairs; SEQ ID NO. 741 amplifies 5 regions with 5 primer pairs; SEQ ID NO. 742 amplifies 1 1 regions with 1 1 primer pairs; SEQ ID NO. 743 amplifies 61 regions with 61 primer pairs; SEQ ID NO. 744 amplifies 8 regions with 8 primer pairs; SEQ ID NO. 745 amplifies 4 regions with 4 primer pairs; SEQ ID NO. 746 amplifies 2 regions with 2 primer pairs; SEQ ID NO. 747 amplifies 5 regions with 5 primer pairs; SEQ ID NO. 748 amplifies 12 regions with 12 primer pairs.
[0044]. In various embodiments the panel of primers designed to target regions of nucleic acid sequences comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
[0045]. In various embodiments the nucleic acid may be amplified by any method known in the art provided multiple amplicons may be formed. Preferably the nucleic acid is amplified by polymerase chain reaction (PCR) with the primers.
[0046]. In various embodiments purifying the amplified nucleic acid comprises removing some of the primers either side of the amplicons. In various embodiments the primers are removed using FuPa reagent but any other method known in the art may be used for this purpose. In various embodiments purifying the amplified nucleic acid may further comprises added identifier sequences sometimes called barcode sequences to either side of the amplicons where the primer sequences have been removed.
[0047]. In various embodiments purifying the amplified nucleic acid may also or alternatively comprises adding paramagnetic particles to the amplified nucleic acid and removing any nucleic acid not adhering to the paramagnetic particles and removing the paramagnetic particles. In various embodiments the method further comprises amplifying the purified amplified nucleic acid after removing the paramagnetic particles.
[0048]. In various embodiments the amplicons may be sequenced in a mass parallel method using reversible dye terminators. This sequencing method is based on reversible dye-terminators that enable the identification of single bases as they are introduced into DNA strands. The amplicons aggregate in clusters on a chip and primers together with modified nucleotides with reversible 3' blockers that force the primers to add on only one nucleotide at a time as well as fluorescent tags are added. After each round of this addition, a signal from the dye terminators is detected for example a camera takes a picture of the chip and a computer determines what base was added by the wavelength of the fluorescent tag and records it for every spot on the chip. After each round, non- incorporated molecules are washed away. A chemical de-blocking step is then used in the removal of the 3' terminal blocking group and the dye in a single step. The process continues until the nucleic acid molecules present are sequenced. With this technology, thousands of places throughout the genome are sequenced at once via massive parallel sequencing.
[0049]. In various embodiments the method further comprises quantifying the amplified nucleic acid and adjusting its concentration to at least 1000 parts per million.
[0050]. In various embodiments the method further comprises staining a second slice of the fixed formalin paraffin embedded sample; capturing a micrograph of the stained sample; and displaying the micrograph with sequence variations detected. Displaying the variations in the nucleic acid sequence and the histological details of the FFPE sample will allow a pathologist or other professionals to make more accurate diagnosis and treatment regimes.
[0051 ]. In various embodiments further testing for variations may be conducted on the FFPE sample including but is not limited to fluorescence in situ hybridization (FISH) to detect the amount of the target gene present in a sample and/or
ImmunoHistoChemistry (IHC) to detect the amount of target protein in a sample. The latter method is however semi-quantitative. In various embodiments these results may be displayed together with the variations in the nucleic acid sequence.
[0052]. In various embodiments the sample is a solid tumour sample. As used herein a solid tumour refers to an abnormal mass of tissue that usually does not contain cysts or liquid areas. Different types of solid tumours are named for the type of cells that form them.
[0053]. In various embodiments the solid tumour is selected from any one of colorectal cancer; lung cancer; gastrointestinal cancer; poorly differentiated malignant neoplasm likely malignant adrenocortical tumour; mucinous adenocarcinoma; or signet ring cell diffuse adenocarcinoma.
[0054]. In various embodiments the method further comprises determining a cancer treatment regime for the subject based on the variations detected.
[0055]. As used herein the term 'cancer treatment regime', 'treatment', "treat" 'cancer therapy' and synonyms thereof refer to both therapeutic treatment and prophylactic or preventative measures, wherein the object is to cure, prevent or slow down (lessen) a cancer condition. A subject that responds to a treatment refers to measures that cure, prevent or slow down (lessen) a cancer condition in the subject.
[0056]. In various embodiments a sample with a KRAS (pAla146Thr) mutation predicts or suggests that the subject is likely to respond to treatment with oxaliplatin and resistant or unresponsive to panitumumab, bevacizumab or cetuximab.
[0057]. In various embodiments a sample with a loss of function mutation in PTEN indicates or predicts that the subject is likely to respond to treatment with everolimus.
[0058]. Other such relationships between biomarkers in the target regions and whether a cancer will be responsive to a particular treatment are also known. The advantage of the current system is that the method is simultaneously evaluating 30 genes in parallel at the same time instead of the existing single gene tests for companion diagnosis.
[0059]. Here the validation of a laboratory developed solid tumour targeted hotspots panel test with NGS to replace the single gene companion diagnostic tests is reported. A few genes with unique diagnostic or prognosis value such as GNAQ, GNAS, GNA1 1 and TP53 have also been included. [0060]. Such an approach is ideal for cancer genomics where multiple regions of the patient genome needs to be sequenced in parallel to very high sequencing depth to detect mutations occurring at low frequency. Furthermore, the method is able to achieve this with limited starting materials, a significant advantage over conventional sequencing platforms that require relatively larger DNA quantities.
[0061 ]. Figure 1 is a non-limiting example of the workflow of various embodiments.
[0062]. Another aspect of the invention relates to a System for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
[0063]. Like terms have similar meanings to the same terms used for other embodiments.
[0064]. In various embodiments the sensor detects the wavelength of each fluorescent tag. Any method known in the art to detect individual wavelengths of the four different fluorescent tags used would be suitable.
[0065]. In various embodiments the analyser is a computer programed to make bioinformatics calculations including variant calling for somatic amplicon sequencing.
[0066]. In various embodiments the display may be a screen for displaying a visual report. In various embodiments the display may be a print out of the analysed results. A non-limiting example of a display of various embodiments is depicted in figure 2.
[0067]. In various embodiments the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
[0068]. Another aspect of the invention relates to a kit for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a system as described herein, comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719- 748; and fluorescent nucleotides.
[0069]. In various embodiments the kit further comprises paramagnetic particles. [0070]. In various embodiments the kit further comprises a panel of primers comprising oligonucleotides identified by SEQ ID NOS. 1 -718
[0071 ]. Another aspect of the invention relates to a panel of primers for use in detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a parallel sequencing method comprising primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748.
[0072]. In various embodiments the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
[0073]. Identified is the advantage and the potential clinical utility of simultaneously evaluating 30 genes instead of the existing single gene tests for companion diagnosis.
[0074]. In summary, the uniqueness of this laboratory developed cancer related gene panel is its advantage to work with limited amount of FFPE sample, adopting a specific library preparation approach and at the same time keeping the high sequencing quality of the llumina MiSeq platform, which shortened the pipeline development time and effort. Its clinical utility in varieties of cancers and in a few treatment resistance cases have been demonstrated in our early adoption of this NGS based 30 gene panel clinical tests for targeted therapy guidance.
[0075]. Another aspect of the invention relates to a method for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: (a) extracting nucleic acid from a slice of a fixed formalin paraffin embedded sample from a subject; (b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences comprising oligonucleotides identified by SEQ ID NOS. 719- 748; (c) amplifying the nucleic acid; (d) purifying the amplified nucleic acid; (e) contacting the amplified nucleic acid with reversible dye terminators; (f) detecting a signal from the dye terminators; and (g) converting the signal to a nucleic acid sequence to analysing the same for variations, wherein detecting variations from the reference sequence in the nucleic acid indicates the subject has a solid tumour cancer and indicates a cancer treatment regime for the subject.
[0076]. In various embodiments the the solid tumour is selected from any one of colorectal cancer; lung cancer; gastrointestinal cancer; poorly differentiated malignant neoplasm likely malignant adrenocortical tumour; mucinous adenocarcinoma; or signet ring cell diffuse adenocarcinoma. [0077]. In various embodiments purifying the nucleic acid comprises adding paramagnetic particles to the amplified nucleic acid and removing any nucleic acid not adhering to the paramagnetic particles and removing the paramagnetic particles.
[0078]. In various embodiments the method further comprises amplifying the purified amplified nucleic acid after removing the paramagnetic particles.
[0079]. In various embodiments the method further comprises quantifying the amplified nucleic acid and adjusting its concentration to at least 1000 parts per million.
[0080]. In various embodiments the panel of primers designed to target regions of nucleic acid sequences comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
[0081 ]. In various embodiments a variation from the reference sequence SEQ ID NO. 737 (cG436A) indicates that a treatment regime with oxaliplatin is preferred.
[0082]. In various embodiments a variation from the reference sequence SEQ ID NO. 746 indicates that a treatment regime with everolimus is preferred.
[0083]. Another aspect of the invention relates to a system for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
[0084]. In various embodiments the system further comprises paramagnetic particles; and a magnetic plate for placing under the chamber.
[0085]. In various embodiments the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
[0086]. Another aspect of the invention relates to a kit for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; and fluorescent nucleotides.
[0087]. In various embodiments the kit further comprises paramagnetic particles. [0088]. In various embodiments the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
[0089]. EXAMPLES
[0090]. The method is designed to balance the need of comprehensive coverage of hotspot regions of each interest gene (359 regions), sequencing depth (average 4000X, minimum 500X for reporting 5% MAF variants) to assure sensitivity (5% MAF) and assay throughput (10 clinical samples) to match the potential clinical volume. A total of 50 clinical samples have been successfully sequenced. The assay validation has also shown that the test can consistently detect variety of mutations at the 5% detection limit (Figure 3 and 4) in FFPE samples.
[0091 ]. 30 samples from three major tumor types (lung adenocarcinoma, colorectal cancer and GIST) were evaluated together with a few rare types of cancer such as "poorly differentiated malignant neoplasm likely malignant adrenocortical tumor" (15-22), mucinous adenocarcinoma (15-24) and signet ring cell diffuse adenocarcinoma (15-123). In case 15-22 (poorly differentiated malignant neoplasm likely malignant adrenocortical tumor), a gain of function mutation (p.Ser33Phe) has been detected in CTNNB1 gene in a liver cancer patient. This mutation results in the stabilization of the beta-catenin protein, increasing transcription of TCF/LEF-responsive target genes and therefore activation of Wnt signalling pathway in in vitro studies. Inhibitors that target β-catenin are still in preclinical development (Lepourcelet et al. Cancer Cell 2004, 5:91 -102.).
However, inhibitors that target Wnt signalling pathway have been reported to confer response in solid tumour. Synthesis of β-catenin has been reported to be regulated by MEK, downstream effector of RAS signalling pathway (Gosens et al. FASEB journal : official publication of the Federation of American Societies for Experimental Biology 2010, 24:757-68.). Aberrant Wnt^-catenin signalling is associated with RAS pathway stabilization and Ras-induced transformation in colorectal tumorigenesis (Jeong et al. Science signaling 2012, 5:ra30), suggesting that MEK inhibitors could be a therapy option for this tumour type.
[0092]. In case 15-24 (mucinous adenocarcinoma), the mucinous adenocarcinoma in the colon was positive for a KRAS (p.Ala146Thr) mutation. This suggests that the specific cancer may be likely to be responsive to oxaliplatin and resistant to
panitumumab, bevacizumab and cetuximab based on reports relating to this mutation (Douillard et al. The New England journal of medicine 2013, 369:1023-34). Investigative drug pimasertib, refametinib, selumetinib and sorafenib, inhibitors of the RAF/MEK/ERK pathway, were also listed in the report as potential choices based on preclinical studies and clinical trials information. In case 15-123 (signet ring cell diffuse adenocarcinoma), a PTEN loss of function mutation was found in a signet ring diffuse stomach
adenocarcinoma. Loss of PTEN leads to PI3K/Akt/mTOR pathway activation
(DeGraffenried et al. Ann Oncol 2004, 15:1510-6) and this indicates that investigational drugs such as everolimus, AKT5363 and AKT MK2206 and buparlisib could be potential targeted therapy choices.
[0093]. Tumour Samples and DNA Extraction
[0094]. In total, 30 formalin-fixed, paraffin-embedded (FFPE) solid tumour samples from colorectal cancer (n=10), lung cancer (n=12) and gastrointestinal stromal tumour (n=10) were analysed by NGS in parallel with Sanger Sequencing. FFPE samples were obtained from Hong Kong and Singapore hospitals with informed consent from each patient. H&E staining was performed in all the clinical samples and tumour content of each sample was assessed by a pathologist. Only cases with more than 20% of tumour content were included in this study. For each case, 10 μηι section of tumour tissue was manually macrodissected using the matched H&E- stained slide as guidance. DNA was extracted according to Qiagen GeneRead (Qiagen, Hilden, Germany). HD200 (Horizon Diagnostics, UK) was used as a positive control for DNA extraction. The extracted DNA was quantified by the Quant-iTTM dsDNA HS Assay (Life Technologies, Darmstadt, Germany) on the Qubit 3.0 fluorometer (Life Technologies).
[0095]. Library Preparation and Sequencing using MiSeq
[0096]. In total, 30 cancer related genes were shortlisted in the test based on the following criteria: (i) actionable driver mutations guiding FDA approved effective drugs (e.g. EGFR mutations for TKIs); (ii) driver mutations guiding FDA approved drugs but using in other cancer types; (iv) driver mutations used in investigational therapies of potential benefit with open clinical trials and (v) known drug resistance mutations (Table 1 ). Two set of primer pools that target the hotspot regions were designed with an Ion AmpliSeq Designer (Applied Biosystems, Life Technologies). Library preparation for each sample was performed using the customized primers and Ion AmpliSeq Library Kit 2.0 (Applied Biosystems, Life Technologies). HD200 (Horizon Diagnostics) was used as a positive control for library generation. PCR was performed using 5-30 ng of DNA per primer pool, followed by adding 2 μΐ of FuPa in the reaction. Pool 1 and pool 2 mixtures were combined prior adding 8 μΐ of Switch Solution, 0.5 μΜ of lllumina TruSeq adapters (IDT) and 4 μΐ of DNA ligase. Generated libraries were purified using Agencourt AMPure XP reagent and were further amplified using KAPA HiFi HotStart ReadyMix (Kapa Biosystems, South Africa). Individual library was quantified by KAPA Library Quantification Kit (Kapa Biosystems). Generated libraries were normalized to the same concentration before pooling together for sequencing. In total, 10 samples were pooled together with HD200 and NTC. After the pooled library preparation, samples were sequenced using the MiSeq Reagent Kit v2 (2x 150 bp) according to the manufacturer's instructions.
[0097]. Table 1 . Solid Tumour Panel contains 30 clinically relevant genes
Figure imgf000017_0001
[0098]. Wherein SEQ ID NO. 719 is the reference sequence to Homo sapiens serine/threonine kinase 1 (AKT1 ) having GenBank accession number NG_012188, version NG_012188.1
[0099]. Wherein SEQ ID NO. 720 is the reference sequence to Homo sapiens ALK receptor tyrosine kinase (ALK) having GenBank accession number NG_009445, version NG_009445.1
[00100]. Wherein SEQ ID NO. 721 is the reference sequence to Homo sapiens B-Raf proto-oncogene, serine/threonine kinase (BRAF) having GenBank accession number NG_007873, version NG_007873.3 [00101 ]. Wherein SEQ ID NO. 722 is the reference sequence to Homo sapiens catenin beta 1 (CTNNB1 ) having GenBank accession number NG_013302, version
NG_013302.2
[00102]. Wherein SEQ ID NO. 723 is the reference sequence to Homo sapiens discoidin domain receptor tyrosine kinase 2 (DDR2) having GenBank accession number NG_016290, version NG_016290.1
[00103]. Wherein SEQ ID NO. 724 is the reference sequence to Homo sapiens epidermal growth factor receptor (EGFR) having GenBank accession number
NG_007726, version NG_007726.3
[00104]. Wherein SEQ ID NO. 725 is the reference sequence to Homo sapiens erb-b2 receptor tyrosine kinase 2 (ERBB2) having GenBank accession number NG_007503, version NG_007503.1
[00105]. Wherein SEQ ID NO. 726 is the reference sequence to Homo sapiens estrogen receptor 1 (ESR1 ) having GenBank accession number, NG_008493, version NG_008493.2
[00106]. Wherein SEQ ID NO. 727 is the reference sequence to Homo sapiens fibroblast growth factor receptor 1 (FGFR1 ) having GenBank accession number
NG_007729, version NG_007729.1
[00107]. Wherein SEQ ID NO. 728 is the reference sequence to Homo sapiens fibroblast growth factor receptor 2 (FGFR2) having GenBank accession number
NG_012449, version NG_012449.2
[00108]. Wherein SEQ ID NO. 729 is the reference sequence to Homo sapiens fibroblast growth factor receptor 3 (FGFR3) having GenBank accession number
NG_012632, version NG_012632.1
[00109]. Wherein SEQ ID NO. 730 is the reference sequence to Homo sapiens G protein subunit alpha 1 1 (GNA1 1 ) having GenBank accession number NG_033852, version NG_033852.2
[001 10]. Wherein SEQ ID NO. 731 is the reference sequence to Homo sapiens G protein subunit alpha q (GNAQ) having GenBank accession number NG_027904, version NG 027904.2 [001 1 1 ]. Wherein SEQ ID NO. 732 is the reference sequence to Homo sapiens GNAS complex locus (GNAS) having GenBank accession number NG_016194, version
NG_016194.1
[001 12]. Wherein SEQ ID NO. 733 is the reference sequence to Homo sapiens HRas proto-oncogene, GTPase (HRAS) having GenBank accession number NG_007666, version NG_007666.1
[001 13]. Wherein SEQ ID NO. 734 is the reference sequence to Homo sapiens KIT proto-oncogene receptor tyrosine kinase (KIT) having GenBank accession number NG_007456, version NG_007456.1
[001 14]. Wherein SEQ ID NO. 735 is the reference sequence to Homo sapiens KRAS proto-oncogene, GTPase (KRAS) having GenBank accession number NG_007524, version NG_007524.1
[001 15]. Wherein SEQ ID NO. 736 is the reference sequence to Homo sapiens mitogen -activated protein kinase kinase 1 (MAP2K1 ) having GenBank accession number NG_008305, version NG_008305.1
[001 16]. Wherein SEQ ID NO. 737 is the reference sequence to Homo sapiens mitogen -activated protein kinase kinase 2 (MAP2K2) having GenBank accession number NG_007996, version NG_007996.1
[001 17]. Wherein SEQ ID NO. 738 is the reference sequence to Homo sapiens MET proto-oncogene, receptor tyrosine kinase (MET) having GenBank accession number NG_008996, version NG_008996.1
[001 18]. Wherein SEQ ID NO. 739 is the reference sequence to Homo sapiens mechanistic target of rapamycin (MTOR) having GenBank accession number
NG_033239, version NG_033239.1
[001 19]. Wherein SEQ ID NO. 740 is the reference sequence to Homo sapiens NRAS proto-oncogene, GTPase (NRAS) having GenBank accession number NG_007572, version NG_007572.1
[00120]. Wherein SEQ ID NO. 741 is the reference sequence to Homo sapiens platelet derived growth factor receptor alpha (PDGFRA) having GenBank accession number NG_009250, version NG_009250.1 [00121 ]. Wherein SEQ ID NO. 742 is the reference sequence to Homo sapiens phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) having GenBank accession number NM_006218, version NM_006218.3
[00122]. Wherein SEQ ID NO. 743 is the reference sequence to Homo sapiens patched 1 (PTCH1 ) having GenBank accession number NG_007664, version
NG_007664.1
[00123]. Wherein SEQ ID NO. 744 is the reference sequence to Homo sapiens phosphatase and tensin homolog (PTEN) having GenBank accession number
NG_007466, version NG_007466.2
[00124]. Wherein SEQ ID NO. 745 is the reference sequence to Homo sapiens ret proto-oncogene (RET) having GenBank accession number NG_007489, version
NG_007489.1
[00125]. Wherein SEQ ID NO. 746 is the reference sequence to Homo sapiens ROS proto-oncogene 1 , receptor tyrosine kinase (ROS1 ) having GenBank accession number NG_033929, version NG_033929.1
[00126]. Wherein SEQ ID NO. 747 is the reference sequence to Homo sapiens smoothened, frizzled class receptor (SMO) having GenBank accession number
NG_023340, version NG_023340.1
[00127]. Wherein SEQ ID NO. 748 is the reference sequence to Homo sapiens tumor protein p53 (TP53) having GenBank accession number NG_017013, version
NG_017013.2
[00128]. Library preparation and sequencing performance
[00129]. Sufficient amount of libraries (>1000 pM) were generated from all cases with starting material of 15 ng, except CRC9_14MA31 12-9 case. All runs were successful with <2% PhiX error rate, 850-1200 K/mm2 cluster density, >80% of cluster passing filter rate and the percentage of bases≥Q30 is >80% (data not shown). As shown in Figure 5, in majority of the samples showed 100% of the amplicons demonstrates at least 500x sequencing depth. All the cases fulfill the acceptable criteria of minimum 70% amplicons having 500x or higher sequencing depth. The overall workflow and the quality control metrics are summarized in Supplementary Figure 5. HD200 with different mutations at various MAF (1 .0 - 24.5 %) was included in each run. It served as a control started from DNA extraction till variant calling.
[00130]. Amplicon based sequencing can be susceptible to pseudogenes regions and segmental duplications. Angsana Solid Tumour Panel consists of 15 (out of total 374) amplicons that were in the original design, but disclaimed subsequently due to low specificity. During wet-lab process these 15 amplicons are still amplified but disregarded in the succeeding bioinformatics analysis. These amplicons are listed as the shaded last 15 primer pairs in Table 2.
[00131 ]. Table 2. List of primers.
Figure imgf000021_0001
chrl 11194502 11194591 TGCATGGTCTGGACAAAATGCT GGTTGACTCATGCCTATCAGATGAA MTOR chrl 11199386 11199519 CGGGCACTCTTCCACATGTTT CTGGGCTGTCCAGTTCTGTC MTOR
GCAAGAGCCTTAAAAATAAGAGAAACT
chrl 11199514 11199637 GG GGTCAGCCCTCATGAAGACATG MTOR chrl 11199636 11199741 GCTTGCATACTTGAGCCAGGTT GCCACAAAAACAATGTATCAGAAGACATAT MTOR chrl 11204620 11204721 AGGCAGGAAAAGCAAGTTGAGTA CGAGAGATCATCCGCCAGATCT MTOR chrl 11204710 11204817 CTTTCTACCAAGCTCACCTGCA TAATTAGTTTGGAATGCGATTTGCTTCC MTOR
GAAATAAGCCTCAAAAATGACAATGTG
chrl 11205007 11205110 C GCTCAGTCCCAAGGGTTGTTTC MTOR chrl 11206727 11206854 AGAGATCTGGGTGCATGTAGGT TGATGCTTTCTGTCTATGTGTGTGT MTOR chrl 11210127 11210249 AGCACCTTACTCTTCTGATGCG CAGCAGTGCTGTGAAAAGTGG MTOR chrl 11210239 11210313 CATCTTGGCTTGGGTCTCATCA GGAAGATTGGTAGTTTAAGGAGATTTGGAT MTOR chrl 11217199 11217316 AGCCACACATGCCATCATTCTA GCTACCTGGTATGAGAAACTGCA MTOR chrl 11217300 11217375 TCCATTTTCTTGTCATAGGCCACAA TGATCCCTTCTTCTGTGTACCTCA MTOR chrl 11227468 11227597 CTCCCTAGCCAAGTCTCTACCT CCTCAGACCTGTTGAGTGTAACT MTOR chrl 11259280 11259409 TGGGAGAAGAGAGGTCATTTTGC CATTGTTCTGCTGGGTGAGAGA MTOR chrl 11259397 11259472 TAGTGT AGTG CTTTGG C ATATG CTC CA M 1 I CC I G I GC I G I GAGG I 1 1 1 MTOR chrl 11259497 11259617 AAAACCTCACAGCACAGGAAAATG CAGACCCTCTTAAACTTGGCTGAAT MTOR chrl 11259607 11259706 GGGAAAAGTCTCACCTTGTCACT GCTGGTCTGAACTGAATGAAGATCA MTOR chrl 11259696 11259778 GCCAACTCGATGCTTCTGATGA CACGGGAGGTCTCTCCATTTTC MTOR chrl 11264598 11264713 CG I 1 1 I U A I G I GU GA I U I C I CCA TCCAAAGATGACTGGCTGGAATG MTOR chrl 11264691 11264770 GCGATGATGAGTCCTTCAGCA ACTGATTGTTTTCCATTGTCACCTGT MTOR chrl 11269363 11269489 TCCCAGTCACCTGAAACAATGG GATACACACTTGCTGATGAAGAGGA MTOR chrl 11269479 11269569 CCTAAGCATCCGATGCTGGTAAA GCACCTCCCAAAATTTAAGGATATTTCCTA MTOR chrl 11270862 11270980 CAGGGACTTCAGAACAGAAAAGAAGT CACTCTCCAAAATGATAGTTTCTCATCTCT MTOR
ACAAAGTCTTCTTTCCAAATAAGGCAG
chrl 11272337 11272461 A TGACTATGCCTCCCGGATCATT MTOR chrl 11272455 11272546 CTCTGGTCCAGTGTTCGAACAA GGAGATTGTATGTTAGCAGTTACTGTATGT MTOR chrl 11272789 11272914 GAAAGATGGCCTGGGAACTTAAGA CAACCTGGATGACTACCTGCAT MTOR chrl 11272904 11272976 GGGCATCAAACAACTTAACAATAGGA I GAC I A I I CA I CACCA I GC I 1 1 I GC I MTOR chrl 11273446 11273559 AGAAAATCTCTCTGGAGGATGACGTA ATCATTCTTCTCATTGAGCAAATTGTGG MTOR chrl 11273549 11273628 GGGCAGGTAGAGCTTAAATTCACC I CAGC I CC I C I GAC I 1 1 I C I C I C I 1 MTOR chrl 11276181 11276308 TCCTTGACCCAAACATGGAAGAG CCTTCATGATACCAGCTGGTTGA MTOR chrl 11288708 11288836 TGCAGAGGAGAAAGAGAAGGATTG ATCATCACACCATGGTTGTCCA MTOR chrl 11288826 11288935 AGTCCCAGGGACTTGAAGATGA GAAATGCTGGTCAACATGGGAAA MTOR chrl 11288925 11289005 GACACAGCTGGGTAGAACTCAT AAGTCACGTCATGCCATGTGTA MTOR chrl 11290958 11291085 TGTGAGTGAGAACTTGGCAAGT A I CCG I G I G I I AGGGC I 1 1 I AGG MTOR chrl 11291075 11291145 GCCAATGTTCACTTTGTGCTTGTAAG GTGATACTTGTTGTAATCATGTTGTGATGA MTOR chrl 11291281 11291405 CTGACTGACACTGTCAGAGCAT GAAGTACCCTACTTTGCTTGAGGT MTOR chrl 11291395 11291512 CTGGTTCTGCTCAGTCTTCAGAA TTTTCTCTCGTACTGGCTCATTGAAT MTOR chrl 11292478 11292593 GGTCTGTCTTGCTCAATCAGGA CC I 1 I C I GCCACCC I C I 1 1 1 1 CA MTOR chrl 11293441 11293560 CCCTATCTCCGCTATGGAAAAAGT AGTAGAGTCTAATCTGCATTCATTTTCTCA MTOR chrl 11294174 11294298 GCTGATGACAACGCACAGAGAA TTTTGACAGAGTTGGAGCACAGT MTOR chrl 11294290 11294385 GGGCACTCTGCTCTTTGATTCT G l I CCC I GCCAC I 1 1 1 I C I G I I G MTOR chrl 11297916 11297999 CGGTTACCTGGATGAGCATCTTG TGTTTGTGGCTCTGAATGACCA MTOR chrl 11298007 11298110 CGGATCTCAAACACCTGGTCAT TGTTCAAAGGCTCCATTGCTCT MTOR chrl 11298396 11298521 CTCTAGAGAGGCCATGTAATGAAACTG CATGCTCATGTGGTTAGCCAGA MTOR chrl 11298504 11298606 GAG C AGTTTG CT AAG CACATCTG TGAACAGTGAGCACAAGGAGATC MTOR chrl 11298639 11298739 TGTGCTCACTGTTCAGGAAATGAT CCTC AAAATG G AAAATTC AG CTAG CA MTOR chrl 11300264 11300394 GAATCAGCTACCCTGTCTTTAGCA GATGTGGGCAGCATCACTCT MTOR chrl 11300381 11300494 CTTACCTTCAAATTCAAAGCTGCCAA TCCCTGGTCCTTATGCACAAAC MTOR chrl 11300524 11300614 CATAAGGACCAGGGACAGCATT CAGCCCTTATGTGACTTGTTTCC MTOR chrl 11301515 11301641 GCTATCCTGTCCATCCAGTTGAA GCATCCAGCAGGATATCAAGGAG MTOR chrl 11301617 11301743 GGTTTCTGACACCCACCTTAGTC CTCAACTAGCCACTCTCCATTTCTC MTOR chrl 11303113 11303237 GTAAGCTCCGTGGATCTGAAATAGA GTGG CTGTGAGGTCTGAGTTTA MTOR chrl 11303234 11303319 GTCCAGCACGCGAGGCAAATAG TCCAAGATACCATGAACCATGTCCTA MTOR chrl 11303309 11303397 CGCTGTACGTTCCTTCTCCTTC CCTTTGACTGTTGATATGTAGACCCTAAC MTOR chrl 11307673 11307799 GATGGGTAATGATGTCTTCCATGGA CCCTGCCAACCCTTTATCCTTC MTOR chrl 11307843 11307956 GGAAAAGTGAGGTGTGGAGCTT CACCAAGGCCTCATGGGATTT MTOR chrl 11307938 11308058 TCTCCACCAGGGTGGACTTAG GCTTCGGAACAAAACCTCGT MTOR chrl 11308048 11308122 TACAGCCTGGAAACTGGTGAAG CGTCTGAGAGAAGAAATGGAAGAAATCAC MTOR chrl 11308112 11308199 TTTGCAGTACTTGTCGTGTACCA TAGAATCCACAGTGCCCAGTTTG MTOR chrl 11313860 11313953 CCTCGCTCACAGAATGGTACAC GCATGAATCGGGATGATCGGAT MTOR chrl 11313943 11314035 GACCAGCTCGTTAAGGATCAACA ATGAAGGCACCCTGTCTCTCTA MTOR chrl 11315962 11316091 CCAAAACGTGATTCTTGCTCCAT GTGCCTGTCTGATTCTCACAAC MTOR chrl 11316062 11316188 CATCTCCTTACCCTGTACCACTGA CTTCTTCTTCCAGCAAGTGCAAC MTOR chrl 11316177 11316256 GTCCCACACGGCCACAAAAATGT CAA I GAGAG I I C I CGG I 1 I GC I 1 1 1 1 MTOR chrl 11316922 11317043 1 1 1 1 1 AAGGAA 1 GAGCC 1 CAGAAGGA TGAGTACGTGGAATTTGAGGTGAAG MTOR chrl 11317000 11317110 GTAGACACTCACAGCTGCATGT GTTGTCATGGAAATGGCATCCAA MTOR chrl 11317084 11317193 CGTACTCAGCGGTAAAAGTGTCC TCATAGGAGTGGAAGGTGGGAAT MTOR chrl 11318483 11318575 TAGTATCCAGTAAGTGGCAGACACA CA I 1 1 1 I GAAI I GG I 1 I CCAGC I CAGAI MTOR chrl 11318565 11318677 TATGGCCAAGATGCCACCTTTC CCACCACCACAGTTAGAGAATTATTAGAAA MTOR chrl 11319280 11319407 ATTTAGCCCACACATCCCACAAT GCCACCACATCTAGCAATGTGAG MTOR chrl 11319384 11319463 TTTCCTCATTCCGGCTCTTTAGG CTCTAAAGAACCTCAGGGCAAGATG MTOR
TGGATCACATCTCTACCAGAGTTAATC
chrl 115252147 115252263 A TGATTTGCCAACAAGGACAGTTG NRAS chrl 115256463 115256578 ATCCGCAAATGACTTGCTATTATTGATG CCCAGGATTCTTACAGAAAACAAGTG NRAS chrl 115258687 115258801 CCTCACCTCTATGGTGGGATCA CCAATTAACCCTGATTACTGGTTTCCAA NRAS chrl 162688846 162688943 CAGACTCCAGTTCCAACACCAT GGTCTTCTAGCATATAGCTGAGTAATAGC DDR2 chrl 162722876 162722996 IG I CC I C I C I 1 1 I U U 1 IGG I 1 I C I C CAGAACATGGGCTTTCTTGATGTAAC DDR2 chrl 162724362 162724481 GCTTGCCTGTGAACCAGTAAAC GTG C A AGTC A ATCTG CAG A AACT DDR2 chrl 162724471 162724574 CTGAGATTCCAGTGGAACCTGA CCCGACTGTAATTGATCTTGTACATGG DDR2 chrl 162724528 162724650 CACCCTCCATTTTATCACTCTGGT ATCCAGGATCTGGATGTCTCTTCTT DDR2 chrl 162724923 162725042 GTGAGCATGATTTAATACCACCTCTTCA CACTCTCATACACACATTCATGGAGT DDR2 chrl 162725032 162725126 CCAGATTTGTCCGGTTCATTCC C I I CAI 1 I CCCAGAAG I IGGAG I 1 1 1 IAA I DDR2 chrl 162725378 162725461 GG AC ACG CTGTG C AAG CTT ATA CAGCTGGAGCATTGTAAGACAC DDR2 chrl 162725451 162725567 CTCACTTGGCTGTGTTTCCTTTG CCAATTTCCCATTCTAATAAAGTTTCCCA DDR2 chrl 162729529 162729636 CGGGTAAAAGCTCTTCCACGAA GGTATTCATGGGTCTGGGTGAAA DDR2 chrl 162729626 162729736 CCAATTGACCGATGGTGTGTCT TAGTGAAATTCCTGATGCGGTCAAA DDR2 chrl 162729726 162729805 GCCACCAATGGCTACATTGAGA CTCTACCACACTATGATTTGTGCCT DDR2 chrl 162730914 162731031 AGTTCAGAATAGCTCGAATCAGGGTA TAGCACTGTACCTCCTTAAAGATCTTCA DDR2 chrl 162731021 162731140 AGGTCCACTGCAACAACATGTT GTGGAGAGGCACCGTGACAAAC DDR2 chrl 162731130 162731205 CCTTGTCCTGGATGACGTCAAC TGATCTCACTGAACATCATCCAGGTAT DDR2 chrl 162731195 162731278 G CC AGTG CC ATC AAGTGTCA ATA TGCATTCCTCCTTCATCACCTTG DDR2 chrl 162735755 162735873 CAGAGAAAACACTAGCTGTCTGTCT CCCAGTAAGTCCCTTCAAATTAATTTGTG DDR2 chrl 162736948 162737075 CAGGGTCTACCTCCATGTTTCC GGCCAGGAGGATAAAGATGATGG DDR2 chrl 162740024 162740131 CATTTGAGTGGGAGAGCTGAGT GCTAGAATCACTTGGCAGGGAAA DDR2 chrl 162740121 162740249 CGGAGGATGCTGGATGATGAAAT GGAG I 1 1 I CG IA I CAGCC IGGAI DDR2 chrl 162740226 162740328 TCGACTTACGATCGCATCTTTCC CTTGCTGAGGTTTCTCCCTTGA DDR2 chrl 162741774 162741901 TCCTTCCTGAAGAGATCTCCAAAGA GTATGTGTTGCCTCCTGTCACT DDR2 chrl 162741970 162742042 CCTGCTCTCAGGAAAAGATGTGG CTTCGAGCTATAAGGGAATCAAAGAATCAA DDR2
GCCTGAGTTGTAAGAGTTAGTTTATCT
chrl 162743184 162743300 CC GCACTGACATCTAGGGCAAAATCTTT DDR2 chrl 162745431 162745551 GGCAACTTCTGAGTTTATCTATGTCTGT CTGATTGAGATCTCCATTCTCCATGT DDR2 chrl 162745541 162745663 TGTATCACTGATGACCCTCTCTGTAT CCAGGTTACTCTCATGACCACA DDR2
CAAGGGATTCATCAAGAGTAGAGAAA
chrl 162745820 162745942 GA CAGAGGCAATTTGGGTAGCCATAA DDR2
CTGTCTTCTTGTCTATTTCCTCAGTTACA
chrl 162745932 162746052 C CCTGCTCATTCCAAAGTCAGCTAT DDR2 chrl 162746042 162746166 ACGAAACTGTTTAGTGGGTAAGAACT AGACAGGGCTTTAAAATGCTGAGA DDR2 chrl 162748329 162748449 GAAATTTAACAGGGTGTTGTTGTGCA CTGTTCATCTGACAGCTGGGAATAG DDR2 chrl 162748439 162748533 TTGTGGGAGACTTTCACCTTTTGT ACAGGTCCACATCCATTCATCC DDR2 chrl 162749869 162749988 CCCGTCTTTGTAATATTCTCTCTCTCTCT GCAGAAGGTGGAI M CI 1 GGAATGA DDR2 chrl 162749978 162750075 AGCTGCTGGAGAAGAGATACGA GTGAGTGGTAGGTCTTGTAGGGA DDR2 chrlO 43609076 43609165 GGCTATGGCACCTGCAACT GATGTGCTGTTGAGACCTCTGT RET chrlO 43609877 43610007 CATACGCAGCCTGTACCCA TAGCAGTGGATGCAGAAGGC RET chrlO 43615535 43615633 IGC IAI 1 1 1 1 CC 1 ACAG 1 G 1 1 ACCTGGCTCCTCTTCACGTA RET chrlO 43617361 43617465 CCCTCCTTCCTAGAGAGTTAGAGT CAAGAGAGCAACACCCACA TA RET chrlO 89624162 89624287 GCCATTTCCATCCTGCAGAAGA ATGGATACAGGTCAAGTCTAAGTCGA PTEN chrlO 89685251 89685367 CI CA I 1 1 1 IG I 1 AAIGG 1 GGCI 1 1 1 I G I CTCACTCTAACAAGCAGATAACTTTCACTT PTEN chrlO 89692775 89692895 IGAGG I IAI C I 1 1 1 IACCACAG I IGCA ATATCATTACACCAGTTCGTCCCTTTC PTEN chrlO 89692861 89692983 GATCTTGACCAATGGCTAAGTGAAGA CC I 1 1 1 IG I C I C IGG I CC I IACI I CC PTEN chrlO 89711802 89711921 TGGCTACGACCCAGTTACCA GGTCTATAATCCAGATGATTCTTTAACAGG PTEN
AI CG I 1 1 1 IGACAG I 1 IGACAG I 1 AAA
chrlO 89717568 89717676 GG GAACTCAAAGTACATGAACTTGTCTTCC PTEN chrlO 89717623 89717748 AACCATGCAGATCCTCAGTTTGT 1 1 1 1 IAGCA I C I IG I I CI G I 1 I G I GG PTEN
GTGCAGATAATGACAAGGAATATCTAG
chrlO 89720803 89720913 TACT ACTCCTAGAATTAAACACACATCACATACATAC PTEN
GCCATTTCTAAAATGATTCAATCAAACT
chrlO 123257971 123258092 GC TGCCACAGAGAAAGACCTTTCTG FGFR2 chrlO 123274733 123274857 TGTTCTTCATTCGGCACAGGAT CTAACTCTATGGCCTGCTTATCTGTTC FGFR2 chrlO 123279419 123279544 ATCCTCTCTCAACTCCAACAGGA AGTGGATCAAGCACGTGGAAAA FGFR2 chrlO 123279641 123279764 CG ACCACTGTGG AGG CATTT C I 1 1 1 C 1 GGCA 1 GAGG 1 CAC 1 GA FGFR2 chrll 533782 533882 GTTCACCTGTACTGGTGGATGT GCCTGTTGGACATCCTGGATAC HRAS chrll 534220 534306 CGCCAGGCTCACCTCTATAG AGGAGCGATGACGGAATATAAGC HRAS
ATTTCAGTGTTACTTACCTGTCTTGTCT
chrl2 25378561 25378663 TT AGTTAAGGACTCTGAAGATGTACCTATGG KRAS chrl2 25380260 25380337 TCCTCATGTACTGGTCCCTCAT GTTTCTCCCTTCTCAGGATTCCTAC KRAS chrl2 25398189 25398310 TGGTCCTGCACCAGTAATATGC ATTATAAGGCCTGCTGAAAATGACTGA KRAS chrl4 105241352 105241484 CCACCTCGTCCTGTAAAGCAG GCACTTTCGGCAAGGTGATC AKT1 chrl4 105246444 105246557 G CG CC AC AG AG AAGTTGTT TCTCACCACCCGCACGT AKT1
MAP2K chrl5 66679646 66679755 GAGGAAGCGAGAGGTGCTG CCCATACTTACTCCGCAGAGC 1
MAP2K chrl5 66679747 66679845 CCGACGGCTCTGCAGTTA AGACGTACGTGACAAACCAGATC 1
MAP2K chrl5 66727308 66727432 GAGCTTTCTTTCCATGATAGGAGTACT CTTCTGGGTAAGAAAGGCCTCAA 1
MAP2K chrl5 66727419 66727527 AGGAGCTAGAGCTTGATGAGCA GCTTGTGGGAGACCTTGAACAC 1
MAP2K chrl5 66727541 66727651 CGGTGTGGTGTTCAAGGTCT GCACAAAAAGGTATTGAGTAAAATCAGTCT 1
TGTTTGTAACAACTTAACCTGTTTCTCC MAP2K chrl5 66728983 66729091 T GATTGCGGGTTTGATCTCCAGA 1
MAP2K chrl5 66729087 66729211 C 1 1 1 LI 1 CCACCTTTCTCCAGCTA CATACCATGTGCTCCATGCAGATA 1
MAP2K chrl5 66729201 66729286 CTATGGTGCGTTCTACAGCGAT CCTCCCAGACCAAAGATTAGGC 1
MAP2K chrl5 66735605 66735722 GAACATTGTCACTAACTGGTCTGGTA CCGACTCATTAACTTATCAATGAGGAACTT 1
MAP2K chrl5 66736897 66737023 AAAGGAGGAAGGCAAATTTGTGATG CCTCTGTG C ATG ATCTTGTG CTT 1
MAP2K chrl5 66737013 66737080 TCCTCTAGGTAATAAAAGGCCTGACAT GGAAGCAACAGCCTTTGGATTATATCTAAA 1
MAP2K chrl5 66774071 66774199 TACCTGTGTCAGTTCCCTCCTT TCATACCGACATGTAGGACCTTGT 1
MAP2K chrl5 66774099 66774222 CTATTTTCTCTTCCCTGCAGATGTCA GCTCAAGCAATGGAAACTTCTGTT 1
CAAGTTAGGTTAGGTGATTATCACTGT MAP2K chrl5 66777315 66777438 CAAACATCAGCTCCAGCTCCTT 1
MAP2K chrl5 66777427 66777548 TTGGGAGGTATCCCATCCCT CCTCCAACAGTCCAAGATGGG 1
CCCTTGCCTCATATTAACAAGTAATCTG MAP2K chrl5 66779529 66779647 T AGCAAGTAAATTCCAAGGTGAAGGA 1
MAP2K chrl5 66781535 66781633 CAAGGAGCCAGGCA 1 1 1 1 1 L 1 1 AT CAAATCCAGAGTATACGCTTCCAGA 1
MAP2K chrl5 66782015 66782132 GCAAACATGTTATTTGAGCTAGAACCA CTGACAGAGAAGAACAAATGAATAAACAGG 1
MAP2K chrl5 66782829 66782961 ACACCACGTCCTCTCGTTTC GGACTCGCTCTTTGTTGCTTC 1 chrl7 7573862 7573991 CATGAAGGCAGGATGAGAATGGA CGAGATGTTCCGAGAGCTGAAT TP53 chrl7 7573995 7574070 CCTTGAGTTCCAAGGCCTCATT L I I GAALLA I L I 1 1 1 AAL 1 LAGG 1 AL IG 1 TP53 chrl7 7576947 7577070 GTGCTAGGAAAGAGGCAAGGAAA GCGCACAGAGGAAGAGAATCTC TP53 chrl7 7577032 7577156 CTTG CTTACCTCG CTTAGTG CT I L I I GLI 1 C I C I 1 1 1 LL 1 A 1 LLI GAG 1 A TP53 chrl7 7577514 7577595 CCTGACCTGGAGTCTTCCAGT ATCTCCTAGGTTGGCTCTGACT TP53 chrl7 7577519 7577608 ACCTGGAGTCTTCCAGTGTGAT CTTGGGCCTGTGTTATCTCCTAG TP53 chrl7 7578143 7578233 GCCACTGACAACCACCCTTAAC GAAGGAAATTTGCGTGTGGAGTAT TP53 chrl7 7578209 7578293 TCATAGGGCACCACCACACTAT CCTCTGATTCCTCACTGATTGCTC TP53 chrl7 7578304 7578422 GGCCAGACCTAAGAGCAATCAG TCTACAAGCAGTCACAGCACATG TP53 chrl7 7578380 7578511 CAGCTGCTCACCATCGCTAT TTTTGCCAACTGGCCAAGA TP53 chrl7 7579350 7579483 GGCTGTCCCAGAATGCAAGA GAAGCTCCCAGAATGCCAGA TP53 chrl7 7579840 7579960 CCTTCCAATGGATCCACTCACA GTTGGAAGTGTCTCATGCTGGAT TP53 chrl7 37880147 37880244 CCCACGCTCTTCTCACTCATAT GGGCTTACGTCTAAGATTTCTTTGTTG ERBB2 chrl7 37880961 37881060 GTGGTCTCCCATACCCTCTCA CCATAGGGCATAAGCTGTGTCA ERBB2 chrl7 37881325 37881455 GGATGAGCTACCTGGAGGATGT TCCTTGGTCCTTCACCTAACCT ERBB2 chrl9 3118876 3118993 CGTCCTGGGATTGCAGATTGG CTGAGGGCGACGAGAAACATG GNA11
MAP2K chrl9 4090646 4090739 GGTTCAGCCGCAGGGTTT CCGTCCTACTGTGACTCCAG 2
MAP2K chrl9 4094446 4094533 CAGAACCCGCTGGCATCA GACTGCCCTGTCTTGTCTCC 2
MAP2K chrl9 4095347 4095447 CGGAAGGAGTGGCACATCTG ACCCTCTGTTCTCCTCCACAG 2 MAP2K chrl9 4095411 4095521 ACCATTTATTGACAAACTCCTGGAAGT GTGACCGGACAGGACAGTGA 2
MAP2K chrl9 4097194 4097326 ACTCTTGCTCTGGTCAGAGAGA TGCAGGTCACGGGATGGATA 2
MAP2K chrl9 4097316 4097449 CCAGGAGTTCAAAGATGGCCAT CCGCCTCCAATTTAGGCTG 2
MAP2K chrl9 4099182 4099279 GCGTCCAGACCGGAAGTT AAGAGCTGGAGGCCATCTTTG 2
MAP2K chrl9 4099260 4099386 GCTGTGAGGCTCTCCTTCTTC CGGTTGCAGGGCACACATTA 2
MAP2K chrl9 4099378 4099495 CCATGCTCCAGATGTCCGAC CATTAGCCATGGAGAGGGTGAC 2
MAP2K chrl9 4100886 4101015 AGCTGCGCAGGAGAACTG G CG CTCCTAC ATG G CTG 2
MAP2K chrl9 4101028 4101124 GGGACTCACAGCCATGTAGG CCAGATGTGAAGCCCTCCAAC 2
MAP2K chrl9 4101104 4101221 CCGAAGTCACACAGCTTGATCTC CCAGATCATGCACCGAGGTAAG 2
MAP2K chrl9 4101238 4101346 GGCCTTACCTCGGTGCATG CTCTGACTGCTCAGCTCTGAC 2
MAP2K chrl9 4102330 4102463 CGTGTGGAAGAGGTCCGT AGCCTGCACTCACTCCTTG 2
MAP2K chrl9 4110387 4110512 CATCGACTGCCTTGAGAAGGAT GGGAGATCAGCATTTGCATGGAA 2
MAP2K chrl9 4110512 4110639 CGGACGCACTCACCATGT TTGCAGCTGATCCACCTTGA 2
MAP2K chrl9 4110586 4110693 CGATGTACGGCGAGTTGCAT CCAAATGCCTTCATCCC 1 1 1 1 C 2
MAP2K chrl9 4117321 4117442 GGACAGAGCCTGGAGCTAATC GTCACCAAAGTCCAGCACAGAC 2
MAP2K chrl9 4117432 4117562 CTCACCTTCCTGGCCATGAT CTTGACGAGCAGCAGAAGAAGC 2
MAP2K chrl9 4117552 4117651 CCTTGGCTTTCTGGGTGAGAAA ATGGAGTCTCCCTAGGTAGCTAAC 2 chr2 29432565 29432669 AGTTGACAGGGTACCAGGAGAT CCAAGATTGGAGACTTCGGGAT ALK chr2 29432659 29432776 AGGCAGTCTTTACTCACCTGTAGA CGTTGTACACTCATCTTCCTAGGGAT ALK chr2 29436753 29436870 GCTCTGGAGGGAGACCTAGTAT CCTGTGGCTGTCAGTATTTGGA ALK chr2 29436861 29436982 CTTTGACTCACCGGTGGATGAA GATTTCCCTCCTCTCACTGACAAG ALK chr2 29443489 29443620 GGTTCCATCGAGGAACTTGCT GTTCATCCTGCTGGAGCTCAT ALK chr2 29443608 29443710 TCTCTCGGAGGAAGGACTTGAG CTCAGTTAATTTTGGTTACATCCCTCTCT ALK chr2 29443700 29443771 CGAACAATGTTCTGGTGGTTGAATT GCTGCTGAAAATGTAACTTTGTATCCTG ALK chr2 29445140 29445223 ACGATTTCCCTTGGAGATATCGATCT GACGAACTGGATTTCCTCATGGAA ALK chr2 29445213 29445318 GTGTCTCTCTGTGGCTTTACCTG CTCTGTAGG CTG C AGTTCTC AG ALK
CCTTTGCTTTAGATTGGCAATTATTACT
chr20 57484360 57484474 GT AAAAAGGTAACAGTTGGCTTACTGGA GNAS chr20 57484535 57484659 AGAGATCATGGTTTCTTGACATTCACC ACAGAAGCAAAGCGTTCTTTACG GNAS
CATTTCCAATCTACTAATGCTAATACTG CTNNB
chr3 41266008 41266086 TTTCG TGGATTCCAGAGTCCAGGTAAGAC 1
CTNNB
chr3 41266075 41266201 ACAGAAAAGCGGCTGTTAGTCA TGAAGGACTGAGAAAATCCCTGTTC 1 chr3 178916742 178916857 GCCTCCGTGAGGCTACATTAAT TCACAAAGTCGTCTTGTTTCATCAAAAA PIK3CA chr3 178916808 178916905 CCCTCCATCAACTTCTTCAAGATGA GGTTGCCTACTGGTTCAATTACTTTTAAAA PIK3CA chr3 178921451 178921570 ATGCCATCTTATTCCAGACGCAT TAAGCATCAGCATTTGACTTTACCTTATCA PIK3CA chr3 178927401 178927517 TTTACATAGGTGGAATGAATGGCTGAA AGCGG 1 A 1 AA 1 AGGAG 1 1 1 1 1 AAAGG 1 AA PIK3CA
GGGAAGAAAAGTGTTTTGAAATGTGTT
chr3 178927934 178928016 T CA I 1 1 1 1 CCAGATACTAGAGTGTCTGTGT PIK3CA
ACACAGACACTCTAGTATCTGGAAAAA
chr3 178928046 178928161 TG CATAAGAGAGAAGGTTTGACTGCCAT PIK3CA
ATTTTACAGAGTAACAGACTAG AGA AAAGAAAAAGAAACAGAGAATCTCCATTTTAG
chr3 178936020 178936128 GACA C PIK3CA
GATCTGAGATGCACAATAAAACAGTTA
chr3 178938802 178938919 GC C 1 1 1 1 G 1 G 1 1 1 CATCCTTCTTCTCCT PIK3CA
TGGCCTGAATCACTATATTTCCATACTA
chr3 178947769 178947887 a TACTTGTCCATCGTCTTTCACCATG PIK3CA chr3 178951974 178952095 TCAATGATGCTTGGCTCTGGAA GAAGA 1 CCAA 1 CCA 1 1 1 1 I G I I G I CCA PIK3CA
TTCATGAAACAAATGAATGATGCACAT
chr3 178952089 178952205 CA GGTCTTTGCCTGCTGAGAGTTA PIK3CA chr4 1803554 1803640 CCCTGAGCGTCATCTGC TCACTGTACACCTTGCAGTGG FGFR3 chr4 1806069 1806205 TTTGCAGCCGAGGAGGAGC AGATCTTGTGCACGGTGGG FGFR3 chr4 1807831 1807932 CTGGTGACCGAGGACAACGT GGCGTCCTACTGGCATGA FGFR3 chr4 1808937 1809063 GGGACGACTCCGTGTTTG CTCCATCTGCACTGAGTCTCATG FGFR3
PDGFR
chr4 55140978 55141102 GGTGCACTGGGACTTTGGTAAT CATCTCTTGGAAACTCCCATCTTGAG A
PDGFR
chr4 55144088 55144208 CGGCCAGATCCAGTGAAAAACA AACAG 1 1 1 1 CACAACCACATGTGTC A
PDGFR
chr4 55144502 55144621 GGL 1 1 1 1 CTGTTCTTCATTTTCATACCC GATATCCAGCTCTTTCTTTGGCTTCT A
PDGFR
chr4 55151978 55152104 G CTATTC AG CTAC AG ATGG CTTG A TGCCTTTCGACACATAGTTCGAAT A
PDGFR
chr4 55152064 55152162 CTCCTGGCACAAGGAAAAATTGT CTGACTTTAGAGATTAAAGTGAAGGAGGAT A chr4 55561699 55561822 GGCTCTTCTCAACCATCTGTGA CATTCTGCTTATTCTCATTCGTTTCATCC KIT
TTCTATAGATTCTAGTGCATTCAAGCAC
chr4 55592134 55592251 AA CATGACTGATATGGTAGACAGAGCCTA KIT chr4 55593388 55593508 CCACATTTCTCTTCCATTGTAGAGCA AGTCATTAGAGCACTCTGGAGAGA KIT chr4 55593527 55593645 TTTGTTCTCTCTCCAGAGTGCTCTA GATCATAAGGAAGTTGTGTTGGGTCTA KIT chr4 55593630 55593750 GGAAGGTTGTTGAGGAGATAAATGGA TGGAGTTCCTTAAAGTCACTGTTATGTG KIT
CTAAAATGCATGTTTCCAATTTTAGCGA
chr4 55594181 55594296 G AGACAATAAAAGGCAGCTTGGACA KIT
GGG I A I 1 1 1 1 A 1 GGGAGGCAGAA 1 1 AA
chr4 55595458 55595572 TCT CATGATCTTCCTGCTTTGAACAAATAAATG KIT chr4 55597439 55597559 GGAGGTAGAGCATGACCCATGA CCTATTCTCACAGATCTCCTTTTGTCG KIT chr4 55599266 55599387 CAGAGACTTGGCAGCCAGAAAT ATCGAAAGTTGAAACTAAAAATCaTTGCA KIT chr4 55602678 55602798 TTCTATTACAGGCTCGACTACCTGT CAAGAAGATGCTCTGAGTCTAATGAAGTT KIT chr6 117630010 117630122 GTCTCCCTCCTGTTTGCACATA ACTACTGTAAACCTGGTGTTTGTAATAAGT R0S1
GGCATTTGCATTATGAAACCAATATTAT
chr6 117638291 117638405 GG CAAATTTAATCATCCCAACATTCTGAAGCA ROS1 chr6 152419790 152419918 GTGTCTTTGGAGTTCCTCTTCCTT CTCCAGCAGCAGGTCATAGAG ESR1 chr6 152419908 152420028 ATCTGTACAGCATGAAGTGCAAGA TGCAAGGAATGCGATGAAGTAGAG ESR1 chr7 55211031 55211131 GGTGGCTGGTTATGTCCTCATT CTTCAGTCCGGTTTTATTTGCATCA EGFR chr7 55221796 55221914 CCACGTACCAGATGGATGTGAA GACTCTCCAAGATGGGATACTCCA EGFR chr7 55233005 55233131 ACTGTATCCAGTGTGCCCACTA CTGTTCTCCTTCACTTTCCACTCA EGFR chr7 55241601 55241731 GGTGACCCTTGTCTCTGTGTTC TGTGCCAGGGACCTTACCTTAT EGFR chr7 55242353 55242482 TGGTAACATCCACCCAGATCACT TTCCTTGTTGGCTTTCGGAGAT EGFR chr7 55242440 55242563 CTCTGGATCCCAGAAGGTGAGA CTGCCAGACATGAGAAAAGGTG EGFR chr7 55248957 55249064 TCTGGCCACCATGCGAAGC GGCATGAGCTGCGTGATGA EGFR chr7 55249005 55249139 GAAGCCTACGTGATGGCCA CACACACCAGTTGAGCAGGT EGFR chr7 55249121 55249200 GGACTATGTCCGGGAACACAAA TGAGGATCCTGGCTCCTTATCTC EGFR chr7 55259482 55259582 GCCAGGAACGTACTGGTGAAAA CAGGAAAATGCTGGCTGACCTA EGFR chr7 116339603 116339683 CCCACAATCATACTGCTGACATACA GGTCCTTTACAGATGAAAGGACTTTG MET chr7 116340199 116340323 GCACAAAGCAAGCCAGATTCTG TGATGTGACTTACCCTATTAAAGCAGTG MET chr7 116403179 116403279 GCAACAGCTGAATCTGCAACTC TTCATTGCCCATTGAGATCATCACT MET
ATGTAGTCCATAAAACCCATGAGTTCT
chr7 116411830 116411952 G GGGCACTTACAAGCCTATCCAA MET chr7 116411950 116412069 CGATGCAAGAGTACACACTCCT ACAACCCACTGAGGTATATGTATAGGTATT MET chr7 116417439 116417558 AG 1 G 1 AACCAAG 1 I I M U M I GC AAGCTATTTATTAGGTTGCAAACCACAA MET chr7 116423367 116423490 TGTCCTTTCTGTAGGCTGGATGA C I GAC I I GG I GG I AAAC I 1 1 I GAG I 1 I G MET chr7 128845060 128845188 GCCAGAATGAGGTGCAGAACA CGATGTAGCTGTGCATGTCCT SMO chr7 128846007 128846135 GCACAGCTCCAATGAGACTCTG GCAGGTGGAAGTAGGAGGTCTT SMO chr7 128846285 128846415 GCCAGTAACCCACCTTCTGT TCACTCACCTCGGATGAGGAA SMO chr7 128850319 128850446 CAACCTGTTTGCCATGTTTGGA GGACCCGACAAAACCTAAAGATGG SMO chr7 128851500 128851620 GGCTTGGCCTTTGACCTCAAT CACCTTCCTCCAGAAGCTTGAA SMO chr7 140453098 140453215 CCATCCACAAAATGGATCCAGACA GCTCTGATAGGAAAATGAGATCTACTGTTT BRAF chr7 140481392 140481512 CATACTTACCATGCCACTTTCCCTT C I 1 1 1 I C I G I 1 I GGC I I GAC I I GAC I 1 BRAF chr8 38282171 38282294 1 ACCCAGGGCCA 1 G 1 1 1 1 CCATCACTGGGAAAGCCAAGT FGFR1 chr8 38285880 38285998 TCTCACGCATACGGTTTGGTTT CTGCACTAGCCTTGGTGAAATCT FGFR1 chr9 80409382 80409502 ATTGCCTGTCTAAAGAACACTTACCTC TGAGTATTGTTAACCTTGCAGAATGGT GNAQ chr9 98209150 98209246 GGAGAACCTTGTCCTCCTCTTTG GGGATTCGAAGGTGGAAGTCATTG PTCH1 chr9 98209236 98209321 GCCTCTCCTCGCATTCCAC AGGCTACCCTGAGACTGACC PTCH1 chr9 98209302 98209431 GACGTGGAAAGGCACGTG GCTACTGCCAGCCCATCAC PTCH1 chr9 98209432 98209567 GCAGAAGCCGTCACAGTG CGCGCAGAGACG 1 I M G PTCH1 chr9 98209561 98209694 GGCCAGAATGCCCTTCAGTA GCATCACCCACCCTCGAAC PTCH1 chr9 98209618 98209743 TGGCCACAAGCCTTCTCTG ACCCTTCTAACCCACCCTCA PTCH1 chr9 98211246 98211372 TCTGTACCTTATCTCTGCATCCCAT TGATCGTGGAAGCCACAGAAAA PTCH1 chr9 98211366 98211496 CTTACAGTGGAGTGGGCGAA CGCACAGCGGGTCTGATT PTCH1 chr9 98211491 98211614 CGTCTGGGAACTATACTCCGAGT GACAAGAACA 1 1 1 1 AACA 1 GGAA 1 CC PTCH1 chr9 98212029 98212157 TGTTTACTGAAGAACCACCAGCAA CICAAIGGGCIGGI 11 IGCI IC PTCH1 chr9 98212147 98212227 ACCTCAGGATATGGTCCAAAGAAAGA GTGTTCCCGTTTCCTCTTGATCT PTCH1 chr9 98215734 98215835 GGCCCAATCACAATGATTTCTAAAAC CCTGGAGCACATGTTTGCAC PTCH1 chr9 98215783 98215907 CCTGACAATGAAGTCGAACTCAGA ACCTGATCTTGTGAACATCCTCATTG PTCH1
11111 CACAAA 1111 I U I AAAI I
chr9 98218550 98218660 C CTGATGACGGTCGAGCTGTT PTCH1 chr9 98218647 98218772 GCACTGAGCTTGATTCCGATGA GAT GAACCGAGGACACCTTAG PTCH1 chr9 98220270 98220396 GCTATGCTGAAAGGAATTTGACTTCC CCTCTTCTGGGAGCAGTACATC PTCH1 chr9 98220376 98220496 CAACACCACGCTGATGAACAG GGACACCTCAGACTTTGTGGA PTCH1 chr9 98220486 98220609 TTGCTGCAGATGGTCCTTACTT GAAACTGTGATGCTCTTCTACCCT PTCH1 chr9 98221822 98221902 GGATGAAGGCTGTTGCTGAGTT GTCCACGACAAAGCCGACTACA PTCH1 chr9 98221895 98222021 CG CTACTTACTTCTC AG CCTTGT GGATGCAGATGGCATCATTAATCC PTCH1 chr9 98222013 98222109 CCAAGCCGTCAGGTAGATGTAG TCTCAAGGCAGAAGTGTGTTTACC PTCH1 chr9 98224044 98224172 AGTGCCTTAGGTCTCCAGAGA GCCTACAAACTCCTGGTGCAAA PTCH1 chr9 98224162 98224260 GGCTGATGTCGATGGGCTTATC TTCTAGGACTTCAGGATGCATTTGAC PTCH1 chr9 98224250 98224376 GTTTGGCATGATTTTCCCGGTT TGTGTTTCTGATGGACGTCTAAGAG PTCH1
TCTAACGCTCTCATAATCATGACAAAG
chr9 98229337 98229430 G GGAAGAAAACAAACAGCTTCCCAAAA PTCH1 chr9 98229420 98229499 CCCTGAAGCCAGTCTCTGAAGTA CTACCCGAATATCCAGCACTTACTTT PTCH1 chr9 98229489 98229613 GACATACTTCACGTTACTGAAACTCCT GCTGGACCTTACGGACATTGTAC PTCH1
G AATTGTG C AG CA ATAA AGTC AT ATTC
chr9 98229603 98229711 TCT CTGACCTTGTGCCTCTTCTGTT PTCH1 chr9 98229707 98229777 CCCAGAAAAAGGAAGATCACCACTA GTGGTGGTGAAAACAAGGTATTAACTAGA PTCH1
AAAGTAGAAGCAATCTGATGAACTCCA
chr9 98230980 98231070 A GGACACTCTCATL 1111 GCTGAGA PTCH1 chr9 98231060 98231187 GG 1111 111 CAAGAGGAAAGG GTGACACAGGACACCCTCAGCT PTCH1 chr9 98231164 98231249 GAGAGCAGGTCCCTTGTGG ACGCACGTGTACTACACCAC PTCH1 chr9 98231231 98231363 GTGACGGGCTGCACAGAGAT ACACGACAATACCCGCTACAG PTCH1 chr9 98231337 98231446 AATCTGCGTTTCATGGGCAAAG TCAGCATAACATTGCAACATGTTTCC PTCH1 chr9 98232021 98232118 CCCGTTACCCACATTCCTTTATAAGT TATATCGACGCGAGGACAGGAGACT PTCH1
CAGTCTGAAAATGTACCTTGTAAAACA
chr9 98232108 98232226 GC GCATGTTGGTGACCTCTGAAI 1111 PTCH1 chr9 98238353 98238463 GCAGAGCGGGAATTGGGATT ATGTCCAGTG C AG CTCTC AG PTCH1 chr9 98239036 98239147 CCAAAG 1 1 1 ICI 11 IGI 111 IGC GICIAAIGCCACCAICCICIGI 111 PTCH1 chr9 98239721 98239845 ATCCTAGTGGAAAAGGCTGCAA TGTGCTCATTGATCGGAATTTCCTTTA PTCH1
AGATAAATGGCTCCTTTAGTACCTGAG
chr9 98239835 98239958 T CCTATGCCTGTCTAACCATGCT PTCH1 chr9 98239906 98240005 GCCACTGACAGTGCAACCA Al ICIAAIGI ICGGCI 11 IGI ICIGIG PTCH1
AGCAGTCATGGAAAAGTAAAGACTAA
chr9 98240292 98240413 AACA TCAAAAGGTGCTTTCCTTCACCA PTCH1 chr9 98240402 98240479 CAGAGAAGGATTTCAGGATGTCGT CATTTGGGCATTTCGCATTCTGT PTCH1
ATGGTGAAAATGAAGAATTGCATAACC
chr9 98241249 98241365 AG CAAGCAAATGTACGAGCACTTCAAG PTCH1 chr9 98241355 98241443 CGTTCCAGTTGATGTGTGAGACA GAATACTGATGATGTGCCTTCCCTT PTCH1 chr9 98242207 98242332 AGGGAAGTGGCTTTTGAGGAAAG CCTTGTTTTGAATGGTGGATGTCATG PTCH1 chr9 98242322 98242401 CCTCCTGCCAGTGCATATACTT CATGTGACCTGCCTACTAATTCCC PTCH1 chr9 98242647 98242766 GTGTTTTGCTCTCCACCCTTCT CAGCTGGGAGGAAATGCTGAATA PTCH1 chr9 98242758 98242875 CGGTCCATGTAACCATGACCAA TTTTCATGGTCTCGTCTCCTAATTTCTTT PTCH1 chr9 98244209 98244335 GAGTCCTAGAGAAGTCACAGACATCA CCTAACGCATGGCCTCTTCTTT PTCH1 chr9 98244373 98244497 GCGTTAGGTTAAGGCACACTACT ACAG 1 AGAAA 1111 IGICICIGCI 11 PTCH1 chr9 98247944 98248030 CGCCTTACCTGCTGCTCATTAG TGCTAATGTCCTGACCACAGAAG PTCH1 chr9 98268634 98268753 GTG CG CTGG CG AATATCTCTAT TTGTGGGCCTCCTCATATTTGG PTCH1 chr9 98268740 98268812 CICGAGGI ICGCIGCI 11 IAAIC GCGAAGTTTCAGAGACTCTTATTTAAACTG PTCH1 chr9 98268798 98268887 CCAAGAA 1 I CCGCA I 1111 ACTCCTCCCTTCTGCTTCGT PTCH1 chr9 98270321 98270449 GCI 111 1 GAGAGG 1 1 A CCTTCGCTCTGGAGCAGAT PTCH1 chr9 98270510 98270624 GTGCAGATAGTCCCGGTCC ATGGCCTCGGCTGGTAACG PTCH1 chr9 98270646 98270744 TTACCAGCCGAGGCCATGT AGCAGCGTCCTCGCAAG PTCH1 chr9 98278887 98278982 CCTCCGTCTTCTCCCAGTCTT TTCCGTTCCTTTTGTAAAGACGGA PTCH1 chr9 98278969 98279090 1111 ICI ICICCICCGI 11 ICI ICI ICI GAGGGCCATGGAACTGCTT PTCH1 chr9 98279082 98279177 GCGGACTCACAATTACAAGCCT CTTTGaTTCCTTGAGTTTATTGTAAAGGG PTCH1 chrl 11167480 11167597 CTTTCTCACCATGGTTTCAGTTTAGTG CAAGTTATCTGTGTAAACaTTG AAG AAG C MTOR chrl 11168135 11168261 AGATCTGAGCTCTAACTGCCCTT CAAACAAGCGACATCCCATGAAAA MTOR chrl 11168251 11168350 CCCACTCACCAGCCAATATAGC AGAGA 1 AAA 1 1111 IGI ICCICCI MTOR lAIGICCCI 111 AAG 1 AAA ACA 1 ACA
chrl 11169342 11169447 CA TGTTGTATTGCTCCCATTCTTACAGTTATT MTOR
AATGAGAAATTCATGGAACCTTTTCTG
chrl 11169596 11169715 C GGACCACAGTGCCAGAATCTATT MTOR chrl 11169705 11169804 GGTTCCTCAGAGGCTGAACTTA GGCI 111 IGGIGI 1 IGAAI 11 ICIGI IAAI MTOR
Figure imgf000028_0001
[00132]. Sequencing of FFPE samples with known mutations
[00133]. Horizon Diagnostics samples, clinical samples and proficiency test samples with known mutations were used to assess the accuracy of this assay. The criteria for the selection of these samples were i) tumour content >20%; ii) known mutation status of KRAS exon 2 or iii) presence of mutations within the 30 genes. Representative examples of mutations detected by Sanger sequencing are shown in Figure 6. Details of the tumour type tested, mutations known to be present in these tumours, detected MAF using NGS and coverage are summarized in Table 3.
[00134]. Table 3. A list of mutations with known mutations and the detected mutations using the system.
Figure imgf000028_0002
Figure imgf000029_0001
PT KITA KIT p.Y503_F504insAY KIT p.Y503_F504insAY
[00135]. To assess the sensitivity of this assay, 10 HapMap samples were pooled together followed by library generation and sequencing. The expected MAF for each test base substitution or indel in the pooled samples was calculated based on the number of alternate alleles present in mix constituents and on mixing ratios (Table 4). In total, 81 SNPs and 3 indels were expected to be identified with a range of MAF (5% - 95%). All these mutations were successfully detected in the pooled HapMap sample, except rs369915085 (MAF=5%) and rs1800863 (MAF=10%) (Supplementary Figure 3). In addition, sensitivity assessment was also performed using FFPE samples. Positive mutations (n=22) obtained from Horizon Diagnostics samples were shown to be consistent with the data provided by Horizon Diagnostics (Figure 7). In total, 8 indels and 25 single nucleotide variations were identified in the 44 cases. GIST samples contained KIT abnormalities, 2 cases with insertion in exon 9 and 2 cases with deletion in exon 1 1 . In the 2 tumour samples with EGFR aberrations, a 15-bp deletion in exon 19 was successfully detected and called by the bioinformatics pipeline. This suggests that the sensitivity of this assay is 98.5% (131 /133) in detecting single nucleotide variants (SNVs) and indels.
[00136]. Table 4. The expected SNP or indels with MAF in pooled HapMap sample.
Chromosome Position SNP ID Reference Alternate Expected MAF (%) j chrl 11169676 rs2275525 C T 30
chrl 11174850 rs 1148477 C T 5
chrl 11174851 rs2275523 G A 15
chrl 11181327 rsl l l21691 C T 35
chrl 11181457 rsl7235633 G T !5
chrl 11182063 rs55951261 G A 5
chrl 11189723 1 - C T 5
chrl 11190646 rs2275527 G A 35
chrl 11190730 rsl7848553 G A 15
chrl 11194591 rs3730381 G A 15
chrl 11199518 rs28730682 T C 15
chrl 11199541 rsl7848548 G A 15
chrl 11199608 rs202113962 G A 5
chrl 11205058 rs 1057079 C T 25
chrl 11288758 rs 1064261 G A 55
chrl 11291512 rs77132482 G A 15
chrl 11298583 rs79942701 G A 15
chrl 11298739 rsl2132215 T C 10
chrl 11301714 rsl l35172 A G 40
chrl 11303153 rsl2141961 C A 15
chrl 11308074 rs28730694 G A 15
chrl 162722881 rs76156815 C T 15
chrl 162724636 rs2271305 T c 5
chrl 162740327 rsl780003 T c 95
chrl 162741794 rs3738807 c T 10
chrl 162745844 rs79279549 G T 15 chr2 29432625 rs3738868 C ! A
chr2 29432776 rs3738867 T ! c
chr2 29443726 rsl46360301 c ! G
chr3 178927410 rs2230461 A ! G
chr4 1807894 rs7688609 G ! A 1 85 chr4 55141055 rsl 873778 A ! G 1 95 chr4 55152040 rs2228230 C ! T 1 25 chr4 55593481 rs55986963 A ! G 1 5 chr4 55602765 rs3733542 G ! c
chr7 55233038 rs 17290162 G ! A
chr7 55233089 rs 17290169 C T
chr7 55249063 rsl050171 G ! A I 20 chr7 116340223 rs77523018 T ! c
chr7 116340269 rs28444388 c ! T
chr7 116411848 rs 140824245 c i CCT chr7 128846328 rs2228617 G ! c I 55 chr7 140481425 rs56216404 T ! c
chr9 98209156 rs56237839 T ! c
chr9 98209594 rs357564 G ! A 1 40 chr9 98211549 rsl38240178 G ! A 1 5 chr9 98211572 rs2236405 T ! A 1 20 chr9 98220322 rs2066835 A ! c
chr9 98221861 rs2236406 T ! c
chr9 98224360 rs2274692 c ! G I 50 chr9 98229389 rs2066829 c ! G
chr9 98231008 rs 16909898 A ! G
chr9 98231379 rs200321717 T ! c I 5 chr9 98238379 rsl 805155 A ! G
chr9 98239147 rs2277184 A ! G
chr9 98239730 rs28448271 G ! A
chr9 98270322 rs369915085 TGA ! T
chr9 98270324 rs28475503 A ! T
chr9 98270342 rs79663574 A ! T 1 5 chr9 98278940 rs73527759 C ! G 1 5 chr9 98279017 rs200842963 C ! A 1 5 lir l O 43615633 rs 1800863 C ! G I 10 lir l O 89720907 rs555895 T ! G I 80 eliiTo 123279745 rs2981448 c T
chrl l 534242 rs 12628 A ! G I 20 chrl4 105241378 rs3730346 C ! T
chrli 105241399 rs3730345 C ! T I 5 chrli 105241422 rs34670300 G ! A I 5 chrl5 66679649 rs 142027715 T i TC I 30 chrl5 66679684 rs77796976 A ! G chrl5 66727325 j rs8043109 C ! G I 5
chrl5 I 66727597 j rs 16949924 G ! c
chrl5 66729250 j rs 16949939 C ! T
chrl5 I 66782048 j rs41306345 C ! T
chrl7 I 7579472 j rs 1042522 G ! c 1 65
chrl9 4097365 j rsl46376140 T ! c 1 10
chrl9 4099187 j rs350911 T ! c 1 90
chrl9 4101062 j rs 10250 G ! T 1 45
chrl9 4101304 j rs 139730969 C ! T
chrl9 1 4102354 j rs73918030 T ! c
chrl9 1 4102449 j rsl7851657 G ! A
chrl9 1 4110408 j rsl2975436 T ! c
chrl9 4110492 j rs 10424545 c ! A
chrl9 4110552 j rs 10424722 c ! G
[00137]. HD266 was identified as wild-type for BRAF, EGFR, KIT and PDGFRA, whereas HD141 was identified as wild-type for EGFR, KIT, KRAS and PDGFRA. Clinical samples and proficiency test samples with wildtype KRAS exon 2 were used to assess the specificity of this assay (Table 5). In conclusion, 100% specificity (33/33) was achieved as shown in below table.
[00138]. Table 5. All the FFPE samples with known wildtype and the detected wildtype using the system.
Figure imgf000032_0001
Lung 3_1 IS 18972-1-3 KRAS exon 2 wild-type KRAS exon 2 wild-type
Lung 4_10S 18323-1-6 KRAS exon 2 wild-type KRAS exon 2 wild-type
Lung 6_llS12219-II-l KRAS exon 2 wild-type KRAS exon 2 wild-type
Lung 7_12S6181-10 KRAS exon 2 wild-type KRAS exon 2 wild-type
Lung 9_02S15020-II-4 KRAS exon 2 wild-type KRAS exon 2 wild-type
Lung 10_l lS7163-I-4 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 1_07S 11911-7 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 2_08S 11437-2 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 3_09S003015-7 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 4_10S010014-6 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 5_10S012883-II-4 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 6_08S026635-2 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 7_07S9452-4 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 8_08S018606 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 9_07S9067-3 KRAS exon 2 wild-type KRAS exon 2 wild-type
GIST 10_09S 1489-5 KRAS exon 2 wild-type KRAS exon 2 wild-type
PT EGFR01 KRAS exon 2 wild-type KRAS exon 2 wild-type
PT EGFR02 KRAS exon 2 wild-type KRAS exon 2 wild-type
PT EGFR03 KRAS exon 2 wild-type KRAS exon 2 wild-type
PT KITA KRAS exon 2 wild-type KRAS exon 2 wild-type
[00139]. Assessment of accuracy, limit of detection and reproducibility
[00140]. Accuracy of the assay will be assessed using 1 1 FFPE samples from Horizon Diagnostics, 1 pooled HapMap sample and 41 clinical FFPE samples. NA18595, NA18618, NA18867, NA18957, NA19059, NA19147, NA19190, NA19214, NA19235 and NA19247 were purchased from Coriell Institute. Individual HapMap samples were pooled by pipetting equal volumes of individual DNA samples at equimolar concentrations. All the expected single nucleotide polymorphisms (SNPs) for each sample were
downloaded from HapMap database and summarized in Table 2.
[00141 ]. HD705, HD273, HD308, HD179, HD127, HD301 , HD300, HD850 and HD200 which harboured known positive mutations with of mutation allele frequency (MAF) ranged from 1 -10% were used to assess the lowest limit of detection of this assay. In addition, HD705, HD301 , HD300 and HD200 were diluted to 5 ng, 15 ng and 30 ng for library generation in order to assess the lowest DNA input for detection of mutation with allele frequency≥5%. Three independent experiments were performed with 5 ng of DNA input.
[00142]. Interrun assay reproducibility was assessed by sequencing 3 patient samples and 1 horizon diagnostic sample (HD200) with different barcodes randomly across 8 independent multiplexed sequencing runs. To test the intrarun assay reproducibility, libraries prepared from 3 patient samples indexed with different barcodes were multiplexed and sequenced on the same flow cell.
[00143]. Lowest limit of detection
[00144]. In total, 9 horizon diagnostics samples were used to assess the lowest limit of detection of mutation allele frequency (Table 6). BRAF p.V600E (1 .40%) in HD273, EGFR p.L861 Q (1 .00%), EGFR p.E746_A750delELREA (1 .00%), EGFR p.T790M (1 .00%), EGFR p.L858R (1 .00%), EGFR p.G719S (1 .00%) in HD850, EGFR
p.E746_A750delELREA (2.00%), EGFR p.T790M (1 .00%) and EGFR p.L858R (3.00%) in HD200 were not identified using current bioinformatics pipeline although these mutations were observed during IGV visual inspection. Nonetheless, all these mutations were under the limit of detection for this assay. The rest of mutations with MAF≥5% were detected in all the horizon diagnostics samples and a strong correlation was observed between the expected and detected MAF (Pearson correlation = 0.996) (Figure 3).
[00145]. Table 6. Known mutations with a range of MAF between 1 %-50% and the corresponding MAF detected using the system
Figure imgf000034_0001
HD850 EGFR p.L861Q 1.00% under detection limit
HD850 EGFR p.E746_A750delELREA 1.00% under detection limit
HD850 EGFR T790M 1.00% under detection limit
HD850 FR p.L858R 1.00% under detection limit
HD850 EGFR p.G719S 1.00% under detection limit
HD200 EGFR p.E746_A750delELREA 2.00% under detection limit
HD200 EGFR T790M 1.00% under detection limit
HD200 EGFR p.L858R 3.00% under detection limit
HD200 BRAF p. V600E 10.50% 12.01%
HD200 KIT p.D816V 10.00% 7.84%
HD200 EGFR p. G719S 24.50% 24.94%
HD200 KRAS p.G13D 15.00% 14.60%
HD200 KRAS p.G12D 6.00% 5.36%
[00146]. HD705, HD301 , HD300 and HD200 were used to assess the lowest limit of DNA input. Three different DNA amount (5 ng, 15 ng and 30 ng) were used to generate NGS library. The library generation and sequencing were successful for the different DNA input. The positive mutations in these samples were identified consistently in three independent experiments (Figure 4). Repeated sequencing using 5 ng DNA input demonstrated reliable sensitivity at 5% variant frequency with high intra-run and inter-run reproducibility. This suggests that successful sequencing can be obtained from at least 5 ng of DNA input.
[00147]. Reproducibility
[00148]. Three different clinical samples were used to assess the reproducibility of the assay. The samples were analyzed 4 times in three independent experiments and two intra run. Detected mutations in the samples showed 100% concordance with minimal variation in mean ± SE variant frequencies. For example, KRAS (p.G12D, 20.01 % ± 0.13%) and PIK3CA (p.E545D 10.96% ± 0.67%) in CRC1 -10S17715-11-8, EGFR (p.T790M, 29.23% ± 0.37%; p.L858R, 29.13% ± 0.32%) in Lung-4-10S18323-l-6, and TP53 (P.P72R, 49.29% ± 1 .10%) in GIST-1 -07S1 191 1 -7. This suggests that the assay is able to reproduce the results in FFPE clinical samples.
[00149]. Confirmation of Mutations by Sanger Sequencing
[00150]. All the positive mutations were confirmed by Sanger Sequencing. Regions of interest for Sanger Sequencing were PCR amplified using forward and reverser primers tagged with M13 universal sequences (M13 forward: 5'-TGTAAAACGACGGCCAGT-3'; M13 reverse: 5'-CAGGAAACAGCTATGACC-3'). Primer pairs for amplicons were designed to work under the same PCR conditions to facilitate simultaneous PCR amplification of different genes. PCR was performed using an 8 ng DNA template.
Followed by that, 3.5 μΙ_ of PCR product was used to perform sequencing PCR.
Sequencing products were further purified using BigDye XTerminator (Life Technologies) and analysed on the 3500 DNA analyser using the Sequencing Analysis 3.0 software (Applied Biosystems, Carlsbad, CA).
[00151 ]. Data analysis
[00152]. Basecalling and De-multiplexing was carried by MiSeq Real Time Analysis and MiSeq Reporter respectively. The resulting FastQ files were mapped to Human Reference Genome hg19 using BWA. Mapped reads were filtered to ensure that only mappings corresponding to target amplicons were selected for downstream analysis. Short reads, reads where start and stop positions do not coincide with amplicon start and end positions or reads that span multiple amplicons were ignored. Selected reads were down-sampled to maximum 5000x coverage per amplicon to facilitate efficient variant calling. Variant Calling was carried out using an in-house developed program fine-tuned for somatic amplicon sequencing. All variants detected at mutant allele frequency >= 2.85% and total coverage >= 500 were reported. Resulting mutations were annotated to a predefined list of hotspot mutations, comprising of non-synonymous Cosmic mutations.
[00153]. Clinical reporting of mutations
[00154]. Each actionable sequence variant shortlisted by the bioinformatics pipeline was confirmed by IGV visual inspection. Synonymous and intronic mutations were not included in the clinical report. Variants with MAF <20% will only be reported with adequate sequencing coverage (500x). If the read depth is between 100X to 500X, only variants with MAF higher than 20% will be reported. The mutational status of the genes for which the amplicons failed to reach the minimum coverage of 10Ox are marked as indeterminate. The variants calling data (VCF file) has been sent to a third party analysis and interpretation. Detected mutations will be associated with FDA-approved drugs, off- label use of FDA-approved drugs, currently open clinical trials and prognosis information.
[00155]. Throughout this document, unless otherwise indicated to the contrary, the terms "comprising", "consisting of", "having" and the like, are to be construed as non- exhaustive, or in other words, as meaning "including, but not limited to". [00156]. Furthermore, throughout the specification, unless the context requires otherwise, the word "include" or variations such as "includes" or "including" will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.
[00157]. As used in the specification, the singular form "a", "an" and "the" include plural references unless the context clearly dictates otherwise.
[00158]. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by a skilled person to which the subject matter herein belongs.
[00159]. It should be further appreciated by the person skilled in the art that variations and combinations of features described above, not being alternatives or substitutes, may be combined to form yet further embodiments falling within the intended scope of the invention.

Claims

Claims:
1. Method for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample from a subject comprising:
(a) extracting nucleic acid from a slice of the sample;
(b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences comprising oligonucleotides identified by SEQ ID NOS. 719-748;
(c) amplifying the nucleic acid;
(d) purifying the amplified nucleic acid;
(e) contacting the amplified nucleic acid with reversible dye terminators;
(f) detecting a signal from the dye terminators; and
(g) converting the signal to a nucleic acid sequence to analysing the same for variations.
2. The method according to claim 1 , wherein purifying the nucleic acid comprises adding paramagnetic particles to the amplified nucleic acid and removing any nucleic acid not adhering to the paramagnetic particles and removing the paramagnetic particles.
3. The method according to claim 2, further comprising amplifying the purified amplified nucleic acid after removing the paramagnetic particles.
4. The method according to any one of claims 1 to 3, further comprising quantifying the amplified nucleic acid and adjusting its concentration to at least 1000 parts per million.
5. The method according to any one of claims 1 to 4, wherein the panel of primers designed to target regions of nucleic acid sequences comprise oligonucleotides identified by SEQ ID NOS. 1-718.
6. The method according to any one of claims 1 to 5, further comprising staining a second slice of the fixed formalin paraffin embedded sample; capturing a micrograph of the stained sample; and displaying the micrograph with sequence variations detected.
7. The method according to any one of claims 1 to 6, wherein the sample is a solid tumour sample.
8. The method according to claim 7, wherein the solid tumour is selected from any one of colorectal cancer; lung cancer; gastrointestinal cancer; poorly differentiated malignant neoplasm likely malignant adrenocortical tumour; mucinous adenocarcinoma; or signet ring cell diffuse adenocarcinoma.
9. The method according to any one of claims 1 to 8, further comprising determining a cancer treatment regime for the subject based on the variations detected.
10. System for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
1 1. The system according to claim 10, further comprising paramagnetic particles; and a magnetic plate for placing under the chamber.
12. The system according to claim 10 or 1 1 , wherein the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1-718.
13. A kit for detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a system according to claim 6, comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; and fluorescent nucleotides.
14. The kit according to claim 13, further comprising paramagnetic particles.
15. The kit according to claim 13 or 14, wherein the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1-718.
16. A panel of primers for use in detecting variations in nucleic acid isolated from a fixed formalin paraffin embedded sample in a parallel sequencing method comprising primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719- 748.
17. The panel according to claim 16, wherein the panel of primers comprise
oligonucleotides identified by SEQ ID NOS. 1-718.
18. Method for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising:
(a) extracting nucleic acid from a slice of a fixed formalin paraffin embedded sample from a subject;
(b) contacting the extracted nucleic acid with a panel of primers designed to target regions of nucleic acid sequences comprising oligonucleotides identified by SEQ ID NOS. 719-748;
(c) amplifying the nucleic acid;
(d) purifying the amplified nucleic acid;
(e) contacting the amplified nucleic acid with reversible dye terminators;
(f) detecting a signal from the dye terminators; and
(g) converting the signal to a nucleic acid sequence to analysing the same for variations wherein detecting variations from the reference sequence in the nucleic acid indicates the subject has a solid tumour cancer and indicates a cancer treatment regime for the subject.
19. The method according to claim 18, wherein the solid tumour is selected from any one of colorectal cancer; lung cancer; gastrointestinal cancer; poorly differentiated malignant neoplasm likely malignant adrenocortical tumour; mucinous adenocarcinoma; or signet ring cell diffuse adenocarcinoma.
20. The method according to claim 18 or 19, wherein purifying the nucleic acid comprises adding paramagnetic particles to the amplified nucleic acid and removing any nucleic acid not adhering to the paramagnetic particles and removing the paramagnetic particles.
21. The method according to claim 20, further comprising amplifying the purified amplified nucleic acid after removing the paramagnetic particles.
22. The method according to any one of claims 18 to 21 , further comprising quantifying the amplified nucleic acid and adjusting its concentration to at least 1000 parts per million.
23. The method according to any one of claims 18 to 22, wherein the panel of primers designed to target regions of nucleic acid sequences comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
24. The method according to any one of claims 18 to 23, wherein a variation from the reference sequence SEQ ID NO. 737 (cG436A) indicates that a treatment regime with oxaliplatin is preferred.
25. The method according to any one of claims 18 to 23, wherein a variation from the reference sequence SEQ ID NO. 746 indicates that a treatment regime with everolimus is preferred.
26. A system for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: a chamber for extracting and amplifying nucleic acid; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; a sensor for detecting signals; an analyser for converting the signals to nucleic acid sequence details; and a display.
27. The system according to claim 26, further comprising paramagnetic particles; and a magnetic plate for placing under the chamber.
28. The system according to claim 26 or 27, wherein the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
29. A kit for diagnosing or prognosing a cancer treatment regime in a solid tumour cancer comprising: polymerase chain reaction reagents; a panel of primers designed to target regions of nucleic acid sequences identified by SEQ ID NOS. 719-748; and fluorescent nucleotides.
30. The kit according to claim 229, further comprising paramagnetic particles.
31. The kit according to claim 29 or 30, wherein the panel of primers comprise oligonucleotides identified by SEQ ID NOS. 1 -718.
PCT/SG2017/050195 2016-04-06 2017-04-06 System and method for detecting variations in nucleic acid sequence for use in next-generation sequencing WO2017176214A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10201602713P 2016-04-06
SG10201602713P 2016-04-06

Publications (1)

Publication Number Publication Date
WO2017176214A1 true WO2017176214A1 (en) 2017-10-12

Family

ID=60001413

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2017/050195 WO2017176214A1 (en) 2016-04-06 2017-04-06 System and method for detecting variations in nucleic acid sequence for use in next-generation sequencing

Country Status (1)

Country Link
WO (1) WO2017176214A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111763730A (en) * 2019-04-01 2020-10-13 长沙金域医学检验实验室有限公司 Multiplex PCR primer composition, reagent and control system for BRAF gene whole exon next generation sequencing
WO2022220141A1 (en) * 2021-04-12 2022-10-20 タカラバイオ株式会社 Method for detecting mutant sars-cov-2

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120208706A1 (en) * 2010-12-30 2012-08-16 Foundation Medicine, Inc. Optimization of multigene analysis of tumor samples
WO2013138510A1 (en) * 2012-03-13 2013-09-19 Patel Abhijit Ajit Measurement of nucleic acid variants using highly-multiplexed error-suppressed deep sequencing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120208706A1 (en) * 2010-12-30 2012-08-16 Foundation Medicine, Inc. Optimization of multigene analysis of tumor samples
WO2013138510A1 (en) * 2012-03-13 2013-09-19 Patel Abhijit Ajit Measurement of nucleic acid variants using highly-multiplexed error-suppressed deep sequencing

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BURGHEL G. J. ET AL.: "Towards a Next-Generation Sequencing Diagnostic Service for Tumour Genotyping: A Comparison of Panels and Platforms", BIOMED RES INT., vol. 2015, 17 August 2015 (2015-08-17), pages 1 - 7, XP055430537, [retrieved on 20170719] *
CHEVRIER S. ET AL.: "Next-generation sequencing analysis of lung and colon carcinomas reveals a variety of genetic alterations", INT. J. ONCOL., vol. 45, no. 3, 27 June 2014 (2014-06-27), pages 1167 - 1174, XP009183644, [retrieved on 20170719] *
HADD A. G. ET AL.: "Targeted, high-depth, next-generation sequencing of cancer genes in formalin-fixed, paraffin-embedded and fine-needle aspiration tumor specimens", J MOL DIAGN., vol. 15, no. 2, 14 January 2013 (2013-01-14), pages 234 - 247, XP055286254, [retrieved on 20170719] *
HAGEMANN I. S. ET AL.: "Clinical next-generation sequencing in patients with non-small cell lung cancer", CANCER, vol. 121, no. 4, 15 February 2015 (2015-02-15), pages 631 - 639, XP055430532, [retrieved on 20170719] *
SIMEN B. B. ET AL.: "Validation of a next-generation-sequencing cancer panel for use in the clinical laboratory", ARCH PATHOL LAB MED., vol. 139, no. 4, April 2015 (2015-04-01), pages 508 - 517, XP055430546, [retrieved on 20170719] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111763730A (en) * 2019-04-01 2020-10-13 长沙金域医学检验实验室有限公司 Multiplex PCR primer composition, reagent and control system for BRAF gene whole exon next generation sequencing
WO2022220141A1 (en) * 2021-04-12 2022-10-20 タカラバイオ株式会社 Method for detecting mutant sars-cov-2

Similar Documents

Publication Publication Date Title
Zaliova et al. ETV6/RUNX1‐like acute lymphoblastic leukemia: a novel B‐cell precursor leukemia subtype associated with the CD27/CD44 immunophenotype
US11142798B2 (en) Systems and methods for monitoring lifelong tumor evolution field of invention
EP3198026B1 (en) Method of determining pik3ca mutational status in a sample
US20200385817A1 (en) Compositions and methods for screening solid tumors
Simen et al. Validation of a next-generation–sequencing cancer panel for use in the clinical laboratory
Kotoula et al. Targeted KRAS mutation assessment on patient tumor histologic material in real time diagnostics
Gonzalez-Bosquet et al. Detection of somatic mutations by high-resolution DNA melting (HRM) analysis in multiple cancers
Graham et al. Gene expression profiles of estrogen receptor–positive and estrogen receptor–negative breast cancers are detectable in histologically normal breast epithelium
CN106414768B (en) Gene fusions and gene variants associated with cancer
US20170298427A1 (en) Nucleic acids and methods for detecting methylation status
EP3464593B1 (en) Molecular tagging methods and sequencing libraries
EP2982986B1 (en) Method for manufacturing gastric cancer prognosis prediction model
WO2017112738A1 (en) Methods for measuring microsatellite instability
CN106755360B (en) Nucleic acid, kit and method for detecting human CYP2D6 gene polymorphism
Portier et al. Quantitative assessment of mutant allele burden in solid tumors by semiconductor-based next-generation sequencing
WO2017176214A1 (en) System and method for detecting variations in nucleic acid sequence for use in next-generation sequencing
WO2016057852A1 (en) Markers for hematological cancers
Mendez et al. Comprehensive evaluation and validation of targeted next-generation sequencing performance in two clinical laboratories
AU2015246009B2 (en) Methods and kits for identifying pre-cancerous colorectal polyps and colorectal cancer
AU2020392127A1 (en) Methods and compositions for analyses of cancer
WO2022239485A1 (en) Amplicon dna library and kit for acute myeloid leukemia gene panel testing, and use for same
Wu et al. Ensemble of Nucleic Acid Absolute Quantitation Modules for Accurate Copy Number Variation Detection and Targeted RNA Profiling
CA3099612C (en) Method of cancer prognosis by assessing tumor variant diversity by means of establishing diversity indices
Debeljak et al. Validation Strategy for Ultrasensitive Mutation Detection
Ip et al. Molecular Techniques in the Diagnosis and Monitoring of Acute and Chronic Leukaemias

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17779453

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17779453

Country of ref document: EP

Kind code of ref document: A1