US20220213562A1 - Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results - Google Patents
Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results Download PDFInfo
- Publication number
- US20220213562A1 US20220213562A1 US17/699,968 US202217699968A US2022213562A1 US 20220213562 A1 US20220213562 A1 US 20220213562A1 US 202217699968 A US202217699968 A US 202217699968A US 2022213562 A1 US2022213562 A1 US 2022213562A1
- Authority
- US
- United States
- Prior art keywords
- cancer
- tumor
- sequence
- subject
- polynucleotides
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 233
- 230000001747 exhibiting effect Effects 0.000 title abstract description 10
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title description 137
- 201000010099 disease Diseases 0.000 title description 133
- 238000012360 testing method Methods 0.000 title description 106
- 238000011282 treatment Methods 0.000 title description 91
- 238000001514 detection method Methods 0.000 title description 30
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 406
- 201000011510 cancer Diseases 0.000 claims abstract description 180
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 141
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 141
- 239000002157 polynucleotide Substances 0.000 claims abstract description 141
- 238000012163 sequencing technique Methods 0.000 claims abstract description 79
- 230000002068 genetic effect Effects 0.000 claims description 166
- 108020004414 DNA Proteins 0.000 claims description 157
- 102000053602 DNA Human genes 0.000 claims description 152
- 239000003814 drug Substances 0.000 claims description 98
- 239000002773 nucleotide Substances 0.000 claims description 44
- 210000004369 blood Anatomy 0.000 claims description 42
- 239000008280 blood Substances 0.000 claims description 42
- 125000003729 nucleotide group Chemical group 0.000 claims description 42
- 206010009944 Colon cancer Diseases 0.000 claims description 24
- 230000004044 response Effects 0.000 claims description 24
- 238000013507 mapping Methods 0.000 claims description 21
- 239000012472 biological sample Substances 0.000 claims description 19
- 238000009877 rendering Methods 0.000 claims description 19
- 230000000392 somatic effect Effects 0.000 claims description 19
- 206010006187 Breast cancer Diseases 0.000 claims description 18
- 208000026310 Breast neoplasm Diseases 0.000 claims description 18
- 108091035707 Consensus sequence Proteins 0.000 claims description 12
- 210000002700 urine Anatomy 0.000 claims description 10
- 208000001333 Colorectal Neoplasms Diseases 0.000 claims description 9
- 238000012217 deletion Methods 0.000 claims description 9
- 230000037430 deletion Effects 0.000 claims description 9
- 208000003174 Brain Neoplasms Diseases 0.000 claims description 6
- 238000003556 assay Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 6
- 238000009169 immunotherapy Methods 0.000 claims description 6
- 210000002381 plasma Anatomy 0.000 claims description 6
- 206010060862 Prostate cancer Diseases 0.000 claims description 5
- 208000000236 Prostatic Neoplasms Diseases 0.000 claims description 5
- 201000010536 head and neck cancer Diseases 0.000 claims description 5
- 208000014829 head and neck neoplasm Diseases 0.000 claims description 5
- 238000003780 insertion Methods 0.000 claims description 5
- 230000037431 insertion Effects 0.000 claims description 5
- 210000002966 serum Anatomy 0.000 claims description 4
- 229960005386 ipilimumab Drugs 0.000 claims description 2
- 229940124597 therapeutic agent Drugs 0.000 claims 1
- 230000001225 therapeutic effect Effects 0.000 abstract description 84
- 206010069754 Acquired gene mutation Diseases 0.000 abstract description 47
- 230000037439 somatic mutation Effects 0.000 abstract description 47
- 210000004027 cell Anatomy 0.000 description 217
- 239000000523 sample Substances 0.000 description 102
- 229940079593 drug Drugs 0.000 description 99
- 230000004075 alteration Effects 0.000 description 94
- 108090000623 proteins and genes Proteins 0.000 description 93
- 230000035772 mutation Effects 0.000 description 55
- 230000008569 process Effects 0.000 description 38
- 238000004458 analytical method Methods 0.000 description 37
- 239000012634 fragment Substances 0.000 description 37
- 108700028369 Alleles Proteins 0.000 description 36
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 description 30
- 108700015053 epidermal growth factor receptor activity proteins Proteins 0.000 description 30
- 238000002560 therapeutic procedure Methods 0.000 description 30
- YOHYSYJDKVYCJI-UHFFFAOYSA-N n-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide Chemical compound FC(F)(F)C1=CC=CC(NC=2N=CN=C(NC=3C=C(NC(=O)C4CC4)C=CC=3)C=2)=C1 YOHYSYJDKVYCJI-UHFFFAOYSA-N 0.000 description 28
- 101000984753 Homo sapiens Serine/threonine-protein kinase B-raf Proteins 0.000 description 25
- 102100027103 Serine/threonine-protein kinase B-raf Human genes 0.000 description 25
- 210000004881 tumor cell Anatomy 0.000 description 25
- 230000010076 replication Effects 0.000 description 24
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 20
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 20
- 230000015654 memory Effects 0.000 description 20
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 17
- 150000007523 nucleic acids Chemical class 0.000 description 17
- 230000003321 amplification Effects 0.000 description 16
- 229920002521 macromolecule Polymers 0.000 description 16
- 238000003199 nucleic acid amplification method Methods 0.000 description 16
- 102000039446 nucleic acids Human genes 0.000 description 16
- 108020004707 nucleic acids Proteins 0.000 description 16
- 230000000007 visual effect Effects 0.000 description 16
- 210000004602 germ cell Anatomy 0.000 description 15
- 102100039788 GTPase NRas Human genes 0.000 description 14
- 101000744505 Homo sapiens GTPase NRas Proteins 0.000 description 14
- 101000779641 Homo sapiens ALK tyrosine kinase receptor Proteins 0.000 description 13
- 238000012544 monitoring process Methods 0.000 description 13
- 230000002829 reductive effect Effects 0.000 description 13
- 102100033793 ALK tyrosine kinase receptor Human genes 0.000 description 12
- 101000584612 Homo sapiens GTPase KRas Proteins 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 12
- 102100030708 GTPase KRas Human genes 0.000 description 11
- 230000002159 abnormal effect Effects 0.000 description 11
- 210000001124 body fluid Anatomy 0.000 description 11
- 230000003247 decreasing effect Effects 0.000 description 11
- 230000004927 fusion Effects 0.000 description 11
- 230000004077 genetic alteration Effects 0.000 description 11
- 238000003860 storage Methods 0.000 description 11
- 210000001519 tissue Anatomy 0.000 description 11
- 208000032791 BCR-ABL1 positive chronic myelogenous leukemia Diseases 0.000 description 10
- 101000997832 Homo sapiens Tyrosine-protein kinase JAK2 Proteins 0.000 description 10
- 102100033444 Tyrosine-protein kinase JAK2 Human genes 0.000 description 10
- 230000032823 cell division Effects 0.000 description 10
- 238000002512 chemotherapy Methods 0.000 description 10
- 102000004169 proteins and genes Human genes 0.000 description 10
- 210000001082 somatic cell Anatomy 0.000 description 10
- 208000010833 Chronic myeloid leukaemia Diseases 0.000 description 9
- 206010068052 Mosaicism Diseases 0.000 description 9
- 208000033761 Myelogenous Chronic BCR-ABL Positive Leukemia Diseases 0.000 description 9
- 102000001759 Notch1 Receptor Human genes 0.000 description 9
- 208000006265 Renal cell carcinoma Diseases 0.000 description 9
- 210000000349 chromosome Anatomy 0.000 description 9
- 238000004891 communication Methods 0.000 description 9
- BFSMGDJOXZAERB-UHFFFAOYSA-N dabrafenib Chemical compound S1C(C(C)(C)C)=NC(C=2C(=C(NS(=O)(=O)C=3C(=CC=CC=3F)F)C=CC=2)F)=C1C1=CC=NC(N)=N1 BFSMGDJOXZAERB-UHFFFAOYSA-N 0.000 description 9
- 238000003745 diagnosis Methods 0.000 description 9
- 102200055464 rs113488022 Human genes 0.000 description 9
- 102100028286 Proto-oncogene tyrosine-protein kinase receptor Ret Human genes 0.000 description 8
- 108020005091 Replication Origin Proteins 0.000 description 8
- 229960002465 dabrafenib Drugs 0.000 description 8
- 230000000694 effects Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 201000001441 melanoma Diseases 0.000 description 8
- 238000005192 partition Methods 0.000 description 8
- 102100034540 Adenomatous polyposis coli protein Human genes 0.000 description 7
- 230000004544 DNA amplification Effects 0.000 description 7
- 239000005517 L01XE01 - Imatinib Substances 0.000 description 7
- 239000002144 L01XE18 - Ruxolitinib Substances 0.000 description 7
- 206010035226 Plasma cell myeloma Diseases 0.000 description 7
- 102100033177 Vascular endothelial growth factor receptor 2 Human genes 0.000 description 7
- 238000001574 biopsy Methods 0.000 description 7
- 230000004545 gene duplication Effects 0.000 description 7
- 231100000118 genetic alteration Toxicity 0.000 description 7
- KTUFNOKKBVMGRW-UHFFFAOYSA-N imatinib Chemical compound C1CN(C)CCN1CC1=CC=C(C(=O)NC=2C=C(NC=3N=C(C=CN=3)C=3C=NC=CC=3)C(C)=CC=2)C=C1 KTUFNOKKBVMGRW-UHFFFAOYSA-N 0.000 description 7
- 238000005259 measurement Methods 0.000 description 7
- 206010061289 metastatic neoplasm Diseases 0.000 description 7
- 230000011987 methylation Effects 0.000 description 7
- 238000007069 methylation reaction Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 108700024394 Exon Proteins 0.000 description 6
- 101000823316 Homo sapiens Tyrosine-protein kinase ABL1 Proteins 0.000 description 6
- 239000005551 L01XE03 - Erlotinib Substances 0.000 description 6
- 108091028043 Nucleic acid sequence Proteins 0.000 description 6
- 208000036878 aneuploidy Diseases 0.000 description 6
- 231100001075 aneuploidy Toxicity 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 6
- 239000013068 control sample Substances 0.000 description 6
- AAKJLRGGTJKAMG-UHFFFAOYSA-N erlotinib Chemical compound C=12C=C(OCCOC)C(OCCOC)=CC2=NC=NC=1NC1=CC=CC(C#C)=C1 AAKJLRGGTJKAMG-UHFFFAOYSA-N 0.000 description 6
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 6
- 230000036541 health Effects 0.000 description 6
- 229960002411 imatinib Drugs 0.000 description 6
- 208000015181 infectious disease Diseases 0.000 description 6
- 210000000056 organ Anatomy 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 230000000284 resting effect Effects 0.000 description 6
- 229960000215 ruxolitinib Drugs 0.000 description 6
- HFNKQEVNSGCOJV-OAHLLOKOSA-N ruxolitinib Chemical compound C1([C@@H](CC#N)N2N=CC(=C2)C=2C=3C=CNC=3N=CN=2)CCCC1 HFNKQEVNSGCOJV-OAHLLOKOSA-N 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- LIRYPHYGHXZJBZ-UHFFFAOYSA-N trametinib Chemical compound CC(=O)NC1=CC=CC(N2C(N(C3CC3)C(=O)C3=C(NC=4C(=CC(I)=CC=4)F)N(C)C(=O)C(C)=C32)=O)=C1 LIRYPHYGHXZJBZ-UHFFFAOYSA-N 0.000 description 6
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 5
- 229940124602 FDA-approved drug Drugs 0.000 description 5
- 101000605639 Homo sapiens Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform Proteins 0.000 description 5
- 101001126417 Homo sapiens Platelet-derived growth factor receptor alpha Proteins 0.000 description 5
- 239000002146 L01XE16 - Crizotinib Substances 0.000 description 5
- 108700020796 Oncogene Proteins 0.000 description 5
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 5
- 102100038332 Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform Human genes 0.000 description 5
- 102100030485 Platelet-derived growth factor receptor alpha Human genes 0.000 description 5
- 208000020584 Polyploidy Diseases 0.000 description 5
- 206010039491 Sarcoma Diseases 0.000 description 5
- 208000005718 Stomach Neoplasms Diseases 0.000 description 5
- 102100022596 Tyrosine-protein kinase ABL1 Human genes 0.000 description 5
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 5
- 108091092240 circulating cell-free DNA Proteins 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 5
- 238000004590 computer program Methods 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 5
- KTEIFNKAUNYNJU-GFCCVEGCSA-N crizotinib Chemical compound O([C@H](C)C=1C(=C(F)C=CC=1Cl)Cl)C(C(=NC=1)N)=CC=1C(=C1)C=NN1C1CCNCC1 KTEIFNKAUNYNJU-GFCCVEGCSA-N 0.000 description 5
- 229940000406 drug candidate Drugs 0.000 description 5
- 230000001973 epigenetic effect Effects 0.000 description 5
- 229960001433 erlotinib Drugs 0.000 description 5
- 239000003777 experimental drug Substances 0.000 description 5
- 230000014509 gene expression Effects 0.000 description 5
- 208000032839 leukemia Diseases 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 229960001972 panitumumab Drugs 0.000 description 5
- 230000005855 radiation Effects 0.000 description 5
- 210000003296 saliva Anatomy 0.000 description 5
- 230000035945 sensitivity Effects 0.000 description 5
- 239000007787 solid Substances 0.000 description 5
- 229960004066 trametinib Drugs 0.000 description 5
- 229960000575 trastuzumab Drugs 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 208000014697 Acute lymphocytic leukaemia Diseases 0.000 description 4
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 4
- 206010005003 Bladder cancer Diseases 0.000 description 4
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 description 4
- 108010009392 Cyclin-Dependent Kinase Inhibitor p16 Proteins 0.000 description 4
- 238000001712 DNA sequencing Methods 0.000 description 4
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 4
- 230000010558 Gene Alterations Effects 0.000 description 4
- 101000692455 Homo sapiens Platelet-derived growth factor receptor beta Proteins 0.000 description 4
- 101000932478 Homo sapiens Receptor-type tyrosine-protein kinase FLT3 Proteins 0.000 description 4
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 4
- 206010025323 Lymphomas Diseases 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 4
- 208000034578 Multiple myelomas Diseases 0.000 description 4
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 4
- 108010011536 PTEN Phosphohydrolase Proteins 0.000 description 4
- 102000014160 PTEN Phosphohydrolase Human genes 0.000 description 4
- 102100026547 Platelet-derived growth factor receptor beta Human genes 0.000 description 4
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 4
- 102100033810 RAC-alpha serine/threonine-protein kinase Human genes 0.000 description 4
- 102100033479 RAF proto-oncogene serine/threonine-protein kinase Human genes 0.000 description 4
- 102100020718 Receptor-type tyrosine-protein kinase FLT3 Human genes 0.000 description 4
- 102100026715 Serine/threonine-protein kinase STK11 Human genes 0.000 description 4
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 description 4
- 102100033254 Tumor suppressor ARF Human genes 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 230000008236 biological pathway Effects 0.000 description 4
- 229960005395 cetuximab Drugs 0.000 description 4
- 238000007385 chemical modification Methods 0.000 description 4
- 230000002759 chromosomal effect Effects 0.000 description 4
- 229960005061 crizotinib Drugs 0.000 description 4
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 4
- 210000001840 diploid cell Anatomy 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 201000004101 esophageal cancer Diseases 0.000 description 4
- 239000012530 fluid Substances 0.000 description 4
- 206010017758 gastric cancer Diseases 0.000 description 4
- 230000037442 genomic alteration Effects 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 230000003902 lesion Effects 0.000 description 4
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 238000007481 next generation sequencing Methods 0.000 description 4
- 201000002528 pancreatic cancer Diseases 0.000 description 4
- 208000008443 pancreatic carcinoma Diseases 0.000 description 4
- 230000036961 partial effect Effects 0.000 description 4
- 210000004214 philadelphia chromosome Anatomy 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 230000004043 responsiveness Effects 0.000 description 4
- 239000004065 semiconductor Substances 0.000 description 4
- 201000011549 stomach cancer Diseases 0.000 description 4
- 238000012706 support-vector machine Methods 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 230000005945 translocation Effects 0.000 description 4
- 238000002604 ultrasonography Methods 0.000 description 4
- 201000005112 urinary bladder cancer Diseases 0.000 description 4
- GPXBXXGIAQBQNI-UHFFFAOYSA-N vemurafenib Chemical compound CCCS(=O)(=O)NC1=CC=C(F)C(C(=O)C=2C3=CC(=CN=C3NC=2)C=2C=CC(Cl)=CC=2)=C1F GPXBXXGIAQBQNI-UHFFFAOYSA-N 0.000 description 4
- 108010004586 Ataxia Telangiectasia Mutated Proteins Proteins 0.000 description 3
- 208000010839 B-cell chronic lymphocytic leukemia Diseases 0.000 description 3
- 201000009030 Carcinoma Diseases 0.000 description 3
- 206010061818 Disease progression Diseases 0.000 description 3
- 206010059866 Drug resistance Diseases 0.000 description 3
- 102100027842 Fibroblast growth factor receptor 3 Human genes 0.000 description 3
- 101710182396 Fibroblast growth factor receptor 3 Proteins 0.000 description 3
- 102100029974 GTPase HRas Human genes 0.000 description 3
- 101000584633 Homo sapiens GTPase HRas Proteins 0.000 description 3
- 208000031422 Lymphocytic Chronic B-Cell Leukemia Diseases 0.000 description 3
- 208000003445 Mouth Neoplasms Diseases 0.000 description 3
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 3
- 102000043276 Oncogene Human genes 0.000 description 3
- 201000000582 Retinoblastoma Diseases 0.000 description 3
- 208000000453 Skin Neoplasms Diseases 0.000 description 3
- 208000000102 Squamous Cell Carcinoma of Head and Neck Diseases 0.000 description 3
- 108091008605 VEGF receptors Proteins 0.000 description 3
- 230000002411 adverse Effects 0.000 description 3
- 150000001413 amino acids Chemical class 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 239000011324 bead Substances 0.000 description 3
- 201000007455 central nervous system cancer Diseases 0.000 description 3
- 239000003086 colorant Substances 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 230000005750 disease progression Effects 0.000 description 3
- 208000035475 disorder Diseases 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- -1 for example Chemical class 0.000 description 3
- XGALLCVXEZPNRQ-UHFFFAOYSA-N gefitinib Chemical compound C=12C=C(OCCCN3CCOCC3)C(OC)=CC2=NC=NC=1NC1=CC=C(F)C(Cl)=C1 XGALLCVXEZPNRQ-UHFFFAOYSA-N 0.000 description 3
- 230000007614 genetic variation Effects 0.000 description 3
- 208000005017 glioblastoma Diseases 0.000 description 3
- 201000000459 head and neck squamous cell carcinoma Diseases 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 238000009396 hybridization Methods 0.000 description 3
- 210000002865 immune cell Anatomy 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 238000009533 lab test Methods 0.000 description 3
- 206010024627 liposarcoma Diseases 0.000 description 3
- 208000020816 lung neoplasm Diseases 0.000 description 3
- 239000002829 mitogen activated protein kinase inhibitor Substances 0.000 description 3
- 201000000050 myeloid neoplasm Diseases 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000037361 pathway Effects 0.000 description 3
- 229960002087 pertuzumab Drugs 0.000 description 3
- 238000001959 radiotherapy Methods 0.000 description 3
- 230000008707 rearrangement Effects 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- FNHKPVJBJVTLMP-UHFFFAOYSA-N regorafenib Chemical compound C1=NC(C(=O)NC)=CC(OC=2C=C(F)C(NC(=O)NC=3C=C(C(Cl)=CC=3)C(F)(F)F)=CC=2)=C1 FNHKPVJBJVTLMP-UHFFFAOYSA-N 0.000 description 3
- 102200124923 rs121913254 Human genes 0.000 description 3
- 102200087780 rs77375493 Human genes 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- 238000006467 substitution reaction Methods 0.000 description 3
- 238000002626 targeted therapy Methods 0.000 description 3
- 230000001131 transforming effect Effects 0.000 description 3
- 229960003862 vemurafenib Drugs 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 102000010400 1-phosphatidylinositol-3-kinase activity proteins Human genes 0.000 description 2
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 108091093088 Amplicon Proteins 0.000 description 2
- 206010005949 Bone cancer Diseases 0.000 description 2
- 208000018084 Bone neoplasm Diseases 0.000 description 2
- 101001042041 Bos taurus Isocitrate dehydrogenase [NAD] subunit beta, mitochondrial Proteins 0.000 description 2
- 102100028914 Catenin beta-1 Human genes 0.000 description 2
- ZEOWTGPWHLSLOG-UHFFFAOYSA-N Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F Chemical compound Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F ZEOWTGPWHLSLOG-UHFFFAOYSA-N 0.000 description 2
- 108091007854 Cdh1/Fizzy-related Proteins 0.000 description 2
- 102000038594 Cdh1/Fizzy-related Human genes 0.000 description 2
- 208000037051 Chromosomal Instability Diseases 0.000 description 2
- 208000005443 Circulating Neoplastic Cells Diseases 0.000 description 2
- ZBNZXTGUTAYRHI-UHFFFAOYSA-N Dasatinib Chemical compound C=1C(N2CCN(CCO)CC2)=NC(C)=NC=1NC(S1)=NC=C1C(=O)NC1=C(C)C=CC=C1Cl ZBNZXTGUTAYRHI-UHFFFAOYSA-N 0.000 description 2
- 101150029707 ERBB2 gene Proteins 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 102100038595 Estrogen receptor Human genes 0.000 description 2
- HKVAMNSJSFKALM-GKUWKFKPSA-N Everolimus Chemical compound C1C[C@@H](OCCO)[C@H](OC)C[C@@H]1C[C@@H](C)[C@H]1OC(=O)[C@@H]2CCCCN2C(=O)C(=O)[C@](O)(O2)[C@H](C)CC[C@H]2C[C@H](OC)/C(C)=C/C=C/C=C/[C@@H](C)C[C@@H](C)C(=O)[C@H](OC)[C@H](O)/C(C)=C/[C@@H](C)C(=O)C1 HKVAMNSJSFKALM-GKUWKFKPSA-N 0.000 description 2
- 101710105178 F-box/WD repeat-containing protein 7 Proteins 0.000 description 2
- 102100028138 F-box/WD repeat-containing protein 7 Human genes 0.000 description 2
- 102100023593 Fibroblast growth factor receptor 1 Human genes 0.000 description 2
- 101710182386 Fibroblast growth factor receptor 1 Proteins 0.000 description 2
- 102100023600 Fibroblast growth factor receptor 2 Human genes 0.000 description 2
- 101710182389 Fibroblast growth factor receptor 2 Proteins 0.000 description 2
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 2
- 102100025334 Guanine nucleotide-binding protein G(q) subunit alpha Human genes 0.000 description 2
- 102100032610 Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas Human genes 0.000 description 2
- 102100036738 Guanine nucleotide-binding protein subunit alpha-11 Human genes 0.000 description 2
- 102100022057 Hepatocyte nuclear factor 1-alpha Human genes 0.000 description 2
- 102100038970 Histone-lysine N-methyltransferase EZH2 Human genes 0.000 description 2
- 101000916173 Homo sapiens Catenin beta-1 Proteins 0.000 description 2
- 101000738568 Homo sapiens G1/S-specific cyclin-E1 Proteins 0.000 description 2
- 101000857888 Homo sapiens Guanine nucleotide-binding protein G(q) subunit alpha Proteins 0.000 description 2
- 101001014590 Homo sapiens Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas Proteins 0.000 description 2
- 101001014594 Homo sapiens Guanine nucleotide-binding protein G(s) subunit alpha isoforms short Proteins 0.000 description 2
- 101001072407 Homo sapiens Guanine nucleotide-binding protein subunit alpha-11 Proteins 0.000 description 2
- 101001045751 Homo sapiens Hepatocyte nuclear factor 1-alpha Proteins 0.000 description 2
- 101000882127 Homo sapiens Histone-lysine N-methyltransferase EZH2 Proteins 0.000 description 2
- 101000960234 Homo sapiens Isocitrate dehydrogenase [NADP] cytoplasmic Proteins 0.000 description 2
- 101000599886 Homo sapiens Isocitrate dehydrogenase [NADP], mitochondrial Proteins 0.000 description 2
- 101000916644 Homo sapiens Macrophage colony-stimulating factor 1 receptor Proteins 0.000 description 2
- 101001014610 Homo sapiens Neuroendocrine secretory protein 55 Proteins 0.000 description 2
- 101001109719 Homo sapiens Nucleophosmin Proteins 0.000 description 2
- 101000797903 Homo sapiens Protein ALEX Proteins 0.000 description 2
- 101000779418 Homo sapiens RAC-alpha serine/threonine-protein kinase Proteins 0.000 description 2
- 101000628562 Homo sapiens Serine/threonine-protein kinase STK11 Proteins 0.000 description 2
- 101000934996 Homo sapiens Tyrosine-protein kinase JAK3 Proteins 0.000 description 2
- 101001087416 Homo sapiens Tyrosine-protein phosphatase non-receptor type 11 Proteins 0.000 description 2
- 101000851018 Homo sapiens Vascular endothelial growth factor receptor 1 Proteins 0.000 description 2
- 101000851007 Homo sapiens Vascular endothelial growth factor receptor 2 Proteins 0.000 description 2
- 102100039905 Isocitrate dehydrogenase [NADP] cytoplasmic Human genes 0.000 description 2
- 102100037845 Isocitrate dehydrogenase [NADP], mitochondrial Human genes 0.000 description 2
- 208000008839 Kidney Neoplasms Diseases 0.000 description 2
- 101150105104 Kras gene Proteins 0.000 description 2
- 239000005411 L01XE02 - Gefitinib Substances 0.000 description 2
- 239000002147 L01XE04 - Sunitinib Substances 0.000 description 2
- 239000002067 L01XE06 - Dasatinib Substances 0.000 description 2
- 239000002136 L01XE07 - Lapatinib Substances 0.000 description 2
- 239000005536 L01XE08 - Nilotinib Substances 0.000 description 2
- 239000003798 L01XE11 - Pazopanib Substances 0.000 description 2
- 239000002118 L01XE12 - Vandetanib Substances 0.000 description 2
- 239000002145 L01XE14 - Bosutinib Substances 0.000 description 2
- 239000002138 L01XE21 - Regorafenib Substances 0.000 description 2
- 239000002176 L01XE26 - Cabozantinib Substances 0.000 description 2
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 2
- 102100028198 Macrophage colony-stimulating factor 1 receptor Human genes 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 208000009018 Medullary thyroid cancer Diseases 0.000 description 2
- 208000000172 Medulloblastoma Diseases 0.000 description 2
- 206010027476 Metastases Diseases 0.000 description 2
- 102100025725 Mothers against decapentaplegic homolog 4 Human genes 0.000 description 2
- 101710143112 Mothers against decapentaplegic homolog 4 Proteins 0.000 description 2
- 208000033383 Neuroendocrine tumor of pancreas Diseases 0.000 description 2
- 102100022678 Nucleophosmin Human genes 0.000 description 2
- 108091007960 PI3Ks Proteins 0.000 description 2
- 206010067517 Pancreatic neuroendocrine tumour Diseases 0.000 description 2
- 108091008611 Protein Kinase B Proteins 0.000 description 2
- 238000003559 RNA-seq method Methods 0.000 description 2
- 102220467236 Receptor tyrosine-protein kinase erbB-2_I655V_mutation Human genes 0.000 description 2
- 102100029981 Receptor tyrosine-protein kinase erbB-4 Human genes 0.000 description 2
- 101710100963 Receptor tyrosine-protein kinase erbB-4 Proteins 0.000 description 2
- 208000015634 Rectal Neoplasms Diseases 0.000 description 2
- 102100038042 Retinoblastoma-associated protein Human genes 0.000 description 2
- 108700028341 SMARCB1 Proteins 0.000 description 2
- 101150008214 SMARCB1 gene Proteins 0.000 description 2
- 102100025746 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 Human genes 0.000 description 2
- 101710181599 Serine/threonine-protein kinase STK11 Proteins 0.000 description 2
- 102100023085 Serine/threonine-protein kinase mTOR Human genes 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 208000021712 Soft tissue sarcoma Diseases 0.000 description 2
- 108010065917 TOR Serine-Threonine Kinases Proteins 0.000 description 2
- 206010043515 Throat cancer Diseases 0.000 description 2
- 208000024770 Thyroid neoplasm Diseases 0.000 description 2
- 208000037280 Trisomy Diseases 0.000 description 2
- 102100029823 Tyrosine-protein kinase BTK Human genes 0.000 description 2
- 102100025387 Tyrosine-protein kinase JAK3 Human genes 0.000 description 2
- 102100033019 Tyrosine-protein phosphatase non-receptor type 11 Human genes 0.000 description 2
- 108010053099 Vascular Endothelial Growth Factor Receptor-2 Proteins 0.000 description 2
- 102100033178 Vascular endothelial growth factor receptor 1 Human genes 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- RITAVMQDGBJQJZ-FMIVXFBMSA-N axitinib Chemical compound CNC(=O)C1=CC=CC=C1SC1=CC=C(C(\C=C\C=2N=CC=CC=2)=NN2)C2=C1 RITAVMQDGBJQJZ-FMIVXFBMSA-N 0.000 description 2
- 238000007622 bioinformatic analysis Methods 0.000 description 2
- 239000013060 biological fluid Substances 0.000 description 2
- 239000000090 biomarker Substances 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000010839 body fluid Substances 0.000 description 2
- UBPYILGKFZZVDX-UHFFFAOYSA-N bosutinib Chemical compound C1=C(Cl)C(OC)=CC(NC=2C3=CC(OC)=C(OCCCN4CCN(C)CC4)C=C3N=CC=2C#N)=C1Cl UBPYILGKFZZVDX-UHFFFAOYSA-N 0.000 description 2
- JJWKPURADFRFRB-UHFFFAOYSA-N carbonyl sulfide Chemical compound O=C=S JJWKPURADFRFRB-UHFFFAOYSA-N 0.000 description 2
- 108091092259 cell-free RNA Proteins 0.000 description 2
- 208000032852 chronic lymphocytic leukemia Diseases 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 230000003111 delayed effect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000002405 diagnostic procedure Methods 0.000 description 2
- 239000003596 drug target Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000003925 fat Substances 0.000 description 2
- 150000004665 fatty acids Chemical class 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000007672 fourth generation sequencing Methods 0.000 description 2
- 229960002584 gefitinib Drugs 0.000 description 2
- 102000054767 gene variant Human genes 0.000 description 2
- 238000011331 genomic analysis Methods 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- BCFGMOOMADDAQU-UHFFFAOYSA-N lapatinib Chemical compound O1C(CNCCS(=O)(=O)C)=CC=C1C1=CC=C(N=CN=C2NC=3C=C(Cl)C(OCC=4C=C(F)C=CC=4)=CC=3)C2=C1 BCFGMOOMADDAQU-UHFFFAOYSA-N 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 201000005202 lung cancer Diseases 0.000 description 2
- 229940124302 mTOR inhibitor Drugs 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 239000003628 mammalian target of rapamycin inhibitor Substances 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 208000023356 medullary thyroid gland carcinoma Diseases 0.000 description 2
- 230000001394 metastastic effect Effects 0.000 description 2
- 150000002772 monosaccharides Chemical class 0.000 description 2
- 201000005962 mycosis fungoides Diseases 0.000 description 2
- JTSLALYXYSRPGW-UHFFFAOYSA-N n-[5-(4-cyanophenyl)-1h-pyrrolo[2,3-b]pyridin-3-yl]pyridine-3-carboxamide Chemical compound C=1C=CN=CC=1C(=O)NC(C1=C2)=CNC1=NC=C2C1=CC=C(C#N)C=C1 JTSLALYXYSRPGW-UHFFFAOYSA-N 0.000 description 2
- HHZIURLSWUIHRB-UHFFFAOYSA-N nilotinib Chemical compound C1=NC(C)=CN1C1=CC(NC(=O)C=2C=C(NC=3N=C(C=CN=3)C=3C=NC=CC=3)C(C)=CC=2)=CC(C(F)(F)F)=C1 HHZIURLSWUIHRB-UHFFFAOYSA-N 0.000 description 2
- 229960003347 obinutuzumab Drugs 0.000 description 2
- 201000008968 osteosarcoma Diseases 0.000 description 2
- 208000021010 pancreatic neuroendocrine tumor Diseases 0.000 description 2
- 244000052769 pathogen Species 0.000 description 2
- 230000001575 pathological effect Effects 0.000 description 2
- 239000013610 patient sample Substances 0.000 description 2
- CUIHSIWYWATEQL-UHFFFAOYSA-N pazopanib Chemical compound C1=CC2=C(C)N(C)N=C2C=C1N(C)C(N=1)=CC=NC=1NC1=CC=C(C)C(S(N)(=O)=O)=C1 CUIHSIWYWATEQL-UHFFFAOYSA-N 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000004393 prognosis Methods 0.000 description 2
- 239000003197 protein kinase B inhibitor Substances 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000012175 pyrosequencing Methods 0.000 description 2
- 229960004836 regorafenib Drugs 0.000 description 2
- 230000008261 resistance mechanism Effects 0.000 description 2
- 102200108201 rs1042522 Human genes 0.000 description 2
- 102200048955 rs121434569 Human genes 0.000 description 2
- 102200003067 rs1670283 Human genes 0.000 description 2
- 102200003068 rs1881420 Human genes 0.000 description 2
- 102200003069 rs1881421 Human genes 0.000 description 2
- 102200015965 rs459552 Human genes 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 210000000582 semen Anatomy 0.000 description 2
- 238000007841 sequencing by ligation Methods 0.000 description 2
- 208000000649 small cell carcinoma Diseases 0.000 description 2
- 239000007858 starting material Substances 0.000 description 2
- 210000004243 sweat Anatomy 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000002054 transplantation Methods 0.000 description 2
- 229960001612 trastuzumab emtansine Drugs 0.000 description 2
- 238000011269 treatment regimen Methods 0.000 description 2
- 230000007306 turnover Effects 0.000 description 2
- UHTHHESEBZOYNR-UHFFFAOYSA-N vandetanib Chemical compound COC1=CC(C(/N=CN2)=N/C=3C(=CC(Br)=CC=3)F)=C2C=C1OCC1CCN(C)CC1 UHTHHESEBZOYNR-UHFFFAOYSA-N 0.000 description 2
- WEVYNIUIFUYDGI-UHFFFAOYSA-N 3-[6-[4-(trifluoromethoxy)anilino]-4-pyrimidinyl]benzamide Chemical compound NC(=O)C1=CC=CC(C=2N=CN=C(NC=3C=CC(OC(F)(F)F)=CC=3)C=2)=C1 WEVYNIUIFUYDGI-UHFFFAOYSA-N 0.000 description 1
- 101150023956 ALK gene Proteins 0.000 description 1
- 102100034580 AT-rich interactive domain-containing protein 1A Human genes 0.000 description 1
- 102000000872 ATM Human genes 0.000 description 1
- ULXXDDBFHOBEHA-ONEGZZNKSA-N Afatinib Chemical compound N1=CN=C2C=C(OC3COCC3)C(NC(=O)/C=C/CN(C)C)=CC2=C1NC1=CC=C(F)C(Cl)=C1 ULXXDDBFHOBEHA-ONEGZZNKSA-N 0.000 description 1
- 108010029445 Agammaglobulinaemia Tyrosine Kinase Proteins 0.000 description 1
- 206010061424 Anal cancer Diseases 0.000 description 1
- 208000007860 Anus Neoplasms Diseases 0.000 description 1
- BFYIZQONLCFLEV-DAELLWKTSA-N Aromasine Chemical compound O=C1C=C[C@]2(C)[C@H]3CC[C@](C)(C(CC4)=O)[C@@H]4[C@@H]3CC(=C)C2=C1 BFYIZQONLCFLEV-DAELLWKTSA-N 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 102000036365 BRCA1 Human genes 0.000 description 1
- 108700020463 BRCA1 Proteins 0.000 description 1
- 101150072950 BRCA1 gene Proteins 0.000 description 1
- 102000052609 BRCA2 Human genes 0.000 description 1
- 108700020462 BRCA2 Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 206010004146 Basal cell carcinoma Diseases 0.000 description 1
- 206010004593 Bile duct cancer Diseases 0.000 description 1
- 206010006143 Brain stem glioma Diseases 0.000 description 1
- 101150008921 Brca2 gene Proteins 0.000 description 1
- 208000011691 Burkitt lymphomas Diseases 0.000 description 1
- 101710098191 C-4 methylsterol oxidase ERG25 Proteins 0.000 description 1
- 102000008203 CTLA-4 Antigen Human genes 0.000 description 1
- 108010021064 CTLA-4 Antigen Proteins 0.000 description 1
- 229940045513 CTLA4 antagonist Drugs 0.000 description 1
- 206010065305 Cancer in remission Diseases 0.000 description 1
- 206010007275 Carcinoid tumour Diseases 0.000 description 1
- 102000000844 Cell Surface Receptors Human genes 0.000 description 1
- 108010001857 Cell Surface Receptors Proteins 0.000 description 1
- 206010057248 Cell death Diseases 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 201000009047 Chordoma Diseases 0.000 description 1
- 206010065163 Clonal evolution Diseases 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 208000009798 Craniopharyngioma Diseases 0.000 description 1
- 108010058546 Cyclin D1 Proteins 0.000 description 1
- 108010025464 Cyclin-Dependent Kinase 4 Proteins 0.000 description 1
- 108010025468 Cyclin-Dependent Kinase 6 Proteins 0.000 description 1
- 102000009512 Cyclin-Dependent Kinase Inhibitor p15 Human genes 0.000 description 1
- 108010009356 Cyclin-Dependent Kinase Inhibitor p15 Proteins 0.000 description 1
- 102100036252 Cyclin-dependent kinase 4 Human genes 0.000 description 1
- 102100026804 Cyclin-dependent kinase 6 Human genes 0.000 description 1
- 102100032218 Cytokine-inducible SH2-containing protein Human genes 0.000 description 1
- 208000008334 Dermatofibrosarcoma Diseases 0.000 description 1
- 206010057070 Dermatofibrosarcoma protuberans Diseases 0.000 description 1
- 101100310856 Drosophila melanogaster spri gene Proteins 0.000 description 1
- 102100031480 Dual specificity mitogen-activated protein kinase kinase 1 Human genes 0.000 description 1
- 102100023266 Dual specificity mitogen-activated protein kinase kinase 2 Human genes 0.000 description 1
- 101150039808 Egfr gene Proteins 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- 201000008228 Ependymoblastoma Diseases 0.000 description 1
- 206010014967 Ependymoma Diseases 0.000 description 1
- 206010014968 Ependymoma malignant Diseases 0.000 description 1
- 208000006168 Ewing Sarcoma Diseases 0.000 description 1
- 206010053717 Fibrous histiocytoma Diseases 0.000 description 1
- 102100024165 G1/S-specific cyclin-D1 Human genes 0.000 description 1
- 102100024185 G1/S-specific cyclin-D2 Human genes 0.000 description 1
- 102100037858 G1/S-specific cyclin-E1 Human genes 0.000 description 1
- 102100027541 GTP-binding protein Rheb Human genes 0.000 description 1
- 102100025477 GTP-binding protein Rit1 Human genes 0.000 description 1
- 208000022072 Gallbladder Neoplasms Diseases 0.000 description 1
- 201000003741 Gastrointestinal carcinoma Diseases 0.000 description 1
- 208000002966 Giant Cell Tumor of Bone Diseases 0.000 description 1
- 208000032612 Glial tumor Diseases 0.000 description 1
- 206010018338 Glioma Diseases 0.000 description 1
- 208000002250 Hematologic Neoplasms Diseases 0.000 description 1
- 208000017604 Hodgkin disease Diseases 0.000 description 1
- 208000021519 Hodgkin lymphoma Diseases 0.000 description 1
- 208000010747 Hodgkins lymphoma Diseases 0.000 description 1
- 101000924266 Homo sapiens AT-rich interactive domain-containing protein 1A Proteins 0.000 description 1
- 101000924577 Homo sapiens Adenomatous polyposis coli protein Proteins 0.000 description 1
- 101000943420 Homo sapiens Cytokine-inducible SH2-containing protein Proteins 0.000 description 1
- 101000967216 Homo sapiens Eosinophil cationic protein Proteins 0.000 description 1
- 101000882584 Homo sapiens Estrogen receptor Proteins 0.000 description 1
- 101000980741 Homo sapiens G1/S-specific cyclin-D2 Proteins 0.000 description 1
- 101000574654 Homo sapiens GTP-binding protein Rit1 Proteins 0.000 description 1
- 101000596894 Homo sapiens High affinity nerve growth factor receptor Proteins 0.000 description 1
- 101001030211 Homo sapiens Myc proto-oncogene protein Proteins 0.000 description 1
- 101000686031 Homo sapiens Proto-oncogene tyrosine-protein kinase ROS Proteins 0.000 description 1
- 101000579425 Homo sapiens Proto-oncogene tyrosine-protein kinase receptor Ret Proteins 0.000 description 1
- 101000712530 Homo sapiens RAF proto-oncogene serine/threonine-protein kinase Proteins 0.000 description 1
- 101000742859 Homo sapiens Retinoblastoma-associated protein Proteins 0.000 description 1
- 101000771237 Homo sapiens Serine/threonine-protein kinase A-Raf Proteins 0.000 description 1
- 101000799466 Homo sapiens Thrombopoietin receptor Proteins 0.000 description 1
- 101000819111 Homo sapiens Trans-acting T-cell-specific transcription factor GATA-3 Proteins 0.000 description 1
- 101000997835 Homo sapiens Tyrosine-protein kinase JAK1 Proteins 0.000 description 1
- 101001125402 Homo sapiens Vitamin K-dependent protein C Proteins 0.000 description 1
- 206010021042 Hypopharyngeal cancer Diseases 0.000 description 1
- 206010056305 Hypopharyngeal neoplasm Diseases 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 208000037396 Intraductal Noninfiltrating Carcinoma Diseases 0.000 description 1
- 206010073094 Intraductal proliferative breast lesion Diseases 0.000 description 1
- 206010061252 Intraocular melanoma Diseases 0.000 description 1
- 208000007766 Kaposi sarcoma Diseases 0.000 description 1
- 239000002177 L01XE27 - Ibrutinib Substances 0.000 description 1
- 206010023825 Laryngeal cancer Diseases 0.000 description 1
- 206010061523 Lip and/or oral cavity cancer Diseases 0.000 description 1
- 206010062038 Lip neoplasm Diseases 0.000 description 1
- 108010068342 MAP Kinase Kinase 1 Proteins 0.000 description 1
- 108010068353 MAP Kinase Kinase 2 Proteins 0.000 description 1
- 208000006644 Malignant Fibrous Histiocytoma Diseases 0.000 description 1
- 208000032271 Malignant tumor of penis Diseases 0.000 description 1
- 208000025205 Mantle-Cell Lymphoma Diseases 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 108090000744 Mitogen-Activated Protein Kinase Kinases Proteins 0.000 description 1
- 102000004232 Mitogen-Activated Protein Kinase Kinases Human genes 0.000 description 1
- 102000013609 MutL Protein Homolog 1 Human genes 0.000 description 1
- 108010026664 MutL Protein Homolog 1 Proteins 0.000 description 1
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 1
- 201000003793 Myelodysplastic syndrome Diseases 0.000 description 1
- 108010071382 NF-E2-Related Factor 2 Proteins 0.000 description 1
- 206010028729 Nasal cavity cancer Diseases 0.000 description 1
- 206010028767 Nasal sinus cancer Diseases 0.000 description 1
- 208000001894 Nasopharyngeal Neoplasms Diseases 0.000 description 1
- 206010061306 Nasopharyngeal cancer Diseases 0.000 description 1
- 206010029260 Neuroblastoma Diseases 0.000 description 1
- 102000007530 Neurofibromin 1 Human genes 0.000 description 1
- 108010085793 Neurofibromin 1 Proteins 0.000 description 1
- 101100401106 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) met-7 gene Proteins 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 208000010505 Nose Neoplasms Diseases 0.000 description 1
- 108010029755 Notch1 Receptor Proteins 0.000 description 1
- 102100031701 Nuclear factor erythroid 2-related factor 2 Human genes 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 206010031096 Oropharyngeal cancer Diseases 0.000 description 1
- 206010057444 Oropharyngeal neoplasm Diseases 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 239000012828 PI3K inhibitor Substances 0.000 description 1
- 239000012823 PI3K/mTOR inhibitor Substances 0.000 description 1
- 206010061332 Paraganglion neoplasm Diseases 0.000 description 1
- 208000003937 Paranasal Sinus Neoplasms Diseases 0.000 description 1
- 208000000821 Parathyroid Neoplasms Diseases 0.000 description 1
- 208000002471 Penile Neoplasms Diseases 0.000 description 1
- 206010034299 Penile cancer Diseases 0.000 description 1
- 208000037581 Persistent Infection Diseases 0.000 description 1
- 208000009565 Pharyngeal Neoplasms Diseases 0.000 description 1
- 206010034811 Pharyngeal cancer Diseases 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 208000007641 Pinealoma Diseases 0.000 description 1
- 208000007913 Pituitary Neoplasms Diseases 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 206010036790 Productive cough Diseases 0.000 description 1
- 102100025803 Progesterone receptor Human genes 0.000 description 1
- 102100023347 Proto-oncogene tyrosine-protein kinase ROS Human genes 0.000 description 1
- 102000014128 RANK Ligand Human genes 0.000 description 1
- 108010025832 RANK Ligand Proteins 0.000 description 1
- 101150020518 RHEB gene Proteins 0.000 description 1
- 101150111584 RHOA gene Proteins 0.000 description 1
- 206010038389 Renal cancer Diseases 0.000 description 1
- 230000018199 S phase Effects 0.000 description 1
- 102000001332 SRC Human genes 0.000 description 1
- 108060006706 SRC Proteins 0.000 description 1
- 208000004337 Salivary Gland Neoplasms Diseases 0.000 description 1
- 206010061934 Salivary gland cancer Diseases 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 102100029437 Serine/threonine-protein kinase A-Raf Human genes 0.000 description 1
- 208000009359 Sezary Syndrome Diseases 0.000 description 1
- 208000021388 Sezary disease Diseases 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 108020004688 Small Nuclear RNA Proteins 0.000 description 1
- 102000013380 Smoothened Receptor Human genes 0.000 description 1
- 101710090597 Smoothened homolog Proteins 0.000 description 1
- 208000031673 T-Cell Cutaneous Lymphoma Diseases 0.000 description 1
- CBPNZQVSJQDFBE-FUXHJELOSA-N Temsirolimus Chemical compound C1C[C@@H](OC(=O)C(C)(CO)CO)[C@H](OC)C[C@@H]1C[C@@H](C)[C@H]1OC(=O)[C@@H]2CCCCN2C(=O)C(=O)[C@](O)(O2)[C@H](C)CC[C@H]2C[C@H](OC)/C(C)=C/C=C/C=C/[C@@H](C)C[C@@H](C)C(=O)[C@H](OC)[C@H](O)/C(C)=C/[C@@H](C)C(=O)C1 CBPNZQVSJQDFBE-FUXHJELOSA-N 0.000 description 1
- 208000024313 Testicular Neoplasms Diseases 0.000 description 1
- 206010057644 Testis cancer Diseases 0.000 description 1
- 102100034196 Thrombopoietin receptor Human genes 0.000 description 1
- 102100021386 Trans-acting T-cell-specific transcription factor GATA-3 Human genes 0.000 description 1
- 102100022387 Transforming protein RhoA Human genes 0.000 description 1
- 206010066901 Treatment failure Diseases 0.000 description 1
- 208000026911 Tuberous sclerosis complex Diseases 0.000 description 1
- 102000044209 Tumor Suppressor Genes Human genes 0.000 description 1
- 108700025716 Tumor Suppressor Genes Proteins 0.000 description 1
- 102100033438 Tyrosine-protein kinase JAK1 Human genes 0.000 description 1
- 208000015778 Undifferentiated pleomorphic sarcoma Diseases 0.000 description 1
- 206010046431 Urethral cancer Diseases 0.000 description 1
- 206010046458 Urethral neoplasms Diseases 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 208000002495 Uterine Neoplasms Diseases 0.000 description 1
- 201000005969 Uveal melanoma Diseases 0.000 description 1
- 102000005789 Vascular Endothelial Growth Factors Human genes 0.000 description 1
- 108010019530 Vascular Endothelial Growth Factors Proteins 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 102100029477 Vitamin K-dependent protein C Human genes 0.000 description 1
- 206010047741 Vulval cancer Diseases 0.000 description 1
- 208000004354 Vulvar Neoplasms Diseases 0.000 description 1
- 208000016025 Waldenstroem macroglobulinemia Diseases 0.000 description 1
- 208000033559 Waldenström macroglobulinemia Diseases 0.000 description 1
- 208000008383 Wilms tumor Diseases 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 208000020990 adrenal cortex carcinoma Diseases 0.000 description 1
- 208000007128 adrenocortical carcinoma Diseases 0.000 description 1
- 229960001686 afatinib Drugs 0.000 description 1
- ULXXDDBFHOBEHA-CWDCEQMOSA-N afatinib Chemical compound N1=CN=C2C=C(O[C@@H]3COCC3)C(NC(=O)/C=C/CN(C)C)=CC2=C1NC1=CC=C(F)C(Cl)=C1 ULXXDDBFHOBEHA-CWDCEQMOSA-N 0.000 description 1
- 229940042992 afinitor Drugs 0.000 description 1
- 238000013019 agitation Methods 0.000 description 1
- 230000000735 allogeneic effect Effects 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 238000011123 anti-EGFR therapy Methods 0.000 description 1
- 238000011394 anticancer treatment Methods 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 229940041181 antineoplastic drug Drugs 0.000 description 1
- 201000011165 anus cancer Diseases 0.000 description 1
- 229940120638 avastin Drugs 0.000 description 1
- 229960003005 axitinib Drugs 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 229910052788 barium Inorganic materials 0.000 description 1
- DSAJWYNOEDNPEQ-UHFFFAOYSA-N barium atom Chemical compound [Ba] DSAJWYNOEDNPEQ-UHFFFAOYSA-N 0.000 description 1
- 230000037429 base substitution Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 208000001119 benign fibrous histiocytoma Diseases 0.000 description 1
- 229960000397 bevacizumab Drugs 0.000 description 1
- 208000026900 bile duct neoplasm Diseases 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 229960000074 biopharmaceutical Drugs 0.000 description 1
- 229920001222 biopolymer Polymers 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 238000004820 blood count Methods 0.000 description 1
- 201000011143 bone giant cell tumor Diseases 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 238000010322 bone marrow transplantation Methods 0.000 description 1
- 238000007469 bone scintigraphy Methods 0.000 description 1
- 229940083476 bosulif Drugs 0.000 description 1
- 229960003736 bosutinib Drugs 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 229960001292 cabozantinib Drugs 0.000 description 1
- ONIQOQHATWINJY-UHFFFAOYSA-N cabozantinib Chemical compound C=12C=C(OC)C(OC)=CC2=NC=CC=1OC(C=C1)=CC=C1NC(=O)C1(C(=O)NC=2C=CC(F)=CC=2)CC1 ONIQOQHATWINJY-UHFFFAOYSA-N 0.000 description 1
- HFCFMRYTXDINDK-WNQIDUERSA-N cabozantinib malate Chemical compound OC(=O)[C@@H](O)CC(O)=O.C=12C=C(OC)C(OC)=CC2=NC=CC=1OC(C=C1)=CC=C1NC(=O)C1(C(=O)NC=2C=CC(F)=CC=2)CC1 HFCFMRYTXDINDK-WNQIDUERSA-N 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 229940056434 caprelsa Drugs 0.000 description 1
- 208000002458 carcinoid tumor Diseases 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 210000003756 cervix mucus Anatomy 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 208000006990 cholangiocarcinoma Diseases 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 229940121657 clinical drug Drugs 0.000 description 1
- 238000011281 clinical therapy Methods 0.000 description 1
- 208000029742 colonic neoplasm Diseases 0.000 description 1
- 238000002648 combination therapy Methods 0.000 description 1
- 229940034568 cometriq Drugs 0.000 description 1
- 238000002591 computed tomography Methods 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 201000007241 cutaneous T cell lymphoma Diseases 0.000 description 1
- 238000002574 cystoscopy Methods 0.000 description 1
- 229960002448 dasatinib Drugs 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 229960001251 denosumab Drugs 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 239000005546 dideoxynucleotide Substances 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 208000022602 disease susceptibility Diseases 0.000 description 1
- 208000028715 ductal breast carcinoma in situ Diseases 0.000 description 1
- 201000007273 ductal carcinoma in situ Diseases 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000001839 endoscopy Methods 0.000 description 1
- 108700021358 erbB-1 Genes Proteins 0.000 description 1
- 229940082789 erbitux Drugs 0.000 description 1
- 108010038795 estrogen receptors Proteins 0.000 description 1
- 229960005167 everolimus Drugs 0.000 description 1
- 229960000255 exemestane Drugs 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 208000024519 eye neoplasm Diseases 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 230000001605 fetal effect Effects 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 201000010175 gallbladder cancer Diseases 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 210000002980 germ line cell Anatomy 0.000 description 1
- 229940087158 gilotrif Drugs 0.000 description 1
- 229940080856 gleevec Drugs 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 201000009277 hairy cell leukemia Diseases 0.000 description 1
- 201000010235 heart cancer Diseases 0.000 description 1
- 208000024348 heart neoplasm Diseases 0.000 description 1
- 201000005787 hematologic cancer Diseases 0.000 description 1
- 208000024200 hematopoietic and lymphoid system neoplasm Diseases 0.000 description 1
- 208000006454 hepatitis Diseases 0.000 description 1
- 231100000283 hepatitis Toxicity 0.000 description 1
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 1
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 1
- 229940022353 herceptin Drugs 0.000 description 1
- 238000013537 high throughput screening Methods 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 201000006866 hypopharynx cancer Diseases 0.000 description 1
- 229960001507 ibrutinib Drugs 0.000 description 1
- XYFPWWZEPKGCCK-GOSISDBHSA-N ibrutinib Chemical compound C1=2C(N)=NC=NC=2N([C@H]2CN(CCC2)C(=O)C=C)N=C1C(C=C1)=CC=C1OC1=CC=CC=C1 XYFPWWZEPKGCCK-GOSISDBHSA-N 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000012296 in situ hybridization assay Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 238000011221 initial treatment Methods 0.000 description 1
- 229940005319 inlyta Drugs 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 201000002313 intestinal cancer Diseases 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229940045773 jakafi Drugs 0.000 description 1
- 201000010982 kidney cancer Diseases 0.000 description 1
- 229960004891 lapatinib Drugs 0.000 description 1
- 206010023841 laryngeal neoplasm Diseases 0.000 description 1
- 208000012987 lip and oral cavity carcinoma Diseases 0.000 description 1
- 201000006721 lip cancer Diseases 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 238000002624 low-dose chemotherapy Methods 0.000 description 1
- 238000009593 lumbar puncture Methods 0.000 description 1
- 230000001926 lymphatic effect Effects 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 208000026045 malignant tumor of parathyroid gland Diseases 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 230000008774 maternal effect Effects 0.000 description 1
- 238000002483 medication Methods 0.000 description 1
- 229940083118 mekinist Drugs 0.000 description 1
- 229960005558 mertansine Drugs 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000003990 molecular pathway Effects 0.000 description 1
- 231100000150 mutagenicity / genotoxicity testing Toxicity 0.000 description 1
- 206010028537 myelofibrosis Diseases 0.000 description 1
- LBWFXVZLPYTWQI-IPOVEDGCSA-N n-[2-(diethylamino)ethyl]-5-[(z)-(5-fluoro-2-oxo-1h-indol-3-ylidene)methyl]-2,4-dimethyl-1h-pyrrole-3-carboxamide;(2s)-2-hydroxybutanedioic acid Chemical compound OC(=O)[C@@H](O)CC(O)=O.CCN(CC)CCNC(=O)C1=C(C)NC(\C=C/2C3=CC(F)=CC=C3NC\2=O)=C1C LBWFXVZLPYTWQI-IPOVEDGCSA-N 0.000 description 1
- 201000008026 nephroblastoma Diseases 0.000 description 1
- 229960001346 nilotinib Drugs 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000002417 nutraceutical Substances 0.000 description 1
- 235000021436 nutraceutical agent Nutrition 0.000 description 1
- 201000008106 ocular cancer Diseases 0.000 description 1
- 201000002575 ocular melanoma Diseases 0.000 description 1
- 239000003921 oil Substances 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 201000005443 oral cavity cancer Diseases 0.000 description 1
- 150000002894 organic compounds Chemical class 0.000 description 1
- 201000006958 oropharynx cancer Diseases 0.000 description 1
- 208000003154 papilloma Diseases 0.000 description 1
- 208000029211 papillomatosis Diseases 0.000 description 1
- 208000007312 paraganglioma Diseases 0.000 description 1
- 201000007052 paranasal sinus cancer Diseases 0.000 description 1
- 238000010238 partial least squares regression Methods 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 229960000639 pazopanib Drugs 0.000 description 1
- 210000005259 peripheral blood Anatomy 0.000 description 1
- 239000011886 peripheral blood Substances 0.000 description 1
- 229940043441 phosphoinositide 3-kinase inhibitor Drugs 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 208000020943 pineal parenchymal cell neoplasm Diseases 0.000 description 1
- 208000010916 pituitary tumor Diseases 0.000 description 1
- 208000010626 plasma cell neoplasm Diseases 0.000 description 1
- 238000012123 point-of-care testing Methods 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 208000025638 primary cutaneous T-cell non-Hodgkin lymphoma Diseases 0.000 description 1
- 208000029340 primitive neuroectodermal tumor Diseases 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 108090000468 progesterone receptors Proteins 0.000 description 1
- 230000002250 progressing effect Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 210000004908 prostatic fluid Anatomy 0.000 description 1
- 238000000575 proteomic method Methods 0.000 description 1
- 206010038038 rectal cancer Diseases 0.000 description 1
- 201000001275 rectum cancer Diseases 0.000 description 1
- 208000019465 refractory cytopenia of childhood Diseases 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 208000015347 renal cell adenocarcinoma Diseases 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 102220014458 rs1050171 Human genes 0.000 description 1
- 102220203488 rs10521 Human genes 0.000 description 1
- 102220123145 rs11145767 Human genes 0.000 description 1
- 102220014476 rs1140475 Human genes 0.000 description 1
- 102220010980 rs12628 Human genes 0.000 description 1
- 102220007194 rs1800861 Human genes 0.000 description 1
- 102220007218 rs1800863 Human genes 0.000 description 1
- 102220109126 rs1873778 Human genes 0.000 description 1
- 102220110099 rs2072454 Human genes 0.000 description 1
- 102220110101 rs2227984 Human genes 0.000 description 1
- 102220109128 rs2228230 Human genes 0.000 description 1
- 102220123111 rs2229974 Human genes 0.000 description 1
- 102220011957 rs2229992 Human genes 0.000 description 1
- 102220099466 rs2246745 Human genes 0.000 description 1
- 102220108577 rs2256740 Human genes 0.000 description 1
- 102220108578 rs2293564 Human genes 0.000 description 1
- 102220205453 rs3125006 Human genes 0.000 description 1
- 102220011958 rs351771 Human genes 0.000 description 1
- 102220139478 rs3795850 Human genes 0.000 description 1
- 102220011961 rs41115 Human genes 0.000 description 1
- 102220011962 rs42427 Human genes 0.000 description 1
- 102220108581 rs4358080 Human genes 0.000 description 1
- 102220015498 rs4362222 Human genes 0.000 description 1
- 102220123112 rs4489420 Human genes 0.000 description 1
- 102220011965 rs465899 Human genes 0.000 description 1
- 102220202616 rs7688609 Human genes 0.000 description 1
- 102220110100 rs7801956 Human genes 0.000 description 1
- 102220011964 rs866006 Human genes 0.000 description 1
- 102220203944 rs9411208 Human genes 0.000 description 1
- JFMWPOCYMYGEDM-XFULWGLBSA-N ruxolitinib phosphate Chemical compound OP(O)(O)=O.C1([C@@H](CC#N)N2N=CC(=C2)C=2C=3C=CNC=3N=CN=2)CCCC1 JFMWPOCYMYGEDM-XFULWGLBSA-N 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 201000002314 small intestine cancer Diseases 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- IVDHYUQIDRJSTI-UHFFFAOYSA-N sorafenib tosylate Chemical compound [H+].CC1=CC=C(S([O-])(=O)=O)C=C1.C1=NC(C(=O)NC)=CC(OC=2C=CC(NC(=O)NC=3C=C(C(Cl)=CC=3)C(F)(F)F)=CC=2)=C1 IVDHYUQIDRJSTI-UHFFFAOYSA-N 0.000 description 1
- 229940068117 sprycel Drugs 0.000 description 1
- 210000003802 sputum Anatomy 0.000 description 1
- 208000024794 sputum Diseases 0.000 description 1
- 206010041823 squamous cell carcinoma Diseases 0.000 description 1
- 229940090374 stivarga Drugs 0.000 description 1
- 238000013517 stratification Methods 0.000 description 1
- 229960001796 sunitinib Drugs 0.000 description 1
- WINHZLLDWRZWRT-ATVHPVEESA-N sunitinib Chemical compound CCN(CC)CCNC(=O)C1=C(C)NC(\C=C/2C3=CC(F)=CC=C3NC\2=O)=C1C WINHZLLDWRZWRT-ATVHPVEESA-N 0.000 description 1
- 238000001847 surface plasmon resonance imaging Methods 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 229940034785 sutent Drugs 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 229940081616 tafinlar Drugs 0.000 description 1
- 229940120982 tarceva Drugs 0.000 description 1
- 229940069905 tasigna Drugs 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 201000003120 testicular cancer Diseases 0.000 description 1
- 208000008732 thymoma Diseases 0.000 description 1
- 201000002510 thyroid cancer Diseases 0.000 description 1
- 210000001685 thyroid gland Anatomy 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- UFTFJSFQGQCHQW-UHFFFAOYSA-N triformin Chemical compound O=COCC(OC=O)COC=O UFTFJSFQGQCHQW-UHFFFAOYSA-N 0.000 description 1
- 208000009999 tuberous sclerosis Diseases 0.000 description 1
- 230000004614 tumor growth Effects 0.000 description 1
- 239000000439 tumor marker Substances 0.000 description 1
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 1
- 229940094060 tykerb Drugs 0.000 description 1
- 229940121358 tyrosine kinase inhibitor Drugs 0.000 description 1
- 206010046766 uterine cancer Diseases 0.000 description 1
- 208000037965 uterine sarcoma Diseases 0.000 description 1
- 206010046885 vaginal cancer Diseases 0.000 description 1
- 208000013139 vaginal neoplasm Diseases 0.000 description 1
- 229960000241 vandetanib Drugs 0.000 description 1
- 210000005166 vasculature Anatomy 0.000 description 1
- 229940069559 votrient Drugs 0.000 description 1
- 201000005102 vulva cancer Diseases 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
- 229940049068 xalkori Drugs 0.000 description 1
- 229940014556 xgeva Drugs 0.000 description 1
- 229940055760 yervoy Drugs 0.000 description 1
- 229940034727 zelboraf Drugs 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6827—Hybridisation assays for detection of mutation or polymorphism
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Definitions
- Cancer refers to any disorder of various malignant neoplasms characterized by the proliferation of anaplastic cells that tend to invade surrounding tissue and metastasize to new body sites and the pathological conditions characterized by such growths.
- cancers can accumulate genetic variants through, e.g., somatic cell mutation. Such variants include, for example, sequence variants and copy number variants. Analysis of tumors has indicated that different cells in a tumor can bear different genetic variants. Such differentiation between tumor cells has been referred to as tumor heterogeneity.
- Cancers can evolve over time, becoming resistant to a therapeutic intervention. Certain variants are known to correlate with responsiveness or resistance to specific therapeutic interventions. More effective treatments for cancers exhibiting tumor heterogeneity would be beneficial. Such cancers may be treated with a second, different, therapeutic intervention to which the cancer responds.
- DNA sequencing methods allow detection of genetic variants in DNA from tumor cells. Cancer tumors continually shed their unique genomic material into the bloodstream. Unfortunately, these telltale genomic “signals” are so weak that current genomic analysis technologies, including next-generation sequencing, may only detect such signals sporadically or in patients with terminally high tumor burden. The main reason for this is that such technologies are plagued by error rates and bias that can be orders of magnitude higher than what is required to reliably detect de novo genomic alterations associated with cancer.
- a method comprising: (a) sequencing polynucleotides from cancer cells from a biological sample of a subject; (b) identifying and quantifying somatic mutations in the polynucleotides; (c) developing a profile of tumor heterogeneity in the subject indicating the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates tumor heterogeneity; and (d) determining a therapeutic intervention for a cancer exhibiting the tumor heterogeneity, wherein the therapeutic intervention is effective against a cancer having the profile of tumor heterogeneity determined.
- the cancer cells are spatially distinct.
- the therapeutic intervention is more effective against a cancer presenting with the plurality of somatic mutations than it is against a cancer presenting with any one, but not all, of the somatic mutations.
- the method further comprises: (e) monitoring changes in tumor heterogeneity in the subject over time and determining different therapeutic interventions over time based on the changes.
- the method further comprises: (e) displaying the therapeutic intervention.
- the method further comprises: (e) implementing the therapeutic intervention.
- the method further comprises: (e) generating a phylogeny of tumor evolution based on the tumor profile; wherein determining the therapeutic intervention takes into account the phylogeny.
- determining is performed with the aid of computer-executed algorithm.
- sequence reads generated by sequencing are subject to noise reduction before identifying and quantifying.
- noise reduction comprises molecular tracking of sequences generated from a single polynucleotide in the sample.
- determining a therapeutic intervention takes into account the relative frequencies of the tumor-related genetic alterations.
- the therapeutic intervention comprises administering, in combination or in series, a plurality of drugs, wherein each drug is relatively more effective against a cancer presenting with a different one of somatic mutations that occur at different relative frequency.
- a drug that is relatively more effective against a cancer presenting with a somatic mutation occurring at higher relative frequency is administered in higher amount.
- the drugs are delivered at doses that are stratified to reflect the relative amounts of the variants in the DNA.
- cancers presenting with at least one of the genetic variants is resistant to at least one of the drugs.
- determining a therapeutic intervention takes into account the tissue of origin of the cancer. In some embodiments, the therapeutic intervention is determined based on a database of interventions shown to be therapeutic for cancers having tumor heterogeneity characterized by each of the somatic mutations.
- the polynucleotides comprise cfDNA from a blood sample. In some embodiments, the polynucleotides comprise polynucleotides from spatially distinct cancer cells. In some embodiments, the polynucleotides comprise polynucleotides from different metastatic tumor sites. In some embodiments, the polynucleotides comprise polynucleotides from a solid tumor or a diffuse tumor. In some embodiments, the polynucleotides are comprised in a blood sample or in solid tumor biopsy.
- identifying comprises generating a plurality of sequence reads for parent polynucleotides from the sample, and collapsing the sequence reads to generate consensus calls for bases in each parent polynucleotide.
- quantifying comprises determining frequency at which the somatic mutations are detected in the population of polynucleotides from the biological sample.
- the biological sample comprises biological molecules from non-disease cells.
- the biological sample comprises biological molecules from a plurality of different tissues.
- the biomolecules are comprised in one biological sample.
- the biomolecules are comprised in a plurality of biological samples.
- the plurality of biological samples are tumors from a plurality of metastases.
- sequencing comprises sequencing all or part of a subset of genes in the subject's genome.
- the somatic mutations are selected from single nucleotide variations (SNVs), insertions, deletions, inversions, transversions, translocations, copy number variations (CNVs) (e.g., aneuploidy, partial aneuploidy, polyploidy), chromosomal instability, chromosomal structure alterations, gene fusions, chromosome fusions, gene truncations, gene amplification, gene duplications, chromosomal lesions, DNA lesions, abnormal changes in nucleic acid chemical modifications, abnormal changes in epigenetic patterns and abnormal changes in nucleic acid methylation.
- genetic loci are selected from single nucleotides, genes and chromosomes.
- the cancer is selected from carcinomas, sarcomas, leukemias, lymphomas, myelomas and central nervous system cancers (e.g., breast cancer, prostate cancer, colorectal cancer, brain cancer, esophageal cancer, head and neck cancer, bladder cancer, gynecological cancer, liposarcoma, and multiple myeloma).
- cancer cells of the tumor are derived from a common parent disease cell.
- cancer cells of the tumor are derived from different parent cancer cells of the same or different cancer type.
- the method further comprises determining a measure of the somatic mutations to one or more control references to determine the relative quantity.
- the polynucleotides are sourced from both circulating cancer polynucleotides and from solid tumor biopsy. In some embodiments, profiles are separately developed for polynucleotides sourced from the circulating cancer polynucleotides and from the solid tumor biopsy.
- a method comprising providing a therapeutic intervention for a subject having a cancer having a tumor profile from which tumor heterogeneity can be inferred, wherein the therapeutic intervention is effective against cancers with the tumor profile.
- the tumor profile indicates relative frequency of a plurality of more somatic mutations.
- the method further comprises monitoring changes in the relative frequencies in the subject over time and determining different therapeutic interventions over time based on the changes.
- the therapeutic intervention is more effective against a cancer presenting with each of the somatic mutations than it is against a cancer presenting with any one, but not all, of the somatic mutations.
- the therapeutic intervention comprises administering, in combination or in series, a plurality of drugs, wherein each drug is relatively more effective against a cancer presenting with a different one of somatic mutations that occur at different relative frequency.
- a drug that is relatively more effective against a cancer presenting with a somatic mutation occurring at higher relative frequency is administered in higher amount.
- the drugs are delivered at doses that are stratified to reflect the relative amounts of the variants in the DNA.
- cancers presenting with at least one of the genetic variants is resistant to at least one of the drugs.
- the cancer is selected from carcinomas, sarcomas, leukemias, lymphomas, myelomas and central nervous system cancers (e.g., breast cancer, prostate cancer, colorectal cancer, brain cancer, esophageal cancer, head and neck cancer, bladder cancer, gynecological cancer, liposarcoma, and multiple myeloma).
- carcinomas e.g., breast cancer, prostate cancer, colorectal cancer, brain cancer, esophageal cancer, head and neck cancer, bladder cancer, gynecological cancer, liposarcoma, and multiple myeloma.
- a method comprising administering to a subject a therapeutic intervention that is effective against a tumor exhibiting tumor heterogeneity, wherein the therapeutic intervention is based on a profile of tumor heterogeneity in the subject indicating the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates tumor heterogeneity.
- a system comprising a computer readable medium comprising machine-executable code that, upon execution by a computer processor, implements a method comprising: (a) receiving into memory sequence reads of polynucleotides mapping to a genetic locus; (b) determining, among said sequence reads, identity of bases that are different than a base of a reference sequence at the locus of the total number of sequence reads mapping to a locus; (c) reporting the identity and relative quantity of the determined bases and their location in the genome; and (d) inferring heterogeneity of a given sample based on information in (c).
- the method implemented further comprises receiving into memory sequence reads derived from samples at a plurality of different times and calculating a difference in relative amount and identity of a plurality of bases between the two samples.
- kits comprising a first pharmaceutical drug and a second pharmaceutical drug, wherein a combination of the first drug and the second drug is more therapeutically effective against a cancer presenting with a first and a second somatic mutation than it is against a cancer presenting with any one, but not all, of the somatic mutations.
- the combination is contained in a mixture or each drug is contained in a separate container.
- a method comprising: (a) performing biomolecular analysis of biomolecular polymers from disease cells (e.g., spatially distinct disease cells) from a subject; (b) identifying and quantifying biomolecular variants in the biomolecular macromolecules; (c) developing a profile of disease cell heterogeneity in the subject indicating the presence and relative quantity of a plurality of the variants in the biomolecular macromolecules, wherein different relative quantities indicates disease cell heterogeneity; and (d) determining a therapeutic intervention for a disease exhibiting the disease cell heterogeneity, wherein the therapeutic intervention is effective against a disease having the profile of disease cell heterogeneity determined.
- the disease cells are spatially distinct disease cells.
- the therapeutic intervention is determined based on a database of interventions shown to be therapeutic for cancers having tumor heterogeneity characterized by each of the somatic mutations.
- a method of detecting disease cell heterogeneity in a subject comprising: a) quantifying polynucleotides that bear a sequence variant at each of a plurality of genetic loci in polynucleotides from a sample from the subject, wherein the sample comprises polynucleotides from somatic cells and from disease cells; b) determining for each locus a measure of copy number variation (CNV) for polynucleotides bearing the sequence variant; c) determining for each locus a weighted measure of quantity of polynucleotides bearing a sequence variant at the locus as a function of CNV at the locus; and d) comparing the weighted measures at each of the plurality of loci, wherein different weighted measures indicate disease cell heterogeneity.
- the disease cells are tumor cells.
- polynucleotides comprise cfDNA.
- a method comprising: a) subjecting a subject to one or more pulsed therapy cycles, each pulsed therapy cycle comprising: (i) a first period during which one or more drugs is administered at a first amount and (ii) a second period during which the one or more drugs is administered at a second, reduced (e.g., completely not administered) amount; wherein: (A) the first period is characterized by a tumor burden detected above a first clinical level; and (B) the second period is characterized by a tumor burden detected below a second clinical level.
- tumor burden is measured as a function of a quantity of a selected somatic variant in tumor polynucleotides.
- one or more drugs is a plurality of drugs and each amount of each drug in each cycle is determined as a function of tumor burden measured as a function of a quantity of each of a plurality of different selected somatic variants in tumor polynucleotides.
- the method comprises subjecting the subject to a plurality of pulsed therapy cycles.
- the method further comprises: b) when the subject exhibits resistance to the one or more drugs, subjecting the subject to one or more pulsed therapy cycles, each pulsed therapy cycle comprising: (i) a first period during which a different one or more drugs is administered at a first amount and (ii) a second period during which the different one or more drugs is administered at a second, reduced (e.g., completely not administered) amount; wherein: (A) the first period is characterized by a tumor burden detected above a first clinical level; and (B) the second period is characterized by a tumor burden detected below a second clinical level.
- a method comprising: (a) sequencing polynucleotides from cancer cells from a subject; (b) identifying and quantifying somatic mutations in the polynucleotides; and (c) developing a profile of tumor heterogeneity in the subject for use in determining a therapeutic intervention effective for a cancer exhibiting tumor heterogeneity, wherein the profile indicates the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates tumor heterogeneity.
- a method comprising providing a therapeutic intervention for a subject wherein the therapeutic intervention is determined from a profile of disease cell heterogeneity in the subject, wherein the profile indicates the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates disease cell heterogeneity; and wherein the therapeutic intervention is effective against a disease having the profile of disease cell heterogeneity determined, e.g., more effective against a disease presenting with the plurality of somatic mutations than it is against a disease presenting with any one, but not all, of the somatic mutations.
- a method comprising: a) determining a measure of deviation from a value of central tendency (e.g., standard deviation, variance) of copy number in polynucleotides in a sample across a region of at least 1 kb, at least 10 kb, at least 100 kb, at least 1 mb, at least 10 mb or at least 100 mb of a genome; b) inferring a measure of burden of DNA from cells undergoing cell division in the sample based on the measure of deviation.
- the value of central tendency is mean, median or mode.
- determining comprises partitioning the region into a plurality of non-overlapping intervals, determining a measure of copy number at each interval and determining the measure of deviation based on measures of copy number at each interval.
- the interval is no more than any of 1 base, 10 bases, 100 bases, 1 kb bases or 10 kb.
- a method of inferring a measure of burden of DNA from cells undergoing cell division in a sample comprising measuring copy number variation induced by proximity of one or more genomic loci to cells' origins of replication, wherein increased CNV indicates cells undergoing cell division.
- the burden is measured in cell-free DNA.
- the measure of burden relates to the fraction of tumor cells or genome-equivalents of DNA from tumor cells in the sample.
- CNV due to proximity to origins of replication is inferred from a set of control samples or cell-lines.
- a hidden-markov model, regression model, principal component analysis-based model, or genotype-modified model is used to approximate variations due to origins of replications.
- the measure of burden is presence or absence of cells undergoing cell division. In some embodiments, proximity is within 1 kb of an origin of replication.
- the method comprises measuring CNV at a locus, determining amount of CNV due to proximity of the locus to an origin of replication, and correcting the measured CNV to reflect genomic CNV, e.g., by subtracting amount of CNV attributable to cell division.
- the genomic data is obtained from cell-free DNA.
- the measure of burden relates to the fraction of tumor cells or genome-equivalents of DNA in a sample.
- variations due to origins of replication are inferred from a set of control samples or cell-lines.
- a hidden-markov model, regression model, principal component analysis-based model, or genotype-modified model is used to approximate variations due to origins of replications.
- a method comprising: a) determining a baseline measure of copies of DNA molecules at one or more loci from one or more control samples, wherein one or more of the loci includes an origin of replication, each containing DNA from cells undergoing a predetermined level of cell division; b) determining a test measure of DNA molecules in a test sample; wherein the measure in test sample is from one or more loci partitioned into one or more partitions and wherein one or more of the loci includes an origin of replication; c) comparing the test measure and the baseline measure, wherein a test measure above a baseline measure indicates DNA in the test sample from cells dividing at a rate faster than cells providing DNA to the control sample.
- the measure is selected from molecule count, a measure of central tendency of molecule count across partitions or a measure of variation of molecule count across partitions.
- a method comprising: (a) administering to a subject an intervention that increases an amount of tumor-derived DNA in the subject's circulation; and (b) when said amount is increased, collecting from the subject a sample containing tumor-derived DNA.
- the intervention preferentially kills tumor cells.
- the intervention comprises exposing the subject or suspected diseased areas of the subject to radiation.
- the intervention comprises exposing the subject or suspected diseased areas of subject to ultrasound.
- the intervention comprises exposing the subject or suspected diseased areas of subject to physical agitation.
- the intervention comprises administering to the subject a low dose of chemotherapy.
- the method comprises administering the intervention to the subject within 1 week before collecting the sample.
- the sample is selected from blood, plasma, serum, urine, saliva, cerebral spinal fluid, vaginal secretion, mucous and semen.
- a method comprising compiling a database, wherein the database includes, for each of a plurality of subjects having cancer, tumor genomic testing data, including somatic alterations, collected at two or more time intervals per subject, one or more therapeutic interventions administered to each of the subjects at one or more times and efficacy of the therapeutic interventions, wherein the database is useful to infer efficacy of the therapeutic interventions in subjects with a tumor genomic profile.
- the plurality is at least 50, at least 500 or at least 5000.
- the tumor genomic testing data is collected via serial biopsy, cell-free DNA, cell-free RNA or circulating tumor cells.
- relative frequencies of detected genetic variants are used to classify treatment efficacy.
- additional information is used to help classify treatment efficacy, including but not limited to, weight, adverse treatment effects, histological testing, blood testing, radiographic information, prior treatments, and cancer type.
- treatment response per patient is collected and classified quantitatively through additional testing.
- the additional testing is blood or urine based testing.
- a method comprising use of a database to identify one or more effective therapeutic interventions for a subject having cancer, wherein the database includes, for each of a plurality of subjects having cancer, tumor genomic testing data, including somatic alterations, collected at two or more time intervals per subject, one or more therapeutic interventions administered to each of the subjects at one or more times and efficacy of the therapeutic interventions.
- identified therapeutic interventions are stratified by efficacy.
- quantitative bounds on predicted therapeutic interventions efficacy or lack thereof are reported.
- the therapeutic interventions use information of predicted tumor genomic evolution or acquired resistance mechanisms in similar patients in response to treatment.
- the method comprises classifying effectiveness of treatment using a classification algorithm, e.g., linear regression processes (e.g., multiple linear regression (MLR), partial least squares (PLS) regression and principal components regression (PCR)), binary decision trees (e.g., recursive partitioning processes such as CART—classification and regression trees), artificial neural networks such as back propagation networks, discriminant analyses (e.g., Bayesian classifier or Fischer analysis), logistic classifiers, and support vector classifiers (e.g., support vector machines).
- linear regression processes e.g., multiple linear regression (MLR), partial least squares (PLS) regression and principal components regression (PCR)
- binary decision trees e.g., recursive partitioning processes such as CART—classification and regression trees
- artificial neural networks such as back propagation networks
- discriminant analyses e.g., Bayesian classifier or Fischer analysis
- logistic classifiers e.g., support vector machines
- a method to report results of one or more genetic tests comprising: capturing genetic information including genetic variants and quantitative measures thereof over one or more test points using a genetic analyzer; normalizing the quantitative measures for rendering with the one or more test points and generating a scaling factor; applying the scaling factor to render a tumor response map; and generating a summary of genetic variants.
- the method comprises analyzing non-CNV (copy number variation) mutant allele frequencies.
- the method comprises transforming an absolute value into a relative metric for rendering the tumor response map.
- the method comprises multiplying a mutant allele frequency by a predetermined value and taking a log thereof.
- the method comprises: multiplying the scaling factor by a transformed value for each gene to determine a quantity indicator to be rendered on the tumor response map; and assigning a unique visual indicator for each alteration in a visual panel.
- the method comprises Y-centering or vertically centering the quantity indicator in a contiguously placed panel that indicates continuity.
- the assigning further comprises providing a unique color for each alteration.
- the method comprises analyzing genetic information from another test point or test time. In some embodiments, wherein a new test result does not differ from a prior test result, the method comprises rendering the prior visual panel. In some embodiments, wherein if alterations remain the same, but quantities have changed, the method comprises: maintaining the order and unique visual indicator for each alteration; and determining a new quantity indicator and generating a new visual panel for all test points. In some embodiments, the method comprises determining a new alteration in the genetic information and adding the alteration to the top of existing alterations. In some embodiments, the method comprises determining a new alteration in the genetic information and determining new transform values and scaling factor and assigning a unique visual indicator for each new alteration.
- the method comprises determining a new alteration in the genetic information and re-generating the tumor response map including alterations from a prior test point that are still detected in current test point and the new alteration. In some embodiments, the method comprises determining if a prior alteration is no longer present and if so, comprising using a height of zero when rendering the quantity of the alteration of the prior alteration for subsequent test points. In some embodiments, the method comprises determining if a prior alteration is no longer present and if so, reserving the unique visual indicator associated with the prior alternation from future use.
- the method comprises analyzing CNV mutant allele frequencies and methylation mutant allele frequencies. In some embodiments, the method comprises grouping of maximum mutant allele frequencies for rendering first on the tumor response map. In some embodiments, the method comprises rendering alterations for the gene in decreasing mutant allele frequency order of alterations. In some embodiments, the method comprises rendering alterations for the gene in a decreasing order. In some embodiments, the method comprises selecting a next gene with next highest mutant allele frequency.
- the method comprises generating a trend indicator for the alteration over the different test points. In some embodiments, the method comprises generating a summary of alterations. In some embodiments, the method comprises generating a summary of treatment options. In some embodiments, the method comprises generating a summary of mutant allele frequency, cell free amplification, clinical approval indication, and clinical trial. In some embodiments, the method comprises generating a panel based on a biological pathway. In some embodiments, the method comprises generating a panel based on an evidence level. In some embodiments, the genetic information includes one or more of single-nucleotide variations, copy number variations, insertions and deletions, and gene rearrangements. In some embodiments, the method comprises generating a clinical relevance report on detected alterations. In some embodiments, the method comprises generating a therapy result summary.
- a method to generate a genetic report comprising: generating non-copy number variation (CNV) data using a genetic analyzer; determining a scaling factor for each non-CNV mutant allele frequency; for a first test, generating a visual panel each non-CNV alteration using the scaling factor; and for each subsequent test, generating changes in the non-CNV alteration for the visual panel using the scaling factor.
- CNV non-copy number variation
- the method comprises transforming an absolute value into a relative metric for rendering. In some embodiments, the method comprises multiplying a mutant allele frequency by a predetermined value and taking a log of the predetermined value. In some embodiments, the method comprises determining a scaling factor using a maximum observed value. In some embodiments, for each non-CNV alteration, the method comprises multiplying a scaling factor by a transformed value for each gene variant as a quantity indicator for visualizing the gene variant.
- the method comprises assigning a unique visual indicator for each alteration.
- the method comprises using the visual panel if the test result is unchanged.
- the method comprises maintaining the order and unique visual indicator for each alteration; and recomputing a quantity indicator for visualizing that variant and re-rendering updated values in existing panel(s) and new panel for the latest test.
- the method comprises adding the alterations to the top of all existing alterations; computing transform values and the scaling factor; and assigning a unique visual indicator for each new alterations.
- the method comprises: re-rendering alterations in the prior test point and the new alteration; and vertically centering an image of the alterations in a contiguously placed panel that indicates continuity.
- the method comprises using a height of zero as the quantity of the alteration for a subsequent rendering.
- the method comprises rendering subject or intervention information associated with alteration changes.
- the method comprises identifying an alteration with the maximum Mutant Allele Frequency.
- the method comprises: reporting alterations for that gene in decreasing mutant allele frequency order of non-CNV alterations; and reporting CNV alterations for that gene in decreasing order of CNV value. In some embodiments, the method comprises selecting the next gene with next highest non-CNV mutant allele frequency and reporting alterations for that gene in decreasing mutant allele frequency order of non-CNV alterations; and reporting CNV alterations for that gene in decreasing order of CNV value.
- the method comprises rendering a trend indicator for an alteration over different test dates.
- the method comprises grouping of maximum mutant allele frequencies and generating annotations including biological pathways or evidence level.
- the method comprises generating a panel based on an evidence level.
- the method comprises generating a panel based on a biological pathway.
- the genetic information includes one or more of single-nucleotide variations, copy number variations, insertions and deletions, and gene rearrangements.
- a method comprising: a) providing a plurality of nucleic acid samples from a subject, the samples collected at serial time points; b) sequencing polynucleotides from the samples to generate sequences; c) determining a quantitative measure of each of a plurality of genetic variants among the polynucleotides in each sample; d) graphically representing by computer relative quantities of genetic variants at each serial time point for those somatic mutations present at a non-zero quantity at least one of the serial time points.
- the quantitative measure is the frequency of the genetic variant among all sequences mapping to the same genetic locus.
- the relative quantities are represented as a stacked area graph.
- the relative quantities are stacked, at the earliest time point, highest to lowest from the bottom to the top of the graph, and wherein a genetic variant first appearing at a non-zero quantity at a later time point is stacked at the top of the graph.
- the areas are represented by different colors.
- the graphical representation further indicates, for each time point, the quantitative measure of the predominant genetic variant.
- the graphical representation further includes a key identifying genetic variants represented on the graph.
- graphically representing comprises normalizing and scaling the quantitative measures.
- the polynucleotides comprise cfDNA.
- the loci are located in oncogenes.
- the plurality of the genetic variants maps to a different gene in the genome.
- the plurality of the genetic variants maps to the same gene in the genome. In some embodiments, at least 10 different oncogenes are sequenced.
- determining comprises receiving the sequences into computer memory and using a computer processor to execute software to determine the quantitative measurement.
- graphically representing comprises using a computer processor to execute software that transforms the quantitative measures into a graphical format and representing the graphical format on an electronic graphical user interface, e.g., a display screen.
- a method to generate a paper or electronic patient test report from data generated by a genetic analyzer comprising: a) summarizing data from two or more testing time points, whereby a union of all non-zero testing results are reported at each subsequent test point after the first test; and b) rendering the testing results on the paper or electronic patient test report.
- summarizing and rendering are performed on a computer by executing code with a computer processor to (i) identify all non-zero testing results, (ii) generate the test report and (iii) display the test report on a graphical user interface.
- a method of graphically representing evolution of genetic variants of a tumor in a subject from data generated by a genetic analyzer comprising: a) generating by computer a stacked representation of genetic variants detected at each of a plurality of time points in the subject, wherein a height or width of each layer in the stack that corresponds to a genetic variant represents a quantitative contribution of the genetic variant to the a total quantity of genetic variants at each time point; and b) displaying the stacked representation on a computer monitor or a paper report.
- the method further comprises using a combination of a magnitude of detected genetic variants in a body-fluid based test to infer a disease burden.
- the method further comprises using allele fractions of detected mutations, allelic imbalances, gene-specific coverage to infer the disease burden.
- an overall stack height is representative of overall disease burden or a disease burden score in the subject.
- a distinct color is used to represent each genetic variant.
- only a subset of detected genetic variants is plotted. In some embodiments, the subset is chosen based on likelihood of being a driver alteration or association with increased or reduced response to treatment.
- the method comprises producing a test report for a genomic test.
- a non-linear scale is used for representing the heights or widths of each represented genetic variant.
- a plot of previous test points is depicted on the report.
- the method comprises estimating a disease progression or remission based on rate of change and/or quantitative precision of each testing result.
- the method comprises displaying a therapeutic intervention between intervening testing points.
- displaying comprises: a) receiving data representing the detected tumor genetic variants into computer memory; b) executing code with a computer processor to graphically represent the quantitative contribution of each genetic variant at a time point as a line or area proportional to the relative contribution; and c) displaying the graphical representation on a graphical user interface.
- FIG. 1 shows a flow chart of an exemplary method of determination and use of a therapeutic intervention.
- FIG. 2 shows a flow chart of an exemplary method of determining frequency of variants in a sample corrected based on CNV at a locus.
- FIG. 3 shows a flow chart of an exemplary method of providing pulsed therapy cycles which can delay drug resistance.
- FIG. 4 shows a flow chart of an exemplary method of detecting tumor burden using CNV at origins of replication to detect DNA from dividing cells.
- FIG. 5 shows an exemplary computer system.
- FIG. 6 shows an exemplary scan of CNV across a region of a genome from samples containing cells in a resting state and in a state of cell division.
- No genomic CNV is seen in loci a and b, but locus c shows gene duplication.
- copy number is relatively equal in all intervals in the region, except those intervals overlapping the locus of gene duplication.
- copy number appears to increase immediately after origins of replication, providing variance in CNV over the region. Deviation is particularly dramatic at a locus exhibiting CNV at an origin of replication (c).
- FIG. 7 shows an exemplary course of monitoring and treatment of disease in a subject.
- FIG. 8 shows an exemplary panel of 70 genes that exhibit genetic variation in cancer.
- FIG. 9A shows an exemplary system for communicating cancer test results.
- FIG. 9B shows an exemplary process to reduce error rates and bias in DNA sequence readings and generate genetic reports for users.
- FIG. 10A-10C show exemplary processes for reporting genetic test results to users.
- FIG. 10D-10I -2 show pages from an exemplary genetic test report.
- FIG. 10J-10P shows various exemplary modified streamgraph.
- FIG. 11A-11B shows exemplary processes for detecting mutation and reporting test results to users.
- Methods of the present disclosure can detect biomolecular mosaicism (e.g., genetic mosaicism) in a biological sample, such as a heterogeneous genomic population of cells or deoxyribonucleic acid (DNA).
- Genetic mosaicism can exist at the organismal level. For example, genetic variants that arise early in development can result in different somatic cells having different genomes.
- An individual can be a chimera, e.g., produced by the fusion of two zygotes.
- Organ transplant from an allogeneic donor can result in genetic mosaics, which also can be detected by examining polynucleotides shed into the blood from the transplanted organ.
- Disease cell heterogeneity in which diseased cells have different genetic variants, is another form of genetic mosaicism.
- Methods provided herein can detect mosaicism and, in the case of disease, provide therapeutic intervention.
- this disclosure provides methods for performing body-wide profiling of biomolecular mosaicism through the use of circulating polynucleotides, which may derive or otherwise originate from cells in diverse locations of the body of a subject.
- Diseased cells such as tumors
- Diseased cells may evolve over time, resulting in different clonal sub-populations having new genetic and phenotypic characteristics. This may result from natural mutations as the cells divide, or it may be driven by treatments that target certain clonal sub-populations, allowing clones more resistant to the treatment to proliferate by negative selection.
- the existence of sub-populations of diseased cells that bear different genotypic or phenotypic characteristics is referred to herein as disease cell heterogeneity, or, in the case of cancer, tumor heterogeneity.
- cancers are treated based on mutant forms found in a cancer biopsy.
- the finding of Her2+ in even small amounts of breast cancer cells may be indicative of breast cancer, which may be followed through with a treatment using an anti-Her2+ therapy.
- a colorectal cancer in which a KRAS mutant is found in small amounts may be treated with a therapy for which KRAS is responsive.
- This information can be used by a health care provider, e.g., a physician, to develop therapeutic interventions.
- a subject that has a heterogeneous tumor can be treated as if they had two tumors, and a therapeutic intervention can treat each of the tumors.
- the therapeutic intervention could include, for example, a combination therapy including a first drug effective against the first tumor type and a second drug effective against the second tumor type.
- the drugs can be given in amounts that reflect the relative amounts of the mutant forms detected. For example, a drug to treat the mutant form that is found in higher relative amounts can be delivered at greater dose than a drug to treat the mutant form in lesser relative amount. Or, treatment for the mutant in the lesser relative amount can be delayed or staggered with respect the mutant in greater amount.
- therapeutic intervention can be calibrated to an evolving tumor. For example, analysis may show increasing amounts of polynucleotides bearing drug resistance mutants.
- the therapeutic intervention can be modified to decrease the amount of drug effective to treat a tumor that does not bear the resistance mutant and increase administration of a drug that does treat a tumor bearing the resistance marker.
- Therapeutic interventions can be determined by a healthcare provider or by a computer algorithm, or a combination of the two.
- a database can contain the results of therapeutic interventions against diseases having various profiles of disease cell heterogeneity. The database can be consulted in determining a therapeutic intervention for a disease with a particular profile.
- This present disclosure provides, among other things, methods of determining a therapeutic intervention for a subject having a disease, such as cancer, that exhibits disease cell heterogeneity, e.g., tumor heterogeneity.
- the method involves analyzing biological macromolecules (e.g., sequencing polynucleotides) of disease cells (e.g., spatially distinct disease cells) from a subject having the disease.
- a profile of disease cell heterogeneity is developed that indicates the existence of genetic variants specific to the disease cells and the amount of these variants relative to each other. This information, in turn, is used to determine a therapeutic intervention that takes the profile into account.
- a subject of the methods of this disclosure is any multicellular organism. More specifically, the subject can be a plant or an animal, a vertebrate, a mammal, a mouse, a primate, a simian or a human. Animals include, but are not limited to, farm animals, sport animals, and pets.
- a subject can be a healthy individual, an individual that has or is suspected of having a disease or a pre-disposition to the disease, or an individual that is in need of therapy or suspected of needing therapy.
- a subject can be a patient, e.g., a subject under the care of a professional healthcare provider.
- the subject can have a pathological condition (disease).
- Disease cells exhibiting pathology of disease are referred to herein as disease cells.
- the disease can be a cancer.
- Cancer is a condition characterized by abnormal cells that divide out of control. Cancers include, without limitation, carcinomas, sarcomas, leukemias, lymphomas, myelomas and central nervous system cancers. More specific examples of cancers are breast cancer, prostate cancer, colorectal cancer, brain cancer, esophageal cancer, head and neck cancer, bladder cancer, gynecological cancer, liposarcoma, and multiple myeloma.
- cancers include, for example, acute lymphoblastic leukemia (ALL), acute myeloid leukemia (AML), adrenocortical carcinoma, Kaposi Sarcoma, anal cancer, basal cell carcinoma, bile duct cancer, bladder cancer, bone cancer, osteosarcoma, malignant fibrous histiocytoma, brain stem glioma, brain cancer, craniopharyngioma, ependymoblastoma, ependymoma, medulloblastoma, medulloeptithelioma, pineal parenchymal tumor, breast cancer, bronchial tumor, Burkitt lymphoma, Non-Hodgkin lymphoma, carcinoid tumor, cervical cancer, chordoma, chronic lymphocytic leukemia (CLL), chronic myelogenous leukemia (CML), colon cancer, colorectal cancer, cutaneous T-cell lymphoma, ductal carcinoma in situ, endometrial cancer
- a tumor is a collection of cancer cells (cancer disease cells). This includes, for example, a collection of cells in a single mass of cells (e.g., a solid tumor), a collection of cells from different metastatic tumor sites (metastatic tumors), and diffuse tumors (e.g., circulating tumor cells).
- a tumor can include cells of a single cancer (e.g., colorectal cancer), or multiple cancers (e.g., colorectal cancer and pancreatic cancer).
- a tumor can include cells originating from a single original somatic cell or from different somatic cells.
- disease cells in the subject are spatially distinct.
- Disease cells are spatially distinct if the cells are located at least 1 cm, at least 2 cm, at least 5 cm or at least 10 cm apart in a body, e.g, in different tissues or organs, or the same tissue or organ.
- examples of spatially distinct cancer cells include cancer cells from diffuse cancers (such as leukemias), cancer cells at different metastatic sites, and cancer cells from the same mass of tumor cells that are separated by at least 1 cm.
- Disease cell burden is a quantitative measure of the amount of disease cells in a subject.
- One measure of disease cell burden is the fraction of total biological macromolecules in a sample that are disease biological macromolecules, e.g., the relative amount of tumor polynucleotides in a sample of cell free polynucleotides. For example, if cfDNA from a first subject has 10% cancer polynucleotides, the subject may be said to have a cell-free tumor burden of 10%, If cfDNA from a second subject has 5% cancer polynucleotides, the a second subject may be said to have half the cell-free tumor burden of the first subject.
- Polynucleotides to be sequenced can be sourced from spatially distinct sites. This includes polynucleotides sourced from biopsies of different locations in a single tumor mass. It also includes polynucleotides sourced from cells at different metastatic tumor sites. Cells shed polynucleotides into the blood where it is detectable as cell free polynucleotides (e.g., circulating tumor DNA). Cell free polynucleotides also can be found in other bodily fluids such as urine. Therefore, cfDNA provides a more accurate profile of tumor heterogeneity across the entire disease cell population than DNA sourced from a single tumor location. DNA sampled from cells across the disease cell population in a body is referred to as “disease burden DNA” or, in the case of cancer, “tumor burden DNA”.
- Disease cells such as tumors
- tumors may share one, two, three or more genetic variants.
- Such variants may share the same stratification, for example highest frequency, second highest frequency, etc.
- Profiles can also share similar disease cell burdens, e.g., cfDNA burdens, e.g., within 15%, within 10%, within 5% or within 2%.
- a macromolecule is a molecule formed from monomeric subunits.
- Monomeric subunits forming biological macromolecules include, for example, nucleotides, amino acids, monosaccharides and fatty acids.
- Biological macromolecules include, for example, biopolymers and non-polymeric macromolecules.
- a polynucleotide is a macromolecule comprising a polymer of nucleotides.
- Polynucleotides include, for example, polydeoxyribonucleotides (DNA) and polyribonucleotides (RNA).
- a polypeptide is a macromolecule comprising a polymer of amino acids.
- a polysaccharide is a macromolecule comprising a polymer of monosaccharides.
- Lipids are a diverse group of organic compounds including, for example, fats, oils and hormones that share the functional characteristic of not interacting appreciably with water.
- a triglyceride is a fat formed from three fatty acid chains.
- a cancer polynucleotide is a polynucleotide (e.g., DNA) derived from a cancer cell.
- Cancer DNA and/or RNA can be extracted from tumors, from isolated cancer cells or from biological fluids (e.g., saliva, serum, blood or urine) in the form of cell free DNA (cfDNA) or cell free RNA.
- Cell free DNA is DNA located outside of a cell in a bodily fluid, e.g., in blood or urine.
- Circulating nucleic acids are nucleic acids found in the blood stream.
- Cell free DNA in the blood is a form of circulating nucleic acid.
- Cell free DNA is believed to arise from dying cells that shed their DNA into the blood. Because spatially distinct cancer cells will shed DNA into bodily fluids, such as blood, cfDNA of cancer subjects typically comprises cancer DNA from spatially distinct cancer cells.
- Analytes for analysis in the methods of this disclosure can derive from a biological sample, e.g., a sample comprising a biological macromolecule.
- a biological sample can be derived from any organ, tissue or biological fluid.
- a biological sample can comprise, for example, a bodily fluid or a solid tissue sample.
- An example of a solid tissue sample is a tumor sample, e.g., from a solid tumor biopsy.
- Bodily fluids include, for example, blood, serum, tumor cells, saliva, urine, lymphatic fluid, prostatic fluid, seminal fluid, milk, sputum, stool and tears.
- Bodily fluids are particularly good sources of biological macromolecules from spatially distinct disease cells, as such cells from many locations in a body can shed these molecules into the bodily fluid.
- blood and urine are good sources of cell free polynucleotides. Macromolecules from such sources can provide a more accurate profile of the diseased cells than macromolecules derived from a localized disease cell mass.
- Amounts of disease polynucleotides in a bodily fluid sample can be increased. Such increases can increase sensitivity of detection of disease polynucleotides.
- an intervention such as a therapeutic intervention, is administered to a subject that causes disease cells to lyse, emptying their DNA into the surrounding fluid.
- Such interventions can include administration of chemotherapy. It also can include administering radiation or ultrasound to the whole body of a subject, or to a portion of the body of a subject, such as being directed to a tumor or a diseased organ.
- a fluid sample is collected for analysis. The interval between administration of the intervention and collection can be long enough for the disease polynucleotides to increase, but not so long that they are cleared from the body. For example, a low dose of chemotherapy can be administered about a week before collection of the sample.
- Genomic analysis can be performed by, for example, a genetic analyzer, e.g., using DNA sequencing. Methylation analysis can be performed by, for example, conversion of methylated bases followed by DNA sequencing. RNA expression analysis can be performed by, for example, polynucleotide array hybridization. Proteomic analysis can be performed by, for example, mass spectrometry.
- the term “genetic analyzer” refers to a system including a DNA sequencer for generating DNA sequence information and a computer comprising software that performs bioinformatic analysis on the DNA sequence information.
- Bioinformatic analysis can include, without limitation, assembling sequence data, detecting and quantifying genetic variants in a sample, including either of germline variants (e.g., heterozygosity) and somatic cell variants (e.g., cancer cell variants).
- Analytic methods can include generating and capturing genetic information.
- Genetic information can include genetic sequence information, ploidy states, the identity of one or more genetic variants, as well as a quantitative measure of the variants.
- quantitative measure refers to any measure of quantity including absolute and relative measures.
- a quantitative measure can be, for example, a number (e.g., a count), a percentage, a frequency, a degree or a threshold amount.
- Polynucleotides can be analyzed by any method known in the art.
- the DNA sequencer will employ next generation sequencing (e.g., Illumina, 454, Ion torrent, SOLiD).
- Sequence analysis can be performed by massively parallel sequencing, that is, simultaneously (or in rapid succession) sequencing any of at least 100,000, 1 million, 10 million, 100 million, or 1 billion polynucleotide molecules.
- Sequencing methods may include, but are not limited to: high-throughput sequencing, pyrosequencing, sequencing-by-synthesis, single-molecule sequencing, nanopore sequencing, semiconductor sequencing, sequencing-by-ligation, sequencing-by-hybridization, RNA-Seq (Illumina), Digital Gene Expression (Helicos), Next generation sequencing, Single Molecule Sequencing by Synthesis (SMSS) (Helicos), massively-parallel sequencing, Clonal Single Molecule Array (Solexa), shotgun sequencing, Maxam-Gilbert or Sanger sequencing, primer walking, sequencing using PacBio, SOLiD, Ion Torrent, Genius (GenapSys) or Nanopore (e.g., Oxford Nanopore) platforms and any other sequencing methods known in the art.
- SMSS Single Molecule Sequencing by Synthesis
- Solexa Solexa
- shotgun sequencing Maxam-Gilbert or Sanger sequencing
- primer walking sequencing using PacBio, SOLiD, Ion
- the DNA sequencer can apply Gilbert's sequencing method based on chemical modification of DNA followed by cleavage at specific bases, or it can apply Sanger's technique which is based on dideoxynucleotide chain termination.
- the Sanger method became popular due to its increased efficiency and low radioactivity.
- the DNA sequencer can use techniques that do not require DNA amplification (polymerase chain reaction—PCR), which speeds up the sample preparation before sequencing and reduces errors.
- PCR polymerase chain reaction
- sequencing data is collected from the reactions caused by the addition of nucleotides in the complementary strand in real time.
- the DNA sequencers can utilize a method called Single-molecule real-time (SMRT), where sequencing data is produced by light (captured by a camera) emitted when a nucleotide is added to the complementary strand by enzymes containing fluorescent dyes.
- SMRT Single-molecule real-time
- Sequencing of the genome can be selective, e.g., directed to portions of the genome of interest. For example, many genes (and mutant forms of these genes) are known to be associated with various cancers. Sequencing of select genes, or portions of genes may suffice for the analysis desired. Polynucleotides mapping to specific loci in the genome that are the subject of interest can be isolated for sequencing by, for example, sequence capture or site-specific amplification.
- a nucleotide sequence (e.g., DNA sequence) can refer to raw sequence reads or processed sequence reads, such as unique molecular counts inferred from raw sequence reads.
- Sequence reads generated from sequencing are subject to analysis including, for example, identifying genetic variants. This can include identifying sequence variants and quantifying numbers of base calls at each locus. Quantifying can involve, for example, counting the number of reads mapping to a particular genetic locus. Different numbers of reads at different loci can indicate copy number variation (CNV).
- CNV copy number variation
- Sequencing and bioinformatics methods that reduce noise and distortion are particularly useful when the number of target polynucleotides in a sample is small compared with non-target polynucleotides.
- the signal from the target may be weak. This can be the case, for example, in the case of cell free DNA, where a small number of tumor polynucleotides may be mixed with a much larger number of polynucleotides from healthy cells.
- Molecular tracking methods can be useful in such situations. Molecular tracking involves tracking sequence reads from a sequencing protocol back to molecules in an original sample (e.g., before amplification and/or sequencing) from which the reads are derived.
- Certain methods involve tagging molecules in such a way that multiple sequence reads produced from original molecules can be grouped into families of sequences derived from original molecules. In this way, base calls representing noise can be filtered out.
- Such methods are described in more detail in, for example, WO 2013/142389 (Schmitt et al.), US 2014/0227705 (Vogelstein et al.) and WO 2014/149134 (Talasaz et al.). Up-sampling methods also are useful to more accurately determine counts of molecules in a sample.
- up-sampling methods involve determining a quantitative measure of individual DNA molecules for which both strands (Watson and Crick strands) are detected; determining a quantitative measure of individual DNA molecules for which only one of the DNA strands is detected; inferring from these measures a quantitative measure of individual DNA molecules for which neither strand was detected; and using these measures to determine the quantitative measure indicative of a number of individual double-stranded DNA molecules in the sample.
- This method is described in more detail in PCT/US2014/072383, filed Dec. 24, 2014.
- Methods of the present disclosure can be used in the detection of genetic variants (also referred to a “gene alterations”).
- Genetic variants are alternative forms at a genetic locus. In the human genome, approximately 0.1% of nucleotide positions are polymorphic, that is, exist in a second genetic form occurring in at least 1% of the population. Mutations can introduce genetic variants into the germ line, and also into disease cells, such as cancer.
- Reference sequences such as hg19 or NCBI Build 37 or Build 38, intend to represent a “wild type” or “normal” genome. However, to the extent they have a single sequence, they do not identify common polymorphisms which may also be considered normal.
- Genetic variants include sequence variants, copy number variants and nucleotide modification variants.
- a sequence variant is a variation in a genetic nucleotide sequence.
- a copy number variant is a deviation from wild type in the number of copies of a portion of a genome.
- Genetic variants include, for example, single nucleotide variations (SNPs), insertions, deletions, inversions, transversions, translocations, gene fusions, chromosome fusions, gene truncations, copy number variations (e.g., aneuploidy, partial aneuploidy, polyploidy, gene amplification), abnormal changes in nucleic acid chemical modifications, abnormal changes in epigenetic patterns and abnormal changes in nucleic acid methylation.
- SNPs single nucleotide variations
- insertions insertions, deletions, inversions, transversions, translocations
- gene fusions chromosome fusions
- gene truncations copy number variations (e.g
- Genetic variants can be detected by comparing sequences from polynucleotides in a sample to a reference, e.g., to a reference genome sequence, to an index or to a database of known mutations.
- the reference sequence is a publicly available reference sequence, such as the human genome sequence HG-19 or NCBI Build 37.
- the reference sequence is a sequence in a non-public database.
- the reference sequence is a germ line sequence of an organism inferred or determined from sequencing polynucleotides from the organism.
- a somatic mutation or somatic alteration is a genetic variant that arises in a somatic cell.
- Somatic mutations are distinguished from mutations that arise in the genome of a germ line cell (i.e., sperm or egg) or a zygote, of an individual.
- Somatic mutations e.g., those found in cancer cells, are distinguishable from the germ line genome of a subject in which the cancer arose. They also can be detected by comparing the cancer genome with the germ line genome or with a reference genome.
- a database of SNVs in human cancer can be found at the website: cancer.sanger.ac.uk/cancergenome/projects/cosmic/.
- FIG. 8 shows genes known, in cancer, to exhibit point mutations, amplifications, fusions and indels.
- a diploid cell having 2N chromosomes with replicated DNA may correspond to about 4 ⁇ DNA content, whereas a diploid cell having 2N chromosomes without replicated DNA may correspond to about 2 ⁇ DNA content.
- Replication proceeds from origins of replication. In mammals, origins of replication are spaced at intervals of about 15 kb to 300 kb. During this period, portions of the genome exist in polyploid form. Those areas between origins of replication and the position of the polymerase are duplicated, while those areas beyond the position of the polymerase (or just before the origin of replication) are still in single copy number in the strand undergoing replication.
- One method to detect tumor burden involves determining copy-number variation due to proximity of examined locus or loci to various origins of replication. Regions that include a replication origin will have very close to 4 copies of DNA in that locus (in a diploid cell), while regions that are far removed from a replication origin will have closer to 2 copies (in a diploid cell).
- the examined locus or loci include, at least 1 kb, at least 10 kb, at least 100 kb, at least 1 mb, at least 10 mb, at least 100 mb, across an entire chromosome or across an entire genome.
- a measure of replication origin CNV (ROCNV) across the region is determined. This can be, for example, a measure of deviation in copy number from a value of central tendency.
- the value of central tendency can be, for example, mean, median or mode.
- the measure of deviation can be for example, variance or standard deviation. This measure can be compared with a measure of ROCNVs across the same region in a control sample, e.g., from a healthy individual or cells in resting state.
- ROCNVs can be determined by partitioning the region or regions analyzed into non-overlapping partitions of various lengths and taking a measure of CNV in this partition. This measure of CNV can be derived from the number of reads or fragments determined to map to those regions after sequencing.
- the partitions can have various sizes, to produce various levels of resolution, e.g., a single base level (base-per-base), 10 bases, 100 bases, 1 kb, 10 kb or 100 kb.
- Deviations that are greater than a control indicate the presence of DNA undergoing replication, which, in turn, indicates malignancy. The greater the degree of deviation, the greater the amount of DNA from cells undergoing cell division in the sample.
- Various methods can be used to calculate true genetic copy number variations that differ from replication origin based distortion.
- heterozygous SNP positions at affected CNV loci can be used to infer copy number variation by calculating the deviation from 50% or the allelic imbalance at those loci. Distortion due to replication origin proximity should not affect this imbalance since both copies would generally be copied at similar time intervals and thus self-normalizing (although allelic changes could conceivably change the replication of origin between the two allelic variants).
- duplication of a chromosome segment containing a SNP could be detected in around 67% of reads, while duplication resulting from ROCNV would be detected in about 50% of reads.
- counting-based techniques that use the density of detected fragments or reads at a certain locus are used to calculate relative copy number. These techniques are generally limited by poisson noise and systematic bias due to DNA sample preparation and sequencing bias. A combination of these methods may also be to obtain even greater accuracy.
- ROCNV can be calculated for a given sample and be used to give a value on cell-free tumor burden despite lack of detection of traditional somatic variants, such as, SNVs, gene-specific CNVs, genomic rearrangements, epigenetic variants, loss of heterozygosity, etc.
- ROCNVs can also be used to subtract distortion for a given sample to increase sensitivity and/or specificity of a given CNV detection/estimation method by removing variation that is related to replication origin proximity rather than due to true copy number changes in a cell.
- Cell-lines with known or no copy number changes over a reference can also be used as a reference of ROCNVs for use in estimating its contribution to a given sample.
- the method involves determining a baseline level of copies of DNA molecules at one or more loci from one or more control samples, each containing DNA from cells undergoing a predetermined level of cell division, e.g., cells in resting state or rapidly dividing tumor cells.
- a measure of copies of DNA molecules in a test sample is also determined.
- the measure in test samples can be from one or more loci partitioned into one or more partitions. In each case, a plurality of loci each include an origin or replication.
- the measure of copies from the test sample can be an average across all partitions, or a level of variance across loci.
- a measure of central tendency or of variation e.g., variance or standard deviation
- a measure that is greater in a test sample than in a control of cells in resting state, or slowly dividing indicates that cells generating the DNA in the test sample are dividing more rapidly than cells providing DNA to the control sample, e.g., are cancerous.
- measures that are similar between a test sample and a control of cells in actively dividing state indicates that cells generating the DNA in the test sample are dividing at a rate similar to the rapidly dividing cells, e.g., are cancerous.
- Disease cell heterogeneity e.g., tumor heterogeneity
- Disease cell heterogeneity is the occurrence of diseased cells having different genetic variants.
- Disease cell heterogeneity can be determined by examination of polynucleotides isolated from diseased cells and detection of differences in their genomes.
- Disease cell heterogeneity also can be inferred from examination of polynucleotides from a sample containing polynucleotides from both diseased and healthy cells based on differences in relative frequency of somatic mutations.
- cancer is characterized by changes at the genetic level, e.g., through the accumulation of somatic mutations in different clonal groups of cells. These changes can contribute to unregulated growth of the cancer cells, or function as markers of responsiveness or non-responsiveness to various therapeutic interventions.
- Tumor heterogeneity is a condition in which a tumor characterized by cancer cells containing different combinations of genetic variants, e.g., different combinations of somatic mutations. That is, the tumor can have different cells containing alterations in different genes, or containing different alterations in the same gene.
- a first cell could include a mutant form of BRAF
- a second cell could include mutant forms of both BRAF and ERBB2.
- a first cancer cell could include the single nucleotide polymorphism EGRF 55249063 G>A
- a second cell could include the single nucleotide polymorphism EGRF 55238874 T>A. (Numbers refer to nucleotide position in genomic reference sequence.)
- an original tumor cell can include a genetic variant in a gene, e.g., an oncogene.
- a genetic variant in a gene e.g., an oncogene.
- some progeny cells, which carry the original mutation may independently develop genetic variants in other genes or in different parts of the same gene. In subsequent divisions, tumor cells can accumulate still more genetic variants.
- the profile includes information from polynucleotides from spatially distinct disease cells.
- the profile is a whole body profile containing information from cells distributed throughout the body. Analysis of polynucleotides in cfDNA allows sampling of DNA across the entire geographic extent of a tumor, in contrast with sampling of a localized area of a tumor. In particular, it allows sampling of diffuse and metastatic tumors. This contrasts with methods that detect the mere existence of tumor heterogeneity through the localized sampling of a tumor.
- the profile can indicate the exact nucleotide sequence of the variant, or may simply indicate a gene bearing the somatic mutation.
- a profile of disease cell heterogeneity such as tumor cell heterogeneity
- the profile identifies genetic variations and the relative amounts of each variant. From this information, one can infer possible distributions of the variants in different cell sub-population. For example, a cancer may begin with a cell bearing somatic mutation X. As a result of clonal evolution, some progeny of this cell may develop variant Y. Other progeny may develop variant Z.
- the tumor may be characterized as 50% X, 35% XY and 15% XZ.
- the profile may indicate 100% X, 35% Y and 15% Z.
- Tumor heterogeneity can be detected from analysis of sequences of cancer polynucleotides, based on the existence of genomic variations at different loci occurring at different frequencies. For example, in a sample of cell free DNA (which is likely to contain germ line DNA as well as cancer DNA), it may be found that a sequence variant of BRAF occurs at a frequency of 17%, a sequence variant of CDKN2A occurs at a frequency of 6%, a sequence variant of ERBB2 occurs at a frequency of 3% and a sequence variant of ATM occurs at a frequency of 1%. These different frequencies of sequence variants indicate tumor heterogeneity. Similarly, genetic sequences exhibiting different amounts of copy number variation also indicate tumor heterogeneity. For example, analysis of a sample may show different levels of amplification for the EGFR and CCNE1 genes. This also indicates tumor heterogeneity.
- somatic mutations can be made by comparing base calls in the sample to a reference sequence or, internally, as less frequent base calls to more common base calls, presumed to be in the germ line sequence. In either case, the existence of sub-dominant forms (e.g., less than 40% of total base calls) at different loci and at different frequency indicates disease cell heterogeneity.
- Cell free DNA typically comprises a preponderance of DNA from normal cells having the germ line genome sequence and, in the case of a disease, such as cancer, a small percentage of DNA from cancer cells and having a cancer genome sequence.
- Sequences generated from polynucleotides in a sample of cfDNA can be compared with a reference sequence to detect differences between the reference sequence and the polynucleotides in the cfDNA.
- all or nearly all of the polynucleotides from a test sample may be identical to a nucleotide in the reference sequence.
- a nucleotide detected at nearly 100% frequency in a sample may be different than a nucleotide in the reference sequence.
- a first nucleotide that matches a reference nucleotide is detected at about 50% and a second nucleotide that is different than a reference nucleotide is detected at about 50%, this most likely indicates normal heterozygosity.
- Heterozygosity may present at allele ratios divergent from 50:50, e.g., 60:40 or even 70:30.
- the sample comprises a nucleotide detectable above noise at a frequency below (of above) an unambiguously heterozygote range (for example, less than about 45%, less than 40%, less than 30%, less than 20%, less than 10% or less than 5%), this can be attributed to the existence of somatic mutations in a percentage of the cells contributing DNA to the cfDNA population. These may come from disease cells, e.g., cancer cells. (The exact percentage is a function of tumor load.) If the frequency of somatic mutations at two different genetic loci are different, e.g., 16% at one locus and 5% at another locus, this indicates that the disease cells, e.g., the cancer cells, are heterogeneous.
- somatic mutations In the case of DNA from solid tumors, which is expected to predominantly comprise tumor DNA, somatic mutations also can be detected by comparison to a reference sequence. Detection of somatic mutations that exist in 100% of the tumor cells may require reference to a standard sequence or information about known mutants to. However, the existence of sub-dominant sequences among the polynucleotide pool at different loci and at different relative frequencies, indicates tumor heterogeneity.
- the profile may include genetic variants in genes that are known to be actionable. Knowledge of such variants can contribute to selecting therapeutic interventions, as therapies can be targeted to such variants. In the case of cancer, many actionable genetic variants are already known.
- the copy number state of a gene should be reflected in the frequency of a genetic form of the gene in the sample.
- a sequence variant may be detected at a frequency consistent with homozygosity or heterozygosity (e.g., about 100% or about 50%, respectively) with no copy number variation. This is consistent with a germ line polymorphism or mutation.
- a sequence variant is detected at a level consistent with homozygosity (e.g., about 100%) but at amounts consistent with copy number variation, this is more likely to reflect the presence of disease cell polynucleotides having undergone gene amplification.
- a sequence variant is detected at a level not inconsistent with heterozygosity (e.g., deviating somewhat from 50%) but at amounts consistent with copy number variation, this also is more likely to reflect the presence of disease cell polynucleotides; the diseased polynucleotides create some level of imbalance in allele frequency away from 50:50.
- sequence variants in a gene detected at levels ideally consistent with heterozygosity in the germ line are more probably the product of a somatic mutation in disease cells if copy number variation also is detected in that gene.
- tumor heterogeneity can be inferred when two genes are detected at different frequency but their copy number is relatively equal.
- tumor homogeneity can be inferred when the difference in frequency between two sequence variants is consistent with difference in copy number for the two genes.
- an EGFR variant is detected at 11% and a KRAS variant is detected at 5%, and no CNV is detected at these genes, the difference in frequency likely reflects tumor heterogeneity (e.g., all tumor cells carry an EGFR mutant and half the tumor cells also carry a KRAS mutant).
- the EGFR gene carrying the mutant is detected at increased copy number, one consistent interpretation is a homogenous population of tumor cells, each cell carrying a mutant in the EGFR and KRAS genes, but in which the KRAS gene is duplicated. Accordingly, both the frequency of a sequence variant and a measure of CNV at the locus of the sequence variant in a sample can be determined. The frequency can then be corrected to reflect the relative number of cells bearing the variant by weighing the frequency based on dose per cell determined from the measure of CNV. This result is now more comparable in terms of number of cells carrying the variant to a sequence variant that does not vary in copy number.
- a report of results from genetic variant analysis may be provided by a report generator, for example to a healthcare practitioner, e.g., a physician, to aid the interpretation of the test results (e.g., data) and selection of treatment options.
- a report generated by a report generator may provide additional information, such as clinical lab results, that may be useful for diagnosing disease and selecting treatment options.
- the report generator system can be a central data processing system configured to establish communications directly with: a remote data site or lab 2 , a medical practice/healthcare provider (treating professional) 4 , and/or a patient/subject 6 through communication links.
- the lab 2 can be medical laboratory, diagnostic laboratory, medical facility, medical practice, point-of-care testing device, or any other remote data site capable of generating subject clinical information.
- Subject clinical information includes but it is not limited to laboratory test data, e.g., analysis of genetic variants; imaging and X-ray data; examination results; and diagnosis.
- the healthcare provider or practice 6 may include medical services providers, such as doctors, nurses, home health aides, technicians and physician's assistants, and the practice may be any medical care facility staffed with healthcare providers. In certain instances the healthcare provider/practice is also a remote data site. Where cancer is a disease to be treated, the subject may be afflicted with cancer, among other possible diseases or disorders.
- Other clinical information for a cancer subject 6 can include the results of laboratory tests, e.g., analysis of genetic variants, metabolic panel, complete blood count, etc.; medical imaging data; and/or medical procedures directed to diagnosing the condition, providing a prognosis, monitoring the progression of the disease, determining relapse or remission, or combinations thereof.
- the list of appropriate sources of clinical information for cancer includes, but it is not limited to, CT scans, MRI scans, ultrasound scans, bone scans, PET Scans, bone marrow test, barium X-ray, endoscopies, lymphangiograms, IVU (Intravenous urogram) or IVP (IV pyelogram), lumbar punctures, cystoscopy, immunological tests (anti-malignin antibody screen), and cancer marker tests.
- the subject 6 's clinical information may be obtained from the lab 2 manually or automatically. Where simplicity of the system is desired, the information may be obtained automatically at predetermined or regular time intervals.
- a regular time interval can refer to a time interval at which the collection of the laboratory data is carried out automatically by the methods and systems described herein based on a measurement of time such as hours, days, weeks, months, years etc.
- the collection of data and processing is carried out at least once a day.
- the transfer and collection of data is carried out about any of monthly, biweekly, weekly, several times a week or daily.
- the retrieval of information may be carried out at predetermined time intervals, which may not be regular time intervals. For instance, a first retrieval step may occur after one week and a second retrieval step may occur after one month.
- the transfer and collection of data can be customized according to the nature of the disorder that is being managed and the frequency of required testing and medical examinations of the subjects.
- FIG. 9B shows an exemplary process to generate genetic reports, including a tumor response map and associated summary of alterations.
- a tumor response map is a graphical representation of genetic information indicating changes over time in genetic information from a tumor, e.g., qualitative and quantitative changes. Such changes can reflect response of a subject to a therapeutic intervention.
- This process can reduce error rates and bias that may be orders of magnitude higher than what is required to reliably detect de novo genetic variants associated with cancer.
- the process can comprise first capturing genetic information by collecting body fluid samples as sources of genetic material (e.g., blood, saliva, sweat, urine, etc). Then, the process can comprise sequencing the materials ( 11 ). For example, polynucleotides in a sample can be sequenced, producing a plurality of sequence reads.
- the tumor burden in a sample that comprises polynucleotides can be estimated as the relative number of sequence reads bearing a variant to the total number of sequence reads generated from the sample. Where copy number variants are analyzed, the tumor burden can be estimated as the relative excess (e.g., in the case of gene duplication) or relative deficit (e.g., in the case of gene elimination) of the total number of sequence reads at test and control loci. For example, a run may produce 1000 reads mapping to an oncogene locus of which 900 correspond to wild type and 100 correspond to a cancer mutant, indicating a copy number variant at this gene. More details on exemplary specimen collection and sequencing of the genetic materials are discussed below in FIGS. 10-11 .
- genetic information can be processed ( 12 ). Genetic variants can then be identified.
- the process can comprise determining the frequency of genetic variants in the sample containing the genetic material.
- the process can comprise separating information from noise ( 13 ) if this process is noisy.
- the sequencing methods for genetic analysis may have error rates.
- the mySeq system of Illumina can produce percent error rates in the low single digits.
- about 50 reads about 5% may be expected to include errors.
- Certain methodologies, such as those described in WO 2014/149134 can significantly reduce the error rate. Errors create noise that can obscure signals from cancer present at low levels in a sample. For example, if a sample has a tumor burden at a level around the sequencing system error rate, e.g., around 0.1%-5%, it may be difficult to distinguish a signal corresponding to a genetic variant due to cancer from one due to noise.
- Analysis of genetic variants may be used for diagnosing in the presence of noise.
- the analysis can be based on the frequency of Sequence Variants or Level of CNV ( 14 ) and a diagnosis confidence indication or level for detecting genetic variants in the noise range can be established ( 15 ).
- the process can comprise increasing the diagnosis confidence. This can be done using a plurality of measurements to increase confidence of diagnosis ( 16 ), or alternatively using measurements at a plurality of time points to determine whether cancer is advancing, in remission or stabilized ( 17 ).
- the diagnostic confidence can be used to identify disease states.
- cell free polynucleotides taken from a subject can include polynucleotides derived from normal cells, as well as polynucleotides derived from diseased cells, such as cancer cells. Polynucleotides from cancer cells may bear genetic variants, such as somatic cell mutations and copy number variants. When cell free polynucleotides from a sample from a subject are sequenced, these cancer polynucleotides are detected as sequence variants or as copy number variants.
- Measurements of a parameter may be provided with a confidence interval. Tested over time, one can determine whether a cancer is advancing, stabilized or in remission by comparing confidence intervals over time. When confidence intervals overlap, one may not be able to tell whether disease is increasing or decreasing, because there is no statistically significant difference between the measures. However, where the confidence intervals do not overlap, this indicates the direction of disease. For example, comparing the lowest point on a confidence interval at one time point and the highest point on a confidence interval at a second time point indicates the direction.
- the process can comprise generating genetic Report/Diagnosis.
- the process can comprise generating genetic graph for a plurality of measurements showing mutation trend ( 18 ) and generating report showing treatment results and options ( 19 ).
- FIGS. 10A-10C show in more details one embodiment for generating genetic reports and diagnosis (e.g., Report/Diagnosis).
- FIG. 10C shows an exemplary pseudo-code executed by the system of FIG. 9A to process non-CNV reported mutant allele frequencies.
- the system can process CNV reported mutant allele frequencies as well.
- Samples comprising genetic material can be collected from a subject at a plurality of time points, that is, serially.
- the genetic material can be sequenced, e.g., using a high-throughput sequencing system. Sequencing can target loci of interest to detect genetic variants, such genes bearing somatic mutations, genes that undergo copy number variation, or genes involved in gene fusions, for example, in cancer.
- a quantitative measure of the genetic variants found can be determined.
- the quantitative measure can be the frequency or percentage of a genetic variant among polynucleotides mapping to a locus, or the absolute number of sequence reads or polynucleotides mapping to a locus.
- variants having a non-zero quantity at at least one time point can then be represented graphically through all time points. For example, in a collection of 1000 sequences, variant 1 may be found at time points 1, 2 and 3 in amounts of 50, 30 and 0, respectively. Variant 2 may be found in amounts 0, 10 and 20 at these time points. These amounts can be normalized, for variant 1, to 5%, 3% and 0%, and, for variant 2, 0%, 1% and 2%. A graphical representation showing the union of all non-zero results can indicate these amounts for both variants at all of the time points. The normalized amounts can be scaled so that each percentage is represented by a layer, for example, having height 1 mm.
- the heights would be at time point 1: heights 5 mm (variant 1) and 0 mm (variant 2); at time point 2: heights 3 mm (variant 1) and 1 mm (variant 2), at time point 3: heights 0 mm (variant 1) and 2 mm (variant 2).
- the graphical representation can be in the form of a stacked area graph, such as a streamgraph. A “zero” time point (before the first time point) can be represented by a point, with all values at 0.
- the height of the quantity of the variants in the graphical representation can be, for example, relative or proportional to each other.
- a variant frequency 5% at one time point could be represented with a height of twice that of a variant with frequency of 2.5% at the same time point.
- the order of stacking can be chosen for ease of understanding.
- variants can be stacked in order of quantity high to low from bottom to top. Or, they can be stacked in a streamgraph with the variant of largest initial amount in the middle, and other variants of decreasing quantity on either side.
- the areas can be color coded based on variant.
- Variants in the same gene can be shown in different hues of the same color.
- KRAS mutants can be shown in different shades of blue, EGFR mutants in different shades of red.
- the process can comprise receiving genetic information from a DNA sequencer ( 30 ).
- the process can then comprise determining specific gene alterations and quantities thereof ( 32 ).
- a tumor response map is generated.
- the process can comprise normalizing the quantities for each gene alteration for rendering across all test points and then generates a scaling factor ( 34 ).
- the term “normalize” generally refers to means adjusting values measured on different scales to a notionally common scale. For example, data measured at different points are converted/adjusted so that all values can be resized to a common scale.
- the process can comprise rendering information on a tumor response map ( 36 ).
- the process can comprise rendering alterations and relative heights using the determined scaling factor ( 38 ) and assigns a unique visual indicator for each alteration ( 40 ).
- the process can comprise generating a summary of alterations and treatment options ( 42 ). Also, information from clinical trials that may help the particular genetic alterations and other helpful treatment suggestions is presented, along with explanations of terminology, test methodology, and other information is added to the report and rendered for the user.
- the copy number variation may be reported as graph, indicating various positions in the genome and a corresponding increase or decrease or maintenance of copy number variation at each respective position. Additionally, copy number variation may be used to report a percentage score indicating how much disease material (or nucleic acids having a copy number variation) exists in the cell free polynucleotide sample.
- the report includes annotations to help physicians interpret the results and recommend treatment options.
- the annotating can include annotating a report for a condition in the NCCN Clinical Practice Guidelines in OncologyTM or the American Society of Clinical Oncology (ASCO) clinical practice guidelines.
- the annotating can include listing one or more FDA-approved drugs for off-label use, one or more drugs listed in a Centers for Medicare and Medicaid Services (CMS) anti-cancer treatment compendia, and/or one or more experimental drugs found in scientific literature, in the report.
- CMS Centers for Medicare and Medicaid Services
- the annotating can include connecting a listed drug treatment option to a reference containing scientific information regarding the drug treatment option.
- the scientific information can be from a peer-reviewed article from a medical journal.
- the annotating can include providing a link to information on a clinical trial for a drug treatment option in the report.
- the annotating can include presenting information in a pop-up box or fly-over box near provided drug treatment options in an electronic based report.
- the annotating can include adding information to a report selected from the group consisting of one or more drug treatment options, scientific information concerning one or more drug treatment options, one or more links to scientific information regarding one or more drug treatment options, one or more links to citations for scientific information regarding one or more drug treatment options, and clinical trial information regarding one or more drug treatment options.
- FIG. 10B shows an exemplary process to generate a tumor response map pathway which may be used by a healthcare practitioner, e.g., physician, for example to make patient care decisions.
- the process can comprise first determining a global scaling factor ( 43 ).
- the process can comprise transforming the absolute value into a relative metric/scale that may be more amenable for plotting (e.g. Multiply mutant allele frequency by 100 and take log of that value) and determines a global scaling factor using maximum observed value.
- the process then involves visualizing information from the earliest test dataset ( 44 ).
- Visualizing can comprise graphically representing the information on a user interface (e.g., a computer screen) or in tangible form (e.g., on a piece of paper).
- a user interface e.g., a computer screen
- tangible form e.g., on a piece of paper.
- the process can comprise multiplying the scaling factor by a transformed value for each gene and use as a quantity indicator for plotting that variant, and then assigns a color/unique visual indicator for each alteration. Then the process can comprise visualizing information for subsequent test points ( 45 ) using the following pseudo-code:
- Each subsequent panel denoting a test date may also include additional patient or intervention information that may correlate with the alteration changes seen in the remainder of the map. Similar scaling, plotting, and transformation may be also implemented on CNV and other types of DNA alterations (e.g. methylation) to display these quantities in separate or combined charts. These additional annotations may themselves also be quantifiable and similarly plotted on the map.
- the process can then comprise determining a summary of alterations and treatment options ( 46 ).
- the following actions are done:
- Grouping of maximum mutant allele frequencies may also extend beyond just the genes they are harbored in to greater encapsulating annotations such as biological pathways, evidence level, etc.
- FIGS. 10D-10I show one exemplary report generated by the system of FIG. 9A .
- a patient identification section 52 provides patient information, reporting date, and physician contact information.
- a tumor response map 54 includes a modified streamgraph 56 that shows tumor activities with unique colors for each mutant gene.
- the graph 56 has accompanying summary explanation textbox 58 . More details are provided in a summary of alterations and treatment option section 60 .
- the alterations 62 and 64 are presented in section 60 , along with mutation trend, mutant allele frequency, cell-free amplification, FDA Approved Drug Indication, FDA Approved Drugs with other Indications, and Clinical Drug Trial information.
- FIGS. 10D-1, 10D-2, and 10D-3 provide enlarged views of FIG. 10D .
- FIG. 10E shows an exemplary report section providing definitions, comments, and interpretation of the tests.
- FIGS. 10E-1 and 10E-2 provide enlarged views of FIG. 10E .
- FIG. 10F shows an exemplary detailed therapy result portion of the report.
- FIGS. 10F-1 and 10F-2 provide enlarged views of FIG. 10F .
- FIG. 10G shows an exemplary discussion of the clinical relevance of detected alterations.
- FIGS. 10G-1 and 10G-2 provide enlarged views of FIG. 10G .
- FIG. 10H shows potentially available medications that are going through clinical trials.
- FIG. 10I shows the test methods and limitations thereof.
- FIGS. 10I-1 and 10I-2 provide enlarged views of FIG. 10I .
- FIG. 10J-10P shows various exemplary modified streamgraph 56 .
- a streamgraph, or stream graph is a type of stacked area graph which is displaced around a central axis, resulting in a flowing, organic shape.
- Streamgraphs are a generalization of stacked area graphs where the baseline is free. By shifting the baseline, it is possible to minimize the change in slope (or “wiggle”) in individual series, thereby making it easier to perceive the thickness of any given layer across the data.
- FIG. 10J shows seven layers representing at least 8 mutants over three time periods, and a “0” time point (all values “0”).
- FIG. 10K shows a single mutant over 4 time periods. No mutants are detected at the second, third and fourth time points.
- FIG. 10L indicates frequency of dominant allele at each time point.
- FIG. 10M shows a single time point with a total of four mutants in two genes. Mutants are identified by amino acid at a position changed (i.e., EGFR T790M).
- One embodiment renders a streamgraph so that it is not x-axis reflective.
- the modified graph applies a unique scaling to denote proportional attributes.
- the graph can indicate the addition of new attributes over time.
- the presence or absence of a mutation may be reflected in graphical form, indicating various positions in the genome and a corresponding increase or decrease or maintenance of a frequency of mutation at each respective position.
- mutations may be used to report a percentage score indicating how much disease material exists in the cell free polynucleotide sample.
- a confidence score may accompany each detected mutation, given known statistics of typical variances at reported positions in non-disease reference sequences. Mutations may also be ranked in order of abundance in the subject or ranked by clinically actionable importance.
- the mapping of genome positions and copy number variation for the subject with cancer can indicate that a particular cancer is aggressive and resistant to treatment.
- the subject may be monitored for a period and retested. If at the end of the period, the copy number variation profile, e.g., as depicted in a tumor response map, begins to increase dramatically, this may indicate that the current treatment is not working.
- a comparison can also done with genetic profiles of other subjects. For example, if it is determined that this increase in copy number variation indicates that the cancer is advancing, then the original treatment regimen as prescribed is no longer treating the cancer and a new treatment is prescribed.
- These reports can be submitted and accessed electronically via the internet. Analysis of sequence data may occur at a site other than the location of the subject. The report can be generated and transmitted to the subject's location. Via an internet enabled computer, the subject may access the reports reflecting his tumor burden.
- an exemplary process receives genetic materials from blood sample or other body samples ( 1102 ).
- the process can comprise converting the polynucleotides from the genetic materials into tagged parent nucleotides ( 1104 ).
- the tagged parent nucleotides are amplified to produce amplified progeny polynucleotides ( 1106 ).
- a subset of the amplified polynucleotides is sequenced to produce sequence reads ( 1108 ), which are grouped into families, each generated from a unique tagged parent nucleotide ( 1110 ).
- the process can comprise assigning each family a confidence score for each family ( 1112 ).
- a consensus is determined using prior readings. This is done by reviewing prior confidence score for each family, and if consistent prior confidence scores exists, then the current confidence score is increased ( 1114 ). If there are prior confidence scores, but they are inconsistent, the current confidence score is not modified in one embodiment ( 1116 ). In other embodiments, the confidence score is adjusted in a predetermined manner for inconsistent prior confidence scores. If this is a first time the family is detected, the current confidence score can be reduced as it may be a false reading ( 1118 ). The process can comprise inferring the frequency of the family at the locus in the set of tagged parent polynucleotides based on the confidence score. Then genetic test reports are generated as discussed above ( 1120 ).
- the historical comparison can be used in conjunction with other consensus sequences mapping to a particular reference sequence to detect instances of genetic variation.
- Consensus sequences mapping to particular reference sequences can be measured and normalized against control samples. Measures of molecules mapping to reference sequences can be compared across a genome to identify areas in the genome in which copy number varies, or heterozygosity is lost.
- Consensus methods include, for example, linear or non-linear methods of building consensus sequences (e.g., voting, averaging, statistical, maximum a posteriori or maximum likelihood detection, dynamic programming, Bayesian, hidden Markov or support vector machine methods, etc.) derived from digital communication theory, information theory, or bioinformatics.
- a stochastic modeling algorithm is applied to convert the normalized nucleic acid sequence read coverage for each window region to the discrete copy number states.
- this algorithm may comprise one or more of the following: Hidden Markov Model, dynamic programming, support vector machine, Bayesian network, trellis decoding, Viterbi decoding, expectation maximization, Kalman filtering methodologies and neural networks.
- cell free DNAs are extracted and isolated from a readily accessible bodily fluid such as blood, sweat, saliva, urine, etc.
- a readily accessible bodily fluid such as blood, sweat, saliva, urine, etc.
- cell free DNAs can be extracted using a variety of methods known in the art, including but not limited to isopropanol precipitation and/or silica based purification.
- Cell free DNAs may be extracted from any number of subjects, such as subjects without cancer, subjects at risk for cancer, or subjects known to have cancer (e.g. through other means).
- any of a number of different sequencing operations may be performed on the cell free polynucleotide sample.
- Samples may be processed before sequencing with one or more reagents (e.g., enzymes, unique identifiers (e.g., barcodes), probes, etc.).
- reagents e.g., enzymes, unique identifiers (e.g., barcodes), probes, etc.
- the samples or fragments of samples may be tagged individually or in subgroups with the unique identifier. The tagged sample may then be used in a downstream application such as a sequencing reaction and individual molecules may be tracked to parent molecules.
- the cell free polynucleotides can be tagged or tracked in order to permit subsequent identification and origin of the particular polynucleotide.
- the assignment of an identifier to individual or subgroups of polynucleotides may allow for a unique identity to be assigned to individual sequences or fragments of sequences. This may allow acquisition of data from individual samples and is not limited to averages of samples.
- nucleic acids or other molecules derived from a single strand may share a common tag or identifier and therefore may be later identified as being derived from that strand.
- all of the fragments from a single strand of nucleic acid may be tagged with the same identifier or tag, thereby permitting subsequent identification of fragments from the parent strand.
- gene expression products may be tagged in order to quantify expression.
- a barcode or barcode in combination with sequence to which it is attached can be counted.
- the systems and methods can be used as a PCR amplification control. In such cases, multiple amplification products from a PCR reaction can be tagged with the same tag or identifier. If the products are later sequenced and demonstrate sequence differences, differences among products with the same identifier can then be attributed to PCR error. Additionally, individual sequences may be identified based upon characteristics of sequence data for the read themselves.
- the detection of unique sequence data at the beginning (start) and end (stop) portions of individual sequencing reads may be used, alone or in combination, with the length, or number of base pairs of each sequence read to assign unique identities to individual molecules. Fragments from a single strand of nucleic acid, having been assigned a unique identity, may thereby permit subsequent identification of fragments from the parent strand. This can be used in conjunction with bottlenecking the initial starting genetic material to limit diversity.
- unique sequence data at the beginning (start) and end (stop) portions of individual sequencing reads and sequencing read length may be used, alone or combination, with the use of barcodes.
- the barcodes may be unique as described herein. In other cases, the barcodes themselves may not be unique. In this case, the use of non-unique barcodes, in combination with sequence data at the beginning (start) and end (stop) portions of individual sequencing reads and sequencing read length may allow for the assignment of a unique identity to individual sequences. Similarly, fragments from a single strand of nucleic acid having been assigned a unique identity may thereby permit subsequent identification of fragments from the parent strand.
- Sequencing methods may include, but are not limited to: high-throughput sequencing, pyrosequencing, sequencing-by-synthesis, single-molecule sequencing, nanopore sequencing, semiconductor sequencing, sequencing-by-ligation, sequencing-by-hybridization, RNA-Seq (Illumina), Digital Gene Expression (Helicos), Next generation sequencing, Single Molecule Sequencing by Synthesis (SMSS) (Helicos), massively-parallel sequencing, Clonal Single Molecule Array (Solexa), shotgun sequencing, Maxim-Gilbert sequencing, primer walking, and any other sequencing methods known in the art.
- SMSS Single Molecule Sequencing by Synthesis
- Solexa Single Molecule Array
- Sequencing methods typically involve sample preparation, sequencing of polynucleotides in the prepared sample to produce sequence reads and bioinformatic manipulation of the sequence reads to produce quantitative and/or qualitative genetic information about the sample.
- Sample preparation typically involves converting polynucleotides in a sample into a form compatible with the sequencing platform used. This conversion can involve tagging polynucleotides.
- the tags comprise polynucleotide sequence tags. Conversion methodologies used in sequencing may not be 100% efficient. For example, it is not uncommon to convert polynucleotides in a sample with a conversion efficiency of about 1-5%, that is, about 1-5% of the polynucleotides in a sample are converted into tagged polynucleotides.
- Polynucleotides that are not converted into tagged molecules are not represented in a tagged library for sequencing. Accordingly, polynucleotides having genetic variants represented at low frequency in the initial genetic material may not be represented in the tagged library and, therefore may not be sequenced or detected. By increasing conversion efficiency, the probability that a polynucleotide in the initial genetic material will be represented in the tagged library and, consequently, detected by sequencing is increased. Furthermore, rather than directly address the low conversion efficiency issue of library preparation, most protocols to date call for greater than 1 microgram of DNA as input material. However, when input sample material is limited or detection of polynucleotides with low representation is desired, high conversion efficiency can efficiently sequence the sample and/or to adequately detect such polynucleotides.
- mutation detection may be performed on selectively enriched regions of the genome or transcriptome purified and isolated ( 1302 ).
- specific regions which may include but are not limited to genes, oncogenes, tumor suppressor genes, promoters, regulatory sequence elements, non-coding regions, miRNAs, snRNAs and the like may be selectively amplified from a total population of cell free polynucleotides. This may be performed as herein described.
- multiplex sequencing may be used, with or without barcode labels for individual polynucleotide sequences.
- sequencing may be performed using any nucleic acid sequencing platforms known in the art. This step generates a plurality of genomic fragment sequence reads ( 1304 ).
- a reference sequence is obtained from a control sample, taken from another subject.
- the control subject may be a subject known to not have known genetic aberrations or disease.
- these sequence reads may contain barcode information. In other examples, barcodes are not utilized.
- a quality score may be a representation of reads that indicates whether those reads may be useful in subsequent analysis based on a threshold. In some cases, some reads are not of sufficient quality or length to perform the subsequent mapping step. Sequencing reads with a quality score at least 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set. In other cases, sequencing reads assigned a quality scored at least 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set. In step 1306, the genomic fragment reads that meet a specified quality score threshold are mapped to a reference genome, or a reference sequence that is known not to contain mutations.
- mapping score may be a representation or reads mapped back to the reference sequence indicating whether each position is or is not uniquely mappable.
- reads may be sequences unrelated to mutation analysis. For example, some sequence reads may originate from contaminant polynucleotides. Sequencing reads with a mapping score at least 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set. In other cases, sequencing reads assigned a mapping scored less than 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set.
- bases that do not meet the minimum threshold for mappability, or low quality bases may be replaced by the corresponding bases as found in the reference sequence.
- the frequency of variant bases may be calculated as the number of reads containing the variant divided by the total number of reads 1308 after ascertaining read coverage and identifying variant bases relative to the control sequence in each read. This may be expressed as a ratio for each mappable position in the genome.
- the frequencies of all four nucleotides, cytosine, guanine, thymine, adenine can be analyzed in comparison to the reference sequence.
- a stochastic or statistical modeling algorithm can be applied to convert the normalized ratios for each mappable position to reflect frequency states for each base variant.
- this algorithm may comprise one or more of the following: Hidden Markov Model, dynamic programming, support vector machine, Bayesian or probabilistic modeling, trellis decoding, Viterbi decoding, expectation maximization, Kalman filtering methodologies, and neural networks.
- the discrete mutation states of each base position can be utilized to identify a base variant with high frequency of variance as compared to the baseline of the reference sequence.
- the baseline might represent a frequency of at least 0.0001%, 0.001%, 0.01%, 0.1%, 1.0%, 2.0%, 3.0%, 4.0% 5.0%, 10%, or 25%.
- the baseline might represent a frequency of at least 0.0001%, 0.001%, 0.01%, 0.1%, 1.0%, 2.0%, 3.0%, 4.0% 5.0%. 10%, or 25%.
- all adjacent base positions with the base variant or mutation can be merged into a segment to report the presence or absence of a mutation.
- various positions can be filtered before they are merged with other segments.
- the variant with largest deviation for a specific position in the sequence derived from the subject as compared to the reference sequence can be identified as a mutation.
- a mutation may be a cancer mutation.
- a mutation might be correlated with a disease state.
- a mutation or variant may comprise a genetic aberration that includes, but is not limited to a single base substitution, or small indels, transversions, translocations, inversion, deletions, truncations or gene truncations.
- a mutation may be at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 nucleotides in length. On other cases a mutation may be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 nucleotides in length.
- a consensus is determined using prior readings. This is done by reviewing prior confidence score for the corresponding bases, and if consistent prior confidence scores exists, then the current confidence score is increased ( 1314 ). If there are prior confidence scores, but they are inconsistent, the current confidence score is not modified in one embodiment ( 1316 ). In other embodiments, the confidence score is adjusted in a predetermined manner for inconsistent prior confidence scores. If this is a first time the family is detected, the current confidence score can be reduced as it may be a false reading ( 1318 ). The process can comprise then converting the frequency of variance per each base into discrete variant states for each base position ( 1320 ).
- Cancers cells as most cells, can be characterized by a rate of turnover, in which old cells die and are replaced by newer cells. Generally dead cells, in contact with vasculature in a given subject, may release DNA or fragments of DNA into the blood stream. This is also true of cancer cells during various stages of the disease. Cancer cells may also be characterized, dependent on the stage of the disease, by various genetic aberrations such as copy number variation as well as mutations. This phenomenon may be used to detect the presence or absence of cancers individuals using the methods and systems described herein.
- blood from subjects at risk for cancer may be drawn and prepared as described herein to generate a population of cell free polynucleotides.
- this might be cell free DNA.
- the systems and methods of the disclosure may be employed to detect mutations or copy number variations that may exist in certain cancers present. The method may help detect the presence of cancerous cells in the body, despite the absence of symptoms or other hallmarks of disease.
- the types and number of cancers that may be detected may include but are not limited to blood cancers, brain cancers, lung cancers, skin cancers, nose cancers, throat cancers, liver cancers, bone cancers, lymphomas, pancreatic cancers, skin cancers, bowel cancers, rectal cancers, thyroid cancers, bladder cancers, kidney cancers, mouth cancers, stomach cancers, solid state tumors, heterogeneous tumors, homogenous tumors and the like.
- the system and methods may be used to detect any number of genetic aberrations that may cause or result from cancers. These may include but are not limited to mutations, mutations, indels, copy number variations, transversions, translocations, inversion, deletions, aneuploidy, partial aneuploidy, polyploidy, chromosomal instability, chromosomal structure alterations, gene fusions, chromosome fusions, gene truncations, gene amplification, gene duplications, chromosomal lesions, DNA lesions, abnormal changes in nucleic acid chemical modifications, abnormal changes in epigenetic patterns, abnormal changes in nucleic acid methylation infection and cancer.
- the systems and methods described herein may also be used to help characterize certain cancers.
- Genetic data produced from the system and methods of this disclosure may allow practitioners to help better characterize a specific form of cancer. Often times, cancers are heterogeneous in both composition and staging. Genetic profile data may allow characterization of specific sub-types of cancer that may be important in the diagnosis or treatment of that specific sub-type. This information may also provide a subject or practitioner clues regarding the prognosis of a specific type of cancer.
- the systems and methods provided herein may be used to monitor already known cancers, or other diseases in a particular subject. This may allow either a subject or practitioner to adapt treatment options in accord with the progress of the disease.
- the systems and methods described herein may be used to construct genetic profiles of a particular subject of the course of the disease. In some instances, cancers can progress, becoming more aggressive and genetically unstable. In other examples, cancers may remain benign, inactive or dormant. The system and methods of this disclosure may be useful in determining disease progression.
- the systems and methods described herein may be useful in determining the efficacy of a particular treatment option.
- successful treatment options may actually increase the amount of copy number variation or mutations detected in subject's blood if the treatment is successful as more cancers may die and shed DNA. In other examples, this may not occur.
- certain treatment options may be correlated with genetic profiles of cancers over time. This correlation may be useful in selecting a therapy.
- the systems and methods described herein may be useful in monitoring residual disease or recurrence of disease.
- the methods and systems described herein may not be limited to detection of mutations and copy number variations associated with only cancers.
- Various other diseases and infections may result in other types of conditions that may be suitable for early detection and monitoring.
- genetic disorders or infectious diseases may cause a certain genetic mosaicism within a subject. This genetic mosaicism may cause copy number variation and mutations that could be observed.
- the system and methods of the disclosure may also be used to monitor the genomes of immune cells within the body. Immune cells, such as B cells, may undergo rapid clonal expansion upon the presence certain diseases. Clonal expansions may be monitored using copy number variation detection and certain immune states may be monitored. In this example, copy number variation analysis may be performed over time to produce a profile of how a particular disease may be progressing.
- systems and methods of this disclosure may also be used to monitor systemic infections themselves, as may be caused by a pathogen such as a bacteria or virus.
- Copy number variation or even mutation detection may be used to determine how a population of pathogens are changing during the course of infection. This may be particularly important during chronic infections, such as HIV/AIDs or Hepatitis infections, whereby viruses may change life cycle state and/or mutate into more virulent forms during the course of infection.
- transplanted tissue undergoes a certain degree of rejection by the body upon transplantation.
- the methods of this disclosure may be used to determine or profile rejection activities of the host body, as immune cells attempt to destroy transplanted tissue. This may be useful in monitoring the status of transplanted tissue as well as altering the course of treatment or prevention of rejection.
- a disease may be heterogeneous. Disease cells may not be identical.
- some tumors are known to comprise different types of tumor cells, some cells in different stages of the cancer.
- heterogeneity may comprise multiple foci of disease. Again, in the example of cancer, there may be multiple tumor foci, perhaps where one or more foci are the result of metastases that have spread from a primary site.
- the methods of this disclosure may be used to generate or profile, fingerprint or set of data that is a summation of genetic information derived from different cells in a heterogeneous disease.
- This set of data may comprise copy number variation and mutation analyses alone or in combination.
- systems and methods of the disclosure may be used to diagnose, prognose, monitor or observe cancers or other diseases of fetal origin. That is, these methodologies may be employed in a pregnant subject to diagnose, prognose, monitor or observe cancers or other diseases in a unborn subject whose DNA and other polynucleotides may co-circulate with maternal molecules.
- these reports are submitted and accessed electronically via the internet. Analysis of sequence data occurs at a site other than the location of the subject. The report is generated and transmitted to the subject's location. Via an internet enabled computer, the subject accesses the reports reflecting his tumor burden.
- the annotated information can be used by a health care provider to select other drug treatment options and/or provide information about drug treatment options to an insurance company.
- the method can include annotating the drug treatment options for a condition in, for example, the NCCN Clinical Practice Guidelines in OncologyTM or the American Society of Clinical Oncology (ASCO) clinical practice guidelines.
- the drug treatment options that are stratified in a report can be annotated in the report by listing additional drug treatment options.
- An additional drug treatment can be an FDA-approved drug for an off-label use.
- OBRA Omnibus Budget Reconciliation Act
- the drugs used for annotating lists can be found in CMS approved compendia, including the National Comprehensive Cancer Network (NCCN) Drugs and Biologics CompendiumTM, Thomson Micromedex DrugDex®, Elsevier Gold Standard's Clinical Pharmacology compendium, and American Hospital Formulary Service-Drug Information Compendium®.
- the drug treatment options can be annotated by listing an experimental drug that may be useful in treating a cancer with one or more molecular markers of a particular status.
- the experimental drug can be a drug for which in vitro data, in vivo data, animal model data, pre-clinical trial data, or clinical-trial data are available.
- the drug treatment options can be annotated by providing a link on an electronic based report connecting a listed drug to scientific information regarding the drug.
- a link can be provided to information regarding a clinical trial for a drug (clinicaltrials.gov). If the report is provided via a computer or computer website, the link can be a footnote, a hyperlink to a website, a pop-up box, or a fly-over box with information, etc.
- the report and the annotated information can be provided on a printed form, and the annotations can be, for example, a footnote to a reference.
- the information for annotating one or more drug treatment options in a report can be provided by a commercial entity that stores scientific information.
- a health care provider can treat a subject, such as a cancer patient, with an experimental drug listed in the annotated information, and the health care provider can access the annotated drug treatment option, retrieve the scientific information (e.g., print a medical journal article) and submit it (e.g., a printed journal article) to an insurance company along with a request for reimbursement for providing the drug treatment.
- Physicians can use any of a variety of Diagnosis-related group (DRG) codes to enable reimbursement.
- DSG Diagnosis-related group
- a drug treatment option in a report can also be annotated with information regarding other molecular components in a pathway that a drug affects (e.g., information on a drug that targets a kinase downstream of a cell-surface receptor that is a drug target).
- the drug treatment option can be annotated with information on drugs that target one or more other molecular pathway components.
- the identification and/or annotation of information related to pathways can be outsourced or subcontracted to another company.
- the annotated information can be, for example, a drug name (e.g., an FDA approved drug for off-label use; a drug found in a CMS approved compendium, and/or a drug described in a scientific (medical) journal article), scientific information concerning one or more drug treatment options, one or more links to scientific information regarding one or more drugs, clinical trial information regarding one or more drugs (e.g., information from clinicaltrials.gov/), one or more links to citations for scientific information regarding drugs, etc.
- a drug name e.g., an FDA approved drug for off-label use; a drug found in a CMS approved compendium, and/or a drug described in a scientific (medical) journal article
- scientific information concerning one or more drug treatment options e.g., an FDA approved drug for off-label use; a drug found in a CMS approved compendium, and/or a drug described in a scientific (medical) journal article
- scientific information concerning one or more drug treatment options e.g.,
- the annotated information can be inserted into any location in a report.
- Annotated information can be inserted in multiple locations on a report.
- Annotated information can be inserted in a report near a section on stratified drug treatment options.
- Annotated information can be inserted into a report on a separate page from stratified drug treatment options.
- a report that does not contain stratified drug treatment options can be annotated with information.
- the system can also include reports on the effects of drugs on sample (e.g. tumor cells) isolated from a subject (e.g. cancer patient).
- sample e.g. tumor cells
- An in vitro culture using a tumor from a cancer patient can be established using techniques known to those skilled in the art.
- the system can also include high-throughput screening of FDA approved off-label drugs or experimental drugs using said in vitro culture and/or xenograft model.
- the system can also include monitoring tumor antigen for recurrence detection.
- the system can provide internet enabled access of reports of a subject with cancer.
- the system can use a handheld DNA sequencer or a desktop DNA sequencer.
- the DNA sequencer is a scientific instrument used to automate the DNA sequencing process. Given a sample of DNA, a DNA sequencer is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The order of the DNA bases is reported as a text string, called a read.
- Some DNA sequencers can be also considered optical instruments as they analyze light signals originating from fluorochromes attached to nucleotides.
- the data is sent by the DNA sequencers over a direct connection or over the internet to a computer for processing.
- the data processing aspects of the system can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them.
- Data processing apparatus of the invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and data processing method steps of the invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output.
- the data processing aspects of the invention can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from and to transmit data and instructions to a data storage system, at least one input device, and at least one output device.
- Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language, if desired; and, in any case, the language can be a compiled or interpreted language.
- Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory.
- Storage devices suitable for tangibly embodying computer program instructions and data include all forms of nonvolatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
- the invention can be implemented using a computer system having a display device such as a monitor or LCD (liquid crystal display) screen for displaying information to the user and input devices by which the user can provide input to the computer system such as a keyboard, a two-dimensional pointing device such as a mouse or a trackball, or a three-dimensional pointing device such as a data glove or a gyroscopic mouse.
- the computer system can be programmed to provide a graphical user interface through which computer programs interact with users.
- the computer system can be programmed to provide a virtual reality, three-dimensional display interface.
- the methods of this disclosure allow one to provide therapeutic interventions more precisely directed to the form of a disease in a subject, and to calibrate these therapeutic interventions over time.
- This precision reflects, in part, the precision by which one is able to profile the whole body tumor status of a subject as reflected in tumor heterogeneity.
- the therapeutic intervention is more effective against cancers with this profile than against cancers with any single one of these variants.
- a therapeutic intervention is an intervention that produces a therapeutic effect, (e.g., is therapeutically effective).
- Therapeutically effective interventions prevent, slow the progression of, improve the condition of (e.g., causes remission of), or cure a disease, such as a cancer.
- a therapeutic intervention can include, for example, administration of a treatment, such as chemotherapy, radiation therapy, surgery, immunotherapy, administration of a pharmaceutical or a nutraceutical, or, a change in behavior, such as diet.
- One measure of therapeutic effectiveness is effectiveness for at least 90% of subjects undergoing the intervention over at least 100 subjects.
- a therapeutic intervention is determined that takes into account both the type of genetic variants found in the disease cells and their relative amounts (e.g., proportion).
- the therapeutic intervention can treat the subject as if each clonal variant were a different cancer to be treated independently.
- these variants may be left out of the therapeutic intervention until they rise to a clinical threshold or significant relative frequency (e.g., greater than the threshold stated above).
- a therapeutic intervention can include treatments effective against diseases with each of the genetic variants.
- genetic variants such as mutant forms of a gene or gene amplification, may be detected in several genes (e.g., a major clone and a minor clone).
- Each of these forms may be actionable, that is, a treatment may be known for which cancers with the particular variant are responsive.
- the profile of tumor heterogeneity may indicate that one of the variants is present in the polynucleotides at, for example, five times the level of each of the other two variants.
- a therapeutic intervention can be determined that involves delivering three different drugs to the subject, each drug relatively more effective against cancers bearing each of the variants.
- the drugs can be delivered as a cocktail, or sequentially.
- the drugs can be administered in doses stratified to reflect the relative amounts of the variants in the DNA.
- a drug effective against the most common variant can be administered in greater amount than drugs effective against the two less common variants.
- the profile of tumor heterogeneity can show the presence of a sub-population of cancer cells bearing a genetic variant that is resistant to a drug to which the disease typically responds.
- the therapeutic intervention can involve including both a first drug effective against tumor cells without the resistance variant and a second drug effective against tumor cells with the resistant variant.
- doses can be stratified to reflect relative amounts of each variant detected in the profile.
- changes in the profile of tumor heterogeneity are examined over time, and therapeutic interventions are developed to treat the changing tumor.
- disease heterogeneity can be determined at a plurality of different times.
- profiling methods of this disclosure more precise inferences can be made about tumor evolution. This allows the practitioner to monitor the evolution of the disease, in particular as new clonal sub-populations emerge after remission effected by a first wave of therapy.
- therapeutic interventions can be calibrated over time to treat the changing tumor.
- a profile may show that a cancer has a form that is responsive to a certain treatment. The treatment is delivered and the tumor burden is seen to decrease over time.
- a genetic variant is found in the tumor indicating the presence of a population of cancer cells that is not responsive to the treatment.
- a new therapeutic intervention is determined that targets the cells bearing the marker of non-responsiveness.
- a dominant tumor form can eventually give way through Darwinian selection to cancer cells carrying mutants that render the cancer unresponsive to the therapy regimen. Appearance of these resistance mutants can be delayed through methods of this disclosure.
- a subject is subjected to one or more pulsed therapy cycles, each pulsed therapy cycle comprising a first period during which a drug is administered at a first amount and a second cycle during which the drug is administered at a second, reduced amount.
- the first period is characterized by a tumor burden detected above a first clinical level.
- the second period is characterized by a tumor burden detected below a second clinical level.
- First and second clinical levels can be different in different pulsed therapy cycles. So, for example, the first clinical level can be lower in succeeding cycles.
- a plurality of cycles can include at least 2, 3, 4, 5, 6, 7, 8 or more cycles.
- the BRAF mutant V600E may be detected in disease cell polynucleotides at an amount indicating a tumor burden of 5% in cfDNA.
- Chemotherapy can commence with dabrafenib. Subsequent testing can show that the amount of the BRAF mutant in the cfDNA falls below 0.5% or to undetectable levels. At this point, dabrafenib therapy can stop or be significantly curtailed. Further subsequent testing may find that DNA bearing the BRAF mutation has risen to 2.5% of polynucleotides in cfDNA. At this point, dabrafenib therapy is re-started, e.g., at the same level as the initial treatment. Subsequent testing may find that DNA bearing the BRAF mutation has decreased to 0.5% of polynucleotides in cfDNA. Again, dabrafenib therapy is stopped or reduced. The cycle can be repeated a number of times.
- FIG. 7 shows an exemplary course of monitoring and treatment of disease in a subject.
- a subject tested at the time of blood draw 1 has a tumor burden of 1.4% and presents with genetic alterations in genes 1 , 2 and 3 .
- the subject is treated with Drug A. After a time, treatment is discontinued.
- a second blood draw shows the cancer in remission.
- a third blood draw indicates that the cancer has recurred, in this instance, presenting with a genetic variant in Gene 4 .
- the subject is now put on a course of Drug B, to which cancers having this variant are responsive.
- a therapeutic intervention can be changed upon detection of the rise of a mutant form resistant to an original drug.
- cancers with the EGFR mutation L858R respond to therapy with erlotinib.
- cancers with the EGFR mutation T790M are resistant to erlotinib.
- they are responsive to ruxolitinib.
- a method of this disclosure involves monitoring changes in tumor profile and changing a therapeutic intervention when a genetic variant associated with drug resistance rises to a predetermined clinical level.
- a database is built in which genetic information from serial samples collected from cancer patients is recorded.
- This database may also contain intervening treatment and other clinically relevant information, such as, weight, adverse effects, histological testing, blood testing, radiographic information, prior treatments, cancer type, etc.
- Serial test results can be used to infer efficacy of treatment, especially when used with blood samples, which can give a more unbiased estimate of tumor burden than self-reporting or radiographic reporting by a medical practitioner.
- Treatment efficacy can be clustered by those with similar genomic profiles and vice versa. Genomic profiles can be organized around, for example, primary genetic alteration, secondary genetic alteration(s), relative amounts of these genetic alterations, and tumor load. This database can be used for decision support for subsequent patients.
- Both germline and somatic alterations can be used for determining treatment efficacy as well.
- Acquired resistance alterations that can also be inferred from the database when treatments that were effective initially begin to fail. This failure can be detected through radiographic, blood or other means.
- the primary data used for inference of acquired resistance mechanisms are genomic tumor profiles collected after treatment per patient. This data can also be used to place quantitative bounds on likely treatment response as well as predict time to treatment failure. Based on likely acquired resistance alterations for a given treatment and tumor genomic profile, a treatment regimen can be modified to suppress acquisition of most likely resistance alterations.
- FIG. 5 shows a computer system 1501 that is programmed or otherwise configured to implement the methods of the present disclosure.
- the computer system 1501 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 1505 .
- the computer system 1501 also includes memory or memory location 1510 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 1515 (e.g., hard disk), communication interface 1520 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 1525 , such as cache, other memory, data storage and/or electronic display adapters.
- memory or memory location 1510 e.g., random-access memory, read-only memory, flash memory
- electronic storage unit 1515 e.g., hard disk
- communication interface 1520 e.g., network adapter
- peripheral devices 1525 such as cache, other memory, data storage and/or electronic display adapters.
- the memory 1510 , storage unit 1515 , interface 1520 and peripheral devices 1525 are in communication with the CPU 1505 through a communication bus (solid lines).
- the storage unit 1515 can be a data storage unit (or data repository) for storing data.
- the computer system 1501 can be operatively coupled to a computer network (“network”) 1530 with the aid of the communication interface 1520 .
- the network 1530 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
- the network 1530 in some cases is a telecommunication and/or data network.
- the network 1530 can include one or more computer servers, which can enable distributed computing, such as cloud computing.
- the CPU 1505 can execute a sequence of machine-readable instructions, which can be embodied in a program or software.
- the instructions may be stored in a memory location, such as the memory 1510 .
- the storage unit 1515 can store files, such as drivers, libraries and saved programs.
- the computer system 1501 can communicate with one or more remote computer systems through the network 1530 .
- Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 1501 , such as, for example, on the memory 1510 or electronic storage unit 1515 .
- the machine executable or machine readable code can be provided in the form of software. Aspects of the systems and methods provided herein, such as the computer system 1501 , can be embodied in programming.
- the computer system 1501 can include or be in communication with an electronic display that comprises a user interface (UI) for providing, for example, one or more results of sample analysis.
- UI user interface
- Nucleotide positions (e.g., loci) in the genome can be designated by number, as depicted in FIG. 2 . Positions at which about 100% of the base calls are identical to the reference sequence or at which about 100% of the base calls are different than the reference sequence are inferred to represent homozygosity of the cfDNA (presumed normal). Positions at which about 50% of the base calls are identical to the reference sequence are inferred to represent heterozygosity of the cfDNA (also presumed normal). Positions at which the percentage of base calls at a locus are substantially below 50% and above the detection limit of the base calling system are inferred to represent tumor-associated genetic variants.
- the sample is subjected to proteinase K digestion.
- DNA is precipitated with isopropanol.
- DNA is captured on a DNA purification column (e.g., a QIAamp DNA Blood Mini Kit) and eluted in 100 ⁇ l solution.
- DNAs below 500 bp are selected with Ampure SPRI magnetic bead capture (PEG/salt).
- HGE haploid genome equivalents
- High-efficiency DNA tagging (>80%) is performed by end repair, A-tailing and sticky-end ligation with 2 different octomers (i.e., 4 combinations) with overloaded hairpin adaptors.
- 2.5 ng DNA i.e. approximately 800 HGE
- Each hairpin adaptor comprises a random sequence on its non-complementary portion. Both ends of each DNA fragment are attached with hairpin adaptors.
- Each tagged fragment can be identified by a combination of the octomer sequence on the hairpin adaptors and endogenous portions of the insert sequence.
- Tagged DNA is amplified by 12 cycles of PCR to produce about 1-7 ⁇ g DNA that contain approximately 500 copies of each of the 800 HGE in the starting material.
- Buffer optimization, polymerase optimization and cycle reduction may be performed to optimize the PCR reactions.
- Amplification bias e.g., non-specific bias, GC bias, and/or size bias are also reduced by optimization.
- Noise(s) e.g., polymerase-introduced errors are reduced by using high-fidelity polymerases.
- Sequences may be enriched as follow: DNAs with regions of interest (ROI) are captured using biotin-labeled bead with probe to ROIs. The ROIs are amplified with 12 cycles of PCR to generate a 2000 times amplification.
- ROI regions of interest
- 0.1 to 1% of the sample are used for sequencing.
- the resulting DNA is then denatured and diluted to 8 pM and loaded into an Illumina sequencer.
- Sequence reads are grouped into families, with about 10 sequence reads in each family. Families are collapsed into consensus sequences by voting (e.g., biased voting) each position in a family. A base is called for consensus sequence if 8 or 9 members agree. A base is not called for consensus sequence if no more than 60% of the members agree.
- the resulting consensus sequences are mapped to a reference genome, such as hg19.
- a reference genome such as hg19.
- Each base in a consensus sequence is covered by about 3000 different families.
- a quality score for each sequence is calculated and sequences are filtered based on their quality scores.
- Base calls at each position in a consensus sequence are compared with the HG-19 reference sequence. At each position at which a base call differs from the reference sequence, the identity of the different base or bases, and their percentage as a function of total base calls at the locus is determined and reported.
- Sequence variation is detected by counting distribution of bases at each locus. If 98% of the reads have the same base (homozygous) and 2% have a different base, the locus is likely to have a sequence variant, presumably from cancer DNA.
- CNV is detected by counting the total number of sequences (bases) mapping to a locus and comparing with a control locus.
- CNV analysis is performed specific regions, including regions on ALK, APC, BRAF, CDKN2A, EGFR, ERBB2, FBXW7, KRAS, MYC, NOTCH1, NRAS, PIK3CA, PTEN, RB1, TP53, MET, AR, ABL1, AKT1, ATM, CDH1, CSF1R, CTNNB1, ERBB4, EZH2, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KDR, KIT, MLH1, MPL, NPM1, PDGFRA, PROC, PTPN11, RET, SMAD4, SMARCB1, SMO, SRC, STK11, VHL, TERT, CCND1, CDK4, CDKN
- fragments are amplified and the sequences of amplified fragments are read and aligned, the fragments are subjected to base calling. Variations in the number of amplified fragments and unseen amplified fragments can introduce errors in base calling. These variations are corrected by calculating the number of unseen amplified fragments.
- sequence readouts can come from two types of fragments: double-strand fragments and single-strand fragments. The following is a theoretical example of calculating the total number of unseen molecules in a sample.
- N is the total number of molecules in the sample.
- P is the probability of seeing a strand.
- Q is the probability of not detecting a strand.
- An assay is used to analyze a panel of genes to identify genetic variants in cancer-associated somatic variants with high sensitivity.
- Cell-free DNA is extracted from plasma of a patient and amplified by PCR. Genetic variants are analyzed by massively parallel sequencing of the amplified target genes. For one set of genes, all exons are sequenced as such sequencing coverage had shown to have clinically utility (Table 3). For another set of genes, sequencing coverage included those exons with a previously reported somatic mutation (Table 4). The minimum detectable mutant allele (limit of detection) is dependent on the patient's sample cell-free DNA concentration, which varied from less than 10 to over 1,000 genomic equivalents per mL of peripheral blood. Amplification may not be detected in samples with lower amounts of cell-free DNA and/or low-level gene copy amplification. Certain sample or variant characteristics resulted in reduced analytic sensitivity, such as low sample quality or improper collection.
- the percentage of genetic variants found in cell-free DNA circulating in blood is related to the unique tumor biology of this patient. Factors that affected the amount/percentages of detected genetic variants in circulating cell-free DNA in blood include tumor growth, turnover, size, heterogeneity, vascularization, disease progression or treatment. Table 5 annotates the percentage, or allele frequency, of altered circulating cell-free DNA (% cfDNA) detected in this patient. Some of the detected genetic variants are listed in descending order by % cfDNA.
- Genetic variants are detected in the circulating cell-free DNA isolated from this patient's blood specimen. These genetic variants are cancer-associated somatic variants, some of which have been associated with either increased or reduced clinical response to specific treatment. “Minor Alterations” are defined as those alterations detected at less than 10% the allele frequency of “Major Alterations”. A Major Alteration is the predominant alteration at a locus. The detected allele frequencies of these alterations (Table 5) and associated treatments for this patient are annotated.
- a nucleotide detected at at least 98.8% frequency in the sample is different than a nucleotide in the reference sequence, indicating homozygosity at these loci.
- T was detected rather than reference nucleotide C in 100% of cases.
- a nucleotide detected at between 41.4% and 55% frequency in the sample is different than a nucleotide in the reference sequence, indicating heterozygosity at these loci.
- G was detected rather than reference nucleotide A in 50% of cases.
- nucleotide detected at less than 9% frequency is different than a nucleotide in the reference sequence.
- BRAF (140453136 A>T, 8.9%)
- NRAS 115256530 G>T 2.6%)
- JAK2 5073770 G>T 1.5%). They are presumed to be somatic mutations from cancer DNA.
- the relative amounts of tumor-associated genetic variants are calculated.
- the ratio of amounts of BRAF:NRAS:JAK2 is 8.9:2.6:1.5, or 1:0.29:0.17. From this result one can infer the presence of tumor heterogeneity. For example, one possible interpretation is that 100% of tumor cells contain a variant in BRAF, 83% contain variants in BRAF and NRAS, and 17% contain variants in BRAF, NRAS and JAK2. However, analysis of CNV may show amplification of BRAF, in which case 100% of tumor cells may have variants in both BRAF and NRAS.
- Double-stranded cell-free DNA is isolated from the plasma of a patient.
- the cell-free DNA fragments are tagged using 16 different bubble-containing adaptors, each of which comprises a distinctive barcode.
- the bubble-containing adaptors are attached to both ends of each cell-free DNA fragment by ligation. After ligation, each of the cell-free DNA fragment can be distinctly identified by the sequence of the distinct barcodes and two 20 bp endogenous sequences at each end of the cell-free DNA fragment.
- the tagged cell-free DNA fragments are amplified by PCR.
- the amplified fragments are enriched using beads comprising oligonucleotide probes that specifically bind to a group of cancer-associated genes. Therefore, cell-free DNA fragments from the group of cancer-associated genes are selectively enriched.
- Sequencing adaptors each of which comprises a sequencing primer binding site, a sample barcode, and a cell-flow sequence, are attached to the enriched DNA molecules.
- the resulting molecules are amplified by PCR.
- each bubble-containing adaptor comprises a non-complementary portion (e.g., the bubble)
- the sequence of the one strand of the bubble-containing adaptor is different from the sequence of the other strand (complement). Therefore, the sequence reads of amplicons derived from the Watson strand of an original cell-free DNA can be distinguished from amplicons from the Crick strand of the original cell-free DNA by the attached bubble-containing adaptor sequences.
- sequence reads from a strand of an original cell-free DNA fragment are compared to the sequence reads from the other strand of the original cell-free DNA fragment. If a variant occurs in only the sequence reads from one strand, but not other strand, of the original cell-free DNA fragment, this variant will be identified as an error (e.g., resulted from PCR and/or amplification), rather than a true genetic variant.
- sequence reads are grouped into families. Errors in the sequence reads are corrected.
- the consensus sequence of each family is generated by collapsing.
- a therapeutic intervention is determined to treat the cancer.
- Cancers with BRAF mutants respond to treatment with vemurafenib, regorafenib, tranetinib and dabrafenib.
- Cancers with NRAS mutants respond to treatment with trametinib.
- Cancers with JAK2 mutants respond to treatment with ruxolitinib.
- a therapeutic intervention including administration of trametinib and ruxolitinib is determined to be more effective against this cancer than treatment with any one of the aforementioned drugs alone.
- the subject is treated with a combination of trametinib and ruxolitinib at a dose ratio of 5:1.
- the cfDNA from the subject is tested again for the presence of tumor heterogeneity.
- Results show that the ratio of the BRAF:NRAS:JAK2 is now about 4:2:1.5. This indicates that the therapeutic intervention has reduced the number of cells with the BRAF and NRAS mutants, and has halted growth of cells with JAK2 mutants.
- a second therapeutic intervention is determined in which trametinib and ruxolitinib are determined to be effective in a dose ratio of 1:1.
- the subject is given a course of chemotherapy at amounts at this ratio.
- Subsequent testing shows that BRAF, NRAS and JAK2 mutants are present in cfDNA at amounts below 1%.
- a blood sample is collected from an individual with melanoma pre-treatment and the patient is determined to have a BRAF V600E mutation at a concentration of 2.8% and no detectable NRAS mutations using cell-free DNA analysis.
- the patient is put on an anti-BRAF therapy (dabrafenib). After 3 weeks, another blood sample is collected and tested.
- the BRAF V600E level is determined to have dropped to 0.1%.
- the therapy is stopped and the test repeated every 2 weeks.
- the BRAF V600E level rises again and therapy is reinitiated when the BRAF V600E level rises to 1.5%. Therapy is again stopped when the level drops down to 0.1% again. This cycle is repeated.
- Copy number variations in a patient sample are determined. Methods for determining can include molecular tracking and upsampling, as described above.
- a hidden-markov model based on expected locations of origins of replication is used to remove the effect of replication origin proximity from the estimated copy number variations in the patient sample. The standard deviation of copy-number variations for each gene is subsequently reduced by 40%.
- the replication origin proximity model is also used to infer cell-free tumor burden in the patient.
- the level of cell-free tumor derived may be low or below the detection limit of a particular technology. This can be the case when the number of human genome equivalents of tumor derived DNA in plasma is below 1 copy per 5 mL.
- Radiation and chemotherapies have been shown to affect rapidly dividing cells more than stable, healthy cells, hence their efficacy in treating advanced cancer patients.
- a procedure with minimal adverse effects is administered to a patient pre-blood collection to preferentially increase the fraction of tumor-derived DNA collected.
- a low dose of chemotherapy could be administered to the patient and a blood sample could be collected within 24 hours, 48 hours, 72 hours or less than 1 week.
- this blood sample contains higher concentrations of cell-free tumor-derived DNA due to potentially higher rates of cell-death of cancer cells.
- low-dose radiation therapy is applied via a whole-body radiographic instrument or locally to the affected regions instead of low-dose chemotherapy.
- Other procedures are envisioned, including subjecting a patient to ultrasound, sound waves, exercise, stress, etc.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- Analytical Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Pathology (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Plant Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 17/462,906 filed Aug. 31, 2021, which is a continuation of U.S. patent application Ser. No. 17/000,010, filed Aug. 20, 2020, which is a continuation of U.S. patent application Ser. No. 15/431,395, filed Feb. 13, 2017, which is a continuation of International Patent Application No. PCT/US2015/067717, filed Dec. 28, 2015, which claims the benefit of U.S. Provisional Application No. 62/098,426, filed Dec. 31, 2014 and U.S. Provisional Application No. 62/155,763, filed on May 1, 2015, each of which is incorporated entirely herein by reference.
- Health care is just now starting to effectively use information from the human genome to diagnose and treat disease. Nowhere is this more crucial than in the treatment of cancer, from which 7.6 million people in the U.S. die each year, and for which the US spends $87 billion a year on treatment. Cancer refers to any disorder of various malignant neoplasms characterized by the proliferation of anaplastic cells that tend to invade surrounding tissue and metastasize to new body sites and the pathological conditions characterized by such growths.
- One of the reasons cancer is difficult to treat is that current testing methods may not help doctors match specific cancers with effective drug treatments. And it is a moving target cancer cells are constantly changing and mutating. Cancers can accumulate genetic variants through, e.g., somatic cell mutation. Such variants include, for example, sequence variants and copy number variants. Analysis of tumors has indicated that different cells in a tumor can bear different genetic variants. Such differentiation between tumor cells has been referred to as tumor heterogeneity.
- Cancers can evolve over time, becoming resistant to a therapeutic intervention. Certain variants are known to correlate with responsiveness or resistance to specific therapeutic interventions. More effective treatments for cancers exhibiting tumor heterogeneity would be beneficial. Such cancers may be treated with a second, different, therapeutic intervention to which the cancer responds.
- DNA sequencing methods allow detection of genetic variants in DNA from tumor cells. Cancer tumors continually shed their unique genomic material into the bloodstream. Unfortunately, these telltale genomic “signals” are so weak that current genomic analysis technologies, including next-generation sequencing, may only detect such signals sporadically or in patients with terminally high tumor burden. The main reason for this is that such technologies are plagued by error rates and bias that can be orders of magnitude higher than what is required to reliably detect de novo genomic alterations associated with cancer.
- In a parallel trend, to understand the clinical significance of a genetic test, treating professionals must have a working knowledge of basic principles of genetic inheritance and reasonable facility with the interpretation of probabilistic data. Some studies suggest that many treating professionals are not adequately prepared to interpret genetic tests for disease susceptibility. Some physicians have difficulty interpreting probabilistic data related to the clinical utility of diagnostic tests, such as the positive or negative predictive value of a laboratory test.
- The error rates and bias in detecting de novo genomic alterations associated with cancer, along with inadequate explanation or the implications of the genetic tests for cancer, have lowered the quality of care for cancer patients. Professional societies, such as the College of American Pathologists (CAP) and the American College of Medical Genetics (ACMG), have published standards or guidelines for laboratories that provide genetic testing, which require that reports containing genetic information include interpretive content that is understandable by generalist physicians.
- In an aspect provided herein is a method comprising: (a) sequencing polynucleotides from cancer cells from a biological sample of a subject; (b) identifying and quantifying somatic mutations in the polynucleotides; (c) developing a profile of tumor heterogeneity in the subject indicating the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates tumor heterogeneity; and (d) determining a therapeutic intervention for a cancer exhibiting the tumor heterogeneity, wherein the therapeutic intervention is effective against a cancer having the profile of tumor heterogeneity determined. In some embodiments, the cancer cells are spatially distinct. In some embodiments, the therapeutic intervention is more effective against a cancer presenting with the plurality of somatic mutations than it is against a cancer presenting with any one, but not all, of the somatic mutations. In some embodiments, the method further comprises: (e) monitoring changes in tumor heterogeneity in the subject over time and determining different therapeutic interventions over time based on the changes. In some embodiments, the method further comprises: (e) displaying the therapeutic intervention. In some embodiments, the method further comprises: (e) implementing the therapeutic intervention. In some embodiments, the method further comprises: (e) generating a phylogeny of tumor evolution based on the tumor profile; wherein determining the therapeutic intervention takes into account the phylogeny.
- In some embodiments, determining is performed with the aid of computer-executed algorithm. In some embodiments, sequence reads generated by sequencing are subject to noise reduction before identifying and quantifying. In some embodiments, noise reduction comprises molecular tracking of sequences generated from a single polynucleotide in the sample.
- In some embodiments, determining a therapeutic intervention takes into account the relative frequencies of the tumor-related genetic alterations. In some embodiments, the therapeutic intervention comprises administering, in combination or in series, a plurality of drugs, wherein each drug is relatively more effective against a cancer presenting with a different one of somatic mutations that occur at different relative frequency. In some embodiments, a drug that is relatively more effective against a cancer presenting with a somatic mutation occurring at higher relative frequency is administered in higher amount. In some embodiments, the drugs are delivered at doses that are stratified to reflect the relative amounts of the variants in the DNA. In some embodiments, cancers presenting with at least one of the genetic variants is resistant to at least one of the drugs. In some embodiments, determining a therapeutic intervention takes into account the tissue of origin of the cancer. In some embodiments, the therapeutic intervention is determined based on a database of interventions shown to be therapeutic for cancers having tumor heterogeneity characterized by each of the somatic mutations.
- In some embodiments, the polynucleotides comprise cfDNA from a blood sample. In some embodiments, the polynucleotides comprise polynucleotides from spatially distinct cancer cells. In some embodiments, the polynucleotides comprise polynucleotides from different metastatic tumor sites. In some embodiments, the polynucleotides comprise polynucleotides from a solid tumor or a diffuse tumor. In some embodiments, the polynucleotides are comprised in a blood sample or in solid tumor biopsy.
- In some embodiments, identifying comprises generating a plurality of sequence reads for parent polynucleotides from the sample, and collapsing the sequence reads to generate consensus calls for bases in each parent polynucleotide. In some embodiments, quantifying comprises determining frequency at which the somatic mutations are detected in the population of polynucleotides from the biological sample. In some embodiments, the biological sample comprises biological molecules from non-disease cells. In some embodiments, the biological sample comprises biological molecules from a plurality of different tissues. In some embodiments, the biomolecules are comprised in one biological sample. In some embodiments, the biomolecules are comprised in a plurality of biological samples. In some embodiments, the plurality of biological samples are tumors from a plurality of metastases.
- In some embodiments, sequencing comprises sequencing all or part of a subset of genes in the subject's genome. In some embodiments, the somatic mutations are selected from single nucleotide variations (SNVs), insertions, deletions, inversions, transversions, translocations, copy number variations (CNVs) (e.g., aneuploidy, partial aneuploidy, polyploidy), chromosomal instability, chromosomal structure alterations, gene fusions, chromosome fusions, gene truncations, gene amplification, gene duplications, chromosomal lesions, DNA lesions, abnormal changes in nucleic acid chemical modifications, abnormal changes in epigenetic patterns and abnormal changes in nucleic acid methylation. In some embodiments, genetic loci are selected from single nucleotides, genes and chromosomes.
- In some embodiments, the cancer is selected from carcinomas, sarcomas, leukemias, lymphomas, myelomas and central nervous system cancers (e.g., breast cancer, prostate cancer, colorectal cancer, brain cancer, esophageal cancer, head and neck cancer, bladder cancer, gynecological cancer, liposarcoma, and multiple myeloma). In some embodiments, cancer cells of the tumor are derived from a common parent disease cell. In some embodiments, cancer cells of the tumor are derived from different parent cancer cells of the same or different cancer type. In some embodiments, the method further comprises determining a measure of the somatic mutations to one or more control references to determine the relative quantity.
- In some embodiments, the polynucleotides are sourced from both circulating cancer polynucleotides and from solid tumor biopsy. In some embodiments, profiles are separately developed for polynucleotides sourced from the circulating cancer polynucleotides and from the solid tumor biopsy.
- In an aspect provided herein is a method comprising providing a therapeutic intervention for a subject having a cancer having a tumor profile from which tumor heterogeneity can be inferred, wherein the therapeutic intervention is effective against cancers with the tumor profile. In some embodiments, the tumor profile indicates relative frequency of a plurality of more somatic mutations. In some embodiments, the method further comprises monitoring changes in the relative frequencies in the subject over time and determining different therapeutic interventions over time based on the changes. In some embodiments, the therapeutic intervention is more effective against a cancer presenting with each of the somatic mutations than it is against a cancer presenting with any one, but not all, of the somatic mutations. In some embodiments, the therapeutic intervention comprises administering, in combination or in series, a plurality of drugs, wherein each drug is relatively more effective against a cancer presenting with a different one of somatic mutations that occur at different relative frequency. In some embodiments, a drug that is relatively more effective against a cancer presenting with a somatic mutation occurring at higher relative frequency is administered in higher amount. In some embodiments, the drugs are delivered at doses that are stratified to reflect the relative amounts of the variants in the DNA. In some embodiments, cancers presenting with at least one of the genetic variants is resistant to at least one of the drugs. In some embodiments, the cancer is selected from carcinomas, sarcomas, leukemias, lymphomas, myelomas and central nervous system cancers (e.g., breast cancer, prostate cancer, colorectal cancer, brain cancer, esophageal cancer, head and neck cancer, bladder cancer, gynecological cancer, liposarcoma, and multiple myeloma).
- In an aspect provided herein is a method comprising administering to a subject a therapeutic intervention that is effective against a tumor exhibiting tumor heterogeneity, wherein the therapeutic intervention is based on a profile of tumor heterogeneity in the subject indicating the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates tumor heterogeneity.
- In an aspect provided herein is a system comprising a computer readable medium comprising machine-executable code that, upon execution by a computer processor, implements a method comprising: (a) receiving into memory sequence reads of polynucleotides mapping to a genetic locus; (b) determining, among said sequence reads, identity of bases that are different than a base of a reference sequence at the locus of the total number of sequence reads mapping to a locus; (c) reporting the identity and relative quantity of the determined bases and their location in the genome; and (d) inferring heterogeneity of a given sample based on information in (c). In some embodiments, the method implemented further comprises receiving into memory sequence reads derived from samples at a plurality of different times and calculating a difference in relative amount and identity of a plurality of bases between the two samples.
- In an aspect provided herein is a kit comprising a first pharmaceutical drug and a second pharmaceutical drug, wherein a combination of the first drug and the second drug is more therapeutically effective against a cancer presenting with a first and a second somatic mutation than it is against a cancer presenting with any one, but not all, of the somatic mutations. In some embodiments, the combination is contained in a mixture or each drug is contained in a separate container.
- In an aspect provided herein is a method comprising: (a) performing biomolecular analysis of biomolecular polymers from disease cells (e.g., spatially distinct disease cells) from a subject; (b) identifying and quantifying biomolecular variants in the biomolecular macromolecules; (c) developing a profile of disease cell heterogeneity in the subject indicating the presence and relative quantity of a plurality of the variants in the biomolecular macromolecules, wherein different relative quantities indicates disease cell heterogeneity; and (d) determining a therapeutic intervention for a disease exhibiting the disease cell heterogeneity, wherein the therapeutic intervention is effective against a disease having the profile of disease cell heterogeneity determined. In some embodiments, the disease cells are spatially distinct disease cells. In some embodiments, the therapeutic intervention is determined based on a database of interventions shown to be therapeutic for cancers having tumor heterogeneity characterized by each of the somatic mutations.
- In an aspect herein is a method of detecting disease cell heterogeneity in a subject comprising: a) quantifying polynucleotides that bear a sequence variant at each of a plurality of genetic loci in polynucleotides from a sample from the subject, wherein the sample comprises polynucleotides from somatic cells and from disease cells; b) determining for each locus a measure of copy number variation (CNV) for polynucleotides bearing the sequence variant; c) determining for each locus a weighted measure of quantity of polynucleotides bearing a sequence variant at the locus as a function of CNV at the locus; and d) comparing the weighted measures at each of the plurality of loci, wherein different weighted measures indicate disease cell heterogeneity. In some embodiments, the disease cells are tumor cells. In some embodiments, polynucleotides comprise cfDNA.
- In an aspect provided herein is a method comprising: a) subjecting a subject to one or more pulsed therapy cycles, each pulsed therapy cycle comprising: (i) a first period during which one or more drugs is administered at a first amount and (ii) a second period during which the one or more drugs is administered at a second, reduced (e.g., completely not administered) amount; wherein: (A) the first period is characterized by a tumor burden detected above a first clinical level; and (B) the second period is characterized by a tumor burden detected below a second clinical level. In some embodiments, tumor burden is measured as a function of a quantity of a selected somatic variant in tumor polynucleotides. In some embodiments, one or more drugs is a plurality of drugs and each amount of each drug in each cycle is determined as a function of tumor burden measured as a function of a quantity of each of a plurality of different selected somatic variants in tumor polynucleotides. In some embodiments, the method comprises subjecting the subject to a plurality of pulsed therapy cycles. In some embodiments, the method further comprises: b) when the subject exhibits resistance to the one or more drugs, subjecting the subject to one or more pulsed therapy cycles, each pulsed therapy cycle comprising: (i) a first period during which a different one or more drugs is administered at a first amount and (ii) a second period during which the different one or more drugs is administered at a second, reduced (e.g., completely not administered) amount; wherein: (A) the first period is characterized by a tumor burden detected above a first clinical level; and (B) the second period is characterized by a tumor burden detected below a second clinical level.
- In an aspect provided herein is a method comprising: (a) sequencing polynucleotides from cancer cells from a subject; (b) identifying and quantifying somatic mutations in the polynucleotides; and (c) developing a profile of tumor heterogeneity in the subject for use in determining a therapeutic intervention effective for a cancer exhibiting tumor heterogeneity, wherein the profile indicates the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates tumor heterogeneity.
- In an aspect provided herein is a method comprising providing a therapeutic intervention for a subject wherein the therapeutic intervention is determined from a profile of disease cell heterogeneity in the subject, wherein the profile indicates the presence and relative quantity of a plurality of the somatic mutations in the polynucleotides, wherein different relative quantities indicates disease cell heterogeneity; and wherein the therapeutic intervention is effective against a disease having the profile of disease cell heterogeneity determined, e.g., more effective against a disease presenting with the plurality of somatic mutations than it is against a disease presenting with any one, but not all, of the somatic mutations.
- In an aspect provided herein is a method comprising: a) determining a measure of deviation from a value of central tendency (e.g., standard deviation, variance) of copy number in polynucleotides in a sample across a region of at least 1 kb, at least 10 kb, at least 100 kb, at least 1 mb, at least 10 mb or at least 100 mb of a genome; b) inferring a measure of burden of DNA from cells undergoing cell division in the sample based on the measure of deviation. In some embodiments, the value of central tendency is mean, median or mode. In some embodiments, determining comprises partitioning the region into a plurality of non-overlapping intervals, determining a measure of copy number at each interval and determining the measure of deviation based on measures of copy number at each interval. In some embodiments, the interval is no more than any of 1 base, 10 bases, 100 bases, 1 kb bases or 10 kb.
- In an aspect provided herein is a method of inferring a measure of burden of DNA from cells undergoing cell division in a sample comprising measuring copy number variation induced by proximity of one or more genomic loci to cells' origins of replication, wherein increased CNV indicates cells undergoing cell division. In some embodiments, the burden is measured in cell-free DNA. In some embodiments, the measure of burden relates to the fraction of tumor cells or genome-equivalents of DNA from tumor cells in the sample. In some embodiments, CNV due to proximity to origins of replication is inferred from a set of control samples or cell-lines. In some embodiments, a hidden-markov model, regression model, principal component analysis-based model, or genotype-modified model is used to approximate variations due to origins of replications. In some embodiments, the measure of burden is presence or absence of cells undergoing cell division. In some embodiments, proximity is within 1 kb of an origin of replication.
- In an aspect provided herein is a method of increasing sensitivity and/or specificity of determining gene-related copy-number variations by ameliorating the effect of variations due to proximity to origins of replications. In some embodiments, the method comprises measuring CNV at a locus, determining amount of CNV due to proximity of the locus to an origin of replication, and correcting the measured CNV to reflect genomic CNV, e.g., by subtracting amount of CNV attributable to cell division. In some embodiments, the genomic data is obtained from cell-free DNA. In some embodiments, the measure of burden relates to the fraction of tumor cells or genome-equivalents of DNA in a sample. In some embodiments, variations due to origins of replication are inferred from a set of control samples or cell-lines. In some embodiments, a hidden-markov model, regression model, principal component analysis-based model, or genotype-modified model is used to approximate variations due to origins of replications.
- In an aspect provided herein is a method comprising: a) determining a baseline measure of copies of DNA molecules at one or more loci from one or more control samples, wherein one or more of the loci includes an origin of replication, each containing DNA from cells undergoing a predetermined level of cell division; b) determining a test measure of DNA molecules in a test sample; wherein the measure in test sample is from one or more loci partitioned into one or more partitions and wherein one or more of the loci includes an origin of replication; c) comparing the test measure and the baseline measure, wherein a test measure above a baseline measure indicates DNA in the test sample from cells dividing at a rate faster than cells providing DNA to the control sample. In some embodiments, the measure is selected from molecule count, a measure of central tendency of molecule count across partitions or a measure of variation of molecule count across partitions.
- In an aspect provided herein is a method comprising: (a) administering to a subject an intervention that increases an amount of tumor-derived DNA in the subject's circulation; and (b) when said amount is increased, collecting from the subject a sample containing tumor-derived DNA. In some embodiments, the intervention preferentially kills tumor cells. In some embodiments, the intervention comprises exposing the subject or suspected diseased areas of the subject to radiation. In some embodiments, the intervention comprises exposing the subject or suspected diseased areas of subject to ultrasound. In some embodiments, the intervention comprises exposing the subject or suspected diseased areas of subject to physical agitation. In some embodiments, the intervention comprises administering to the subject a low dose of chemotherapy. In some embodiments, the method comprises administering the intervention to the subject within 1 week before collecting the sample. In some embodiments, the sample is selected from blood, plasma, serum, urine, saliva, cerebral spinal fluid, vaginal secretion, mucous and semen.
- In an aspect provided herein is a method comprising compiling a database, wherein the database includes, for each of a plurality of subjects having cancer, tumor genomic testing data, including somatic alterations, collected at two or more time intervals per subject, one or more therapeutic interventions administered to each of the subjects at one or more times and efficacy of the therapeutic interventions, wherein the database is useful to infer efficacy of the therapeutic interventions in subjects with a tumor genomic profile. In some embodiments, the plurality is at least 50, at least 500 or at least 5000. In some embodiments, the tumor genomic testing data is collected via serial biopsy, cell-free DNA, cell-free RNA or circulating tumor cells. In some embodiments, relative frequencies of detected genetic variants are used to classify treatment efficacy. In some embodiments, additional information is used to help classify treatment efficacy, including but not limited to, weight, adverse treatment effects, histological testing, blood testing, radiographic information, prior treatments, and cancer type. In some embodiments, treatment response per patient is collected and classified quantitatively through additional testing. In some embodiments, the additional testing is blood or urine based testing.
- In an aspect provided herein is a method comprising use of a database to identify one or more effective therapeutic interventions for a subject having cancer, wherein the database includes, for each of a plurality of subjects having cancer, tumor genomic testing data, including somatic alterations, collected at two or more time intervals per subject, one or more therapeutic interventions administered to each of the subjects at one or more times and efficacy of the therapeutic interventions. In some embodiments, identified therapeutic interventions are stratified by efficacy. In some embodiments, quantitative bounds on predicted therapeutic interventions efficacy or lack thereof are reported. In some embodiments, the therapeutic interventions use information of predicted tumor genomic evolution or acquired resistance mechanisms in similar patients in response to treatment.
- In some embodiments, the method comprises classifying effectiveness of treatment using a classification algorithm, e.g., linear regression processes (e.g., multiple linear regression (MLR), partial least squares (PLS) regression and principal components regression (PCR)), binary decision trees (e.g., recursive partitioning processes such as CART—classification and regression trees), artificial neural networks such as back propagation networks, discriminant analyses (e.g., Bayesian classifier or Fischer analysis), logistic classifiers, and support vector classifiers (e.g., support vector machines).
- In an aspect disclosed herein is a method to report results of one or more genetic tests comprising: capturing genetic information including genetic variants and quantitative measures thereof over one or more test points using a genetic analyzer; normalizing the quantitative measures for rendering with the one or more test points and generating a scaling factor; applying the scaling factor to render a tumor response map; and generating a summary of genetic variants. In some embodiments, the method comprises analyzing non-CNV (copy number variation) mutant allele frequencies. In some embodiments, the method comprises transforming an absolute value into a relative metric for rendering the tumor response map. In some embodiments, the method comprises multiplying a mutant allele frequency by a predetermined value and taking a log thereof. In some embodiments, the method comprises: multiplying the scaling factor by a transformed value for each gene to determine a quantity indicator to be rendered on the tumor response map; and assigning a unique visual indicator for each alteration in a visual panel. In some embodiments, the method comprises Y-centering or vertically centering the quantity indicator in a contiguously placed panel that indicates continuity. In some embodiments, the assigning further comprises providing a unique color for each alteration.
- In some embodiments, the method comprises analyzing genetic information from another test point or test time. In some embodiments, wherein a new test result does not differ from a prior test result, the method comprises rendering the prior visual panel. In some embodiments, wherein if alterations remain the same, but quantities have changed, the method comprises: maintaining the order and unique visual indicator for each alteration; and determining a new quantity indicator and generating a new visual panel for all test points. In some embodiments, the method comprises determining a new alteration in the genetic information and adding the alteration to the top of existing alterations. In some embodiments, the method comprises determining a new alteration in the genetic information and determining new transform values and scaling factor and assigning a unique visual indicator for each new alteration. In some embodiments, the method comprises determining a new alteration in the genetic information and re-generating the tumor response map including alterations from a prior test point that are still detected in current test point and the new alteration. In some embodiments, the method comprises determining if a prior alteration is no longer present and if so, comprising using a height of zero when rendering the quantity of the alteration of the prior alteration for subsequent test points. In some embodiments, the method comprises determining if a prior alteration is no longer present and if so, reserving the unique visual indicator associated with the prior alternation from future use.
- In some embodiments, the method comprises analyzing CNV mutant allele frequencies and methylation mutant allele frequencies. In some embodiments, the method comprises grouping of maximum mutant allele frequencies for rendering first on the tumor response map. In some embodiments, the method comprises rendering alterations for the gene in decreasing mutant allele frequency order of alterations. In some embodiments, the method comprises rendering alterations for the gene in a decreasing order. In some embodiments, the method comprises selecting a next gene with next highest mutant allele frequency.
- In some embodiments, for each reported alteration, the method comprises generating a trend indicator for the alteration over the different test points. In some embodiments, the method comprises generating a summary of alterations. In some embodiments, the method comprises generating a summary of treatment options. In some embodiments, the method comprises generating a summary of mutant allele frequency, cell free amplification, clinical approval indication, and clinical trial. In some embodiments, the method comprises generating a panel based on a biological pathway. In some embodiments, the method comprises generating a panel based on an evidence level. In some embodiments, the genetic information includes one or more of single-nucleotide variations, copy number variations, insertions and deletions, and gene rearrangements. In some embodiments, the method comprises generating a clinical relevance report on detected alterations. In some embodiments, the method comprises generating a therapy result summary.
- In an aspect provided herein is a method to generate a genetic report comprising: generating non-copy number variation (CNV) data using a genetic analyzer; determining a scaling factor for each non-CNV mutant allele frequency; for a first test, generating a visual panel each non-CNV alteration using the scaling factor; and for each subsequent test, generating changes in the non-CNV alteration for the visual panel using the scaling factor.
- In some embodiments, the method comprises transforming an absolute value into a relative metric for rendering. In some embodiments, the method comprises multiplying a mutant allele frequency by a predetermined value and taking a log of the predetermined value. In some embodiments, the method comprises determining a scaling factor using a maximum observed value. In some embodiments, for each non-CNV alteration, the method comprises multiplying a scaling factor by a transformed value for each gene variant as a quantity indicator for visualizing the gene variant.
- In some embodiments, the method comprises assigning a unique visual indicator for each alteration. In some embodiments, for the subsequent test, the method comprises using the visual panel if the test result is unchanged. In some embodiments, if alterations remain the same in the subsequent test, the method comprises maintaining the order and unique visual indicator for each alteration; and recomputing a quantity indicator for visualizing that variant and re-rendering updated values in existing panel(s) and new panel for the latest test. In some embodiments, if new alteration is found in the subsequent test, the method comprises adding the alterations to the top of all existing alterations; computing transform values and the scaling factor; and assigning a unique visual indicator for each new alterations.
- In some embodiments, the method comprises: re-rendering alterations in the prior test point and the new alteration; and vertically centering an image of the alterations in a contiguously placed panel that indicates continuity. In some embodiments, if a prior alteration is not present in a subsequent test, the method comprises using a height of zero as the quantity of the alteration for a subsequent rendering. In some embodiments, the method comprises rendering subject or intervention information associated with alteration changes. In some embodiments, the method comprises identifying an alteration with the maximum Mutant Allele Frequency.
- In some embodiments, the method comprises: reporting alterations for that gene in decreasing mutant allele frequency order of non-CNV alterations; and reporting CNV alterations for that gene in decreasing order of CNV value. In some embodiments, the method comprises selecting the next gene with next highest non-CNV mutant allele frequency and reporting alterations for that gene in decreasing mutant allele frequency order of non-CNV alterations; and reporting CNV alterations for that gene in decreasing order of CNV value.
- In some embodiments, the method comprises rendering a trend indicator for an alteration over different test dates. In some embodiments, the method comprises grouping of maximum mutant allele frequencies and generating annotations including biological pathways or evidence level. In some embodiments, the method comprises generating a panel based on an evidence level. In some embodiments, the method comprises generating a panel based on a biological pathway. In some embodiments, the genetic information includes one or more of single-nucleotide variations, copy number variations, insertions and deletions, and gene rearrangements.
- In an aspect provided herein is a method comprising: a) providing a plurality of nucleic acid samples from a subject, the samples collected at serial time points; b) sequencing polynucleotides from the samples to generate sequences; c) determining a quantitative measure of each of a plurality of genetic variants among the polynucleotides in each sample; d) graphically representing by computer relative quantities of genetic variants at each serial time point for those somatic mutations present at a non-zero quantity at least one of the serial time points. In some embodiments, the quantitative measure is the frequency of the genetic variant among all sequences mapping to the same genetic locus. In some embodiments, the relative quantities are represented as a stacked area graph. In some embodiments, the relative quantities are stacked, at the earliest time point, highest to lowest from the bottom to the top of the graph, and wherein a genetic variant first appearing at a non-zero quantity at a later time point is stacked at the top of the graph. In some embodiments, the areas are represented by different colors. In some embodiments, the graphical representation further indicates, for each time point, the quantitative measure of the predominant genetic variant. In some embodiments, the graphical representation further includes a key identifying genetic variants represented on the graph. In some embodiments, graphically representing comprises normalizing and scaling the quantitative measures.
- In some embodiments, the polynucleotides comprise cfDNA. In some embodiments, the loci are located in oncogenes. In some embodiments, the plurality of the genetic variants maps to a different gene in the genome. In some embodiments, the plurality of the genetic variants maps to the same gene in the genome. In some embodiments, at least 10 different oncogenes are sequenced.
- In some embodiments, determining comprises receiving the sequences into computer memory and using a computer processor to execute software to determine the quantitative measurement. In some embodiments, graphically representing comprises using a computer processor to execute software that transforms the quantitative measures into a graphical format and representing the graphical format on an electronic graphical user interface, e.g., a display screen.
- In an aspect provided herein is a method to generate a paper or electronic patient test report from data generated by a genetic analyzer comprising: a) summarizing data from two or more testing time points, whereby a union of all non-zero testing results are reported at each subsequent test point after the first test; and b) rendering the testing results on the paper or electronic patient test report. In some embodiments, summarizing and rendering are performed on a computer by executing code with a computer processor to (i) identify all non-zero testing results, (ii) generate the test report and (iii) display the test report on a graphical user interface.
- In an aspect provided herein is a method of graphically representing evolution of genetic variants of a tumor in a subject from data generated by a genetic analyzer comprising: a) generating by computer a stacked representation of genetic variants detected at each of a plurality of time points in the subject, wherein a height or width of each layer in the stack that corresponds to a genetic variant represents a quantitative contribution of the genetic variant to the a total quantity of genetic variants at each time point; and b) displaying the stacked representation on a computer monitor or a paper report. In some embodiments, the method further comprises using a combination of a magnitude of detected genetic variants in a body-fluid based test to infer a disease burden. In some embodiments, the method further comprises using allele fractions of detected mutations, allelic imbalances, gene-specific coverage to infer the disease burden.
- In some embodiments, an overall stack height is representative of overall disease burden or a disease burden score in the subject. In some embodiments, a distinct color is used to represent each genetic variant. In some embodiments, only a subset of detected genetic variants is plotted. In some embodiments, the subset is chosen based on likelihood of being a driver alteration or association with increased or reduced response to treatment.
- In some embodiments, the method comprises producing a test report for a genomic test. In some embodiments, a non-linear scale is used for representing the heights or widths of each represented genetic variant. In some embodiments, a plot of previous test points is depicted on the report. In some embodiments, the method comprises estimating a disease progression or remission based on rate of change and/or quantitative precision of each testing result. In some embodiments, the method comprises displaying a therapeutic intervention between intervening testing points. In some embodiments, displaying comprises: a) receiving data representing the detected tumor genetic variants into computer memory; b) executing code with a computer processor to graphically represent the quantitative contribution of each genetic variant at a time point as a line or area proportional to the relative contribution; and c) displaying the graphical representation on a graphical user interface.
-
FIG. 1 shows a flow chart of an exemplary method of determination and use of a therapeutic intervention. -
FIG. 2 shows a flow chart of an exemplary method of determining frequency of variants in a sample corrected based on CNV at a locus. -
FIG. 3 shows a flow chart of an exemplary method of providing pulsed therapy cycles which can delay drug resistance. -
FIG. 4 shows a flow chart of an exemplary method of detecting tumor burden using CNV at origins of replication to detect DNA from dividing cells. -
FIG. 5 shows an exemplary computer system. -
FIG. 6 shows an exemplary scan of CNV across a region of a genome from samples containing cells in a resting state and in a state of cell division. No genomic CNV is seen in loci a and b, but locus c shows gene duplication. In the resting state cells, copy number is relatively equal in all intervals in the region, except those intervals overlapping the locus of gene duplication. In the sample containing DNA from tumor cells, which are undergoing cell division, copy number appears to increase immediately after origins of replication, providing variance in CNV over the region. Deviation is particularly dramatic at a locus exhibiting CNV at an origin of replication (c). -
FIG. 7 shows an exemplary course of monitoring and treatment of disease in a subject. -
FIG. 8 shows an exemplary panel of 70 genes that exhibit genetic variation in cancer. -
FIG. 9A shows an exemplary system for communicating cancer test results. -
FIG. 9B shows an exemplary process to reduce error rates and bias in DNA sequence readings and generate genetic reports for users. -
FIG. 10A-10C show exemplary processes for reporting genetic test results to users. -
FIG. 10D-10I -2 show pages from an exemplary genetic test report. -
FIG. 10J-10P shows various exemplary modified streamgraph. -
FIG. 11A-11B shows exemplary processes for detecting mutation and reporting test results to users. - Methods of the present disclosure can detect biomolecular mosaicism (e.g., genetic mosaicism) in a biological sample, such as a heterogeneous genomic population of cells or deoxyribonucleic acid (DNA). Genetic mosaicism can exist at the organismal level. For example, genetic variants that arise early in development can result in different somatic cells having different genomes. An individual can be a chimera, e.g., produced by the fusion of two zygotes. Organ transplant from an allogeneic donor can result in genetic mosaics, which also can be detected by examining polynucleotides shed into the blood from the transplanted organ. Disease cell heterogeneity, in which diseased cells have different genetic variants, is another form of genetic mosaicism. Methods provided herein can detect mosaicism and, in the case of disease, provide therapeutic intervention. In certain embodiments, this disclosure provides methods for performing body-wide profiling of biomolecular mosaicism through the use of circulating polynucleotides, which may derive or otherwise originate from cells in diverse locations of the body of a subject.
- Diseased cells, such as tumors, may evolve over time, resulting in different clonal sub-populations having new genetic and phenotypic characteristics. This may result from natural mutations as the cells divide, or it may be driven by treatments that target certain clonal sub-populations, allowing clones more resistant to the treatment to proliferate by negative selection. The existence of sub-populations of diseased cells that bear different genotypic or phenotypic characteristics is referred to herein as disease cell heterogeneity, or, in the case of cancer, tumor heterogeneity.
- Presently, cancers are treated based on mutant forms found in a cancer biopsy. For example, the finding of Her2+ in even small amounts of breast cancer cells may be indicative of breast cancer, which may be followed through with a treatment using an anti-Her2+ therapy. As another example, a colorectal cancer in which a KRAS mutant is found in small amounts may be treated with a therapy for which KRAS is responsive.
- Tools for fine analysis of diseased cells (e.g., tumors), allows detection of disease cell heterogeneity. Furthermore, the analysis of polynucleotides sourced from diseased cells located throughout the body allows for a whole-body profile of disease cell heterogeneity. The use of cell-free DNA, or circulating DNA, is particularly powerful because polynucleotides in the blood are not sourced from physically localized cells. Rather, they include cells from metastatic sites throughout the body. For example, analysis may show that a population of breast-cancer cells includes 90% that are Her2+ and 10% that are Her2-. This may be determined, for example, by quantifying DNA for each form in a sample, e.g., cell free DNA (cfDNA), thereby detecting heterogeneity in the tumor.
- This information can be used by a health care provider, e.g., a physician, to develop therapeutic interventions. For example, a subject that has a heterogeneous tumor can be treated as if they had two tumors, and a therapeutic intervention can treat each of the tumors. The therapeutic intervention could include, for example, a combination therapy including a first drug effective against the first tumor type and a second drug effective against the second tumor type. The drugs can be given in amounts that reflect the relative amounts of the mutant forms detected. For example, a drug to treat the mutant form that is found in higher relative amounts can be delivered at greater dose than a drug to treat the mutant form in lesser relative amount. Or, treatment for the mutant in the lesser relative amount can be delayed or staggered with respect the mutant in greater amount.
- Monitoring changes in the profile of disease cell heterogeneity over time allows therapeutic intervention to be calibrated to an evolving tumor. For example, analysis may show increasing amounts of polynucleotides bearing drug resistance mutants. In this case, the therapeutic intervention can be modified to decrease the amount of drug effective to treat a tumor that does not bear the resistance mutant and increase administration of a drug that does treat a tumor bearing the resistance marker.
- Therapeutic interventions can be determined by a healthcare provider or by a computer algorithm, or a combination of the two. A database can contain the results of therapeutic interventions against diseases having various profiles of disease cell heterogeneity. The database can be consulted in determining a therapeutic intervention for a disease with a particular profile.
- This present disclosure provides, among other things, methods of determining a therapeutic intervention for a subject having a disease, such as cancer, that exhibits disease cell heterogeneity, e.g., tumor heterogeneity. In one embodiment, the method involves analyzing biological macromolecules (e.g., sequencing polynucleotides) of disease cells (e.g., spatially distinct disease cells) from a subject having the disease. A profile of disease cell heterogeneity is developed that indicates the existence of genetic variants specific to the disease cells and the amount of these variants relative to each other. This information, in turn, is used to determine a therapeutic intervention that takes the profile into account.
- A subject of the methods of this disclosure is any multicellular organism. More specifically, the subject can be a plant or an animal, a vertebrate, a mammal, a mouse, a primate, a simian or a human. Animals include, but are not limited to, farm animals, sport animals, and pets. A subject can be a healthy individual, an individual that has or is suspected of having a disease or a pre-disposition to the disease, or an individual that is in need of therapy or suspected of needing therapy. A subject can be a patient, e.g., a subject under the care of a professional healthcare provider.
- The subject can have a pathological condition (disease). Cells exhibiting pathology of disease are referred to herein as disease cells.
- In particular, the disease can be a cancer. Cancer is a condition characterized by abnormal cells that divide out of control. Cancers include, without limitation, carcinomas, sarcomas, leukemias, lymphomas, myelomas and central nervous system cancers. More specific examples of cancers are breast cancer, prostate cancer, colorectal cancer, brain cancer, esophageal cancer, head and neck cancer, bladder cancer, gynecological cancer, liposarcoma, and multiple myeloma.
- Other cancers include, for example, acute lymphoblastic leukemia (ALL), acute myeloid leukemia (AML), adrenocortical carcinoma, Kaposi Sarcoma, anal cancer, basal cell carcinoma, bile duct cancer, bladder cancer, bone cancer, osteosarcoma, malignant fibrous histiocytoma, brain stem glioma, brain cancer, craniopharyngioma, ependymoblastoma, ependymoma, medulloblastoma, medulloeptithelioma, pineal parenchymal tumor, breast cancer, bronchial tumor, Burkitt lymphoma, Non-Hodgkin lymphoma, carcinoid tumor, cervical cancer, chordoma, chronic lymphocytic leukemia (CLL), chronic myelogenous leukemia (CML), colon cancer, colorectal cancer, cutaneous T-cell lymphoma, ductal carcinoma in situ, endometrial cancer, esophageal cancer, Ewing Sarcoma, eye cancer, intraocular melanoma, retinoblastoma, fibrous histiocytoma, gallbladder cancer, gastric cancer, glioma, hairy cell leukemia, head and neck cancer, heart cancer, hepatocellular (liver) cancer, Hodgkin lymphoma, hypopharyngeal cancer, kidney cancer, laryngeal cancer, lip cancer, oral cavity cancer, lung cancer, non-small cell carcinoma, small cell carcinoma, melanoma, mouth cancer, myelodysplastic syndromes, multiple myeloma, medulloblastoma, nasal cavity cancer, paranasal sinus cancer, neuroblastoma, nasopharyngeal cancer, oral cancer, oropharyngeal cancer, osteosarcoma, ovarian cancer, pancreatic cancer, papillomatosis, paraganglioma, parathyroid cancer, penile cancer, pharyngeal cancer, pituitary tumor, plasma cell neoplasm, prostate cancer, rectal cancer, renal cell cancer, rhabdomyosarcoma, salivary gland cancer, Sezary syndrome, skin cancer, nonmelanoma, small intestine cancer, soft tissue sarcoma, squamous cell carcinoma, testicular cancer, throat cancer, thymoma, thyroid cancer, urethral cancer, uterine cancer, uterine sarcoma, vaginal cancer, vulvar cancer, Waldenstrom macroglobulinemia, and/or Wilms Tumor.
- A tumor is a collection of cancer cells (cancer disease cells). This includes, for example, a collection of cells in a single mass of cells (e.g., a solid tumor), a collection of cells from different metastatic tumor sites (metastatic tumors), and diffuse tumors (e.g., circulating tumor cells). A tumor can include cells of a single cancer (e.g., colorectal cancer), or multiple cancers (e.g., colorectal cancer and pancreatic cancer). A tumor can include cells originating from a single original somatic cell or from different somatic cells.
- In certain embodiments, disease cells in the subject are spatially distinct. Disease cells are spatially distinct if the cells are located at least 1 cm, at least 2 cm, at least 5 cm or at least 10 cm apart in a body, e.g, in different tissues or organs, or the same tissue or organ. In the case of cancer, examples of spatially distinct cancer cells include cancer cells from diffuse cancers (such as leukemias), cancer cells at different metastatic sites, and cancer cells from the same mass of tumor cells that are separated by at least 1 cm.
- Disease cell burden (e.g., “tumor burden”) is a quantitative measure of the amount of disease cells in a subject. One measure of disease cell burden is the fraction of total biological macromolecules in a sample that are disease biological macromolecules, e.g., the relative amount of tumor polynucleotides in a sample of cell free polynucleotides. For example, if cfDNA from a first subject has 10% cancer polynucleotides, the subject may be said to have a cell-free tumor burden of 10%, If cfDNA from a second subject has 5% cancer polynucleotides, the a second subject may be said to have half the cell-free tumor burden of the first subject. These measures are much more relevant on an intra-subject basis than on an inter-subject basis, as cell-free tumor burdens in one individual can be much higher or lower than another individual despite differing levels of disease burden. However, these measures can be used quite effectively for monitoring disease burden within an individual, e.g., an increase from a 5% to a 15% cell-free DNA tumor burden may indicate significant progression of disease, while a decrease from 10% to 1% may indicate partial response to treatment.
- Polynucleotides to be sequenced can be sourced from spatially distinct sites. This includes polynucleotides sourced from biopsies of different locations in a single tumor mass. It also includes polynucleotides sourced from cells at different metastatic tumor sites. Cells shed polynucleotides into the blood where it is detectable as cell free polynucleotides (e.g., circulating tumor DNA). Cell free polynucleotides also can be found in other bodily fluids such as urine. Therefore, cfDNA provides a more accurate profile of tumor heterogeneity across the entire disease cell population than DNA sourced from a single tumor location. DNA sampled from cells across the disease cell population in a body is referred to as “disease burden DNA” or, in the case of cancer, “tumor burden DNA”.
- Disease cells, such as tumors, can share the same or similar biomolecular profiles. For example, tumors may share one, two, three or more genetic variants. Such variants may share the same stratification, for example highest frequency, second highest frequency, etc. Profiles can also share similar disease cell burdens, e.g., cfDNA burdens, e.g., within 15%, within 10%, within 5% or within 2%.
- As used herein, a macromolecule is a molecule formed from monomeric subunits. Monomeric subunits forming biological macromolecules include, for example, nucleotides, amino acids, monosaccharides and fatty acids. Biological macromolecules include, for example, biopolymers and non-polymeric macromolecules.
- A polynucleotide is a macromolecule comprising a polymer of nucleotides. Polynucleotides include, for example, polydeoxyribonucleotides (DNA) and polyribonucleotides (RNA). A polypeptide is a macromolecule comprising a polymer of amino acids. A polysaccharide is a macromolecule comprising a polymer of monosaccharides. Lipids are a diverse group of organic compounds including, for example, fats, oils and hormones that share the functional characteristic of not interacting appreciably with water. For example, a triglyceride is a fat formed from three fatty acid chains.
- A cancer polynucleotide (e.g., cancer DNA) is a polynucleotide (e.g., DNA) derived from a cancer cell. Cancer DNA and/or RNA can be extracted from tumors, from isolated cancer cells or from biological fluids (e.g., saliva, serum, blood or urine) in the form of cell free DNA (cfDNA) or cell free RNA.
- Cell free DNA is DNA located outside of a cell in a bodily fluid, e.g., in blood or urine. Circulating nucleic acids (CNA) are nucleic acids found in the blood stream. Cell free DNA in the blood is a form of circulating nucleic acid. Cell free DNA is believed to arise from dying cells that shed their DNA into the blood. Because spatially distinct cancer cells will shed DNA into bodily fluids, such as blood, cfDNA of cancer subjects typically comprises cancer DNA from spatially distinct cancer cells.
- Analytes for analysis in the methods of this disclosure can derive from a biological sample, e.g., a sample comprising a biological macromolecule. A biological sample can be derived from any organ, tissue or biological fluid. A biological sample can comprise, for example, a bodily fluid or a solid tissue sample. An example of a solid tissue sample is a tumor sample, e.g., from a solid tumor biopsy. Bodily fluids include, for example, blood, serum, tumor cells, saliva, urine, lymphatic fluid, prostatic fluid, seminal fluid, milk, sputum, stool and tears. Bodily fluids are particularly good sources of biological macromolecules from spatially distinct disease cells, as such cells from many locations in a body can shed these molecules into the bodily fluid. For example, blood and urine are good sources of cell free polynucleotides. Macromolecules from such sources can provide a more accurate profile of the diseased cells than macromolecules derived from a localized disease cell mass.
- Amounts of disease polynucleotides in a bodily fluid sample can be increased. Such increases can increase sensitivity of detection of disease polynucleotides. In one method, an intervention, such as a therapeutic intervention, is administered to a subject that causes disease cells to lyse, emptying their DNA into the surrounding fluid. Such interventions can include administration of chemotherapy. It also can include administering radiation or ultrasound to the whole body of a subject, or to a portion of the body of a subject, such as being directed to a tumor or a diseased organ. After administration of the intervention and when the amount disease polynucleotides in the fluid is increased, a fluid sample is collected for analysis. The interval between administration of the intervention and collection can be long enough for the disease polynucleotides to increase, but not so long that they are cleared from the body. For example, a low dose of chemotherapy can be administered about a week before collection of the sample.
- This disclosure contemplates several types of biomolecular analysis including, for example, genomic, epigenetic (e.g., methylation), RNA expression and proteomic. Genomic analysis can be performed by, for example, a genetic analyzer, e.g., using DNA sequencing. Methylation analysis can be performed by, for example, conversion of methylated bases followed by DNA sequencing. RNA expression analysis can be performed by, for example, polynucleotide array hybridization. Proteomic analysis can be performed by, for example, mass spectrometry.
- As used herein, the term “genetic analyzer” refers to a system including a DNA sequencer for generating DNA sequence information and a computer comprising software that performs bioinformatic analysis on the DNA sequence information. Bioinformatic analysis can include, without limitation, assembling sequence data, detecting and quantifying genetic variants in a sample, including either of germline variants (e.g., heterozygosity) and somatic cell variants (e.g., cancer cell variants).
- Analytic methods can include generating and capturing genetic information. Genetic information can include genetic sequence information, ploidy states, the identity of one or more genetic variants, as well as a quantitative measure of the variants. The term “quantitative measure” refers to any measure of quantity including absolute and relative measures. A quantitative measure can be, for example, a number (e.g., a count), a percentage, a frequency, a degree or a threshold amount.
- Polynucleotides can be analyzed by any method known in the art. Typically, the DNA sequencer will employ next generation sequencing (e.g., Illumina, 454, Ion torrent, SOLiD). Sequence analysis can be performed by massively parallel sequencing, that is, simultaneously (or in rapid succession) sequencing any of at least 100,000, 1 million, 10 million, 100 million, or 1 billion polynucleotide molecules. Sequencing methods may include, but are not limited to: high-throughput sequencing, pyrosequencing, sequencing-by-synthesis, single-molecule sequencing, nanopore sequencing, semiconductor sequencing, sequencing-by-ligation, sequencing-by-hybridization, RNA-Seq (Illumina), Digital Gene Expression (Helicos), Next generation sequencing, Single Molecule Sequencing by Synthesis (SMSS) (Helicos), massively-parallel sequencing, Clonal Single Molecule Array (Solexa), shotgun sequencing, Maxam-Gilbert or Sanger sequencing, primer walking, sequencing using PacBio, SOLiD, Ion Torrent, Genius (GenapSys) or Nanopore (e.g., Oxford Nanopore) platforms and any other sequencing methods known in the art.
- The DNA sequencer can apply Gilbert's sequencing method based on chemical modification of DNA followed by cleavage at specific bases, or it can apply Sanger's technique which is based on dideoxynucleotide chain termination. The Sanger method became popular due to its increased efficiency and low radioactivity. The DNA sequencer can use techniques that do not require DNA amplification (polymerase chain reaction—PCR), which speeds up the sample preparation before sequencing and reduces errors. In addition, sequencing data is collected from the reactions caused by the addition of nucleotides in the complementary strand in real time. For example, the DNA sequencers can utilize a method called Single-molecule real-time (SMRT), where sequencing data is produced by light (captured by a camera) emitted when a nucleotide is added to the complementary strand by enzymes containing fluorescent dyes.
- Sequencing of the genome can be selective, e.g., directed to portions of the genome of interest. For example, many genes (and mutant forms of these genes) are known to be associated with various cancers. Sequencing of select genes, or portions of genes may suffice for the analysis desired. Polynucleotides mapping to specific loci in the genome that are the subject of interest can be isolated for sequencing by, for example, sequence capture or site-specific amplification.
- A nucleotide sequence (e.g., DNA sequence) can refer to raw sequence reads or processed sequence reads, such as unique molecular counts inferred from raw sequence reads.
- Sequence reads generated from sequencing are subject to analysis including, for example, identifying genetic variants. This can include identifying sequence variants and quantifying numbers of base calls at each locus. Quantifying can involve, for example, counting the number of reads mapping to a particular genetic locus. Different numbers of reads at different loci can indicate copy number variation (CNV).
- Sequencing and bioinformatics methods that reduce noise and distortion are particularly useful when the number of target polynucleotides in a sample is small compared with non-target polynucleotides. When the target molecules are few in number, the signal from the target may be weak. This can be the case, for example, in the case of cell free DNA, where a small number of tumor polynucleotides may be mixed with a much larger number of polynucleotides from healthy cells. Molecular tracking methods can be useful in such situations. Molecular tracking involves tracking sequence reads from a sequencing protocol back to molecules in an original sample (e.g., before amplification and/or sequencing) from which the reads are derived. Certain methods involve tagging molecules in such a way that multiple sequence reads produced from original molecules can be grouped into families of sequences derived from original molecules. In this way, base calls representing noise can be filtered out. Such methods are described in more detail in, for example, WO 2013/142389 (Schmitt et al.), US 2014/0227705 (Vogelstein et al.) and WO 2014/149134 (Talasaz et al.). Up-sampling methods also are useful to more accurately determine counts of molecules in a sample. In some embodiments, up-sampling methods involve determining a quantitative measure of individual DNA molecules for which both strands (Watson and Crick strands) are detected; determining a quantitative measure of individual DNA molecules for which only one of the DNA strands is detected; inferring from these measures a quantitative measure of individual DNA molecules for which neither strand was detected; and using these measures to determine the quantitative measure indicative of a number of individual double-stranded DNA molecules in the sample. This method is described in more detail in PCT/US2014/072383, filed Dec. 24, 2014.
- Methods of the present disclosure can be used in the detection of genetic variants (also referred to a “gene alterations”). Genetic variants are alternative forms at a genetic locus. In the human genome, approximately 0.1% of nucleotide positions are polymorphic, that is, exist in a second genetic form occurring in at least 1% of the population. Mutations can introduce genetic variants into the germ line, and also into disease cells, such as cancer. Reference sequences, such as hg19 or NCBI Build 37 or Build 38, intend to represent a “wild type” or “normal” genome. However, to the extent they have a single sequence, they do not identify common polymorphisms which may also be considered normal.
- Genetic variants include sequence variants, copy number variants and nucleotide modification variants. A sequence variant is a variation in a genetic nucleotide sequence. A copy number variant is a deviation from wild type in the number of copies of a portion of a genome. Genetic variants include, for example, single nucleotide variations (SNPs), insertions, deletions, inversions, transversions, translocations, gene fusions, chromosome fusions, gene truncations, copy number variations (e.g., aneuploidy, partial aneuploidy, polyploidy, gene amplification), abnormal changes in nucleic acid chemical modifications, abnormal changes in epigenetic patterns and abnormal changes in nucleic acid methylation.
- Genetic variants can be detected by comparing sequences from polynucleotides in a sample to a reference, e.g., to a reference genome sequence, to an index or to a database of known mutations. In one embodiment, the reference sequence is a publicly available reference sequence, such as the human genome sequence HG-19 or NCBI Build 37. In another embodiment, the reference sequence is a sequence in a non-public database. In another embodiment, the reference sequence is a germ line sequence of an organism inferred or determined from sequencing polynucleotides from the organism.
- A somatic mutation or somatic alteration is a genetic variant that arises in a somatic cell. Somatic mutations are distinguished from mutations that arise in the genome of a germ line cell (i.e., sperm or egg) or a zygote, of an individual. Somatic mutations, e.g., those found in cancer cells, are distinguishable from the germ line genome of a subject in which the cancer arose. They also can be detected by comparing the cancer genome with the germ line genome or with a reference genome. There also are known genetic variants that are common in cancer cells. A database of SNVs in human cancer can be found at the website: cancer.sanger.ac.uk/cancergenome/projects/cosmic/.
-
FIG. 8 shows genes known, in cancer, to exhibit point mutations, amplifications, fusions and indels. - During the S phase of the cell cycle, the cell replicates DNA. A diploid cell having 2N chromosomes with replicated DNA may correspond to about 4×DNA content, whereas a diploid cell having 2N chromosomes without replicated DNA may correspond to about 2×DNA content. Replication proceeds from origins of replication. In mammals, origins of replication are spaced at intervals of about 15 kb to 300 kb. During this period, portions of the genome exist in polyploid form. Those areas between origins of replication and the position of the polymerase are duplicated, while those areas beyond the position of the polymerase (or just before the origin of replication) are still in single copy number in the strand undergoing replication. When scanned across the genome, copy number appears uneven or distorted, having regions that exist in polyploidy form and regions that exist in diploid form. Such a scan appears noisy. This is true even for cells that do not bear copy number variations in the genome in the resting state. In contrast, a scan of CNV in cells in Go shows a profile in which copy number is relatively flat or undistorted across the genome. Because cancer cells divide rapidly, their CNV profile across the genome exhibits distortion, whether or not the genome also bears CNVs at certain loci.
- One can take advantage of this fact to detect tumor burden in DNA from samples comprising heterogeneous DNA, e.g., a mixture of disease DNA and healthy DNA, such as cfDNA. One method to detect tumor burden involves determining copy-number variation due to proximity of examined locus or loci to various origins of replication. Regions that include a replication origin will have very close to 4 copies of DNA in that locus (in a diploid cell), while regions that are far removed from a replication origin will have closer to 2 copies (in a diploid cell). In certain embodiments, the examined locus or loci include, at least 1 kb, at least 10 kb, at least 100 kb, at least 1 mb, at least 10 mb, at least 100 mb, across an entire chromosome or across an entire genome. A measure of replication origin CNV (ROCNV) across the region is determined. This can be, for example, a measure of deviation in copy number from a value of central tendency. The value of central tendency can be, for example, mean, median or mode. The measure of deviation can be for example, variance or standard deviation. This measure can be compared with a measure of ROCNVs across the same region in a control sample, e.g., from a healthy individual or cells in resting state. ROCNVs can be determined by partitioning the region or regions analyzed into non-overlapping partitions of various lengths and taking a measure of CNV in this partition. This measure of CNV can be derived from the number of reads or fragments determined to map to those regions after sequencing. The partitions can have various sizes, to produce various levels of resolution, e.g., a single base level (base-per-base), 10 bases, 100 bases, 1 kb, 10 kb or 100 kb. Deviations that are greater than a control indicate the presence of DNA undergoing replication, which, in turn, indicates malignancy. The greater the degree of deviation, the greater the amount of DNA from cells undergoing cell division in the sample.
- Various methods can be used to calculate true genetic copy number variations that differ from replication origin based distortion. For example, heterozygous SNP positions at affected CNV loci can be used to infer copy number variation by calculating the deviation from 50% or the allelic imbalance at those loci. Distortion due to replication origin proximity should not affect this imbalance since both copies would generally be copied at similar time intervals and thus self-normalizing (although allelic changes could conceivably change the replication of origin between the two allelic variants). For example, duplication of a chromosome segment containing a SNP could be detected in around 67% of reads, while duplication resulting from ROCNV would be detected in about 50% of reads. In another method, counting-based techniques that use the density of detected fragments or reads at a certain locus are used to calculate relative copy number. These techniques are generally limited by poisson noise and systematic bias due to DNA sample preparation and sequencing bias. A combination of these methods may also be to obtain even greater accuracy.
- ROCNV can be calculated for a given sample and be used to give a value on cell-free tumor burden despite lack of detection of traditional somatic variants, such as, SNVs, gene-specific CNVs, genomic rearrangements, epigenetic variants, loss of heterozygosity, etc. ROCNVs can also be used to subtract distortion for a given sample to increase sensitivity and/or specificity of a given CNV detection/estimation method by removing variation that is related to replication origin proximity rather than due to true copy number changes in a cell. Cell-lines with known or no copy number changes over a reference can also be used as a reference of ROCNVs for use in estimating its contribution to a given sample.
- In one embodiment, the method involves determining a baseline level of copies of DNA molecules at one or more loci from one or more control samples, each containing DNA from cells undergoing a predetermined level of cell division, e.g., cells in resting state or rapidly dividing tumor cells. A measure of copies of DNA molecules in a test sample is also determined. The measure in test samples can be from one or more loci partitioned into one or more partitions. In each case, a plurality of loci each include an origin or replication. The measure of copies from the test sample can be an average across all partitions, or a level of variance across loci. A measure of central tendency or of variation (e.g., variance or standard deviation) in copy number in the test sample is compared to the control sample. A measure that is greater in a test sample than in a control of cells in resting state, or slowly dividing, indicates that cells generating the DNA in the test sample are dividing more rapidly than cells providing DNA to the control sample, e.g., are cancerous. Similarly, measures that are similar between a test sample and a control of cells in actively dividing state, indicates that cells generating the DNA in the test sample are dividing at a rate similar to the rapidly dividing cells, e.g., are cancerous.
- Disease cell heterogeneity, e.g., tumor heterogeneity, is the occurrence of diseased cells having different genetic variants. Disease cell heterogeneity can be determined by examination of polynucleotides isolated from diseased cells and detection of differences in their genomes. Disease cell heterogeneity also can be inferred from examination of polynucleotides from a sample containing polynucleotides from both diseased and healthy cells based on differences in relative frequency of somatic mutations. For example, cancer is characterized by changes at the genetic level, e.g., through the accumulation of somatic mutations in different clonal groups of cells. These changes can contribute to unregulated growth of the cancer cells, or function as markers of responsiveness or non-responsiveness to various therapeutic interventions.
- Tumor heterogeneity is a condition in which a tumor characterized by cancer cells containing different combinations of genetic variants, e.g., different combinations of somatic mutations. That is, the tumor can have different cells containing alterations in different genes, or containing different alterations in the same gene. For example, a first cell could include a mutant form of BRAF, while a second cell could include mutant forms of both BRAF and ERBB2. Alternatively, a first cancer cell could include the single nucleotide polymorphism EGRF 55249063 G>A, while a second cell could include the single nucleotide polymorphism EGRF 55238874 T>A. (Numbers refer to nucleotide position in genomic reference sequence.)
- For example, an original tumor cell can include a genetic variant in a gene, e.g., an oncogene. As the cells continue to divide, some progeny cells, which carry the original mutation, may independently develop genetic variants in other genes or in different parts of the same gene. In subsequent divisions, tumor cells can accumulate still more genetic variants.
- Methods of this disclosure allow quantitative as well as qualitative profiling of disease mosaicism, e.g., tumor heterogeneity. In one embodiment, the profile includes information from polynucleotides from spatially distinct disease cells. In one embodiment, the profile is a whole body profile containing information from cells distributed throughout the body. Analysis of polynucleotides in cfDNA allows sampling of DNA across the entire geographic extent of a tumor, in contrast with sampling of a localized area of a tumor. In particular, it allows sampling of diffuse and metastatic tumors. This contrasts with methods that detect the mere existence of tumor heterogeneity through the localized sampling of a tumor. The profile can indicate the exact nucleotide sequence of the variant, or may simply indicate a gene bearing the somatic mutation.
- In one embodiment of a profile of disease cell heterogeneity, such as tumor cell heterogeneity, the profile identifies genetic variations and the relative amounts of each variant. From this information, one can infer possible distributions of the variants in different cell sub-population. For example, a cancer may begin with a cell bearing somatic mutation X. As a result of clonal evolution, some progeny of this cell may develop variant Y. Other progeny may develop variant Z. At the cellular level, after analysis, the tumor may be characterized as 50% X, 35% XY and 15% XZ. At the DNA level (and considering DNA from tumor cells only), the profile may indicate 100% X, 35% Y and 15% Z. One may also detect both CNV at a first locus and sequence variants at a second locus.
- Tumor heterogeneity can be detected from analysis of sequences of cancer polynucleotides, based on the existence of genomic variations at different loci occurring at different frequencies. For example, in a sample of cell free DNA (which is likely to contain germ line DNA as well as cancer DNA), it may be found that a sequence variant of BRAF occurs at a frequency of 17%, a sequence variant of CDKN2A occurs at a frequency of 6%, a sequence variant of ERBB2 occurs at a frequency of 3% and a sequence variant of ATM occurs at a frequency of 1%. These different frequencies of sequence variants indicate tumor heterogeneity. Similarly, genetic sequences exhibiting different amounts of copy number variation also indicate tumor heterogeneity. For example, analysis of a sample may show different levels of amplification for the EGFR and CCNE1 genes. This also indicates tumor heterogeneity.
- In the case of cell free DNA, detection of somatic mutations can be made by comparing base calls in the sample to a reference sequence or, internally, as less frequent base calls to more common base calls, presumed to be in the germ line sequence. In either case, the existence of sub-dominant forms (e.g., less than 40% of total base calls) at different loci and at different frequency indicates disease cell heterogeneity.
- Cell free DNA typically comprises a preponderance of DNA from normal cells having the germ line genome sequence and, in the case of a disease, such as cancer, a small percentage of DNA from cancer cells and having a cancer genome sequence. Sequences generated from polynucleotides in a sample of cfDNA can be compared with a reference sequence to detect differences between the reference sequence and the polynucleotides in the cfDNA. At any locus, all or nearly all of the polynucleotides from a test sample may be identical to a nucleotide in the reference sequence. Alternatively, a nucleotide detected at nearly 100% frequency in a sample may be different than a nucleotide in the reference sequence. This most likely indicates a normal polymorphic form at this locus. If a first nucleotide that matches a reference nucleotide is detected at about 50% and a second nucleotide that is different than a reference nucleotide is detected at about 50%, this most likely indicates normal heterozygosity. Heterozygosity may present at allele ratios divergent from 50:50, e.g., 60:40 or even 70:30. However, if the sample comprises a nucleotide detectable above noise at a frequency below (of above) an unambiguously heterozygote range (for example, less than about 45%, less than 40%, less than 30%, less than 20%, less than 10% or less than 5%), this can be attributed to the existence of somatic mutations in a percentage of the cells contributing DNA to the cfDNA population. These may come from disease cells, e.g., cancer cells. (The exact percentage is a function of tumor load.) If the frequency of somatic mutations at two different genetic loci are different, e.g., 16% at one locus and 5% at another locus, this indicates that the disease cells, e.g., the cancer cells, are heterogeneous.
- In the case of DNA from solid tumors, which is expected to predominantly comprise tumor DNA, somatic mutations also can be detected by comparison to a reference sequence. Detection of somatic mutations that exist in 100% of the tumor cells may require reference to a standard sequence or information about known mutants to. However, the existence of sub-dominant sequences among the polynucleotide pool at different loci and at different relative frequencies, indicates tumor heterogeneity.
- The profile may include genetic variants in genes that are known to be actionable. Knowledge of such variants can contribute to selecting therapeutic interventions, as therapies can be targeted to such variants. In the case of cancer, many actionable genetic variants are already known.
- In general, the copy number state of a gene should be reflected in the frequency of a genetic form of the gene in the sample. For example, a sequence variant may be detected at a frequency consistent with homozygosity or heterozygosity (e.g., about 100% or about 50%, respectively) with no copy number variation. This is consistent with a germ line polymorphism or mutation. A sequence variant may be detected at frequency of about 67% (or, alternatively, at about 33%) of polynucleotides at a locus, and also in a gene measured at increased copy number (generally, n=2), This is consistent with gene duplication in the germ line. For example, a trisomy would present in this fashion. However, if a sequence variant is detected at a level consistent with homozygosity (e.g., about 100%) but at amounts consistent with copy number variation, this is more likely to reflect the presence of disease cell polynucleotides having undergone gene amplification. Similarly, if a sequence variant is detected at a level not inconsistent with heterozygosity (e.g., deviating somewhat from 50%) but at amounts consistent with copy number variation, this also is more likely to reflect the presence of disease cell polynucleotides; the diseased polynucleotides create some level of imbalance in allele frequency away from 50:50.
- This observation can be used to infer whether a sequence variant is more likely present in the germ line level or resulted from a somatic cell mutation, e.g., in a cancer cell. For example, a sequence variant in a gene detected at levels arguably consistent with heterozygosity in the germ line is more probably the product of a somatic mutation in disease cells if copy number variation also is detected in that gene.
- Also, to the extent we expect that a gene duplication in the germ line should bear a variant consistent with increased genetic dose (e.g., about 67% for trisomy at a locus), detection gene amplification with a sequence variant dose that deviates significantly from this expected amount indicates that the CNV is more likely present as a result of somatic cell mutation.
- The fact that somatic mutations at different loci may be present at single or multiple copy number in the same disease cell also can be used to infer tumor heterogeneity. More specifically, tumor heterogeneity can be inferred when two genes are detected at different frequency but their copy number is relatively equal. Alternatively, tumor homogeneity can be inferred when the difference in frequency between two sequence variants is consistent with difference in copy number for the two genes. Thus, if an EGFR variant is detected at 11% and a KRAS variant is detected at 5%, and no CNV is detected at these genes, the difference in frequency likely reflects tumor heterogeneity (e.g., all tumor cells carry an EGFR mutant and half the tumor cells also carry a KRAS mutant). Alternatively, if the EGFR gene carrying the mutant is detected at increased copy number, one consistent interpretation is a homogenous population of tumor cells, each cell carrying a mutant in the EGFR and KRAS genes, but in which the KRAS gene is duplicated. Accordingly, both the frequency of a sequence variant and a measure of CNV at the locus of the sequence variant in a sample can be determined. The frequency can then be corrected to reflect the relative number of cells bearing the variant by weighing the frequency based on dose per cell determined from the measure of CNV. This result is now more comparable in terms of number of cells carrying the variant to a sequence variant that does not vary in copy number.
- A report of results from genetic variant analysis (e.g., sequence variants, CNV, disease cell heterogeneity, and combinations thereof) may be provided by a report generator, for example to a healthcare practitioner, e.g., a physician, to aid the interpretation of the test results (e.g., data) and selection of treatment options. A report generated by a report generator may provide additional information, such as clinical lab results, that may be useful for diagnosing disease and selecting treatment options.
- Referring now to
FIG. 9A , a system with areport generator 1 for reporting on, e.g., cancer test results and treatment options therefrom is schematically illustrated. The report generator system can be a central data processing system configured to establish communications directly with: a remote data site orlab 2, a medical practice/healthcare provider (treating professional) 4, and/or a patient/subject 6 through communication links. Thelab 2 can be medical laboratory, diagnostic laboratory, medical facility, medical practice, point-of-care testing device, or any other remote data site capable of generating subject clinical information. Subject clinical information includes but it is not limited to laboratory test data, e.g., analysis of genetic variants; imaging and X-ray data; examination results; and diagnosis. The healthcare provider orpractice 6 may include medical services providers, such as doctors, nurses, home health aides, technicians and physician's assistants, and the practice may be any medical care facility staffed with healthcare providers. In certain instances the healthcare provider/practice is also a remote data site. Where cancer is a disease to be treated, the subject may be afflicted with cancer, among other possible diseases or disorders. - Other clinical information for a
cancer subject 6 can include the results of laboratory tests, e.g., analysis of genetic variants, metabolic panel, complete blood count, etc.; medical imaging data; and/or medical procedures directed to diagnosing the condition, providing a prognosis, monitoring the progression of the disease, determining relapse or remission, or combinations thereof. The list of appropriate sources of clinical information for cancer includes, but it is not limited to, CT scans, MRI scans, ultrasound scans, bone scans, PET Scans, bone marrow test, barium X-ray, endoscopies, lymphangiograms, IVU (Intravenous urogram) or IVP (IV pyelogram), lumbar punctures, cystoscopy, immunological tests (anti-malignin antibody screen), and cancer marker tests. - The subject 6's clinical information may be obtained from the
lab 2 manually or automatically. Where simplicity of the system is desired, the information may be obtained automatically at predetermined or regular time intervals. A regular time interval can refer to a time interval at which the collection of the laboratory data is carried out automatically by the methods and systems described herein based on a measurement of time such as hours, days, weeks, months, years etc. In one embodiment, the collection of data and processing is carried out at least once a day. In one embodiment, the transfer and collection of data is carried out about any of monthly, biweekly, weekly, several times a week or daily. Alternatively the retrieval of information may be carried out at predetermined time intervals, which may not be regular time intervals. For instance, a first retrieval step may occur after one week and a second retrieval step may occur after one month. The transfer and collection of data can be customized according to the nature of the disorder that is being managed and the frequency of required testing and medical examinations of the subjects. -
FIG. 9B shows an exemplary process to generate genetic reports, including a tumor response map and associated summary of alterations. A tumor response map is a graphical representation of genetic information indicating changes over time in genetic information from a tumor, e.g., qualitative and quantitative changes. Such changes can reflect response of a subject to a therapeutic intervention. This process can reduce error rates and bias that may be orders of magnitude higher than what is required to reliably detect de novo genetic variants associated with cancer. The process can comprise first capturing genetic information by collecting body fluid samples as sources of genetic material (e.g., blood, saliva, sweat, urine, etc). Then, the process can comprise sequencing the materials (11). For example, polynucleotides in a sample can be sequenced, producing a plurality of sequence reads. The tumor burden in a sample that comprises polynucleotides can be estimated as the relative number of sequence reads bearing a variant to the total number of sequence reads generated from the sample. Where copy number variants are analyzed, the tumor burden can be estimated as the relative excess (e.g., in the case of gene duplication) or relative deficit (e.g., in the case of gene elimination) of the total number of sequence reads at test and control loci. For example, a run may produce 1000 reads mapping to an oncogene locus of which 900 correspond to wild type and 100 correspond to a cancer mutant, indicating a copy number variant at this gene. More details on exemplary specimen collection and sequencing of the genetic materials are discussed below inFIGS. 10-11 . - Next, genetic information can be processed (12). Genetic variants can then be identified. The process can comprise determining the frequency of genetic variants in the sample containing the genetic material. The process can comprise separating information from noise (13) if this process is noisy.
- The sequencing methods for genetic analysis may have error rates. For example, the mySeq system of Illumina can produce percent error rates in the low single digits. For 1000 sequence reads mapping to a locus, about 50 reads (about 5%) may be expected to include errors. Certain methodologies, such as those described in WO 2014/149134 can significantly reduce the error rate. Errors create noise that can obscure signals from cancer present at low levels in a sample. For example, if a sample has a tumor burden at a level around the sequencing system error rate, e.g., around 0.1%-5%, it may be difficult to distinguish a signal corresponding to a genetic variant due to cancer from one due to noise.
- Analysis of genetic variants may be used for diagnosing in the presence of noise. The analysis can be based on the frequency of Sequence Variants or Level of CNV (14) and a diagnosis confidence indication or level for detecting genetic variants in the noise range can be established (15).
- Next, the process can comprise increasing the diagnosis confidence. This can be done using a plurality of measurements to increase confidence of diagnosis (16), or alternatively using measurements at a plurality of time points to determine whether cancer is advancing, in remission or stabilized (17). The diagnostic confidence can be used to identify disease states. For example, cell free polynucleotides taken from a subject can include polynucleotides derived from normal cells, as well as polynucleotides derived from diseased cells, such as cancer cells. Polynucleotides from cancer cells may bear genetic variants, such as somatic cell mutations and copy number variants. When cell free polynucleotides from a sample from a subject are sequenced, these cancer polynucleotides are detected as sequence variants or as copy number variants.
- Measurements of a parameter, whether or not they are in the noise range, may be provided with a confidence interval. Tested over time, one can determine whether a cancer is advancing, stabilized or in remission by comparing confidence intervals over time. When confidence intervals overlap, one may not be able to tell whether disease is increasing or decreasing, because there is no statistically significant difference between the measures. However, where the confidence intervals do not overlap, this indicates the direction of disease. For example, comparing the lowest point on a confidence interval at one time point and the highest point on a confidence interval at a second time point indicates the direction.
- Next, the process can comprise generating genetic Report/Diagnosis. The process can comprise generating genetic graph for a plurality of measurements showing mutation trend (18) and generating report showing treatment results and options (19).
-
FIGS. 10A-10C show in more details one embodiment for generating genetic reports and diagnosis (e.g., Report/Diagnosis). In one implementation,FIG. 10C shows an exemplary pseudo-code executed by the system ofFIG. 9A to process non-CNV reported mutant allele frequencies. However, the system can process CNV reported mutant allele frequencies as well. - Samples comprising genetic material, such as cfDNA, can be collected from a subject at a plurality of time points, that is, serially. The genetic material can be sequenced, e.g., using a high-throughput sequencing system. Sequencing can target loci of interest to detect genetic variants, such genes bearing somatic mutations, genes that undergo copy number variation, or genes involved in gene fusions, for example, in cancer. At each time point, a quantitative measure of the genetic variants found can be determined. For example, in the case of cfDNA, the quantitative measure can be the frequency or percentage of a genetic variant among polynucleotides mapping to a locus, or the absolute number of sequence reads or polynucleotides mapping to a locus. Genetic variants having a non-zero quantity at at least one time point can then be represented graphically through all time points. For example, in a collection of 1000 sequences,
variant 1 may be found attime points Variant 2 may be found inamounts variant 1, to 5%, 3% and 0%, and, forvariant height 1 mm. So, for example, in this case the heights would be at time point 1: heights 5 mm (variant 1) and 0 mm (variant 2); at time point 2:heights 3 mm (variant 1) and 1 mm (variant 2), at time point 3:heights 0 mm (variant 1) and 2 mm (variant 2). The graphical representation can be in the form of a stacked area graph, such as a streamgraph. A “zero” time point (before the first time point) can be represented by a point, with all values at 0. The height of the quantity of the variants in the graphical representation can be, for example, relative or proportional to each other. For example, a variant frequency 5% at one time point could be represented with a height of twice that of a variant with frequency of 2.5% at the same time point. The order of stacking can be chosen for ease of understanding. For example, variants can be stacked in order of quantity high to low from bottom to top. Or, they can be stacked in a streamgraph with the variant of largest initial amount in the middle, and other variants of decreasing quantity on either side. In certain embodiments, the areas can be color coded based on variant. Variants in the same gene can be shown in different hues of the same color. For example, KRAS mutants can be shown in different shades of blue, EGFR mutants in different shades of red. - Turning now to
FIG. 10A , the process can comprise receiving genetic information from a DNA sequencer (30). The process can then comprise determining specific gene alterations and quantities thereof (32). - Next, a tumor response map is generated. To generate the map, the process can comprise normalizing the quantities for each gene alteration for rendering across all test points and then generates a scaling factor (34). As used herein, the term “normalize” generally refers to means adjusting values measured on different scales to a notionally common scale. For example, data measured at different points are converted/adjusted so that all values can be resized to a common scale. As used herein, the term “scaling factor” generally refers to a number which scales, or multiplies, some quantity. For example, in the equation y=Cx, C is the scale factor for x. C is also the coefficient of x, and may be called the constant of proportionality of y to x. The values are normalized to allow plotting on a common scale that is visually-friendly. And the scaling factor is used to know the exact heights that correspond to the values to be plotted (e.g. 10% mutant allele frequency may represent 1 cm on the report wherein the total height is 10 cm). The scaling factor is applied to all test points and thus is considered to be a universal scaling factor. For each test point, the process can comprise rendering information on a tumor response map (36). In
operation 36, the process can comprise rendering alterations and relative heights using the determined scaling factor (38) and assigns a unique visual indicator for each alteration (40). In addition to the response map, the process can comprise generating a summary of alterations and treatment options (42). Also, information from clinical trials that may help the particular genetic alterations and other helpful treatment suggestions is presented, along with explanations of terminology, test methodology, and other information is added to the report and rendered for the user. - In one implementation, the copy number variation may be reported as graph, indicating various positions in the genome and a corresponding increase or decrease or maintenance of copy number variation at each respective position. Additionally, copy number variation may be used to report a percentage score indicating how much disease material (or nucleic acids having a copy number variation) exists in the cell free polynucleotide sample.
- In another embodiment, the report includes annotations to help physicians interpret the results and recommend treatment options. The annotating can include annotating a report for a condition in the NCCN Clinical Practice Guidelines in Oncology™ or the American Society of Clinical Oncology (ASCO) clinical practice guidelines. The annotating can include listing one or more FDA-approved drugs for off-label use, one or more drugs listed in a Centers for Medicare and Medicaid Services (CMS) anti-cancer treatment compendia, and/or one or more experimental drugs found in scientific literature, in the report. The annotating can include connecting a listed drug treatment option to a reference containing scientific information regarding the drug treatment option. The scientific information can be from a peer-reviewed article from a medical journal. The annotating can include providing a link to information on a clinical trial for a drug treatment option in the report. The annotating can include presenting information in a pop-up box or fly-over box near provided drug treatment options in an electronic based report. The annotating can include adding information to a report selected from the group consisting of one or more drug treatment options, scientific information concerning one or more drug treatment options, one or more links to scientific information regarding one or more drug treatment options, one or more links to citations for scientific information regarding one or more drug treatment options, and clinical trial information regarding one or more drug treatment options.
-
FIG. 10B shows an exemplary process to generate a tumor response map pathway which may be used by a healthcare practitioner, e.g., physician, for example to make patient care decisions. In this embodiment, the process can comprise first determining a global scaling factor (43). In one embodiment, for all non-CNV (copy number variation) reported mutant allele frequencies, the process can comprise transforming the absolute value into a relative metric/scale that may be more amenable for plotting (e.g. Multiply mutant allele frequency by 100 and take log of that value) and determines a global scaling factor using maximum observed value. The process then involves visualizing information from the earliest test dataset (44). Visualizing can comprise graphically representing the information on a user interface (e.g., a computer screen) or in tangible form (e.g., on a piece of paper). For each non-CNV alteration, the process can comprise multiplying the scaling factor by a transformed value for each gene and use as a quantity indicator for plotting that variant, and then assigns a color/unique visual indicator for each alteration. Then the process can comprise visualizing information for subsequent test points (45) using the following pseudo-code: - If unchanged composition of test results, continue prior panel date visual in new panel
- If alterations remain the same, but quantities have changed
-
- Recompute the quantity indicator for plotting that variant and re-plot all updated values in existing panel(s) and new panel for the latest test date.
- If new alterations addition
-
- Add the alterations to the top of all existing alterations
- Compute transform values
- Recompute scaling factor
- Re-draw the response map, re-plotting alterations in the prior test date that are still detected in current test date as well as newly emerging alterations
- If prior existing alteration is not among the set of detected alterations
-
- Use a height of zero and plot the quantity of the alteration for all subsequent test dates
- Still include color is set of unavailable colors
- Each subsequent panel denoting a test date may also include additional patient or intervention information that may correlate with the alteration changes seen in the remainder of the map. Similar scaling, plotting, and transformation may be also implemented on CNV and other types of DNA alterations (e.g. methylation) to display these quantities in separate or combined charts. These additional annotations may themselves also be quantifiable and similarly plotted on the map.
- The process can then comprise determining a summary of alterations and treatment options (46). In one embodiment, for the alteration with the maximum mutant allele frequencies, the following actions are done:
-
- Report all alterations for that gene in decreasing mutant allele frequency order of non-CNV alterations
- Report all CNV alterations for that gene in decreasing order of CNV value
- Repeat for next gene with next highest non-CNV mutant allele frequency not yet reported
- For each reported alteration, the process can comprise including a trend indicator for that alteration over the different test date points.
- Grouping of maximum mutant allele frequencies may also extend beyond just the genes they are harbored in to greater encapsulating annotations such as biological pathways, evidence level, etc.
-
FIGS. 10D-10I show one exemplary report generated by the system ofFIG. 9A . InFIG. 10D , apatient identification section 52 provides patient information, reporting date, and physician contact information. Atumor response map 54 includes a modifiedstreamgraph 56 that shows tumor activities with unique colors for each mutant gene. Thegraph 56 has accompanyingsummary explanation textbox 58. More details are provided in a summary of alterations andtreatment option section 60. Thealterations section 60, along with mutation trend, mutant allele frequency, cell-free amplification, FDA Approved Drug Indication, FDA Approved Drugs with other Indications, and Clinical Drug Trial information.FIGS. 10D-1, 10D-2, and 10D-3 provide enlarged views ofFIG. 10D . -
FIG. 10E shows an exemplary report section providing definitions, comments, and interpretation of the tests.FIGS. 10E-1 and 10E-2 provide enlarged views ofFIG. 10E .FIG. 10F shows an exemplary detailed therapy result portion of the report.FIGS. 10F-1 and 10F-2 provide enlarged views ofFIG. 10F .FIG. 10G shows an exemplary discussion of the clinical relevance of detected alterations.FIGS. 10G-1 and 10G-2 provide enlarged views ofFIG. 10G .FIG. 10H shows potentially available medications that are going through clinical trials.FIG. 10I shows the test methods and limitations thereof.FIGS. 10I-1 and 10I-2 provide enlarged views ofFIG. 10I . -
FIG. 10J-10P shows various exemplary modifiedstreamgraph 56. A streamgraph, or stream graph, is a type of stacked area graph which is displaced around a central axis, resulting in a flowing, organic shape. Streamgraphs are a generalization of stacked area graphs where the baseline is free. By shifting the baseline, it is possible to minimize the change in slope (or “wiggle”) in individual series, thereby making it easier to perceive the thickness of any given layer across the data. - For example,
FIG. 10J shows seven layers representing at least 8 mutants over three time periods, and a “0” time point (all values “0”).FIG. 10K shows a single mutant over 4 time periods. No mutants are detected at the second, third and fourth time points.FIG. 10L indicates frequency of dominant allele at each time point.FIG. 10M shows a single time point with a total of four mutants in two genes. Mutants are identified by amino acid at a position changed (i.e., EGFR T790M). - One embodiment renders a streamgraph so that it is not x-axis reflective. The modified graph applies a unique scaling to denote proportional attributes. The graph can indicate the addition of new attributes over time. The presence or absence of a mutation may be reflected in graphical form, indicating various positions in the genome and a corresponding increase or decrease or maintenance of a frequency of mutation at each respective position. Additionally, mutations may be used to report a percentage score indicating how much disease material exists in the cell free polynucleotide sample. A confidence score may accompany each detected mutation, given known statistics of typical variances at reported positions in non-disease reference sequences. Mutations may also be ranked in order of abundance in the subject or ranked by clinically actionable importance.
- The mapping of genome positions and copy number variation for the subject with cancer can indicate that a particular cancer is aggressive and resistant to treatment. The subject may be monitored for a period and retested. If at the end of the period, the copy number variation profile, e.g., as depicted in a tumor response map, begins to increase dramatically, this may indicate that the current treatment is not working. A comparison can also done with genetic profiles of other subjects. For example, if it is determined that this increase in copy number variation indicates that the cancer is advancing, then the original treatment regimen as prescribed is no longer treating the cancer and a new treatment is prescribed.
- These reports can be submitted and accessed electronically via the internet. Analysis of sequence data may occur at a site other than the location of the subject. The report can be generated and transmitted to the subject's location. Via an internet enabled computer, the subject may access the reports reflecting his tumor burden.
- Next, details of exemplary gene testing processes are disclosed. Turning now to
FIG. 11A , an exemplary process receives genetic materials from blood sample or other body samples (1102). The process can comprise converting the polynucleotides from the genetic materials into tagged parent nucleotides (1104). The tagged parent nucleotides are amplified to produce amplified progeny polynucleotides (1106). A subset of the amplified polynucleotides is sequenced to produce sequence reads (1108), which are grouped into families, each generated from a unique tagged parent nucleotide (1110). At a selected locus, the process can comprise assigning each family a confidence score for each family (1112). Next, a consensus is determined using prior readings. This is done by reviewing prior confidence score for each family, and if consistent prior confidence scores exists, then the current confidence score is increased (1114). If there are prior confidence scores, but they are inconsistent, the current confidence score is not modified in one embodiment (1116). In other embodiments, the confidence score is adjusted in a predetermined manner for inconsistent prior confidence scores. If this is a first time the family is detected, the current confidence score can be reduced as it may be a false reading (1118). The process can comprise inferring the frequency of the family at the locus in the set of tagged parent polynucleotides based on the confidence score. Then genetic test reports are generated as discussed above (1120). - While temporal information has been used in
FIGS. 11A-11B to enhance the information for mutation or copy number variation detection, other consensus methods can be applied. In other embodiments, the historical comparison can be used in conjunction with other consensus sequences mapping to a particular reference sequence to detect instances of genetic variation. Consensus sequences mapping to particular reference sequences can be measured and normalized against control samples. Measures of molecules mapping to reference sequences can be compared across a genome to identify areas in the genome in which copy number varies, or heterozygosity is lost. Consensus methods include, for example, linear or non-linear methods of building consensus sequences (e.g., voting, averaging, statistical, maximum a posteriori or maximum likelihood detection, dynamic programming, Bayesian, hidden Markov or support vector machine methods, etc.) derived from digital communication theory, information theory, or bioinformatics. After the sequence read coverage has been determined, a stochastic modeling algorithm is applied to convert the normalized nucleic acid sequence read coverage for each window region to the discrete copy number states. In some cases, this algorithm may comprise one or more of the following: Hidden Markov Model, dynamic programming, support vector machine, Bayesian network, trellis decoding, Viterbi decoding, expectation maximization, Kalman filtering methodologies and neural networks. - As depicted in
FIG. 11B , a comparison of sequence coverage to a control sample or reference sequence may aid in normalization across windows. In this embodiment, cell free DNAs are extracted and isolated from a readily accessible bodily fluid such as blood, sweat, saliva, urine, etc. For example, cell free DNAs can be extracted using a variety of methods known in the art, including but not limited to isopropanol precipitation and/or silica based purification. Cell free DNAs may be extracted from any number of subjects, such as subjects without cancer, subjects at risk for cancer, or subjects known to have cancer (e.g. through other means). - Following the isolation/extraction step, any of a number of different sequencing operations may be performed on the cell free polynucleotide sample. Samples may be processed before sequencing with one or more reagents (e.g., enzymes, unique identifiers (e.g., barcodes), probes, etc.). In some cases if the sample is processed with a unique identifier such as a barcode, the samples or fragments of samples may be tagged individually or in subgroups with the unique identifier. The tagged sample may then be used in a downstream application such as a sequencing reaction and individual molecules may be tracked to parent molecules.
- The cell free polynucleotides can be tagged or tracked in order to permit subsequent identification and origin of the particular polynucleotide. The assignment of an identifier to individual or subgroups of polynucleotides may allow for a unique identity to be assigned to individual sequences or fragments of sequences. This may allow acquisition of data from individual samples and is not limited to averages of samples. In some examples, nucleic acids or other molecules derived from a single strand may share a common tag or identifier and therefore may be later identified as being derived from that strand. Similarly, all of the fragments from a single strand of nucleic acid may be tagged with the same identifier or tag, thereby permitting subsequent identification of fragments from the parent strand. In other cases, gene expression products (e.g., mRNA) may be tagged in order to quantify expression. A barcode or barcode in combination with sequence to which it is attached can be counted. In still other cases, the systems and methods can be used as a PCR amplification control. In such cases, multiple amplification products from a PCR reaction can be tagged with the same tag or identifier. If the products are later sequenced and demonstrate sequence differences, differences among products with the same identifier can then be attributed to PCR error. Additionally, individual sequences may be identified based upon characteristics of sequence data for the read themselves. For example, the detection of unique sequence data at the beginning (start) and end (stop) portions of individual sequencing reads may be used, alone or in combination, with the length, or number of base pairs of each sequence read to assign unique identities to individual molecules. Fragments from a single strand of nucleic acid, having been assigned a unique identity, may thereby permit subsequent identification of fragments from the parent strand. This can be used in conjunction with bottlenecking the initial starting genetic material to limit diversity.
- Further, using unique sequence data at the beginning (start) and end (stop) portions of individual sequencing reads and sequencing read length may be used, alone or combination, with the use of barcodes. In some cases, the barcodes may be unique as described herein. In other cases, the barcodes themselves may not be unique. In this case, the use of non-unique barcodes, in combination with sequence data at the beginning (start) and end (stop) portions of individual sequencing reads and sequencing read length may allow for the assignment of a unique identity to individual sequences. Similarly, fragments from a single strand of nucleic acid having been assigned a unique identity may thereby permit subsequent identification of fragments from the parent strand.
- Generally, the methods and systems provided herein are useful for preparation of cell free polynucleotide sequences to a down-stream application sequencing reaction. Often, a sequencing method is classic Sanger sequencing. Sequencing methods may include, but are not limited to: high-throughput sequencing, pyrosequencing, sequencing-by-synthesis, single-molecule sequencing, nanopore sequencing, semiconductor sequencing, sequencing-by-ligation, sequencing-by-hybridization, RNA-Seq (Illumina), Digital Gene Expression (Helicos), Next generation sequencing, Single Molecule Sequencing by Synthesis (SMSS) (Helicos), massively-parallel sequencing, Clonal Single Molecule Array (Solexa), shotgun sequencing, Maxim-Gilbert sequencing, primer walking, and any other sequencing methods known in the art.
- Sequencing methods typically involve sample preparation, sequencing of polynucleotides in the prepared sample to produce sequence reads and bioinformatic manipulation of the sequence reads to produce quantitative and/or qualitative genetic information about the sample. Sample preparation typically involves converting polynucleotides in a sample into a form compatible with the sequencing platform used. This conversion can involve tagging polynucleotides. In certain embodiments of this invention the tags comprise polynucleotide sequence tags. Conversion methodologies used in sequencing may not be 100% efficient. For example, it is not uncommon to convert polynucleotides in a sample with a conversion efficiency of about 1-5%, that is, about 1-5% of the polynucleotides in a sample are converted into tagged polynucleotides. Polynucleotides that are not converted into tagged molecules are not represented in a tagged library for sequencing. Accordingly, polynucleotides having genetic variants represented at low frequency in the initial genetic material may not be represented in the tagged library and, therefore may not be sequenced or detected. By increasing conversion efficiency, the probability that a polynucleotide in the initial genetic material will be represented in the tagged library and, consequently, detected by sequencing is increased. Furthermore, rather than directly address the low conversion efficiency issue of library preparation, most protocols to date call for greater than 1 microgram of DNA as input material. However, when input sample material is limited or detection of polynucleotides with low representation is desired, high conversion efficiency can efficiently sequence the sample and/or to adequately detect such polynucleotides.
- Generally, mutation detection may be performed on selectively enriched regions of the genome or transcriptome purified and isolated (1302). As described herein, specific regions, which may include but are not limited to genes, oncogenes, tumor suppressor genes, promoters, regulatory sequence elements, non-coding regions, miRNAs, snRNAs and the like may be selectively amplified from a total population of cell free polynucleotides. This may be performed as herein described. In one example, multiplex sequencing may be used, with or without barcode labels for individual polynucleotide sequences. In other examples, sequencing may be performed using any nucleic acid sequencing platforms known in the art. This step generates a plurality of genomic fragment sequence reads (1304). Additionally, a reference sequence is obtained from a control sample, taken from another subject. In some cases, the control subject may be a subject known to not have known genetic aberrations or disease. In some cases, these sequence reads may contain barcode information. In other examples, barcodes are not utilized.
- After sequencing, reads can be assigned a quality score. A quality score may be a representation of reads that indicates whether those reads may be useful in subsequent analysis based on a threshold. In some cases, some reads are not of sufficient quality or length to perform the subsequent mapping step. Sequencing reads with a quality score at least 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set. In other cases, sequencing reads assigned a quality scored at least 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set. In
step 1306, the genomic fragment reads that meet a specified quality score threshold are mapped to a reference genome, or a reference sequence that is known not to contain mutations. After mapping alignment, sequence reads are assigned a mapping score. A mapping score may be a representation or reads mapped back to the reference sequence indicating whether each position is or is not uniquely mappable. In some instances, reads may be sequences unrelated to mutation analysis. For example, some sequence reads may originate from contaminant polynucleotides. Sequencing reads with a mapping score at least 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set. In other cases, sequencing reads assigned a mapping scored less than 90%, 95%, 99%, 99.9%, 99.99% or 99.999% may be filtered out of the data set. - For each mappable base, bases that do not meet the minimum threshold for mappability, or low quality bases, may be replaced by the corresponding bases as found in the reference sequence.
- The frequency of variant bases may be calculated as the number of reads containing the variant divided by the total number of
reads 1308 after ascertaining read coverage and identifying variant bases relative to the control sequence in each read. This may be expressed as a ratio for each mappable position in the genome. - For each base position, the frequencies of all four nucleotides, cytosine, guanine, thymine, adenine can be analyzed in comparison to the reference sequence. A stochastic or statistical modeling algorithm can be applied to convert the normalized ratios for each mappable position to reflect frequency states for each base variant. In some cases, this algorithm may comprise one or more of the following: Hidden Markov Model, dynamic programming, support vector machine, Bayesian or probabilistic modeling, trellis decoding, Viterbi decoding, expectation maximization, Kalman filtering methodologies, and neural networks.
- The discrete mutation states of each base position can be utilized to identify a base variant with high frequency of variance as compared to the baseline of the reference sequence. In some cases, the baseline might represent a frequency of at least 0.0001%, 0.001%, 0.01%, 0.1%, 1.0%, 2.0%, 3.0%, 4.0% 5.0%, 10%, or 25%. In other cases the baseline might represent a frequency of at least 0.0001%, 0.001%, 0.01%, 0.1%, 1.0%, 2.0%, 3.0%, 4.0% 5.0%. 10%, or 25%. In some cases, all adjacent base positions with the base variant or mutation can be merged into a segment to report the presence or absence of a mutation. In some cases, various positions can be filtered before they are merged with other segments.
- After calculation of frequencies of variance for each base position, the variant with largest deviation for a specific position in the sequence derived from the subject as compared to the reference sequence can be identified as a mutation. In some cases, a mutation may be a cancer mutation. In other cases, a mutation might be correlated with a disease state.
- A mutation or variant may comprise a genetic aberration that includes, but is not limited to a single base substitution, or small indels, transversions, translocations, inversion, deletions, truncations or gene truncations. In some cases, a mutation may be at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 nucleotides in length. On other cases a mutation may be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 nucleotides in length.
- Next, a consensus is determined using prior readings. This is done by reviewing prior confidence score for the corresponding bases, and if consistent prior confidence scores exists, then the current confidence score is increased (1314). If there are prior confidence scores, but they are inconsistent, the current confidence score is not modified in one embodiment (1316). In other embodiments, the confidence score is adjusted in a predetermined manner for inconsistent prior confidence scores. If this is a first time the family is detected, the current confidence score can be reduced as it may be a false reading (1318). The process can comprise then converting the frequency of variance per each base into discrete variant states for each base position (1320).
- Numerous cancers may be detected using the methods and systems described herein. Cancers cells, as most cells, can be characterized by a rate of turnover, in which old cells die and are replaced by newer cells. Generally dead cells, in contact with vasculature in a given subject, may release DNA or fragments of DNA into the blood stream. This is also true of cancer cells during various stages of the disease. Cancer cells may also be characterized, dependent on the stage of the disease, by various genetic aberrations such as copy number variation as well as mutations. This phenomenon may be used to detect the presence or absence of cancers individuals using the methods and systems described herein.
- For example, blood from subjects at risk for cancer may be drawn and prepared as described herein to generate a population of cell free polynucleotides. In one example, this might be cell free DNA. The systems and methods of the disclosure may be employed to detect mutations or copy number variations that may exist in certain cancers present. The method may help detect the presence of cancerous cells in the body, despite the absence of symptoms or other hallmarks of disease.
- The types and number of cancers that may be detected may include but are not limited to blood cancers, brain cancers, lung cancers, skin cancers, nose cancers, throat cancers, liver cancers, bone cancers, lymphomas, pancreatic cancers, skin cancers, bowel cancers, rectal cancers, thyroid cancers, bladder cancers, kidney cancers, mouth cancers, stomach cancers, solid state tumors, heterogeneous tumors, homogenous tumors and the like.
- The system and methods may be used to detect any number of genetic aberrations that may cause or result from cancers. These may include but are not limited to mutations, mutations, indels, copy number variations, transversions, translocations, inversion, deletions, aneuploidy, partial aneuploidy, polyploidy, chromosomal instability, chromosomal structure alterations, gene fusions, chromosome fusions, gene truncations, gene amplification, gene duplications, chromosomal lesions, DNA lesions, abnormal changes in nucleic acid chemical modifications, abnormal changes in epigenetic patterns, abnormal changes in nucleic acid methylation infection and cancer.
- Additionally, the systems and methods described herein may also be used to help characterize certain cancers. Genetic data produced from the system and methods of this disclosure may allow practitioners to help better characterize a specific form of cancer. Often times, cancers are heterogeneous in both composition and staging. Genetic profile data may allow characterization of specific sub-types of cancer that may be important in the diagnosis or treatment of that specific sub-type. This information may also provide a subject or practitioner clues regarding the prognosis of a specific type of cancer.
- The systems and methods provided herein may be used to monitor already known cancers, or other diseases in a particular subject. This may allow either a subject or practitioner to adapt treatment options in accord with the progress of the disease. In this example, the systems and methods described herein may be used to construct genetic profiles of a particular subject of the course of the disease. In some instances, cancers can progress, becoming more aggressive and genetically unstable. In other examples, cancers may remain benign, inactive or dormant. The system and methods of this disclosure may be useful in determining disease progression.
- Further, the systems and methods described herein may be useful in determining the efficacy of a particular treatment option. In one example, successful treatment options may actually increase the amount of copy number variation or mutations detected in subject's blood if the treatment is successful as more cancers may die and shed DNA. In other examples, this may not occur. In another example, perhaps certain treatment options may be correlated with genetic profiles of cancers over time. This correlation may be useful in selecting a therapy. Additionally, if a cancer is observed to be in remission after treatment, the systems and methods described herein may be useful in monitoring residual disease or recurrence of disease.
- The methods and systems described herein may not be limited to detection of mutations and copy number variations associated with only cancers. Various other diseases and infections may result in other types of conditions that may be suitable for early detection and monitoring. For example, in certain cases, genetic disorders or infectious diseases may cause a certain genetic mosaicism within a subject. This genetic mosaicism may cause copy number variation and mutations that could be observed. In another example, the system and methods of the disclosure may also be used to monitor the genomes of immune cells within the body. Immune cells, such as B cells, may undergo rapid clonal expansion upon the presence certain diseases. Clonal expansions may be monitored using copy number variation detection and certain immune states may be monitored. In this example, copy number variation analysis may be performed over time to produce a profile of how a particular disease may be progressing.
- Further, the systems and methods of this disclosure may also be used to monitor systemic infections themselves, as may be caused by a pathogen such as a bacteria or virus. Copy number variation or even mutation detection may be used to determine how a population of pathogens are changing during the course of infection. This may be particularly important during chronic infections, such as HIV/AIDs or Hepatitis infections, whereby viruses may change life cycle state and/or mutate into more virulent forms during the course of infection.
- Yet another example that the system and methods of this disclosure may be used for is the monitoring of transplant subjects. Generally, transplanted tissue undergoes a certain degree of rejection by the body upon transplantation. The methods of this disclosure may be used to determine or profile rejection activities of the host body, as immune cells attempt to destroy transplanted tissue. This may be useful in monitoring the status of transplanted tissue as well as altering the course of treatment or prevention of rejection.
- Further, the methods of the disclosure may be used to characterize the heterogeneity of an abnormal condition in a subject, the method comprising generating a genetic profile of extracellular polynucleotides in the subject, wherein the genetic profile comprises a plurality of data resulting from copy number variation and mutation analyses. In some cases, including but not limited to cancer, a disease may be heterogeneous. Disease cells may not be identical. In the example of cancer, some tumors are known to comprise different types of tumor cells, some cells in different stages of the cancer. In other examples, heterogeneity may comprise multiple foci of disease. Again, in the example of cancer, there may be multiple tumor foci, perhaps where one or more foci are the result of metastases that have spread from a primary site.
- The methods of this disclosure may be used to generate or profile, fingerprint or set of data that is a summation of genetic information derived from different cells in a heterogeneous disease. This set of data may comprise copy number variation and mutation analyses alone or in combination.
- Additionally, the systems and methods of the disclosure may be used to diagnose, prognose, monitor or observe cancers or other diseases of fetal origin. That is, these methodologies may be employed in a pregnant subject to diagnose, prognose, monitor or observe cancers or other diseases in a unborn subject whose DNA and other polynucleotides may co-circulate with maternal molecules.
- Further, these reports are submitted and accessed electronically via the internet. Analysis of sequence data occurs at a site other than the location of the subject. The report is generated and transmitted to the subject's location. Via an internet enabled computer, the subject accesses the reports reflecting his tumor burden.
- The annotated information can be used by a health care provider to select other drug treatment options and/or provide information about drug treatment options to an insurance company. The method can include annotating the drug treatment options for a condition in, for example, the NCCN Clinical Practice Guidelines in Oncology™ or the American Society of Clinical Oncology (ASCO) clinical practice guidelines.
- The drug treatment options that are stratified in a report can be annotated in the report by listing additional drug treatment options. An additional drug treatment can be an FDA-approved drug for an off-label use. A provision in the 1993 Omnibus Budget Reconciliation Act (OBRA) requires Medicare to cover off-label uses of anticancer drugs that are included in standard medical compendia. The drugs used for annotating lists can be found in CMS approved compendia, including the National Comprehensive Cancer Network (NCCN) Drugs and Biologics Compendium™, Thomson Micromedex DrugDex®, Elsevier Gold Standard's Clinical Pharmacology compendium, and American Hospital Formulary Service-Drug Information Compendium®.
- The drug treatment options can be annotated by listing an experimental drug that may be useful in treating a cancer with one or more molecular markers of a particular status. The experimental drug can be a drug for which in vitro data, in vivo data, animal model data, pre-clinical trial data, or clinical-trial data are available. The data can be published in peer-reviewed medical literature found in journals listed in the CMS Medicare Benefit Policy Manual, including, for example, American Journal of Medicine, Annals of Internal Medicine, Annals of Oncology, Annals of Surgical Oncology, Biology of Blood and Marrow Transplantation, Blood, Bone Marrow Transplantation, British Journal of Cancer, British Journal of Hematology, British Medical Journal, Cancer, Clinical Cancer Research, Drugs, European Journal of Cancer (formerly the European Journal of Cancer and Clinical Oncology), Gynecologic Oncology, International Journal of Radiation, Oncology, Biology, and Physics, The Journal of the American Medical Association, Journal of Clinical Oncology, Journal of the National Cancer Institute, Journal of the National Comprehensive Cancer Network (NCCN), Journal of Urology, Lancet, Lancet Oncology, Leukemia, The New England Journal of Medicine, and Radiation Oncology.
- The drug treatment options can be annotated by providing a link on an electronic based report connecting a listed drug to scientific information regarding the drug. For example, a link can be provided to information regarding a clinical trial for a drug (clinicaltrials.gov). If the report is provided via a computer or computer website, the link can be a footnote, a hyperlink to a website, a pop-up box, or a fly-over box with information, etc. The report and the annotated information can be provided on a printed form, and the annotations can be, for example, a footnote to a reference.
- The information for annotating one or more drug treatment options in a report can be provided by a commercial entity that stores scientific information. A health care provider can treat a subject, such as a cancer patient, with an experimental drug listed in the annotated information, and the health care provider can access the annotated drug treatment option, retrieve the scientific information (e.g., print a medical journal article) and submit it (e.g., a printed journal article) to an insurance company along with a request for reimbursement for providing the drug treatment. Physicians can use any of a variety of Diagnosis-related group (DRG) codes to enable reimbursement.
- A drug treatment option in a report can also be annotated with information regarding other molecular components in a pathway that a drug affects (e.g., information on a drug that targets a kinase downstream of a cell-surface receptor that is a drug target). The drug treatment option can be annotated with information on drugs that target one or more other molecular pathway components. The identification and/or annotation of information related to pathways can be outsourced or subcontracted to another company.
- The annotated information can be, for example, a drug name (e.g., an FDA approved drug for off-label use; a drug found in a CMS approved compendium, and/or a drug described in a scientific (medical) journal article), scientific information concerning one or more drug treatment options, one or more links to scientific information regarding one or more drugs, clinical trial information regarding one or more drugs (e.g., information from clinicaltrials.gov/), one or more links to citations for scientific information regarding drugs, etc.
- The annotated information can be inserted into any location in a report. Annotated information can be inserted in multiple locations on a report. Annotated information can be inserted in a report near a section on stratified drug treatment options. Annotated information can be inserted into a report on a separate page from stratified drug treatment options. A report that does not contain stratified drug treatment options can be annotated with information.
- The system can also include reports on the effects of drugs on sample (e.g. tumor cells) isolated from a subject (e.g. cancer patient). An in vitro culture using a tumor from a cancer patient can be established using techniques known to those skilled in the art. The system can also include high-throughput screening of FDA approved off-label drugs or experimental drugs using said in vitro culture and/or xenograft model. The system can also include monitoring tumor antigen for recurrence detection.
- The system can provide internet enabled access of reports of a subject with cancer. The system can use a handheld DNA sequencer or a desktop DNA sequencer. The DNA sequencer is a scientific instrument used to automate the DNA sequencing process. Given a sample of DNA, a DNA sequencer is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The order of the DNA bases is reported as a text string, called a read. Some DNA sequencers can be also considered optical instruments as they analyze light signals originating from fluorochromes attached to nucleotides.
- The data is sent by the DNA sequencers over a direct connection or over the internet to a computer for processing. The data processing aspects of the system can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. Data processing apparatus of the invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and data processing method steps of the invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output. The data processing aspects of the invention can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from and to transmit data and instructions to a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language, if desired; and, in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of nonvolatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
- To provide for interaction with a user, the invention can be implemented using a computer system having a display device such as a monitor or LCD (liquid crystal display) screen for displaying information to the user and input devices by which the user can provide input to the computer system such as a keyboard, a two-dimensional pointing device such as a mouse or a trackball, or a three-dimensional pointing device such as a data glove or a gyroscopic mouse. The computer system can be programmed to provide a graphical user interface through which computer programs interact with users. The computer system can be programmed to provide a virtual reality, three-dimensional display interface.
- The methods of this disclosure allow one to provide therapeutic interventions more precisely directed to the form of a disease in a subject, and to calibrate these therapeutic interventions over time. This precision reflects, in part, the precision by which one is able to profile the whole body tumor status of a subject as reflected in tumor heterogeneity. Thus, the therapeutic intervention is more effective against cancers with this profile than against cancers with any single one of these variants.
- A therapeutic intervention is an intervention that produces a therapeutic effect, (e.g., is therapeutically effective). Therapeutically effective interventions prevent, slow the progression of, improve the condition of (e.g., causes remission of), or cure a disease, such as a cancer. A therapeutic intervention can include, for example, administration of a treatment, such as chemotherapy, radiation therapy, surgery, immunotherapy, administration of a pharmaceutical or a nutraceutical, or, a change in behavior, such as diet. One measure of therapeutic effectiveness is effectiveness for at least 90% of subjects undergoing the intervention over at least 100 subjects.
- Drug targets in cancer and drugs efficacious against these targets are set forth in Tables 1 and 2 (taken from Bailey et al., Discovery Medicine, v. 18 #92, 2/7/14).
-
TABLE 1 Selected Examples of Commercially Available Diagnostic Tests, Associated Therapy Implication, and Relevant Cancer Type. Drug-Biomarker Clinical Therapy Implications Test Cancer Type Association IHC Assays Cetuximab; Panitumumab EGFR CRC Established Imatinib C-KIT GIST Established Trastuzumab HER2 Breast Cancer; Established Gastric Cancer Resistance to PI3K, AKT, LKB1 NSCLC Investigational (Mahoney et and MEK inhibitors al., 2009) Crizotinib C-MET NSCLC Investigational (Sadiq & Salgia, 2013) Akt/mTOR Inhibitors; PTEN CRC, NSCLC Investigational (Di resistance to anti- EGFR Nicolantonio et al., 2010; therapies Sos et al., 2009; Wang et al., 2012) In Situ Hybridization Assays Crizotinib ALK Fusion NSCLC Established FISH Trastuzumab; Pertuzumab HER2 FISH Breast Cancer, Established Gastric Cancer Trastuzumab HER2 CISH Breast Cancer Established Trastuzumab HER2 ISH Breast Cancer Established Mutation Assays Cetuximab, Panitumumab KRAS CRC, NSCLC, Established Pancreatic Cancer Erlotinib, Gefitinib EGFR NSCLC, CRC Established Vemurafenib, Trametenib, BRAF CRC, Thyroid Established Dabrafenib, Resistance to Cancer, Melanoma Anti-EGFR therapies Imatinib; 2nd Generation BCR-ABL CML, Ph+ AML Established TKIs Crizotinib ALK NSCLC Established RAF and MEK inhibitors, NRAS Melanoma, CRC, Investigational (Ascierto et resistance to anti-EGFR NSCLC al., 2013; De Mattos-Arruda therapies et al., 2011; De Roock et al., 2010; Huang et al., 2013) Imatinib PDGFRA GIST Established PI3K/mTOR Inhibitors PIK3CA Breast Cancer, CRC, Investigational (Di Lung Cancer Nicolantonio et al., 2010; Janku et al., 2013) Akt/mTOR Inhibitors; PTEN CRC, NSCLC, Investigational (Di resistance to anti- EGFR Breast Nicolantonio et al., 2010, therapies Jerusalem et al., 2013; Sos et al., 2009; Wang et al., 2012) Resistance to PI3K, AKT, LKB1 NSCLC Investigational (Averette- and MEK inhibitors Byers et al., 2012) Other Imatinib BCR-ABL1 CML, Ph+ AML Established Quantitative Transcript Analysis Resistance to Imatinib BCR-ABL1 CML, Ph+ AML Investigational (Hochhaus et Copy Number al., 2002) PI3K Inhibitors PIK3CA Multiple Cancer Investigational (Rodon et al., Amplification Types 2013) Erlotinib; Getfitnib; EGFR NSCLC, CRC Investigational (Gupta et al., Cetuximab; Panitumumab Amplification 2009) Note: The drug-biomarker clinical associations denoted ‘Established’ reflect well known drug FDA indications. The ones denoted ‘Investigational’ are associations that are hypothesized and demonstrated by scientific literature. -
TABLE 2 US FDA Approved Targeted Therapies and Indications. Trade Agent Name Target(s) FDA-approved Indication(s) Company Monoclonal Antibodies Ado- Kadcyla HER2 Breast cancer (HER2+)* Genentech trastuzumab emtansine (T- DM1)* Bevacizumab Avastin VEGF CRC Genentech GBM NCLC RCC Cetuximab* Erbitux EGFR CRC (KRAS wild-type)* Eli Lilly HNSCC Ipilimumab Yervoy CTLA-4 Melanoma Bristol-Myers Squibb Obinutuzumab Gazyva CD-20 CLL Genentech Panitumumab* Vectibix EGFR CRC (KRAS wild-type)* Amgen Pertuzumab Perjeta HER2 Breast Cancer (HER2+)* Genentech Trastuzumab* Herceptin HER2 Breast cancer (HER2+)* Genentech Gastric cancer (HER2+)* Small Molecule Inhibitors Afatinib* Gilotrif EGFR, HER2 NSCLC (with EGFR exon 19 Boehringer deletions or L858R substitution)* Ingelheim Axitinib Inlyta KIT, PDGFRβ, RCC Pfizer VEGFR1/2/3 Bosutinib* Bosulif ABL CML (Philadelphia chromosome Pfizer positive)* Cabozantinib Cometriq FLT3, KIT, Medullary thyroid cancer Exelixis MET, RET, VEGFR2 Crizotinib* Xalkori ALK, MET NSCLC (with ALK fusion)* Pfizer Dabrafenib* Tafinlar BRAF Melanoma (with BRAF V600E GlaxoStnithKline mutation)* Dasatinib* Sprycel ABL CML (Philadelphia chromosome Bristol-Myers positive)* Squibb ALL (Philadelphia chromosome positive)* Denosumab Xgeva RANKL Giant cell tumor of bone Amgen Erlotinib* Tarceva EGFR NSCLC (with exon 19 deletions or Genentech & L858R substitutions)* OSI Pancreatic cancer Everolimus* Afinitor mTOR Pancreatic neuroendocrine tumor Novartis RCC Breast cancer (ER/PR+) in combination with exemestane* Nonresectable subependymal giant cell astrocytorna associated with tuberous sclerosis Gefitinib Iressa EGFR NSCLC with known prior benefit AstraZeneca from gefitinib (limited approval) Ibrutininb Imbruvica BTK Mantle cell lymphoma Pharmacyclics Imatinib* Gleevec KIT, PDGFR, GI stromal tumor Novartis ABL Dermatofibrosarcoma protuberans Multiple hematologic malignancies including Philadelphia chromosome- positive ALL and CML* Lapatinib* Tykerb HER2, EGFR Breast cancer (HER2+)* GlaxoSmithKline Nilotinib* Tasigna ABL CML (Philadelphia chromosome Novartis positive)* Pazopanib Votrient VEGFR, RCC GlaxoSmithKline PDGFR, KIT Soft tissue sarcoma Regorafenib Stivarga KIT, PDGFRβ, CRC Bayer RAF, RET, Gastrointestinal stromal tumors VEGFR1/2/3 Ruxolitinib Jakafi JAK1/2 Myelofibrosis Incyte Sorafenib Nexavar VEGFR, Hepatocellular carcinoma Bayer PDGFR, KIT, RCC RAF Sunitinib Sutent VEGFR, GIST Pfizer PDGFR, KIT, Pancreatic neuroendocrine tumor RET RCC Temsirolimus Torisel mTOR RCC Wyeth Trametinib* Mekinist MEK Melanoma (with BRAF V600E or GlaxoSmithKline V600K mutations)* Vandetanib Caprelsa EGFR RET, Medullary thyroid cancer AstraZeneca VEGFR2 Vemurafenib* Zelboraf BRAF Melanoma (with BRAF V600 Roche mutation)* Note: ALL, acute lymphoblastic leukemia; CML, chronic myeloid leukemia; GIST, gastrointestinal stromal tumor; ER, estrogen receptor; PR, progesterone receptor; NSCLC, non-small cell lung cancer; CRC, colorectal cancer; GBM, glioblastoma; RCC, renal cell carcinoma; HNSCC, head and neck squamous cell carcinoma; CLL, chronic lymphoblastic leukemia; BTK, Bruton's tyrosine kinase. *Targeted therapy that is associated with a molecular-specific cancer subtype alteration. There are approximately 17 targeted therapies that are associated with 10 molecular-specific subtypes of cancer. - In one embodiment, based on the profile of disease heterogeneity, a therapeutic intervention is determined that takes into account both the type of genetic variants found in the disease cells and their relative amounts (e.g., proportion). The therapeutic intervention can treat the subject as if each clonal variant were a different cancer to be treated independently. In some cases, when one or more genetic variants are detected at less than sub-clinical amounts, e.g., at least 5× lower, at least 10× lower, or at least 100× lower than the dominant detected clones, these variants may be left out of the therapeutic intervention until they rise to a clinical threshold or significant relative frequency (e.g., greater than the threshold stated above).
- When a plurality of different genetic variants is found in different quantities, e.g., different numbers or different relative amounts, a therapeutic intervention can include treatments effective against diseases with each of the genetic variants. For example, in the case of cancer, genetic variants, such as mutant forms of a gene or gene amplification, may be detected in several genes (e.g., a major clone and a minor clone). Each of these forms may be actionable, that is, a treatment may be known for which cancers with the particular variant are responsive. However, the profile of tumor heterogeneity may indicate that one of the variants is present in the polynucleotides at, for example, five times the level of each of the other two variants. A therapeutic intervention can be determined that involves delivering three different drugs to the subject, each drug relatively more effective against cancers bearing each of the variants. The drugs can be delivered as a cocktail, or sequentially.
- In a further embodiment, the drugs can be administered in doses stratified to reflect the relative amounts of the variants in the DNA. For example, a drug effective against the most common variant can be administered in greater amount than drugs effective against the two less common variants.
- Alternatively, the profile of tumor heterogeneity can show the presence of a sub-population of cancer cells bearing a genetic variant that is resistant to a drug to which the disease typically responds. In this case, the therapeutic intervention can involve including both a first drug effective against tumor cells without the resistance variant and a second drug effective against tumor cells with the resistant variant. Again, doses can be stratified to reflect relative amounts of each variant detected in the profile.
- In another embodiment, changes in the profile of tumor heterogeneity are examined over time, and therapeutic interventions are developed to treat the changing tumor. For example, disease heterogeneity can be determined at a plurality of different times. Using the profiling methods of this disclosure, more precise inferences can be made about tumor evolution. This allows the practitioner to monitor the evolution of the disease, in particular as new clonal sub-populations emerge after remission effected by a first wave of therapy. In this case, therapeutic interventions can be calibrated over time to treat the changing tumor. For example, a profile may show that a cancer has a form that is responsive to a certain treatment. The treatment is delivered and the tumor burden is seen to decrease over time. At some point, a genetic variant is found in the tumor indicating the presence of a population of cancer cells that is not responsive to the treatment. A new therapeutic intervention is determined that targets the cells bearing the marker of non-responsiveness.
- In response to chemotherapy, a dominant tumor form can eventually give way through Darwinian selection to cancer cells carrying mutants that render the cancer unresponsive to the therapy regimen. Appearance of these resistance mutants can be delayed through methods of this disclosure. In one embodiment of this method, a subject is subjected to one or more pulsed therapy cycles, each pulsed therapy cycle comprising a first period during which a drug is administered at a first amount and a second cycle during which the drug is administered at a second, reduced amount. The first period is characterized by a tumor burden detected above a first clinical level. The second period is characterized by a tumor burden detected below a second clinical level. First and second clinical levels can be different in different pulsed therapy cycles. So, for example, the first clinical level can be lower in succeeding cycles. A plurality of cycles can include at least 2, 3, 4, 5, 6, 7, 8 or more cycles. For example, the BRAF mutant V600E may be detected in disease cell polynucleotides at an amount indicating a tumor burden of 5% in cfDNA. Chemotherapy can commence with dabrafenib. Subsequent testing can show that the amount of the BRAF mutant in the cfDNA falls below 0.5% or to undetectable levels. At this point, dabrafenib therapy can stop or be significantly curtailed. Further subsequent testing may find that DNA bearing the BRAF mutation has risen to 2.5% of polynucleotides in cfDNA. At this point, dabrafenib therapy is re-started, e.g., at the same level as the initial treatment. Subsequent testing may find that DNA bearing the BRAF mutation has decreased to 0.5% of polynucleotides in cfDNA. Again, dabrafenib therapy is stopped or reduced. The cycle can be repeated a number of times.
-
FIG. 7 shows an exemplary course of monitoring and treatment of disease in a subject. A subject tested at the time ofblood draw 1 has a tumor burden of 1.4% and presents with genetic alterations ingenes Gene 4. The subject is now put on a course of Drug B, to which cancers having this variant are responsive. - In another embodiment, a therapeutic intervention can be changed upon detection of the rise of a mutant form resistant to an original drug. For example, cancers with the EGFR mutation L858R respond to therapy with erlotinib. However, cancers with the EGFR mutation T790M are resistant to erlotinib. However, they are responsive to ruxolitinib. A method of this disclosure involves monitoring changes in tumor profile and changing a therapeutic intervention when a genetic variant associated with drug resistance rises to a predetermined clinical level.
- In another embodiment, a database is built in which genetic information from serial samples collected from cancer patients is recorded. This database may also contain intervening treatment and other clinically relevant information, such as, weight, adverse effects, histological testing, blood testing, radiographic information, prior treatments, cancer type, etc. Serial test results can be used to infer efficacy of treatment, especially when used with blood samples, which can give a more unbiased estimate of tumor burden than self-reporting or radiographic reporting by a medical practitioner. Treatment efficacy can be clustered by those with similar genomic profiles and vice versa. Genomic profiles can be organized around, for example, primary genetic alteration, secondary genetic alteration(s), relative amounts of these genetic alterations, and tumor load. This database can be used for decision support for subsequent patients. Both germline and somatic alterations can be used for determining treatment efficacy as well. Acquired resistance alterations that can also be inferred from the database when treatments that were effective initially begin to fail. This failure can be detected through radiographic, blood or other means. The primary data used for inference of acquired resistance mechanisms are genomic tumor profiles collected after treatment per patient. This data can also be used to place quantitative bounds on likely treatment response as well as predict time to treatment failure. Based on likely acquired resistance alterations for a given treatment and tumor genomic profile, a treatment regimen can be modified to suppress acquisition of most likely resistance alterations.
- Methods of the present disclosure can be implemented using, or with the aid of, computer systems.
FIG. 5 shows acomputer system 1501 that is programmed or otherwise configured to implement the methods of the present disclosure. Thecomputer system 1501 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 1505. Thecomputer system 1501 also includes memory or memory location 1510 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 1515 (e.g., hard disk), communication interface 1520 (e.g., network adapter) for communicating with one or more other systems, andperipheral devices 1525, such as cache, other memory, data storage and/or electronic display adapters. Thememory 1510,storage unit 1515,interface 1520 andperipheral devices 1525 are in communication with theCPU 1505 through a communication bus (solid lines). Thestorage unit 1515 can be a data storage unit (or data repository) for storing data. Thecomputer system 1501 can be operatively coupled to a computer network (“network”) 1530 with the aid of thecommunication interface 1520. Thenetwork 1530 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet. Thenetwork 1530 in some cases is a telecommunication and/or data network. Thenetwork 1530 can include one or more computer servers, which can enable distributed computing, such as cloud computing. TheCPU 1505 can execute a sequence of machine-readable instructions, which can be embodied in a program or software. The instructions may be stored in a memory location, such as thememory 1510. Thestorage unit 1515 can store files, such as drivers, libraries and saved programs. Thecomputer system 1501 can communicate with one or more remote computer systems through thenetwork 1530. Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of thecomputer system 1501, such as, for example, on thememory 1510 orelectronic storage unit 1515. The machine executable or machine readable code can be provided in the form of software. Aspects of the systems and methods provided herein, such as thecomputer system 1501, can be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machine-executable code can be stored on an electronic storage unit, such memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk. “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Thecomputer system 1501 can include or be in communication with an electronic display that comprises a user interface (UI) for providing, for example, one or more results of sample analysis. - Nucleotide positions (e.g., loci) in the genome can be designated by number, as depicted in
FIG. 2 . Positions at which about 100% of the base calls are identical to the reference sequence or at which about 100% of the base calls are different than the reference sequence are inferred to represent homozygosity of the cfDNA (presumed normal). Positions at which about 50% of the base calls are identical to the reference sequence are inferred to represent heterozygosity of the cfDNA (also presumed normal). Positions at which the percentage of base calls at a locus are substantially below 50% and above the detection limit of the base calling system are inferred to represent tumor-associated genetic variants. - 10-30 mL Blood samples are collected at room temperature. The samples are centrifuged to remove cells. Plasma is collected after centrifugation.
- cfDNA Extraction
- The sample is subjected to proteinase K digestion. DNA is precipitated with isopropanol. DNA is captured on a DNA purification column (e.g., a QIAamp DNA Blood Mini Kit) and eluted in 100 μl solution. DNAs below 500 bp are selected with Ampure SPRI magnetic bead capture (PEG/salt). The resulting production is suspended in 30 μl H2O. Size distribution is checked (major peak=166 nucleotides; minor peak=330 nucleotides) and quantified. 5 ng of extracted DNA contain approximately 1700 haploid genome equivalents (“HGE”). The general correlation between the amount of DNA and HGE is as follow: 3 pg DNA=1 HGE; 3 ng DNA=1K HGE; 3 μg DNA=1M HGE; 10 pg DNA=3 HGE; 10 ng DNA=3K HGE; 10 μg DNA=3M HGE.
- High-efficiency DNA tagging (>80%) is performed by end repair, A-tailing and sticky-end ligation with 2 different octomers (i.e., 4 combinations) with overloaded hairpin adaptors. 2.5 ng DNA (i.e. approximately 800 HGE) is used as the starting material. Each hairpin adaptor comprises a random sequence on its non-complementary portion. Both ends of each DNA fragment are attached with hairpin adaptors. Each tagged fragment can be identified by a combination of the octomer sequence on the hairpin adaptors and endogenous portions of the insert sequence.
- Tagged DNA is amplified by 12 cycles of PCR to produce about 1-7 μg DNA that contain approximately 500 copies of each of the 800 HGE in the starting material.
- Buffer optimization, polymerase optimization and cycle reduction may be performed to optimize the PCR reactions. Amplification bias, e.g., non-specific bias, GC bias, and/or size bias are also reduced by optimization. Noise(s) (e.g., polymerase-introduced errors) are reduced by using high-fidelity polymerases.
- Sequences may be enriched as follow: DNAs with regions of interest (ROI) are captured using biotin-labeled bead with probe to ROIs. The ROIs are amplified with 12 cycles of PCR to generate a 2000 times amplification.
- 0.1 to 1% of the sample (approximately 100 pg) are used for sequencing. The resulting DNA is then denatured and diluted to 8 pM and loaded into an Illumina sequencer.
- Sequence reads are grouped into families, with about 10 sequence reads in each family. Families are collapsed into consensus sequences by voting (e.g., biased voting) each position in a family. A base is called for consensus sequence if 8 or 9 members agree. A base is not called for consensus sequence if no more than 60% of the members agree.
- The resulting consensus sequences are mapped to a reference genome, such as hg19. Each base in a consensus sequence is covered by about 3000 different families. A quality score for each sequence is calculated and sequences are filtered based on their quality scores. Base calls at each position in a consensus sequence are compared with the HG-19 reference sequence. At each position at which a base call differs from the reference sequence, the identity of the different base or bases, and their percentage as a function of total base calls at the locus is determined and reported.
- Sequence variation is detected by counting distribution of bases at each locus. If 98% of the reads have the same base (homozygous) and 2% have a different base, the locus is likely to have a sequence variant, presumably from cancer DNA.
- CNV is detected by counting the total number of sequences (bases) mapping to a locus and comparing with a control locus. To increase CNV detection, CNV analysis is performed specific regions, including regions on ALK, APC, BRAF, CDKN2A, EGFR, ERBB2, FBXW7, KRAS, MYC, NOTCH1, NRAS, PIK3CA, PTEN, RB1, TP53, MET, AR, ABL1, AKT1, ATM, CDH1, CSF1R, CTNNB1, ERBB4, EZH2, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KDR, KIT, MLH1, MPL, NPM1, PDGFRA, PROC, PTPN11, RET, SMAD4, SMARCB1, SMO, SRC, STK11, VHL, TERT, CCND1, CDK4, CDKN2B, RAF1, BRCA1, CCND2, CDK6, NF1, TP53, ARID1A, BRCA2, CCNE1, ESR1, RIT1, GATA3, MAP2K1, RHEB, ROS1, ARAF, MAP2K2, NFE2L2, RHOA, or NTRK1 genes.
- After fragments are amplified and the sequences of amplified fragments are read and aligned, the fragments are subjected to base calling. Variations in the number of amplified fragments and unseen amplified fragments can introduce errors in base calling. These variations are corrected by calculating the number of unseen amplified fragments.
- When base calling for locus A (an arbitrary locus), it is first assumed that there are N amplified fragments. The sequence readouts can come from two types of fragments: double-strand fragments and single-strand fragments. The following is a theoretical example of calculating the total number of unseen molecules in a sample.
- N is the total number of molecules in the sample.
- Assuming 1000 is the number of duplexes detected.
- Assuming 500 is the number of single-stranded molecule detected.
- P is the probability of seeing a strand.
- Q is the probability of not detecting a strand.
- Since Q=1−P.
- 1000=NP(2).
- 500=N2PQ.
- 1000/P(2)=N.
- 500÷2PQ=N.
- 1000/P(2)=500÷2PQ.
- 1000*2 PQ=500 P(2).
- 2000 PQ=500 P(2).
- 2000 Q=500 P.
- 2000 (1−P)=500P
- 2000−2000 P=500P.
- 2000=500P+2000 P.
- 2000=2500 P.
- 2000÷2500=P.
- 0.8=P.
- 1000/P(2)=N.
- 1000÷0.64=N.
- 1562=N.
- Number of unseen fragments=62.
- An assay is used to analyze a panel of genes to identify genetic variants in cancer-associated somatic variants with high sensitivity.
- Cell-free DNA is extracted from plasma of a patient and amplified by PCR. Genetic variants are analyzed by massively parallel sequencing of the amplified target genes. For one set of genes, all exons are sequenced as such sequencing coverage had shown to have clinically utility (Table 3). For another set of genes, sequencing coverage included those exons with a previously reported somatic mutation (Table 4). The minimum detectable mutant allele (limit of detection) is dependent on the patient's sample cell-free DNA concentration, which varied from less than 10 to over 1,000 genomic equivalents per mL of peripheral blood. Amplification may not be detected in samples with lower amounts of cell-free DNA and/or low-level gene copy amplification. Certain sample or variant characteristics resulted in reduced analytic sensitivity, such as low sample quality or improper collection.
- The percentage of genetic variants found in cell-free DNA circulating in blood is related to the unique tumor biology of this patient. Factors that affected the amount/percentages of detected genetic variants in circulating cell-free DNA in blood include tumor growth, turnover, size, heterogeneity, vascularization, disease progression or treatment. Table 5 annotates the percentage, or allele frequency, of altered circulating cell-free DNA (% cfDNA) detected in this patient. Some of the detected genetic variants are listed in descending order by % cfDNA.
- Genetic variants are detected in the circulating cell-free DNA isolated from this patient's blood specimen. These genetic variants are cancer-associated somatic variants, some of which have been associated with either increased or reduced clinical response to specific treatment. “Minor Alterations” are defined as those alterations detected at less than 10% the allele frequency of “Major Alterations”. A Major Alteration is the predominant alteration at a locus. The detected allele frequencies of these alterations (Table 5) and associated treatments for this patient are annotated.
- All genes listed in Tables 3 and 4 are analyzed as part of the test. Amplification is not detected for ERBB2, EGFR, or MET in the circulating cell-free DNA isolated from this patient's blood specimen.
- Patient test results comprising the genetic variants are listed in Table 6.
- Referring to Table 4, at 13 positions, a nucleotide detected at at least 98.8% frequency in the sample is different than a nucleotide in the reference sequence, indicating homozygosity at these loci. For example, in the KRAS gene, at position 25346462, T was detected rather than reference nucleotide C in 100% of cases.
- At 35 positions, a nucleotide detected at between 41.4% and 55% frequency in the sample is different than a nucleotide in the reference sequence, indicating heterozygosity at these loci. For example, in the ALK gene, at position 29455267, G was detected rather than reference nucleotide A in 50% of cases.
- At 3 positions a nucleotide detected at less than 9% frequency is different than a nucleotide in the reference sequence. These include variants in BRAF (140453136 A>T, 8.9%), NRAS (115256530 G>T 2.6%) and JAK2 (5073770 G>T 1.5%). They are presumed to be somatic mutations from cancer DNA.
- The relative amounts of tumor-associated genetic variants are calculated. The ratio of amounts of BRAF:NRAS:JAK2 is 8.9:2.6:1.5, or 1:0.29:0.17. From this result one can infer the presence of tumor heterogeneity. For example, one possible interpretation is that 100% of tumor cells contain a variant in BRAF, 83% contain variants in BRAF and NRAS, and 17% contain variants in BRAF, NRAS and JAK2. However, analysis of CNV may show amplification of BRAF, in which
case 100% of tumor cells may have variants in both BRAF and NRAS. -
TABLE 3 Genes in which all exons are sequenced GENES IN WHICH ALL EXONS ARE SEQUENCED Gene LOD Gene LOD ALK <0.1% APC <0.1% AR <0.1% BRAF <0.1% CDKN2A <0.1% EGFR <0.1% ERBB2 <0.1% FBXW7 <0.1% KRAS <0.1% MET <0.1% MYC <0.1% NOTCH1 <0.1% NRAS <0.1% PIK3CA <0.1% PTEN <0.1% PROC <0.1% RB1 <0.1% TP53 <0.1% LOD: Limit of Detection. The minimum detectable mutant allele frequency for this specimen in which 80% of somatic variants is detected. -
TABLE 4 Genes in which exons with a previously reported somatic mutation are sequenced GENES IN WHICH EXONS WITH A PREVIOUSLY REPORTED SOMATIC MUTATION ARE SEQUENCED Gene LOD Gene LOD ABL1 <0.1% AKT1 <0.1% ATM <0.1% CDH1 <0.1% CSF1R <0.1% CTNNB1 <0.1% ERBB4 <0.1% EZH2 <0.1% FGFR1 <0.1% FGFR2 <0.1% FGFR3 <0.1% FLT3 <0.1% GNA11 <0.1% GNAQ <0.1% GNAS <0.1% HNF1A <0.1% HRAS <0.1% IDH1 <0.1% IDH2 <0.1% JAK2 <0.1% JAK3 <0.1% KDR <0.1% KIT <0.1% MLH1 <0.1% MPL <0.1% NPM1 <0.1% PDGFRA <0.1% PTPN11 <0.1% RET <0.1% SMAD4 <0.1% SMARCB1 <0.1% SMO <0.1% SRC <0.1% STK11 <0.1% TERT <0.1% VHL <0.1% LOD: Limit of Detection. The minimum detectable mutant allele frequency for this specimen in which 80% of somatic variants is detected. -
TABLE 5 Allele frequency of altered circulating cell-free DNA detected in this patient cfDNA with cfDNA without Gene alterations (%) alterations (%) BRAF V600E 8.9 91.1 NRAS Q61K 6.2 93.8 JAK V617F 1.5 98.6 -
TABLE 6 Genomic alterations detected in selected genes Detected: 51 Genomic Alterations Mutation Mutation Gene Chromosome Position (nt) (AA) Percentage Cosmic ID DBSNP ID KRAS 12 25368462 C > T 100.0% rs4362222 ALK 2 29416572 T > C I1461V 100.0% rs1670283 ALK 2 29444095 C > T 100.0% rs1569156 ALK 2 29543663 T > C Q500Q 100.0% rs2293564 ALK 2 29940529 A > T P234P 100.0% rs2246745 APC 5 112176756 T > A V1822D 100.0% rs459552 CDKN2A 9 21968199 C > G 100.0% COSM14251 rs11515 FGFR3 4 1807894 G > A T651T 100.0% rs7688609 NOTCH1 9 139410424 A > G 100.0% rs3125006 PDGFRA 4 55141055 A > G P567P 100.0% rs1873778 HRAS 11 534242 A > G H27H 100.0% COSM249860 rs12628 EGFR 7 55214348 C > T N158N 99.9% COSM42978 rs2072454 TP53 17 7579472 G > C P72R 99.8% rs1042522 APC 5 112162854 T > C Y486Y 55.0% rs2229992 APC 5 112177171 G > A P1960P 53.8% rs465899 EGFR 7 55266417 T > C T903T 53.6% rs1140475 APC 5 112176325 G > A G1678G 53.2% rs42427 APC 5 112176559 T > G S1756S 53.0% rs866006 EGFR 7 55229255 G > A R521K 53.0% MET 7 116397572 A > G Q648Q 52.7% APC 5 112175770 G > A T1493T 52.7% rs41115 EGFR 7 55249063 G > A Q787Q 52.6% rs1050171 NOTCH1 9 139411714 T > C 52.4% rs11145767 EGFR 7 55238874 T > A T629T 52.0% rs2227984 ERBB2 17 37879588 A > G I655V 51.6% rs1136201 NOTCH1 9 139397707 G > A D1698D 51.3% COSM33747 rs10521 ALK 2 30143499 G > C L9L 51.0% rs4358080 APC 5 112164561 G > A A545A 51.0% rs351771 FLT3 13 28610183 A > G 50.8% rs2491231 NOTCH1 9 139418260 A > G N104N 50.5% rs4489420 ALK 2 29444076 G > T 50.4% rs1534545 PIK3CA 3 178917005 A > G 50.3% rs3729674 NOTCH1 9 139412197 G > A 50.2% rs9411208 ALK 2 29455267 A > G G845G 50.0% COSM148825 rs2256740 KIT 4 55593464 A > C M541L 49.9% COSM28026 NOTCH1 9 139391636 G > A D2185D 48.9% rs2229974 PDGFRA 4 55152040 C > T V824V 48.9% COSM22413 rs2228230 ALK 2 29416481 T > C K1491R 48.9% COSM1130802 rs1881420 ALK 2 29445458 G > T G1125G 48.6% rs3795850 NOTCH1 9 139410177 T > C 48.5% rs3124603 RET 10 43613843 G > T L769L 48.2% rs1800861 EGFR 7 55214443 G > A 48.0% rs7801956 ALK 2 29416366 G > C D1529E 47.2% rs1881421 EGFR 7 55238087 C > T 45.5% rs10258429 RET 10 43615633 C > G S904S 44.8% rs1800863 BRAF 7 140453136 A > T V600E 8.9% COSM476 NRAS 1 115256530 G > T Q61K 6.2% COSM580 rs121913254 JAK2 9 5073770 G > T V617F 1.5% COSM12600 rs77375493 - Using the method of Example 3, Genetic alterations in cell-free DNA of a patient are detected. The sequence reads of these genes include exon and/or intron sequences.
- Double-stranded cell-free DNA is isolated from the plasma of a patient. The cell-free DNA fragments are tagged using 16 different bubble-containing adaptors, each of which comprises a distinctive barcode. The bubble-containing adaptors are attached to both ends of each cell-free DNA fragment by ligation. After ligation, each of the cell-free DNA fragment can be distinctly identified by the sequence of the distinct barcodes and two 20 bp endogenous sequences at each end of the cell-free DNA fragment.
- The tagged cell-free DNA fragments are amplified by PCR. The amplified fragments are enriched using beads comprising oligonucleotide probes that specifically bind to a group of cancer-associated genes. Therefore, cell-free DNA fragments from the group of cancer-associated genes are selectively enriched.
- Sequencing adaptors, each of which comprises a sequencing primer binding site, a sample barcode, and a cell-flow sequence, are attached to the enriched DNA molecules. The resulting molecules are amplified by PCR.
- Both strands of the amplified fragments are sequenced. Because each bubble-containing adaptor comprises a non-complementary portion (e.g., the bubble), the sequence of the one strand of the bubble-containing adaptor is different from the sequence of the other strand (complement). Therefore, the sequence reads of amplicons derived from the Watson strand of an original cell-free DNA can be distinguished from amplicons from the Crick strand of the original cell-free DNA by the attached bubble-containing adaptor sequences.
- The sequence reads from a strand of an original cell-free DNA fragment are compared to the sequence reads from the other strand of the original cell-free DNA fragment. If a variant occurs in only the sequence reads from one strand, but not other strand, of the original cell-free DNA fragment, this variant will be identified as an error (e.g., resulted from PCR and/or amplification), rather than a true genetic variant.
- The sequence reads are grouped into families. Errors in the sequence reads are corrected. The consensus sequence of each family is generated by collapsing.
- A therapeutic intervention is determined to treat the cancer. Cancers with BRAF mutants respond to treatment with vemurafenib, regorafenib, tranetinib and dabrafenib. Cancers with NRAS mutants respond to treatment with trametinib. Cancers with JAK2 mutants respond to treatment with ruxolitinib. A therapeutic intervention including administration of trametinib and ruxolitinib is determined to be more effective against this cancer than treatment with any one of the aforementioned drugs alone. The subject is treated with a combination of trametinib and ruxolitinib at a dose ratio of 5:1.
- After several rounds of treatment, the cfDNA from the subject is tested again for the presence of tumor heterogeneity. Results show that the ratio of the BRAF:NRAS:JAK2 is now about 4:2:1.5. This indicates that the therapeutic intervention has reduced the number of cells with the BRAF and NRAS mutants, and has halted growth of cells with JAK2 mutants. A second therapeutic intervention is determined in which trametinib and ruxolitinib are determined to be effective in a dose ratio of 1:1. The subject is given a course of chemotherapy at amounts at this ratio. Subsequent testing shows that BRAF, NRAS and JAK2 mutants are present in cfDNA at amounts below 1%.
- A blood sample is collected from an individual with melanoma pre-treatment and the patient is determined to have a BRAF V600E mutation at a concentration of 2.8% and no detectable NRAS mutations using cell-free DNA analysis. The patient is put on an anti-BRAF therapy (dabrafenib). After 3 weeks, another blood sample is collected and tested. The BRAF V600E level is determined to have dropped to 0.1%. The therapy is stopped and the test repeated every 2 weeks. The BRAF V600E level rises again and therapy is reinitiated when the BRAF V600E level rises to 1.5%. Therapy is again stopped when the level drops down to 0.1% again. This cycle is repeated.
- Copy number variations in a patient sample are determined. Methods for determining can include molecular tracking and upsampling, as described above. A hidden-markov model based on expected locations of origins of replication is used to remove the effect of replication origin proximity from the estimated copy number variations in the patient sample. The standard deviation of copy-number variations for each gene is subsequently reduced by 40%. The replication origin proximity model is also used to infer cell-free tumor burden in the patient.
- In many cases, the level of cell-free tumor derived may be low or below the detection limit of a particular technology. This can be the case when the number of human genome equivalents of tumor derived DNA in plasma is below 1 copy per 5 mL. Radiation and chemotherapies have been shown to affect rapidly dividing cells more than stable, healthy cells, hence their efficacy in treating advanced cancer patients. Hence, a procedure with minimal adverse effects is administered to a patient pre-blood collection to preferentially increase the fraction of tumor-derived DNA collected. For example, a low dose of chemotherapy could be administered to the patient and a blood sample could be collected within 24 hours, 48 hours, 72 hours or less than 1 week. For effective chemotherapies, this blood sample contains higher concentrations of cell-free tumor-derived DNA due to potentially higher rates of cell-death of cancer cells. Alternatively, low-dose radiation therapy is applied via a whole-body radiographic instrument or locally to the affected regions instead of low-dose chemotherapy. Other procedures are envisioned, including subjecting a patient to ultrasound, sound waves, exercise, stress, etc.
- All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
- While preferred embodiments of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the disclosure. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/699,968 US20240141432A9 (en) | 2014-12-31 | 2022-03-21 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201462098426P | 2014-12-31 | 2014-12-31 | |
US201562155763P | 2015-05-01 | 2015-05-01 | |
PCT/US2015/067717 WO2016109452A1 (en) | 2014-12-31 | 2015-12-28 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US15/431,395 US20170260590A1 (en) | 2014-12-31 | 2017-02-13 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/000,010 US20210040565A1 (en) | 2014-12-31 | 2020-08-21 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/462,906 US20210395837A1 (en) | 2014-12-31 | 2021-08-31 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/699,968 US20240141432A9 (en) | 2014-12-31 | 2022-03-21 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/462,906 Continuation US20210395837A1 (en) | 2014-12-31 | 2021-08-31 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
Publications (2)
Publication Number | Publication Date |
---|---|
US20220213562A1 true US20220213562A1 (en) | 2022-07-07 |
US20240141432A9 US20240141432A9 (en) | 2024-05-02 |
Family
ID=56284976
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/431,395 Pending US20170260590A1 (en) | 2014-12-31 | 2017-02-13 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/000,010 Pending US20210040565A1 (en) | 2014-12-31 | 2020-08-21 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/462,906 Pending US20210395837A1 (en) | 2014-12-31 | 2021-08-31 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/699,968 Pending US20240141432A9 (en) | 2014-12-31 | 2022-03-21 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/431,395 Pending US20170260590A1 (en) | 2014-12-31 | 2017-02-13 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/000,010 Pending US20210040565A1 (en) | 2014-12-31 | 2020-08-21 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
US17/462,906 Pending US20210395837A1 (en) | 2014-12-31 | 2021-08-31 | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results |
Country Status (9)
Country | Link |
---|---|
US (4) | US20170260590A1 (en) |
EP (3) | EP3766986B1 (en) |
JP (4) | JP6783768B2 (en) |
CN (2) | CN107406876B (en) |
AU (3) | AU2015374259B2 (en) |
CA (1) | CA2972433A1 (en) |
ES (2) | ES2828279T3 (en) |
GB (1) | GB2552267B (en) |
WO (1) | WO2016109452A1 (en) |
Families Citing this family (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9476095B2 (en) | 2011-04-15 | 2016-10-25 | The Johns Hopkins University | Safe sequencing system |
US9892230B2 (en) | 2012-03-08 | 2018-02-13 | The Chinese University Of Hong Kong | Size-based analysis of fetal or tumor DNA fraction in plasma |
EP3447495B2 (en) | 2012-10-29 | 2024-03-13 | The Johns Hopkins University | Papanicolaou test for ovarian and endometrial cancers |
EP3561072A1 (en) | 2012-12-10 | 2019-10-30 | Resolution Bioscience, Inc. | Methods for targeted genomic analysis |
DK3543356T3 (en) | 2014-07-18 | 2021-10-11 | Univ Hong Kong Chinese | Analysis of methylation pattern of tissues in DNA mixture |
US10364467B2 (en) | 2015-01-13 | 2019-07-30 | The Chinese University Of Hong Kong | Using size and number aberrations in plasma DNA for detecting cancer |
CN113957124A (en) | 2015-02-10 | 2022-01-21 | 香港中文大学 | Mutation detection for cancer screening and fetal analysis |
HUE057821T2 (en) | 2015-07-23 | 2022-06-28 | Univ Hong Kong Chinese | Analysis of fragmentation patterns of cell-free dna |
WO2017027653A1 (en) | 2015-08-11 | 2017-02-16 | The Johns Hopkins University | Assaying ovarian cyst fluid |
CN108474040B (en) | 2015-10-09 | 2023-05-16 | 夸登特健康公司 | Population-based treatment recommendations using cell-free DNA |
ES2856598T3 (en) | 2015-11-11 | 2021-09-27 | Resolution Bioscience Inc | High-efficiency construction of DNA libraries |
CN108603228B (en) | 2015-12-17 | 2023-09-01 | 夸登特健康公司 | Method for determining tumor gene copy number by analyzing cell-free DNA |
RU2019108294A (en) | 2016-08-25 | 2020-09-25 | Резолюшн Байосайенс, Инк. | METHODS FOR DETECTING CHANGES IN THE AMOUNT OF GENOMIC COPIES IN DNA SAMPLES |
US9850523B1 (en) | 2016-09-30 | 2017-12-26 | Guardant Health, Inc. | Methods for multi-resolution analysis of cell-free nucleic acids |
KR102344635B1 (en) | 2016-09-30 | 2021-12-31 | 가던트 헬쓰, 인크. | Methods for Multi-Resolution Analysis of Cell-Free Nucleic Acids |
KR20230062684A (en) | 2016-11-30 | 2023-05-09 | 더 차이니즈 유니버시티 오브 홍콩 | Analysis of cell-free dna in urine and other samples |
CA3049926A1 (en) | 2017-01-17 | 2018-07-26 | Heparegenix Gmbh | Protein kinase inhibitors for promoting liver regeneration or reducing or preventing hepatocyte death |
EP4421489A2 (en) | 2017-01-25 | 2024-08-28 | The Chinese University of Hong Kong | Diagnostic applications using nucleic acid fragments |
KR102083501B1 (en) * | 2017-02-09 | 2020-03-02 | 사회복지법인 삼성생명공익재단 | Method of identifying target gene for tumor-therapy |
EP3616213A2 (en) * | 2017-04-28 | 2020-03-04 | 4D Path Inc. | Apparatus, systems, and methods for rapid cancer detection |
CN118711654A (en) * | 2017-05-16 | 2024-09-27 | 夸登特健康公司 | Identification of somatic or germ line sources of cell-free DNA |
US10636512B2 (en) | 2017-07-14 | 2020-04-28 | Cofactor Genomics, Inc. | Immuno-oncology applications using next generation sequencing |
CA3072195A1 (en) | 2017-08-07 | 2019-04-04 | The Johns Hopkins University | Methods and materials for assessing and treating cancer |
JP7072825B2 (en) * | 2017-09-13 | 2022-05-23 | 三菱電機ソフトウエア株式会社 | Copy number measuring device, copy number measuring program and copy number measuring method |
EP3685386A1 (en) * | 2017-09-20 | 2020-07-29 | Guardant Health, Inc. | Methods and systems for differentiating somatic and germline variants |
WO2019090156A1 (en) * | 2017-11-03 | 2019-05-09 | Guardant Health, Inc. | Normalizing tumor mutation burden |
US11961589B2 (en) | 2017-11-28 | 2024-04-16 | Grail, Llc | Models for targeted sequencing |
CA3067229A1 (en) * | 2017-12-01 | 2019-06-06 | Illumina, Inc. | Methods and systems for determining somatic mutation clonality |
US11597967B2 (en) | 2017-12-01 | 2023-03-07 | Personal Genome Diagnostics Inc. | Process for microsatellite instability detection |
CN108491689B (en) * | 2018-02-01 | 2019-07-09 | 杭州纽安津生物科技有限公司 | Tumour neoantigen identification method based on transcript profile |
CA3090951C (en) * | 2018-02-12 | 2023-10-17 | F.Hoffmann-La Roche Ag | Method of predicting response to therapy by assessing tumor genetic heterogeneity |
KR20210009299A (en) * | 2018-02-27 | 2021-01-26 | 코넬 유니버시티 | Ultra-sensitive detection of circulating tumor DNA through genome-wide integration |
CA3092352A1 (en) * | 2018-02-27 | 2019-09-06 | Cornell University | Systems and methods for detection of residual disease |
CA3094717A1 (en) | 2018-04-02 | 2019-10-10 | Grail, Inc. | Methylation markers and targeted methylation probe panels |
WO2019195769A1 (en) * | 2018-04-06 | 2019-10-10 | The Brigham And Women's Hospital, Inc. | Methods of diagnosing and treating aggressive cutaneous t-cell lymphomas |
US20210104297A1 (en) * | 2018-04-16 | 2021-04-08 | Grail, Inc. | Systems and methods for determining tumor fraction in cell-free nucleic acid |
US20240254547A1 (en) * | 2018-04-27 | 2024-08-01 | Kao Corporation | Highly accurate sequencing method |
JP7274504B2 (en) * | 2018-05-08 | 2023-05-16 | エフ. ホフマン-ラ ロシュ アーゲー | A method for cancer prognosis by assessing tumor variant diversity by establishing a diversity index |
KR20210038577A (en) * | 2018-07-23 | 2021-04-07 | 가던트 헬쓰, 인크. | Methods and systems for modulating tumor mutation burden by tumor fraction and coverage |
CN113286881A (en) | 2018-09-27 | 2021-08-20 | 格里尔公司 | Methylation signatures and target methylation probe plates |
GB2577548B (en) * | 2018-09-28 | 2022-10-26 | Siemens Healthcare Gmbh | Method for determining a subject's genetic copy number value |
KR20210089240A (en) * | 2018-11-13 | 2021-07-15 | 미리어드 제네틱스, 인크. | Methods and systems for somatic mutagenesis and uses thereof |
WO2020106987A1 (en) * | 2018-11-21 | 2020-05-28 | Karius, Inc. | Detection and prediction of infectious disease |
CN109712671B (en) * | 2018-12-20 | 2020-06-26 | 北京优迅医学检验实验室有限公司 | Gene detection device based on ctDNA, storage medium and computer system |
CA3126146A1 (en) * | 2019-01-10 | 2020-07-16 | Travera LLC | Identifying cancer therapies |
US11643693B2 (en) | 2019-01-31 | 2023-05-09 | Guardant Health, Inc. | Compositions and methods for isolating cell-free DNA |
CN110428905B (en) * | 2019-07-02 | 2022-03-29 | 江南大学附属医院 | Tumor growth trend prediction method |
CN110895963A (en) * | 2019-10-31 | 2020-03-20 | 深圳兰丁医学检验实验室 | Cell DNA quantitative determination system based on artificial intelligence |
US11211147B2 (en) | 2020-02-18 | 2021-12-28 | Tempus Labs, Inc. | Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing |
US11475981B2 (en) | 2020-02-18 | 2022-10-18 | Tempus Labs, Inc. | Methods and systems for dynamic variant thresholding in a liquid biopsy assay |
US11211144B2 (en) | 2020-02-18 | 2021-12-28 | Tempus Labs, Inc. | Methods and systems for refining copy number variation in a liquid biopsy assay |
CA3180386A1 (en) * | 2020-05-29 | 2021-12-02 | Yi-Wen Chen | Genetic diagnostic tool for facioscapulohumeral muscular dystrophy (fshd) |
CN114267445B (en) * | 2021-12-23 | 2024-09-06 | 众阳健康科技集团有限公司 | Diagnostic consistency checking method, system, equipment and medium |
JP7365656B1 (en) | 2022-02-18 | 2023-10-20 | Dic株式会社 | Method for producing sulfurized olefin |
WO2023183750A1 (en) * | 2022-03-23 | 2023-09-28 | Foundation Medicine, Inc. | Methods and systems for determining tumor heterogeneity |
CN116403644B (en) * | 2023-03-03 | 2023-12-05 | 深圳吉因加信息科技有限公司 | Method and device for predicting cancer risk |
CN117219162B (en) * | 2023-09-12 | 2024-07-02 | 四川大学 | Evidence intensity assessment method for body source identification aiming at tumor tissue STR (short tandem repeat) map |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7385605B2 (en) * | 2003-12-04 | 2008-06-10 | International Business Machines Corporation | Computer display system for dynamically modifying stacked area line graphs to change the order or presence of a set of stacked areas in the graph respectively representative of the proportions contributed to a total by each of a set of time dependent variables |
US20060199189A1 (en) * | 2005-03-07 | 2006-09-07 | Bradford Sherry A | Low-dose, sequenced, individualized chemotherapy dosing method |
NZ597655A (en) * | 2009-07-08 | 2013-05-31 | Worldwide Innovative Network | Method for predicting efficacy of drugs in a patient |
AU2011291599B2 (en) * | 2010-08-18 | 2015-09-10 | Caris Life Sciences Switzerland Holdings Gmbh | Circulating biomarkers for disease |
US9476095B2 (en) | 2011-04-15 | 2016-10-25 | The Johns Hopkins University | Safe sequencing system |
CN103003447B (en) * | 2011-07-26 | 2020-08-25 | 维里纳塔健康公司 | Method for determining the presence or absence of different aneuploidies in a sample |
US20130110407A1 (en) * | 2011-09-16 | 2013-05-02 | Complete Genomics, Inc. | Determining variants in genome of a heterogeneous sample |
SG190466A1 (en) * | 2011-11-18 | 2013-06-28 | Agency Science Tech & Res | Methods for diagnosis and/or prognosis of ovarian cancer |
DK2828218T3 (en) | 2012-03-20 | 2020-11-02 | Univ Washington Through Its Center For Commercialization | METHODS OF LOWERING THE ERROR RATE OF MASSIVELY PARALLEL DNA SEQUENCING USING DUPLEX CONSENSUS SEQUENCING |
US11261494B2 (en) * | 2012-06-21 | 2022-03-01 | The Chinese University Of Hong Kong | Method of measuring a fractional concentration of tumor DNA |
CA2883901C (en) * | 2012-09-04 | 2023-04-11 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
EP3421613B1 (en) * | 2013-03-15 | 2020-08-19 | The Board of Trustees of the Leland Stanford Junior University | Identification and use of circulating nucleic acid tumor markers |
CN105408496A (en) | 2013-03-15 | 2016-03-16 | 夸登特健康公司 | Systems and methods to detect rare mutations and copy number variation |
US10435740B2 (en) * | 2013-04-01 | 2019-10-08 | University Of Florida Research Foundation, Incorporated | Determination of methylation state and chromatin structure of target genetic loci |
WO2014191938A1 (en) * | 2013-05-31 | 2014-12-04 | Novartis Ag | Combination therapy containing a pi3k-alpha inhibitor and fgfr kinase inhibitor for treating cancer |
-
2015
- 2015-12-28 CN CN201580077268.8A patent/CN107406876B/en active Active
- 2015-12-28 EP EP20179648.9A patent/EP3766986B1/en active Active
- 2015-12-28 EP EP15876120.5A patent/EP3240911B1/en active Active
- 2015-12-28 CN CN202111138455.6A patent/CN113930507A/en active Pending
- 2015-12-28 ES ES15876120T patent/ES2828279T3/en active Active
- 2015-12-28 CA CA2972433A patent/CA2972433A1/en active Pending
- 2015-12-28 JP JP2017535008A patent/JP6783768B2/en active Active
- 2015-12-28 ES ES20179648T patent/ES2923602T3/en active Active
- 2015-12-28 GB GB1712299.5A patent/GB2552267B/en active Active
- 2015-12-28 EP EP22176398.0A patent/EP4123032A1/en active Pending
- 2015-12-28 AU AU2015374259A patent/AU2015374259B2/en active Active
- 2015-12-28 WO PCT/US2015/067717 patent/WO2016109452A1/en active Application Filing
-
2017
- 2017-02-13 US US15/431,395 patent/US20170260590A1/en active Pending
-
2020
- 2020-04-17 JP JP2020073957A patent/JP7145907B2/en active Active
- 2020-08-21 US US17/000,010 patent/US20210040565A1/en active Pending
- 2020-11-04 AU AU2020264326A patent/AU2020264326B2/en active Active
-
2021
- 2021-08-31 US US17/462,906 patent/US20210395837A1/en active Pending
- 2021-11-11 JP JP2021184052A patent/JP7458360B2/en active Active
-
2022
- 2022-03-21 US US17/699,968 patent/US20240141432A9/en active Pending
-
2023
- 2023-07-20 AU AU2023206196A patent/AU2023206196A1/en active Pending
-
2024
- 2024-01-04 JP JP2024000141A patent/JP2024029174A/en active Pending
Also Published As
Publication number | Publication date |
---|---|
US20210395837A1 (en) | 2021-12-23 |
AU2023206196A1 (en) | 2023-08-10 |
EP3240911A4 (en) | 2018-11-21 |
EP3766986B1 (en) | 2022-06-01 |
AU2020264326A1 (en) | 2020-11-26 |
CN107406876A (en) | 2017-11-28 |
EP3240911A1 (en) | 2017-11-08 |
EP3766986A1 (en) | 2021-01-20 |
ES2923602T3 (en) | 2022-09-28 |
JP2024029174A (en) | 2024-03-05 |
AU2015374259A1 (en) | 2017-08-10 |
WO2016109452A1 (en) | 2016-07-07 |
GB2552267A (en) | 2018-01-17 |
JP7458360B2 (en) | 2024-03-29 |
JP2018507682A (en) | 2018-03-22 |
US20240141432A9 (en) | 2024-05-02 |
CN113930507A (en) | 2022-01-14 |
AU2020264326B2 (en) | 2023-05-18 |
JP2020127408A (en) | 2020-08-27 |
EP4123032A1 (en) | 2023-01-25 |
US20210040565A1 (en) | 2021-02-11 |
GB2552267B (en) | 2020-06-10 |
JP7145907B2 (en) | 2022-10-03 |
ES2828279T3 (en) | 2021-05-25 |
US20170260590A1 (en) | 2017-09-14 |
JP2022024040A (en) | 2022-02-08 |
AU2015374259B2 (en) | 2020-08-13 |
CA2972433A1 (en) | 2016-07-07 |
EP3240911B1 (en) | 2020-08-26 |
GB201712299D0 (en) | 2017-09-13 |
JP6783768B2 (en) | 2020-11-11 |
CN107406876B (en) | 2021-09-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2020264326B2 (en) | Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results | |
Pleasance et al. | Pan-cancer analysis of advanced patient tumors reveals interactions between therapy and genomic landscapes | |
JP2022532897A (en) | Systems and methods for multi-label cancer classification | |
US20220367006A1 (en) | Methods and systems for dynamic variant thresholding in a liquid biopsy assay | |
JP5955557B2 (en) | Pathways underlying hereditary pancreatic tumorigenesis and hereditary pancreatic oncogenes | |
AU2016293025A1 (en) | System and methodology for the analysis of genomic data obtained from a subject | |
CN113151474A (en) | Plasma DNA mutation analysis for cancer detection | |
JP2022533137A (en) | Systems and methods for assessing tumor fractions | |
US20230162815A1 (en) | Methods and systems for accurate genotyping of repeat polymorphisms | |
US20240071628A1 (en) | Database for therapeutic interventions | |
US20240052419A1 (en) | Methods and systems for detecting genetic variants | |
KR20200044123A (en) | COMPREHENSIVE GENOMIC TRANSCRIPTOMIC TUMOR-NORMAL GENE PANEL ANALYSIS FOR ENHANCED PRECISION IN PATIENTS WITH CANCER | |
Williams et al. | Tracking clonal evolution of drug resistance in ovarian cancer patients by exploiting structural variants in cfDNA | |
WO2023183750A1 (en) | Methods and systems for determining tumor heterogeneity | |
WO2023183751A1 (en) | Characterization of tumor heterogeneity as a prognostic biomarker |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: GUARDANT HEALTH, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ELTOUKHY, HELMY;TALASAZ, AMIRALI;KERMANI, BAHRAM GHAFFARZADEH;AND OTHERS;SIGNING DATES FROM 20161116 TO 20170428;REEL/FRAME:059583/0661 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |