WO2023107570A1 - Charge mutationnelle tumorale pondérée par l'expression en tant que biomarqueur oncologique - Google Patents
Charge mutationnelle tumorale pondérée par l'expression en tant que biomarqueur oncologique Download PDFInfo
- Publication number
- WO2023107570A1 WO2023107570A1 PCT/US2022/052153 US2022052153W WO2023107570A1 WO 2023107570 A1 WO2023107570 A1 WO 2023107570A1 US 2022052153 W US2022052153 W US 2022052153W WO 2023107570 A1 WO2023107570 A1 WO 2023107570A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cancer
- tmb
- genes
- rna
- sequencing
- Prior art date
Links
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 58
- 230000000869 mutational effect Effects 0.000 title claims abstract description 11
- 239000000090 biomarker Substances 0.000 title description 4
- 238000000034 method Methods 0.000 claims abstract description 138
- 238000003559 RNA-seq method Methods 0.000 claims abstract description 70
- 239000012472 biological sample Substances 0.000 claims abstract description 17
- 238000011282 treatment Methods 0.000 claims abstract description 11
- 239000002773 nucleotide Substances 0.000 claims description 173
- 125000003729 nucleotide group Chemical group 0.000 claims description 173
- 108090000623 proteins and genes Proteins 0.000 claims description 173
- 238000012163 sequencing technique Methods 0.000 claims description 97
- 239000000523 sample Substances 0.000 claims description 95
- 230000035772 mutation Effects 0.000 claims description 62
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 54
- 239000000092 prognostic biomarker Substances 0.000 claims description 40
- 238000010839 reverse transcription Methods 0.000 claims description 26
- 230000004044 response Effects 0.000 claims description 25
- 238000002360 preparation method Methods 0.000 claims description 21
- 239000002299 complementary DNA Substances 0.000 claims description 18
- 206010009944 Colon cancer Diseases 0.000 claims description 16
- 206010017758 gastric cancer Diseases 0.000 claims description 16
- 238000002560 therapeutic procedure Methods 0.000 claims description 16
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 claims description 15
- 208000006265 Renal cell carcinoma Diseases 0.000 claims description 15
- 208000005718 Stomach Neoplasms Diseases 0.000 claims description 15
- 201000011549 stomach cancer Diseases 0.000 claims description 15
- 210000001519 tissue Anatomy 0.000 claims description 15
- 206010005003 Bladder cancer Diseases 0.000 claims description 14
- 206010006187 Breast cancer Diseases 0.000 claims description 14
- 208000026310 Breast neoplasm Diseases 0.000 claims description 14
- 206010008342 Cervix carcinoma Diseases 0.000 claims description 14
- 206010025323 Lymphomas Diseases 0.000 claims description 14
- 208000015634 Rectal Neoplasms Diseases 0.000 claims description 14
- 208000000453 Skin Neoplasms Diseases 0.000 claims description 14
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 claims description 14
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 claims description 14
- 201000010881 cervical cancer Diseases 0.000 claims description 14
- 208000029742 colonic neoplasm Diseases 0.000 claims description 14
- 201000007270 liver cancer Diseases 0.000 claims description 14
- 208000014018 liver neoplasm Diseases 0.000 claims description 14
- 206010038038 rectal cancer Diseases 0.000 claims description 14
- 201000001275 rectum cancer Diseases 0.000 claims description 14
- 201000000849 skin cancer Diseases 0.000 claims description 14
- 201000005112 urinary bladder cancer Diseases 0.000 claims description 14
- 208000017604 Hodgkin disease Diseases 0.000 claims description 13
- 208000021519 Hodgkin lymphoma Diseases 0.000 claims description 13
- 208000010747 Hodgkins lymphoma Diseases 0.000 claims description 13
- 201000010536 head and neck cancer Diseases 0.000 claims description 13
- 208000014829 head and neck neoplasm Diseases 0.000 claims description 13
- 208000015347 renal cell adenocarcinoma Diseases 0.000 claims description 13
- 239000007787 solid Substances 0.000 claims description 12
- 230000003321 amplification Effects 0.000 claims description 10
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 10
- 238000009098 adjuvant therapy Methods 0.000 claims description 9
- 230000005746 immune checkpoint blockade Effects 0.000 claims description 9
- 238000009099 neoadjuvant therapy Methods 0.000 claims description 9
- 238000009169 immunotherapy Methods 0.000 claims description 8
- 238000001959 radiotherapy Methods 0.000 claims description 8
- 238000002626 targeted therapy Methods 0.000 claims description 8
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 7
- 238000001794 hormone therapy Methods 0.000 claims description 7
- 238000003786 synthesis reaction Methods 0.000 claims description 7
- 210000004369 blood Anatomy 0.000 claims description 6
- 239000008280 blood Substances 0.000 claims description 6
- 238000002512 chemotherapy Methods 0.000 claims description 6
- 201000010099 disease Diseases 0.000 claims description 6
- 229940079593 drug Drugs 0.000 claims description 6
- 239000003814 drug Substances 0.000 claims description 6
- 238000000126 in silico method Methods 0.000 claims description 6
- 238000011221 initial treatment Methods 0.000 claims description 6
- 210000003819 peripheral blood mononuclear cell Anatomy 0.000 claims description 6
- 230000008685 targeting Effects 0.000 claims description 6
- 238000012408 PCR amplification Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 5
- 101100176487 Mus musculus Gzmc gene Proteins 0.000 claims description 4
- 101710089372 Programmed cell death protein 1 Proteins 0.000 claims description 4
- 239000013614 RNA sample Substances 0.000 claims description 4
- 238000013459 approach Methods 0.000 claims description 4
- 238000002372 labelling Methods 0.000 claims description 4
- 108020004418 ribosomal RNA Proteins 0.000 claims description 4
- 108010074708 B7-H1 Antigen Proteins 0.000 claims description 3
- 102000008096 B7-H1 Antigen Human genes 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 238000007672 fourth generation sequencing Methods 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 abstract description 7
- 239000013615 primer Substances 0.000 description 47
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 35
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 35
- 210000004027 cell Anatomy 0.000 description 31
- 239000012634 fragment Substances 0.000 description 26
- 230000008859 change Effects 0.000 description 20
- 108020004414 DNA Proteins 0.000 description 19
- 150000007523 nucleic acids Chemical class 0.000 description 19
- 102000039446 nucleic acids Human genes 0.000 description 18
- 108020004707 nucleic acids Proteins 0.000 description 18
- 201000011510 cancer Diseases 0.000 description 17
- 108020004635 Complementary DNA Proteins 0.000 description 16
- 238000010804 cDNA synthesis Methods 0.000 description 16
- 238000012165 high-throughput sequencing Methods 0.000 description 12
- 238000012886 linear function Methods 0.000 description 9
- 102000053602 DNA Human genes 0.000 description 8
- 238000013461 design Methods 0.000 description 8
- 238000007481 next generation sequencing Methods 0.000 description 8
- 101150084750 1 gene Proteins 0.000 description 7
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 6
- 230000000295 complement effect Effects 0.000 description 6
- 108091093088 Amplicon Proteins 0.000 description 5
- 108700024394 Exon Proteins 0.000 description 5
- 239000003153 chemical reaction reagent Substances 0.000 description 5
- 150000001875 compounds Chemical class 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 210000002381 plasma Anatomy 0.000 description 5
- 239000002671 adjuvant Substances 0.000 description 4
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 230000037431 insertion Effects 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 206010044412 transitional cell carcinoma Diseases 0.000 description 4
- 230000005945 translocation Effects 0.000 description 4
- 241000124008 Mammalia Species 0.000 description 3
- 108020004485 Nonsense Codon Proteins 0.000 description 3
- 108091027544 Subgenomic mRNA Proteins 0.000 description 3
- 230000037429 base substitution Effects 0.000 description 3
- 238000002869 basic local alignment search tool Methods 0.000 description 3
- 239000000872 buffer Substances 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000001684 chronic effect Effects 0.000 description 3
- 239000012530 fluid Substances 0.000 description 3
- 231100000221 frame shift mutation induction Toxicity 0.000 description 3
- 230000037433 frameshift Effects 0.000 description 3
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 3
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 3
- 238000002887 multiple sequence alignment Methods 0.000 description 3
- 230000037434 nonsense mutation Effects 0.000 description 3
- 230000008707 rearrangement Effects 0.000 description 3
- 230000037432 silent mutation Effects 0.000 description 3
- 238000007671 third-generation sequencing Methods 0.000 description 3
- 241000251468 Actinopterygii Species 0.000 description 2
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 2
- 208000014697 Acute lymphocytic leukaemia Diseases 0.000 description 2
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- KWIUHFFTVRNATP-UHFFFAOYSA-N Betaine Natural products C[N+](C)(C)CC([O-])=O KWIUHFFTVRNATP-UHFFFAOYSA-N 0.000 description 2
- 201000009030 Carcinoma Diseases 0.000 description 2
- 208000005443 Circulating Neoplastic Cells Diseases 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 2
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 2
- 206010014733 Endometrial cancer Diseases 0.000 description 2
- 206010014759 Endometrial neoplasm Diseases 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 2
- 229940076838 Immune checkpoint inhibitor Drugs 0.000 description 2
- 108091008026 Inhibitory immune checkpoint proteins Proteins 0.000 description 2
- 102000037984 Inhibitory immune checkpoint proteins Human genes 0.000 description 2
- 241000270322 Lepidosauria Species 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 208000035346 Margins of Excision Diseases 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- OKIZCWYLBDKLSU-UHFFFAOYSA-M N,N,N-Trimethylmethanaminium chloride Chemical compound [Cl-].C[N+](C)(C)C OKIZCWYLBDKLSU-UHFFFAOYSA-M 0.000 description 2
- KWIUHFFTVRNATP-UHFFFAOYSA-O N,N,N-trimethylglycinium Chemical compound C[N+](C)(C)CC(O)=O KWIUHFFTVRNATP-UHFFFAOYSA-O 0.000 description 2
- 208000002454 Nasopharyngeal Carcinoma Diseases 0.000 description 2
- 206010061306 Nasopharyngeal cancer Diseases 0.000 description 2
- 206010035226 Plasma cell myeloma Diseases 0.000 description 2
- WCUXLLCKKVVCTQ-UHFFFAOYSA-M Potassium chloride Chemical compound [Cl-].[K+] WCUXLLCKKVVCTQ-UHFFFAOYSA-M 0.000 description 2
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 2
- 241000288906 Primates Species 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 210000004381 amniotic fluid Anatomy 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 210000003567 ascitic fluid Anatomy 0.000 description 2
- 229960003237 betaine Drugs 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 210000003608 fece Anatomy 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000012274 immune-checkpoint protein inhibitor Substances 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 208000032839 leukemia Diseases 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 201000001441 melanoma Diseases 0.000 description 2
- 201000011216 nasopharynx carcinoma Diseases 0.000 description 2
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 2
- 239000013074 reference sample Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 210000003296 saliva Anatomy 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- 208000023747 urothelial carcinoma Diseases 0.000 description 2
- GUAHPAJOXVYFON-ZETCQYMHSA-N (8S)-8-amino-7-oxononanoic acid zwitterion Chemical compound C[C@H](N)C(=O)CCCCCC(O)=O GUAHPAJOXVYFON-ZETCQYMHSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 208000036764 Adenocarcinoma of the esophagus Diseases 0.000 description 1
- 241000243818 Annelida Species 0.000 description 1
- 241000239223 Arachnida Species 0.000 description 1
- 206010003445 Ascites Diseases 0.000 description 1
- 206010003571 Astrocytoma Diseases 0.000 description 1
- 208000003950 B-cell lymphoma Diseases 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000003174 Brain Neoplasms Diseases 0.000 description 1
- 208000010667 Carcinoma of liver and intrahepatic biliary tract Diseases 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 208000030808 Clear cell renal carcinoma Diseases 0.000 description 1
- 206010052360 Colorectal adenocarcinoma Diseases 0.000 description 1
- 241000238424 Crustacea Species 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 206010064571 Gene mutation Diseases 0.000 description 1
- 206010018338 Glioma Diseases 0.000 description 1
- 206010073069 Hepatic cancer Diseases 0.000 description 1
- 208000008051 Hereditary Nonpolyposis Colorectal Neoplasms Diseases 0.000 description 1
- 208000017095 Hereditary nonpolyposis colon cancer Diseases 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 208000031671 Large B-Cell Diffuse Lymphoma Diseases 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 201000005027 Lynch syndrome Diseases 0.000 description 1
- FYYHWMGAXLPEAU-UHFFFAOYSA-N Magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 1
- 208000025205 Mantle-Cell Lymphoma Diseases 0.000 description 1
- 206010027406 Mesothelioma Diseases 0.000 description 1
- 241000237852 Mollusca Species 0.000 description 1
- 208000034578 Multiple myelomas Diseases 0.000 description 1
- 241000244206 Nematoda Species 0.000 description 1
- 206010029260 Neuroblastoma Diseases 0.000 description 1
- 206010030137 Oesophageal adenocarcinoma Diseases 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 206010061534 Oesophageal squamous cell carcinoma Diseases 0.000 description 1
- 206010031096 Oropharyngeal cancer Diseases 0.000 description 1
- 206010057444 Oropharyngeal neoplasm Diseases 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 241000282577 Pan troglodytes Species 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 208000027190 Peripheral T-cell lymphomas Diseases 0.000 description 1
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 description 1
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 description 1
- 208000032758 Precursor T-lymphoblastic lymphoma/leukaemia Diseases 0.000 description 1
- 206010036790 Productive cough Diseases 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 206010054184 Small intestine carcinoma Diseases 0.000 description 1
- 206010068771 Soft tissue neoplasm Diseases 0.000 description 1
- 208000000102 Squamous Cell Carcinoma of Head and Neck Diseases 0.000 description 1
- 208000034254 Squamous cell carcinoma of the cervix uteri Diseases 0.000 description 1
- 208000036765 Squamous cell carcinoma of the esophagus Diseases 0.000 description 1
- 208000031672 T-Cell Peripheral Lymphoma Diseases 0.000 description 1
- 208000029052 T-cell acute lymphoblastic leukemia Diseases 0.000 description 1
- 206010042971 T-cell lymphoma Diseases 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 208000002495 Uterine Neoplasms Diseases 0.000 description 1
- 201000005969 Uveal melanoma Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 208000008383 Wilms tumor Diseases 0.000 description 1
- 208000006336 acinar cell carcinoma Diseases 0.000 description 1
- 238000011226 adjuvant chemotherapy Methods 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 239000012805 animal sample Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 201000009036 biliary tract cancer Diseases 0.000 description 1
- 208000020790 biliary tract neoplasm Diseases 0.000 description 1
- 239000012620 biological material Substances 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 201000008275 breast carcinoma Diseases 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 201000006612 cervical squamous cell carcinoma Diseases 0.000 description 1
- 210000003756 cervix mucus Anatomy 0.000 description 1
- 208000006990 cholangiocarcinoma Diseases 0.000 description 1
- 210000004252 chorionic villi Anatomy 0.000 description 1
- 230000008711 chromosomal rearrangement Effects 0.000 description 1
- 206010073251 clear cell renal cell carcinoma Diseases 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 201000010989 colorectal carcinoma Diseases 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 208000035250 cutaneous malignant susceptibility to 1 melanoma Diseases 0.000 description 1
- 208000030381 cutaneous melanoma Diseases 0.000 description 1
- 206010012818 diffuse large B-cell lymphoma Diseases 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 201000003914 endometrial carcinoma Diseases 0.000 description 1
- 201000000330 endometrial stromal sarcoma Diseases 0.000 description 1
- 208000029179 endometrioid stromal sarcoma Diseases 0.000 description 1
- 208000028653 esophageal adenocarcinoma Diseases 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 208000007276 esophageal squamous cell carcinoma Diseases 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 210000003722 extracellular fluid Anatomy 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 210000000416 exudates and transudate Anatomy 0.000 description 1
- 210000004700 fetal blood Anatomy 0.000 description 1
- 230000001605 fetal effect Effects 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 201000008396 gallbladder adenocarcinoma Diseases 0.000 description 1
- 201000010175 gallbladder cancer Diseases 0.000 description 1
- 201000007487 gallbladder carcinoma Diseases 0.000 description 1
- 208000010749 gastric carcinoma Diseases 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 201000005787 hematologic cancer Diseases 0.000 description 1
- 208000024200 hematopoietic and lymphoid system neoplasm Diseases 0.000 description 1
- 208000006359 hepatoblastoma Diseases 0.000 description 1
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 230000003426 interchromosomal effect Effects 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 201000002250 liver carcinoma Diseases 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 210000004880 lymph fluid Anatomy 0.000 description 1
- 230000000527 lymphocytic effect Effects 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 229910052749 magnesium Inorganic materials 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000001394 metastastic effect Effects 0.000 description 1
- 206010061289 metastatic neoplasm Diseases 0.000 description 1
- 210000004877 mucosa Anatomy 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 238000011227 neoadjuvant chemotherapy Methods 0.000 description 1
- 238000011445 neoadjuvant hormone therapy Methods 0.000 description 1
- 201000008026 nephroblastoma Diseases 0.000 description 1
- 201000011330 nonpapillary renal cell carcinoma Diseases 0.000 description 1
- 239000002777 nucleoside Substances 0.000 description 1
- -1 nucleoside triphosphates Chemical class 0.000 description 1
- 201000002575 ocular melanoma Diseases 0.000 description 1
- 208000010655 oral cavity squamous cell carcinoma Diseases 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 201000006958 oropharynx cancer Diseases 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 238000002888 pairwise sequence alignment Methods 0.000 description 1
- 201000002528 pancreatic cancer Diseases 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 201000008129 pancreatic ductal adenocarcinoma Diseases 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 239000001103 potassium chloride Substances 0.000 description 1
- 235000011164 potassium chloride Nutrition 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 201000005825 prostate adenocarcinoma Diseases 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 238000007841 sequencing by ligation Methods 0.000 description 1
- 201000003708 skin melanoma Diseases 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 210000003802 sputum Anatomy 0.000 description 1
- 208000024794 sputum Diseases 0.000 description 1
- 201000000498 stomach carcinoma Diseases 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 210000004243 sweat Anatomy 0.000 description 1
- 210000001179 synovial fluid Anatomy 0.000 description 1
- 210000001138 tear Anatomy 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 206010046766 uterine cancer Diseases 0.000 description 1
- 208000037965 uterine sarcoma Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/40—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H15/00—ICT specially adapted for medical reports, e.g. generation or transmission thereof
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/10—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/40—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to mechanical, radiation or invasive therapies, e.g. surgery, laser therapy, dialysis or acupuncture
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
Definitions
- the present disclosure relates to the field of determining tumor mutational burden (TMB) using sequencing data. More particularly, it relates to methods for expression-weighted tumor mutational burden (EW-TMB) analysis, and the use of EW- TMB as an oncology biomarker.
- TMB tumor mutational burden
- EW-TMB expression-weighted tumor mutational burden
- TMB Tumor mutational burden
- ICIs immune checkpoint inhibitors
- New methods of EW-TMB can be used in the prognostication of patient outcome and the determination of treatment prescription for patients.
- this disclosure provides a method comprising: obtaining a RNA sample, performing RNA sequencing (RNA Seq), analyzing the RNA sequencing data to determine an expression level E for each gene i (Ei) within a set of genes X, analyzing the RNA sequencing data to determine a number of coding mutations M for each gene i (Mi) within the same set of genes X, and determining an EW-TMB based on Ei and Mi for each gene i within the set of genes X.
- RNA Seq RNA sequencing
- Figure 1 depicts an illustration of the genome with mutations in coding regions (exons), introns, and intergenic regions. Only mutations in coding regions result in potential neoantigens.
- Figure 2 comprises panels A and B.
- Figure 2 depicts an illustration of a comparison of low EW-TMB samples (panel A) and high EW-TMB samples (panel B), showing E (expression analysis) and M (mutation analysis).
- Low EW-TMB samples present few mutations on a limited amount of genes.
- FIG. 3 depicts an illustration of the EW-TMB analysis workflow.
- the RNA is extracted and either used as input for next-generation sequencing (NGS) or long-read sequencing.
- NGS next-generation sequencing
- the RNA is reverse transcribed, and adapters are attached throughout library preparation steps.
- the adapters are attached to the crude RNA molecules.
- the resulting reads are aligned and assembled as necessary, and the data set is further analyzed to establish E (analysis of the expression level) and M (analysis of the presence of mutations). From E and M, the EW- TMB is computed.
- Figure 4 depicts an illustration of RNA library preparation methods for NGS read-out.
- RNA Seq methods routinely used can be adapted to collect the dataset necessary for the EW-TMB calculation.
- Figure 5 depicts an illustration of the EW-TMB Logistic and Heaviside functions.
- composition provided herein is specifically envisioned for use with any applicable method provided herein.
- any and all combinations of the members that make up that grouping of alternatives is specifically envisioned. For example, if an item is selected from a group consisting of A, B, C, and D, the inventors specifically envision each alternative individually (e.g., A alone, B alone, etc.), as well as combinations such as A, B, and D; A and C; B and C; etc.
- the term “and/or” when used in a list of two or more items means any one of the listed items by itself or in combination with any one or more of the other listed items.
- the expression “A and/or B” is intended to mean either or both of A and B - i.e., A alone, B alone, or A and B in combination.
- the expression “A, B and/or C” is intended to mean A alone, B alone, C alone, A and B in combination, A and C in combination, B and C in combination, or A, B, and C in combination.
- range is understood to inclusive of the edges of the range as well as any number between the defined edges of the range.
- “between 1 and 10” includes any number between 1 and 10, as well as the number 1 and the number 10.
- a compound or “at least one compound” may include a plurality of compounds, including mixtures thereof.
- plural refers to any number greater than one.
- DNA refers to deoxyribonucleic acid. DNA can be either single-stranded or double-stranded. DNA typically comprises four nucleotides: cytosine (C), guanine (G), adenine (A), and thymine (T). In an aspect, the sequence of a DNA molecule provided herein comprises one or more degenerate nucleotides. As used herein, a “degenerate nucleotide” refers to a nucleotide that can perform the same function or yield the same output as a structurally different nucleotide.
- Non-limiting examples of degenerate nucleotides include a C, G, or T nucleotide (B); an A, G, or T nucleotide (D); an A, C, or T nucleotide (H); a G or T nucleotide (K); an A or C nucleotide (M); any nucleotide (N); an A or G nucleotide (R); a G or C nucleotide (S); an A, C, or G nucleotide (V); an A or T nucleotide (W), and a C or T nucleotide (Y).
- RNA refers to ribonucleic acid. RNA can be either singlestranded or double-stranded. RNA typically comprises four nucleotides: cytosine (C), guanine (G), adenine (A), and uracil (U). In an aspect, the sequence of an RNA molecule provided herein comprises one or more degenerate nucleotides. As used herein, a “degenerate nucleotide” refers to a nucleotide that can perform the same function or yield the same output as a structurally different nucleotide.
- Non-limiting examples of degenerate nucleotides include a C, G, or U nucleotide (B); an A, G, or U nucleotide (D); an A, C, or U nucleotide (H); a G or U nucleotide (K); an A or C nucleotide (M); any nucleotide (N); an A or G nucleotide (R); a G or C nucleotide (S); an A, C, or G nucleotide (V); an A or U nucleotide (W), and a C or U nucleotide (Y).
- B C, G, or U nucleotide
- D A, G, or U nucleotide
- H A, C, or U nucleotide
- K G or U nucleotide
- M any nucleotide (N); an A or G nucleotide (R); a G or C nucleotide (S); an A,
- RNA is present in, or obtained from, a sample (e.g, a “biological sample”).
- a “biological sample” refers to any biological material that is capable of being analyzed by or subjected to the methods and/or compositions provided herein. Any suitable method known in the art can be used to obtain a nucleic acid (e.g, an RNA molecule) from a sample.
- a sample comprises RNA.
- a sample comprises RNA and DNA.
- a sample comprises cells.
- a sample comprises cell-free nucleic acids.
- sample and “biological sample” are intended to be interchangeable.
- a sample can include at least one cell, fetal cell, cell culture, tissue specimen, blood, serum, plasma, saliva, urine, tear, vaginal secretion, sweat, lymph fluid, cerebrospinal fluid, mucosa secretion, peritoneal fluid, ascites fluid, fecal matter, body exudates, umbilical cord blood, chorionic villi, amniotic fluid, embryonic tissue, multicellular embryo, lysate, extract, solution, or reaction mixture suspected of containing nucleic acids.
- a sample can comprise DNA, RNA, or both.
- a sample is from a healthy organism.
- a sample is from a diseased organism.
- a sample is from a mutagenized sample.
- the nucleic acids obtained from a sample can be converted to cDNA.
- a sample includes one or more cells associated with a tumor.
- a sample includes one or more circulating tumor cells (CTC).
- CTC circulating tumor cells
- a sample includes one or more cells not associated with a tumor.
- a cell is a prokaryotic cell.
- a cell is a eukaryotic cell.
- a cell is an animal cell.
- a cell is a plant cell.
- a cell is a fungal cell.
- a cell is a mammal cell.
- a cell is a primate cell.
- a cell is a human cell.
- a cell is a human cancer cell.
- a sample is obtained from a subject.
- a “subject” refers to an animal (e.g, without being limiting, a mammal, reptile, bird, fish, amphibian) or other organism, such as, without being limiting, a plant or fungus.
- a subject can be a healthy individual, an individual that has or is suspected of having a disease or a predisposition to the disease, or an individual that is in need of therapy or suspected of needing therapy.
- the term “individual” and “subject” are intended to be interchangeable.
- a subject is a eukaryote.
- a subject is a prokaryote.
- a subject is a virus. In an aspect, a subject is an animal. In an aspect, a subject is a plant. In an aspect, a subject is a fungus. In an aspect, a subject is a mammal. In an aspect, a subject is a rodent. In an aspect, a subject is a mouse. In an aspect, a subject is a rat. In an aspect, a subject is a rabbit. In an aspect, a subject is a cat. In an aspect, a subject is a dog. In an aspect, a subject is a horse. In an aspect, a subject is a cow. In an aspect, a subject is a pig. In an aspect, a subject is a primate.
- a subject is a monkey. In an aspect, a subject is a chimpanzee. In an aspect, a subject is a human. In an aspect, a subject is a bird. In an aspect, a subject is a chicken. In an aspect, a subject is a fish. In an aspect, a subject is a reptile. In an aspect, a subject is an amphibian. In an aspect, a subject is an insect. In an aspect, a subject is an arachnid. In an aspect, a subject is a crustacean. In an aspect, a subject is a mollusk. In an aspect, a subject is a nematode. In an aspect, a subject is an annelid.
- a subject has, or is suspected of having, cancer.
- a subject has, or is suspected of having, colorectal cancer.
- a subject has, or is suspected of having, breast cancer.
- a subject has, or is suspected of having, gastric cancer.
- a subject has, or is suspected of having, endometrial cancer.
- a subject has, or is suspected of having, bladder cancer.
- a subject has, or is suspected of having, cervical cancer.
- a subject has, or is suspected of having, colon cancer.
- a subject has, or is suspected of having, head and neck cancer.
- a subject has, or is suspected of having, liver cancer.
- a subject has, or is suspected of having, renal cell cancer. In an aspect, a subject has, or is suspected of having, skin cancer. In an aspect, a subject has, or is suspected of having, stomach cancer. In an aspect, a subject has, or is suspected of having, rectal cancer. In an aspect, a subject has, or is suspected of having, lymphoma. In an aspect, a subject has, or is suspected of having, non-Hodgkin lymphoma. In an aspect, a subject has, or is suspected of having, Hodgkin lymphoma. In an aspect, a subject has, or is suspected of having, solid tumors. In an aspect, a subject has, or is suspected of having, non-solid tumors. In an aspect, a subject has, or is suspected of having, a genetic-based disease, disorder, or condition.
- RNA can originate from and/or be isolated from any types of cancer for use with the methods and compositions provided herein.
- Samples can be obtained from any type of cancer.
- cancers include biliary tract cancer, bladder cancer, transitional cell carcinoma, urothelial carcinoma, brain cancer, gliomas, astrocytomas, breast carcinoma, metaplastic carcinoma, cervical cancer, cervical squamous cell carcinoma, rectal cancer, colorectal carcinoma, colon cancer, hereditary nonpolyposis colorectal cancer, colorectal adenocarcinomas, gastrointestinal stromal tumors (GISTs), endometrial carcinoma, endometrial stromal sarcomas, esophageal cancer, esophageal squamous cell carcinoma, esophageal adenocarcinoma, ocular melanoma, uveal melanoma, gallbladder carcinomas, gallbladder adenocarcinoma, renal cell carcinoma
- Prostate cancer prostate adenocarcinoma, skin cancer, melanoma, malignant melanoma, cutaneous melanoma, small intestine carcinomas, stomach cancer, gastric carcinoma, gastrointestinal stromal tumor (GIST), uterine cancer, or uterine sarcoma.
- a sample comprises a cell.
- a sample comprises a tissue.
- a sample comprises an organ.
- a sample comprises blood.
- a sample comprises plasma.
- a sample comprises urine.
- a sample comprises feces. Additional non-limiting examples of samples include serum, sputum, semen, vaginal fluid, synovial fluid, spinal fluid, cerebral spinal fluid, amniotic fluid, peritoneal fluid, interstitial fluid, bone marrow aspirate, and saliva.
- a sample provided herein is obtained from a source selected from the group consisting of formalin-fixed paraffin-embedded tissue, whole blood, peripheral blood mononuclear cell, plasma, buffy coat, fresh frozen tissue, fresh tissue, biopsy, and tissue or cells from a surgical margin.
- a sample provided herein is obtained from formalin-fixed paraffin-embedded tissue.
- a sample provided herein is obtained from whole blood.
- a sample provided herein is obtained from peripheral blood mononuclear cell.
- a sample provided herein is obtained from plasma.
- a sample provided herein is obtained from buffy coat.
- a sample provided herein is obtained from fresh frozen tissue.
- a sample provided herein is obtained from fresh tissue. In an aspect, a sample provided herein is obtained from a biopsy. In an aspect, a sample provided herein is obtained from tissue or cells from a surgical margin. In an aspect, a sample provided herein is a human sample. In an aspect, a sample provided herein is an animal sample.
- a sample provided herein comprises less than or equal to 100 ng of RNA. In an aspect, a sample provided herein comprises less than or equal to 75 ng of RNA. In an aspect, a sample provided herein comprises less than or equal to 50 ng of
- RNA in an aspect, comprises less than or equal to 25 ng of
- RNA in an aspect, comprises less than or equal to 20 ng of
- RNA in an aspect, comprises less than or equal to 15 ng of
- RNA in an aspect, comprises less than or equal to 10 ng of
- RNA comprises less than or equal to 5 ng of RNA. In an aspect, a sample provided herein comprises less than or equal to 1 ng of RNA.
- the sample is tumor sample or sample derived from a tumor.
- the sample is acquired from a solid tumor.
- the sample is acquired from a non-solid tumor.
- the sample is obtained from a subject having a cancer.
- the sample is obtained from a healthy subject.
- the sample is obtained from a subject who is receiving a therapy or has received a therapy.
- the sample is obtained from a subject who is receiving immune checkpoint blockade therapy.
- the sample is obtained from a subject who has received immune checkpoint blockade therapy.
- the sample is obtained from a subject who is receiving neoadjuvant therapy.
- the sample is obtained from a subject who has received neoadjuvant therapy. In an aspect, the sample is obtained from a subject who is receiving adjuvant therapy. In an aspect, the sample is obtained from a subject who has received adjuvant therapy.
- cancer refers to malignant cancers, premalignant cancers, solid tumors, non-solid tumors, liquid tumors, soft tissue tumors, blood cancers, metastatic lesions, carcinoma, sarcoma, myeloma, leukemia, and lymphoma.
- cancer or “tumor” is used interchangeably herein.
- the a coding mutation can include, but are not limited to a single nucleotide polymorphism (SNP), a base substitution, a point mutation, an indel, an inversion, a duplication, an amplification, a translocation, an inter- and intra- chromosomal rearrangement, a missense mutation, a nonsense mutation, a frameshift mutation, and a silent mutation.
- SNP single nucleotide polymorphism
- a coding mutation is a SNP.
- a coding mutation is a base substitution.
- a coding mutation is a point mutation.
- a coding mutation is an indel.
- a coding mutation is an insertion.
- a coding mutation is a deletion. In an aspect, a coding mutation is an inversion. In an aspect, a coding mutation is a duplication. In an aspect, a coding mutation is an amplification. In an aspect, a coding mutation is a translocation. In an aspect, a coding mutation is a translocation. In an aspect, a coding mutation is an inter- chromosomal rearrangement. In an aspect, a coding mutation is an intra-chromosomal rearrangement. In an aspect, a coding mutation is a missense mutation. In an aspect, a coding mutation is a nonsense mutation. In an aspect, a coding mutation is a frameshift mutation.
- a coding mutation is a silent mutation.
- an “indel” refers to an insertion, a deletion, or both, of one or more nucleotides in a nucleic acid of a cell.
- an indel includes both an insertion and a deletion of one or more nucleotides, where both the insertion and deletion are nearby on the nucleic acid.
- the indel results in a net change in the total number of nucleotides.
- the indel results in no net change in the total number of nucleotides.
- the indel results in a net change of about 1 to 10 nucleotides.
- the indel results in a net change of about 1 to 20 nucleotides. In an aspect, the indel results in a net change of about 1 to 30 nucleotides. In an aspect, the indel results in a net change of about 1 to 40 nucleotides. In an aspect, the indel results in a net change of about 1 to 50 nucleotides. In an aspect, the indel results in a net change of about 1 to 75 nucleotides. In an aspect, the indel results in a net change of about 1 to 100 nucleotides. In an aspect, the coding mutation results in a net change in the total number of nucleotides.
- the coding mutation results in a net change of about 1 to 10 nucleotides. In an aspect, the coding mutation results in a net change of about 1 to 25 nucleotides. In an aspect, the coding mutation results in a net change of about 1 to 50 nucleotides. In an aspect, the coding mutation results in a net change of about 1 to 75 nucleotides. In an aspect, the coding mutation results in a net change of about 1 to 100 nucleotides. In an aspect, the coding mutation results in no net change in the total number of nucleotides.
- a set of genes X provided herein comprises between 1 gene and 10,000 genes. In an aspect, a set of genes X provided herein comprises between 1 gene and 100,000 genes. In an aspect, a set of genes X provided herein comprises between 1 gene and 1000 genes. In an aspect, a set of genes X provided herein comprises between 1 gene and 500 genes. In an aspect, a set of genes X provided herein comprises between 1 gene and 100 genes. In an aspect, a set of genes X provided herein comprises between 1 gene and 10 genes. In an aspect, a set of genes X provided herein comprises between 5 genes and 10 genes. In an aspect, a set of genes X provided herein comprises between 5 genes and 50 genes. In an aspect, a set of genes X provided herein comprises between 5 genes and 100 genes. In an aspect, a set of genes X provided herein comprises between 10 genes and 50 genes.
- a set of genes X provided herein comprises at least 1 gene or fragment thereof. In an aspect, a set of genes X provided herein comprises at least 2 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises at least 5 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises at least 10 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises at least 25 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises at least 50 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises at least 100 genes or fragments thereof.
- a set of genes X provided herein comprises at least 1000 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises at least 10,000 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises at least 100,000 genes or fragments thereof. In an aspect, a set of genes X provided herein comprises a pre-determined number of genes or fragments thereof.
- a set of genes X comprises substantially all exons in a genome. In an aspect, a set of genes X comprises a subset of exome. In an aspect, a set of genes X comprises one or more exons, or fragments thereof. In an aspect, a set of genes X comprises one or more genes or fragments thereof associated with cancer or cancerous phenotypes. In an aspect, a set of genes X comprises one or more genes or fragments thereof not associated with cancer or cancerous phenotypes.
- a set of genes X comprises at least one exon, or fragment thereof, which includes one or more mutations selected from the group consisting of single nucleotide polymorphisms (SNPs), base substitutions, point mutations, indels, inversions, duplications, amplifications, translocations, inter- and intra-chromosomal rearrangements, missense mutations, nonsense mutations, frameshift mutations, and silent mutations.
- SNPs single nucleotide polymorphisms
- a set of genes X comprises one or more exons or fragments thereof with mutations that are associated with cancer.
- a set of genes X comprises one or more exons or fragments thereof with mutations in which the gene products are associated with cancer.
- TMB tumor mutation burden
- Expression-Weighted TMB determines the level of mutations in a pre-determined set of genes from factors including, but not limited to, the expression level of each gene within the pre-determined set of genes, and the number of coding mutations for each gene within the pre-determined set of genes.
- EW-TMB can be determined for the whole exome.
- EW-TMB can be determined for a subset of exome.
- EW-TMB determined from a subset of exome can be extrapolated to determine the whole exome TMB.
- EW-TMB determined from a subset of exome can be extrapolated to determine the whole genome TMB. In an aspect, EW-TMB determined from a sample can be extrapolated to the subject from which the sample is obtained. In an aspect, the EW-TMB determined from a subset of exome correlates with the whole exome TMB. In an aspect, the EW-TMB determined from a subset of exome correlates with the whole genome TMB.
- the EW-TMB determined from a sample or a subject can be compared to EW-TMB determined from a reference sample or a reference subject.
- the reference sample is obtained from subjects in a reference population.
- the reference population comprises patients having the same type of cancer as the subject.
- the reference population comprises patients who are receiving, or have received, the same type of therapy, as the subject.
- the reference population comprises patients who are in the same category as the subject based on one or more demographics.
- the expression level E is expressed linearly or in log format.
- an “expression level E” refers to the RNA expression level for one gene within a pre-defined set of genes.
- RNA sequencing data is used to determine expression level E for each gene within a pre-defined set of genes.
- RNA sequencing data is used to determine expression level E for each gene within a set of genes X.
- the expression level E is expressed linearly.
- the expression level E is expressed in log format.
- the function f(E) is selected from the group consisting of a linear function, a super-linear function, and a sub-linear function.
- the function f(E) is linear in E.
- f(E) a*E + b, where a and b are constants.
- the function f(E) is super-linear in E.
- f(E) a*E A 2.
- the function f(E) is sublinear in E.
- f(E) log(E).
- the function f(E) is selected from the group consisting of a sigmoidal function, a logistic function, and a hyperbolic function.
- the function f(E) is sigmoidal in E.
- f(E) erf(E)
- erf the Gauss error function.
- the function f(E) is logistic function in E.
- f(E) a / (1 + e A (-b(E - c))).
- the function f(E) is hyperbolic function in E.
- a hyperbolic tangent (tanh) function is a non-limiting example of a hyperbolic function.
- the function f(E) is selected from the group consisting of a Heaviside step function, and a rectified linear activation function.
- the function f(E) is a Heaviside step function in E.
- f(Ei) a if Ei > T
- f(Ei) b if Ei ⁇ T
- T is a pre-determined threshold that may be different for each value of i.
- the function f(E) is a rectified linear activation function in E.
- f(Ei) a * max(0, Ei-Ti) + b, where a and b are constants, and where T is a pre-determined threshold constant.
- the number of coding mutations M is expressed linearly.
- a “number of coding mutations M” refers to the number of coding mutations for one gene within a pre-defined set of genes.
- RNA sequencing data is used to determine the number of coding mutations M for each gene within a pre-defined set of genes.
- RNA sequencing data is used to determine the number of coding mutations M for each gene within a set of genes X.
- the function g(M) is selected from the group consisting of a linear function, and a Heaviside step function.
- the function g(M) is a linear function in M.
- g(M) a*M + b, and where a and b are constants.
- the function g(M) is a Heaviside step function in M.
- g(Mi) a if Mi > T
- g(Mi) b if Mi ⁇ T
- T is a pre-determined threshold that may be different for each value of i.
- the determination of the expression level E for each gene within a set of genes X is performed via methods selected from the group consisting of the use of de novo transcriptome assembly approach, and a genome-guided transcriptome assembly. In an aspect, the determination of the expression level E for each gene within a set of genes X is performed via the use of de novo transcriptome assembly approach, where contiguous sequences (contigs) are assembled in silico from pair-wise overlapping reads. In an aspect, the determination of the expression level E for each gene within a set of genes X is performed via the use of a genome-guided transcriptome assembly, where reads are aligned to reference genomes or transcript sequences.
- a method comprises the use of a reverse transcription primer.
- a reverse transcription primer refers to a primer used in a reverse transcription reaction, where RNA in an RNA sample is converted to complementary DNA (cDNA).
- complementary DNA or “cDNA” refers to a DNA copy of a messenger RNA (mRNA) molecule produced by a reverse transcriptase.
- a reverse transcription primer comprises at least 1 degenerate nucleotide.
- a reverse transcription primer comprises at least 2 degenerate nucleotides.
- a reverse transcription primer comprises at least 3 degenerate nucleotides.
- a reverse transcription primer comprises at least 4 degenerate nucleotides. In an aspect, a reverse transcription primer comprises at least 5 degenerate nucleotides. In an aspect, a reverse transcription primer comprises at least 6 degenerate nucleotides. In an aspect, a reverse transcription primer comprises at least 7 degenerate nucleotides. In an aspect, a reverse transcription primer comprises at least 8 degenerate nucleotides. In an aspect, a reverse transcription primer comprises at least 9 degenerate nucleotides. In an aspect, a reverse transcription primer comprises at least 10 degenerate nucleotides. In an aspect, a reverse transcription primer comprises between 1 and 5 degenerate nucleotides.
- a reverse transcription primer comprises between 1 and 10 degenerate nucleotides. In an aspect, a reverse transcription primer comprises between 1 and 15 degenerate nucleotides. In an aspect, a reverse transcription primer comprises a random hexamer. As is known in the art, a “hexamer” comprises six nucleotides. In an aspect, a reverse transcription primer comprises a polyT string.
- a “polyT string” refers to at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 25, or at least 30 consecutive thymine nucleobases.
- this disclosure provides at least one DNA polymerase.
- a “DNA polymerase” refers to an enzyme that is capable of catalyzing the synthesis of a DNA molecule from nucleoside triphosphates.
- DNA polymerases add a nucleotide to the 3' end of a DNA strand one nucleotide at a time, creating an antiparallel DNA strand as compared to a template DNA strand. DNA polymerases are unable to begin a new DNA molecule de novo; they require a primer to which it can add a first new nucleotide.
- a “reagent” refers to any substance or compound added to a mixture to cause a chemical reaction or to test if a chemical reaction occurs.
- a reagent comprises a component selected from the group consisting of magnesium, at least one dNTP, phosphatase, betaine, dimethyl sulfoxide (DMSO), and tetramethylammonium chloride (TMAC).
- Non-limiting examples of reagents and buffers needed for DNA polymerase extension include Tris-HCl, potassium chloride, magnesium chloride, oligonucleotide primers, deoxynucleotides (dNTPs), betaine, and dimethyl sulfoxide.
- Tris-HCl Tris-HCl
- potassium chloride potassium chloride
- magnesium chloride oligonucleotide primers
- dNTPs deoxynucleotides
- betaine betaine
- dimethyl sulfoxide dimethyl sulfoxide
- DNA polymerases can extend primers at different temperatures, depending on the DNA polymerase.
- a DNA polymerase extends primers at a temperature of at least 40°C.
- a DNA polymerase extends primers at a temperature of at least 50°C.
- a DNA polymerase extends primers at a temperature of at least 55°C.
- a DNA polymerase extends primers at a temperature of at least 60°C.
- a DNA polymerase extends primers at a temperature of at least 65°C.
- a DNA polymerase extends primers at a temperature of at least 70°C.
- a DNA polymerase extends primers at a temperature of at least 75°C.
- a DNA polymerase extends primers at a temperature of at least 80°C.
- Primers can bind, or anneal, to a complementary sequence at a variety of temperatures, depending on the structure and length of the sequences involved.
- primer binding occurs at a temperature of at least 35°C.
- primer binding occurs at a temperature of at least 40°C.
- primer binding occurs at a temperature of at least 45°C.
- primer binding occurs at a temperature of at least 50°C.
- primer binding occurs at a temperature of at least 55°C.
- primer binding occurs at a temperature of at least 60°C.
- primer binding occurs at a temperature of at least 65°C.
- primer binding occurs at a temperature of at least 70°C.
- DNA polymerase extension and primer binding occur at different temperatures. In an aspect, DNA polymerase extension and primer binding occur at the same temperature.
- a DNA polymerase is a thermostable DNA polymerase.
- a “thermostable DNA polymerase” refers to DNA polymerases that can function at high temperatures (e.g, greater than 65°C) and can survive higher temperatures (e.g, up to about 100°C). Thermostable DNA polymerases often have maximal catalytic activity at temperatures between 70°C and 80°C.
- a thermostable DNA polymerase is selected from the group consisting of comprising Taq DNA polymerase, Phusion® DNA polymerase, Q5® DNA polymerase, and KAPA High Fidelity DNA polymerase.
- a DNA polymerase is a non-thermostable DNA polymerase.
- a “non-thermostable DNA polymerase” refers to DNA polymerases that cannot function at high temperatures.
- a non-thermostable DNA polymerase is selected from the group consisting of phi29 DNA polymerase and Bst DNA polymerase.
- an “amplicon” refers to a copy of DNA made via PCR.
- a method comprises preparing a sequencing library.
- sequencing library refers to a pool of nucleic acid molecules.
- the sequencing library comprises a pool of whole genomic sequences, subgenomic fragments, cDNA, cDNA fragments, RNA, mRNA, RNA fragments, or a combination thereof.
- the sequencing library comprises a pool of whole genomic sequences.
- the sequencing library comprises a pool of subgenomic fragments.
- the sequencing library comprises a pool of cDNA.
- the sequencing library comprises a pool of cDNA fragments.
- the sequencing library comprises a pool of RNA.
- the sequencing library comprises a pool of mRNA. In an aspect, the sequencing library comprises a pool of RNA fragments. In an aspect, the sequencing library comprises a pool of nucleic acid molecules with adapter sequences attached. In an aspect, a sequencing library comprises a pool of DNA with adapter sequences attached. In an aspect, a sequencing library comprises a pool of RNA with adapter sequences attached. In an aspect, the sequencing library comprises a pool of whole genomic sequences with adapter sequences attached. In an aspect, the sequencing library comprises a pool of subgenomic fragments with adapter sequences attached. In an aspect, the sequencing library comprises a pool of cDNA with adapter sequences attached.
- the sequencing library comprises a pool of cDNA fragments with adapter sequences attached. In an aspect, the sequencing library comprises a pool of mRNA with adapter sequences attached. In an aspect, the sequencing library comprises a pool of RNA fragments with adapter sequences attached. In an aspect, the adapter sequence can be located at one or both ends. In an aspect, a portion or all of the sequencing library comprises an adapter sequence. In an aspect, the adapter sequence can be useful for a sequencing method. In an aspect, the adapter sequence is useful for NGS. In an aspect, the adapter sequence is useful for amplification. In an aspect, the adapter sequence is useful for reverse transcription. In an aspect, the adapter sequence is useful for cloning into a vector.
- a method comprises at least one enrichment step of targeted RNA or cDNA molecules during preparation of sequencing libraries.
- the enrichment step is selected from the group consisting of pulldown probes, and amplification by target specific primers.
- the enrichment of targeted RNA or cDNA molecules during preparation of sequencing libraries is performed via pulldown probes.
- the enrichment of targeted RNA or cDNA molecules during preparation of sequencing libraries is performed via amplification by target specific primers.
- preparing the sequencing library comprises a step selected from the group consisting of reverse transcription, PCR amplification, ligation, ribosomal RNA depletion, labeling with unique molecular identifier (UMI).
- preparing the sequencing library comprises a reverse transcription step.
- preparing the sequencing library comprises a reverse transcription step with random primers.
- preparing the sequencing library comprises a PCR amplification step.
- preparing the sequencing library comprises a PCR amplification step with random primers.
- preparing the sequencing library comprises a ligation step.
- preparing the sequencing library comprises a ribosomal RNA depletion step.
- preparing the sequencing library comprises a labeling with UMI step.
- a method comprises high-throughput sequencing.
- a method comprises subjecting a sequencing library to high-throughput sequencing.
- “high-throughput sequencing” refers to any sequencing method that is capable of sequencing multiple (e.g, tens, hundreds, thousands, millions, hundreds of millions) nucleic acid molecules in parallel.
- Sanger sequencing is not high- throughput sequencing.
- high-throughput sequencing comprises the use of a sequencing-by-synthesis (SBS) flow cell.
- SBS flow cell is selected from the group consisting of an Illumina SBS flow cell and a Pacific Biosciences (PacBio) SBS flow cell.
- high-throughput sequencing is performed via electrical current measurements in conjunction with an Oxford nanopore.
- a method comprises sequencing-by-synthesis.
- a method comprises nanopore-based sequencing.
- a method comprises targeted RNA sequencing for a pre-specified set of genes.
- a method comprises long- read sequencing.
- a “sequencing read” or “read” refers to a nucleotide sequence of a single nucleic acid molecule generated via a high-throughput sequencing method.
- sequencing reads are provided in a FASTX or FASTQ file type.
- a sequencing read comprises a UMI sequence.
- a sequencing read comprises a sequence from a gene.
- a sequencing read comprises a UMI sequence and a sequence from a gene.
- the term “sequencing read” and “read” are intended to be interchangeable.
- RNA sequencing data comprises reads, and the minimal length of each read is selected from the group consisting of 25 nucleotides long, 50 nucleotides long, 75 nucleotides long, 150 nucleotides long, 200 nucleotides long, 300 nucleotides long, 1000 nucleotides long, 5000 nucleotides long, 10000 nucleotides long, and 50000 nucleotides long.
- the minimal length of each read from RNA sequencing data is 25 nucleotides long.
- the minimal length of each read from RNA sequencing data is 50 nucleotides long.
- the minimal length of each read from RNA sequencing data is 75 nucleotides long.
- the minimal length of each read from RNA sequencing data is 150 nucleotides long. In an aspect, the minimal length of each read from RNA sequencing data is 200 nucleotides long. In an aspect, the minimal length of each read from RNA sequencing data is 300 nucleotides long. In an aspect, the minimal length of each read from RNA sequencing data is 1000 nucleotides long. In an aspect, the minimal length of each read from RNA sequencing data is 5000 nucleotides long. In an aspect, the minimal length of each read from RNA sequencing data is 10000 nucleotides long. In an aspect, the minimal length of each read from RNA sequencing data is 50000 nucleotides long. In an aspect, the minimal length of each read from RNA sequencing data is variable. In an aspect, the minimal length of each read from RNA sequencing data is identical.
- a sequencing read comprises at least 10 nucleotides. In an aspect, a sequencing read comprises at least 25 nucleotides. In an aspect, a sequencing read comprises at least 50 nucleotides. In an aspect, a sequencing read comprises at least 100 nucleotides. In an aspect, a sequencing read comprises at least 250 nucleotides. In an aspect, a sequencing read comprises at least 500 nucleotides. In an aspect, a sequencing read comprises at least 1000 nucleotides. [0064] In an aspect, a sequencing read comprises between 10 nucleotides and 10,000 nucleotides. In an aspect, a sequencing read comprises between 10 nucleotides and 5000 nucleotides.
- a sequencing read comprises between 10 nucleotides and 1000 nucleotides. In an aspect, a sequencing read comprises between 10 nucleotides and 500 nucleotides. In an aspect, a sequencing read comprises between 10 nucleotides and 100 nucleotides. In an aspect, a sequencing read comprises between 25 nucleotides and 150 nucleotides.
- the determination of the expression level E for each gene within a set of genes X comprises considering the total amount of reads mapped to one particular transcript. In an aspect, the determination of the expression level E for each gene within a set of genes X comprises a function L of the total amount of reads mapped to one particular transcript and the total length of the transcripts.
- the determination of the expression level E for each gene within a set of genes X comprises normalizing the value.
- a set of genes Y is used to normalize or is compared to the expression level E for each gene within the set of genes X.
- the expression level E for each gene within a set of genes X from one particular sample or sample type is normalized or compared to the expression level E for each gene within the same set of genes X from a different sample or sample type.
- the number of coding mutations M is estimated by the homologous number of reads (align) to a reference genome or trans criptome, with at least one gap, one base change or one nucleotide change, and up to 10% of the total bases or total nucleotides are different.
- the number of coding mutations M is estimated by the number of contiguous sequences (contigs) assembled in silico from pair-wise overlapping reads that is homologous to a reference genome or transcriptome with at least one gap, one base change or one nucleotide change, and up to 10% of the total bases or total nucleotides are different.
- sequencing an amplicon, or a plurality of amplicons is performed using next-generation sequencing technologies.
- sequencing an amplicon, or a plurality of amplicons is performed using a sequencing instrument selected from the group consisting of an Oxford Nanopore sequencer, a PacBio sequencer, an Illumina Miseq sequencer, an Illumina MiniSeq sequencer, an Illumina NextS eq sequencer, an Ion Torrent sequencer, and an Illumina Hiseq sequencer to generate at least one sequencing read.
- a sequencing read is aligned to a reference sequence to identify an RNA splicing variant.
- next generation sequencing refers to any high-throughput sequencing technology.
- next-generation sequencing include single-molecule real-time sequencing (e.g., Pacific Biosciences), Ion Torrent sequencing, sequencing-by-synthesis (e.g., Illumina), sequencing by ligation (SOLiD sequencing), nanopore sequencing, and GenapSys sequencing.
- high-throughput sequencing refers to any sequences method that is capable of sequencing multiple (e.g, tens, hundreds, thousands, millions, hundreds of millions) DNA molecules in parallel. In an aspect, Sanger sequencing is not high-throughput sequencing. In an aspect, high- throughput sequencing comprises the use of a sequencing-by-synthesis (SBS) flow cell.
- SBS sequencing-by-synthesis
- an SBS flow cell is selected from the group consisting of an Illumina SBS flow cell and a Pacific Biosciences (PacBio) SBS flow cell.
- high-throughput sequencing is performed via electrical current measurements in conjunction with an Oxford nanopore.
- one nucleic acid molecule can be complementary to a second nucleic acid molecule (e.g, a Target RNA Region subsequence).
- a second nucleic acid molecule e.g, a Target RNA Region subsequence
- the sequence of a nucleic acid molecule need not be 100% complementary to that of its target nucleic acid molecule to be specifically hybridizable or hybridizable.
- a polynucleotide may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event (e.g., a loop structure or hairpin structure).
- an antisense nucleic acid molecule in which 18 of 20 nucleotides of the antisense compound are complementary to a target region, and would therefore specifically hybridize would represent 90 percent complementarity.
- the remaining noncomplementary nucleotides may be clustered or interspersed with complementary nucleotides and need not be contiguous to each other or to complementary nucleotides. Percent complementarity between particular stretches of nucleic acid sequences within nucleic acids can be determined routinely using BLAST® programs (basic local alignment search tools) and PowerBLAST programs known in the art (see Altschul et al., J. Mol.
- a DNA molecule comprises a UMI.
- an RNA molecule comprises a UMI.
- a primer comprises a UMI.
- a “unique molecular identifier” refers to a unique nucleotide sequence that serves as a molecular barcode for an individual molecule. UMIs are often attached to DNA molecules in a sample library to uniquely tag each molecule. UMIs enable error correction and increased accuracy during sequencing of DNA molecules.
- a UMI sequence comprises between 7 nucleotides and 30 nucleotides. In an aspect, a UMI sequence comprises between 5 nucleotides and 40 nucleotides. In an aspect, a UMI sequence comprises between 10 nucleotides and 20 nucleotides. In an aspect, a UMI sequence comprises at least 5 nucleotides. In an aspect, a UMI sequence comprises at least 7 nucleotides. In an aspect, a UMI sequence comprises at least 10 nucleotides. In an aspect, a UMI sequence comprises at least 15 nucleotides. In an aspect, a UMI sequence comprises fewer than 50 nucleotides. In an aspect, a UMI sequence comprises fewer than 40 nucleotides. In an aspect, a UMI sequence comprises fewer than 30 nucleotides. In an aspect, a UMI sequence comprises fewer than 20 nucleotides.
- a UMI sequence comprises between 7 degenerate nucleotides and 30 degenerate nucleotides. In an aspect, a UMI sequence comprises between 5 degenerate nucleotides and 40 degenerate nucleotides. In an aspect, a UMI sequence comprises between 10 degenerate nucleotides and 20 degenerate nucleotides. In an aspect, a UMI sequence comprises at least 5 degenerate nucleotides. In an aspect, a UMI sequence comprises at least 7 degenerate nucleotides. In an aspect, a UMI sequence comprises at least 10 degenerate nucleotides.
- a UMI sequence comprises at least 15 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 50 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 40 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 30 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 20 degenerate nucleotides.
- each degenerate nucleotide in a UMI sequence is individually selected from the group consisting of N, B, D, H, V, S, W, Y, R, M, and K.
- a UMI sequence comprises between 7 degenerate nucleotides and 30 degenerate nucleotides, where each degenerate nucleotide is selected from the group consisting of N, B, D, H, V, S, W, Y, R, M, and K.
- a method comprises removal of sequencing reads where the UMI sequence of the sequencing reads do not comprise a predefined UMI degenerate base design pattern.
- a “predefined UMI degenerate base design pattern” refers to a UMI sequence comprising the expected number of degenerate bases and the expected type of degenerate bases for a given method. Non-limiting examples of inappropriate degenerate base designs would include UMI sequences comprising too many degenerate bases or too few degenerate bases.
- a method comprises removal of at least one sequencing read where the UMI sequence of the at least one sequencing read does not comprise a predefined UMI degenerate base design pattern.
- a method comprises removal of at least two sequencing reads where the UMI sequence of the at least two sequencing reads do not comprise a predefined UMI degenerate base design pattern. In an aspect, a method comprises removal of at least three sequencing reads where the UMI sequence of the at least three sequencing reads do not comprise a predefined UMI degenerate base design pattern. In an aspect, a method comprises removal of at least four sequencing reads where the UMI sequence of the at least four sequencing reads do not comprise a predefined UMI degenerate base design pattern. In an aspect, a method comprises removal of at least five sequencing reads where the UMI sequence of the at least five sequencing reads do not comprise a predefined UMI degenerate base design pattern.
- the determination of the expression level E for each gene within a set of genes X comprises considering the total amount of similar molecules harboring the same UMI. In an aspect, the determination of the expression level E for each gene within a set of genes X comprises a function L of the total amount of similar molecules harboring the same UMI mapping one particular transcript and the total length of the transcripts.
- the prognostication of an outcome for a patient comprises differentially prognosticating the outcome of the patient based on the value of the EW- TMB.
- the EW-TMB is used to determine the high tumor mutation burden (TMB-H) for solid and non-solid tumors selected from the group consisting of breast cancer, bladder cancer, cervical cancer, colon cancer, head and neck cancer, liver cancer, renal cell cancer, skin cancer, stomach cancer, rectal cancer, lymphomas, Non-Hodgkin lymphoma, Hodgkin lymphoma, and any other type of cancer.
- the EW-TMB is used to determine the TMB-H for breast cancer.
- the EW-TMB is used to determine the TMB-H for bladder cancer. In an aspect, the EW-TMB is used to determine the TMB-H for cervical cancer. In an aspect, the EW-TMB is used to determine the TMB-H for colon cancer. In an aspect, the EW-TMB is used to determine the TMB-H for head and neck cancer. In an aspect, the EW-TMB is used to determine the TMB-H for liver cancer. In an aspect, the EW-TMB is used to determine the TMB-H for renal cell cancer. In an aspect, the EW-TMB is used to determine the TMB-H for skin cancer. In an aspect, the EW-TMB is used to determine the TMB-H for stomach cancer.
- the EW-TMB is used to determine the TMB-H for rectal cancer. In an aspect, the EW- TMB is used to determine the TMB-H for lymphomas. In an aspect, the EW-TMB is used to determine the TMB-H for Non-Hodgkin lymphoma. In an aspect, the EW-TMB is used to determine the TMB-H for Hodgkin lymphoma.
- the EW-TMB is used to determine the minimum residual disease (MRD) for solid and non-solid tumors selected from the group consisting of breast cancer, bladder cancer, cervical cancer, colon cancer, head and neck cancer, liver cancer, renal cell cancer, skin cancer, stomach cancer, rectal cancer, lymphomas, Non-Hodgkin lymphoma, Hodgkin lymphoma, and any other type of cancer.
- MRD minimum residual disease
- the EW-TMB is used to determine the MRD for breast cancer.
- the EW-TMB is used to determine the MRD for bladder cancer.
- the EW-TMB is used to determine the MRD for cervical cancer.
- the EW-TMB is used to determine the MRD for colon cancer.
- the EW-TMB is used to determine the MRD for head and neck cancer. In an aspect, the EW-TMB is used to determine the MRD for liver cancer. In an aspect, the EW-TMB is used to determine the MRD for renal cell cancer. In an aspect, the EW-TMB is used to determine the MRD for skin cancer. In an aspect, the EW-TMB is used to determine the MRD for stomach cancer. In an aspect, the EW-TMB is used to determine the MRD for rectal cancer. In an aspect, the EW-TMB is used to determine the MRD for lymphomas. In an aspect, the EW-TMB is used to determine the MRD for NonHodgkin lymphoma. In an aspect, the EW-TMB is used to determine the MRD for Hodgkin lymphoma.
- determining treatment prescription for a patient comprises differentially prescribing treatments for the patient based on the value of the EW-TMB.
- the EW-TMB is used as a predictive biomarker of response to immune checkpoint blockade therapy, with drugs targeting one or more specific sites selected from the group consisting of CTLA-5, PD-1 and PD-L1.
- the EW-TMB is used as a predictive biomarker of response to drugs targeting CTLA-5.
- the EW-TMB is used as a predictive biomarker of response to drugs targeting PD-1.
- the EW-TMB is used as a predictive biomarker of response to drugs targeting PD- Ll.
- the EW-TMB is used as a predictive biomarker for solid and non-solid tumors selected from the group consisting of breast cancer, bladder cancer, cervical cancer, colon cancer, head and neck cancer, liver cancer, renal cell cancer, skin cancer, stomach cancer, rectal cancer, lymphomas, Non-Hodgkin lymphoma, Hodgkin lymphoma, and any other type of cancer.
- the EW-TMB is used as a predictive biomarker for solid cancer.
- the EW-TMB is used as a predictive biomarker for non-solid cancer.
- the EW-TMB is used as a predictive biomarker for breast cancer.
- the EW-TMB is used as a predictive biomarker for bladder cancer. In an aspect, the EW-TMB is used as a predictive biomarker for cervical cancer. In an aspect, the EW-TMB is used as a predictive biomarker for colon cancer. In an aspect, the EW-TMB is used as a predictive biomarker for head and neck cancer. In an aspect, the EW-TMB is used as a predictive biomarker for liver cancer. In an aspect, the EW-TMB is used as a predictive biomarker for renal cell cancer. In an aspect, the EW-TMB is used as a predictive biomarker for skin cancer. In an aspect, the EW-TMB is used as a predictive biomarker for stomach cancer.
- the EW- TMB is used as a predictive biomarker for rectal cancer.
- the EW-TMB is used as a predictive biomarker for lymphomas.
- the EW-TMB is used as a predictive biomarker for Non-Hodgkin lymphoma.
- the EW-TMB is used as a predictive biomarker for Hodgkin lymphoma.
- the EW-TMB is used as a predictive biomarker of response to neoadjuvant therapy (before the primary treatment).
- the neoadjuvant therapy is selected from the group consisting of chemotherapy, radiation therapy, hormone therapy, targeted therapy, and immune therapy.
- the EW-TMB is used as a predictive biomarker of response to neoadjuvant chemotherapy.
- the EW- TMB is used as a predictive biomarker of response to neoadjuvant radiation therapy.
- the EW-TMB is used as a predictive biomarker of response to neoadjuvant hormone therapy.
- the EW-TMB is used as a predictive biomarker of response to neoadjuvant targeted therapy. In an aspect, the EW-TMB is used as a predictive biomarker of response to neoadjuvant immune therapy.
- the EW-TMB is used as a predictive biomarker of response to adjuvant therapy (after the primary treatment).
- the adjuvant therapy is selected from the group consisting of chemotherapy, radiation therapy, hormone therapy, targeted therapy, and immune therapy.
- the EW-TMB is used as a predictive biomarker of response to adjuvant chemotherapy.
- the EW-TMB is used as a predictive biomarker of response to adjuvant radiation therapy.
- the EW- TMB is used as a predictive biomarker of response to adjuvant hormone therapy.
- the EW-TMB is used as a predictive biomarker of response to adjuvant targeted therapy.
- the EW-TMB is used as a predictive biomarker of response to adjuvant immune therapy.
- a method further comprises classifying the sample or the subject from which the sample was obtained as responsive to immune checkpoint blockade therapy. In an aspect, a method further comprises classifying the sample as responsive to immune checkpoint blockade therapy. In an aspect, a method further comprises classifying the subject from which the sample was obtained as responsive to immune checkpoint blockade therapy.
- a method further comprises classifying the sample or the subject from which the sample was obtained as responsive to neoadjuvant therapy. In an aspect, a method further comprises classifying the sample as responsive to neoadjuvant therapy. In an aspect, a method further comprises classifying the subject from which the sample was obtained as responsive to neoadjuvant therapy.
- a method further comprises classifying the sample or the subject from which the sample was obtained as responsive to adjuvant therapy. In an aspect, a method further comprises classifying the sample as responsive to adjuvant therapy. In an aspect, a method further comprises classifying the subject from which the sample was obtained as responsive to adjuvant therapy.
- a method further comprises generating and delivering a report to the subject, or another person or entity.
- the report is selected from the group consisting of an electronic report, a web-based report, a digital report, and a paper report.
- the other person or entity is selected from the group consisting of a caregiver, a physician, an oncologist, a database, a server, a hospital, a clinic, a third- party payer, an insurance company, and a government office.
- the report contains information that is selected from the group consisting of an EW-TMB, a TMB-
- the report comprises the EW- TMB. In an aspect, the report comprises the TMB-H. In an aspect, the report comprises the results derived from the methods described herein. In an aspect, the report comprises information on the likelihood of patient response to a therapy. In an aspect, the report comprises information on the likelihood of patient response to immune check point blockade therapy. In an aspect, the report comprises information on the prognostication of patient outcome. In an aspect, the report comprises information on the determination of treatment prescription for the patient.
- a method for calculating an Expression-Weighted Tumor Mutational Burden (EW- TMB) from a biological sample comprising: extracting RNA from the biological sample, performing RNA sequencing (RNA Seq) on the RNA, analyzing the RNA sequencing data to determine an expression level E for each gene i within a set of genes X, analyzing the RNA sequencing data to determine a number of coding mutations M for each gene i within the same set of genes X, and calculating an EW-TMB as the sum of the product of a function f(») of Ei and a function g(») of Mi for each gene i within the set of genes X, mathematically expressed as:
- RNA sequencing is targeted RNA sequencing for a pre-specified set of genes.
- RNA sequencing data comprises reads
- the minimal length of each read is selected from the group consisting of 25 nucleotides long, 50 nucleotides long, 75 nucleotides long, 150 nucleotides long, 200 nucleotides long, 300 nucleotides long, 1000 nucleotides long, 5000 nucleotides long, 10000 nucleotides long, and 50000 nucleotides long.
- RNA sequencing data comprises reads
- determination of the expression level E for each gene within the set of genes X comprises considering the total amount of reads mapped to one particular transcript.
- RNA sequencing data comprises reads
- determination of the expression level E for each gene within the set of genes X comprises a function L of the total amount of reads mapped to one particular transcript and the total length of the transcripts.
- FFPE Formalin-Fixed Paraffin-Embedded
- a method for prognosticating the outcome of a patient comprising: collecting a biological sample from the patient, performing the method of any one of embodiments 1 to 33 to determine an EW-TMB, and differentially prognosticating the outcome of the patient based on the value of the EW-TMB.
- the EW-TMB is used to determine the high tumor mutation burden (TMB-H) for solid and non-solid tumors selected from the group consisting of breast cancer, bladder cancer, cervical cancer, colon cancer, head and neck cancer, liver cancer, renal cell cancer, skin cancer, stomach cancer, rectal cancer, lymphomas, Non-Hodgkin lymphoma, and Hodgkin lymphoma.
- EW-TMB is used to determine minimum residual disease (MRD) for solid and non-solid tumors selected from the group consisting of lymphomas, breast cancer, bladder cancer, cervical cancer, colon cancer, head and neck cancer, liver cancer, renal cell cancer, skin cancer, stomach cancer, rectal cancer, Non-Hodgkin lymphoma, and Hodgkin lymphoma.
- MRD minimum residual disease
- a method for determining treatment prescription for a patient comprising: collecting a biological sample from the patient, performing the method of any one of embodiments 1 to 33 to determine an EW-TMB, and differentially prescribing treatments for the patient based on the value of the EW- TMB.
- RNA Seq RNA sequencing
- Ei expression level of each gene i
- Mi number of coding mutations M for each gene i (Mi) within the same set of genes X
- EW-TMB based on Ei and Mi for each gene i within the set of genes X.
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Medicinal Chemistry (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Surgery (AREA)
- Urology & Nephrology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente invention concerne des procédés de calcul d'une charge mutationnelle tumorale pondérée par l'expression à partir d'un échantillon biologique. L'ARN est extrait de l'échantillon biologique, suivi du séquençage de l'ARN et de l'analyse pondérée par l'expression des données de séquençage de l'ARN. La charge mutationnelle tumorale pondérée par l'expression calculée à partir de ces procédés peut être utilisée dans le pronostic sur l'évolution de l'état de patients et la détermination d'une prescription de traitement pour des patients.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163287455P | 2021-12-08 | 2021-12-08 | |
US63/287,455 | 2021-12-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023107570A1 true WO2023107570A1 (fr) | 2023-06-15 |
Family
ID=86731165
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2022/052153 WO2023107570A1 (fr) | 2021-12-08 | 2022-12-07 | Charge mutationnelle tumorale pondérée par l'expression en tant que biomarqueur oncologique |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023107570A1 (fr) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190018926A1 (en) * | 2017-07-14 | 2019-01-17 | Cofactor Genomics, Inc. | Immuno-oncology applications using next generation sequencing |
WO2020136133A1 (fr) * | 2018-12-23 | 2020-07-02 | F. Hoffmann-La Roche Ag | Classification de tumeur basée sur une charge mutationnelle tumorale prédite |
-
2022
- 2022-12-07 WO PCT/US2022/052153 patent/WO2023107570A1/fr unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190018926A1 (en) * | 2017-07-14 | 2019-01-17 | Cofactor Genomics, Inc. | Immuno-oncology applications using next generation sequencing |
WO2020136133A1 (fr) * | 2018-12-23 | 2020-07-02 | F. Hoffmann-La Roche Ag | Classification de tumeur basée sur une charge mutationnelle tumorale prédite |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220195530A1 (en) | Identification and use of circulating nucleic acid tumor markers | |
Xuan et al. | Next-generation sequencing in the clinic: promises and challenges | |
Newman et al. | An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage | |
KR102393608B1 (ko) | 희귀 돌연변이 및 카피수 변이를 검출하기 위한 시스템 및 방법 | |
EP3322816B1 (fr) | Système et méthodologie pour l'analyse de données génomiques obtenues à partir d'un sujet | |
US20230119938A1 (en) | Methods of Preparing Dual-Indexed DNA Libraries for Bisulfite Conversion Sequencing | |
JP2022058904A (ja) | 無細胞核酸の多重解像度分析のための方法 | |
Strausberg et al. | Sequence-based cancer genomics: progress, lessons and opportunities | |
US20200327954A1 (en) | Methods and systems for differentiating somatic and germline variants | |
CN111534580A (zh) | 用于检测遗传变异的方法和系统 | |
US20130317083A1 (en) | Non-coding transcripts for determination of cellular states | |
JP2022505050A (ja) | プーリングを介した多数の試料の効率的な遺伝子型決定のための方法および試薬 | |
Suela et al. | DNA profiling analysis of 100 consecutive de novo acute myeloid leukemia cases reveals patterns of genomic instability that affect all cytogenetic risk groups | |
US20240013857A1 (en) | Methods and systems for analyzing nucleic acid sequences | |
JP7407193B2 (ja) | 可変の複製多重pcrを使用した配列決定方法 | |
De Carvalho et al. | miRNA genetic variants alter their secondary structure and expression in patients with RASopathies syndromes | |
US20240141425A1 (en) | Correcting for deamination-induced sequence errors | |
US20200071754A1 (en) | Methods and systems for detecting contamination between samples | |
JP2022512848A (ja) | エピジェネティック区画アッセイを較正するための方法、組成物およびシステム | |
JP2021518106A (ja) | 治療用核酸構築物の非侵襲的な検出およびモニタリングのための方法 | |
WO2023107570A1 (fr) | Charge mutationnelle tumorale pondérée par l'expression en tant que biomarqueur oncologique | |
EP3847276A2 (fr) | Procédés et systèmes pour détecter un déséquilibre allélique dans des échantillons d'acides nucléiques acellulaires | |
Amr et al. | Targeted Hybrid Capture for Inherited Disease Panels | |
Rebollar-Vega et al. | Clinical Applications of Next-Generation Sequencing | |
WO2023056300A1 (fr) | Biopsies liquides personnalisées dans le cadre du cancer en utilisant des amorces provenant d'une banque d'amorces |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22905085 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |