CN116129998A - Esophageal squamous cell carcinoma data processing method and system - Google Patents
Esophageal squamous cell carcinoma data processing method and system Download PDFInfo
- Publication number
- CN116129998A CN116129998A CN202310062863.0A CN202310062863A CN116129998A CN 116129998 A CN116129998 A CN 116129998A CN 202310062863 A CN202310062863 A CN 202310062863A CN 116129998 A CN116129998 A CN 116129998A
- Authority
- CN
- China
- Prior art keywords
- ddr
- subtype
- silent
- data
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 208000036765 Squamous cell carcinoma of the esophagus Diseases 0.000 title claims abstract description 93
- 208000007276 esophageal squamous cell carcinoma Diseases 0.000 title claims abstract description 93
- 206010061534 Oesophageal squamous cell carcinoma Diseases 0.000 title claims abstract description 36
- 238000003672 processing method Methods 0.000 title abstract description 10
- 238000000034 method Methods 0.000 claims abstract description 46
- 238000012163 sequencing technique Methods 0.000 claims abstract description 40
- 102100029822 B- and T-lymphocyte attenuator Human genes 0.000 claims abstract description 30
- 101000864344 Homo sapiens B- and T-lymphocyte attenuator Proteins 0.000 claims abstract description 30
- 101000801234 Homo sapiens Tumor necrosis factor receptor superfamily member 18 Proteins 0.000 claims abstract description 29
- 102100033728 Tumor necrosis factor receptor superfamily member 18 Human genes 0.000 claims abstract description 28
- 238000012545 processing Methods 0.000 claims abstract description 28
- 238000013145 classification model Methods 0.000 claims abstract description 17
- 239000003112 inhibitor Substances 0.000 claims abstract description 17
- 238000011269 treatment regimen Methods 0.000 claims abstract description 11
- 238000009169 immunotherapy Methods 0.000 claims abstract description 10
- 238000003860 storage Methods 0.000 claims abstract description 8
- 230000014509 gene expression Effects 0.000 claims description 70
- 230000004083 survival effect Effects 0.000 claims description 40
- 108090000623 proteins and genes Proteins 0.000 claims description 36
- 230000037361 pathway Effects 0.000 claims description 34
- 206010028980 Neoplasm Diseases 0.000 claims description 29
- 101150072950 BRCA1 gene Proteins 0.000 claims description 23
- 108700020463 BRCA1 Proteins 0.000 claims description 20
- 101000843497 Homo sapiens Probable ATP-dependent DNA helicase HFM1 Proteins 0.000 claims description 20
- 102100030730 Probable ATP-dependent DNA helicase HFM1 Human genes 0.000 claims description 20
- 238000004393 prognosis Methods 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 17
- 238000011282 treatment Methods 0.000 claims description 17
- 238000003559 RNA-seq method Methods 0.000 claims description 16
- 230000001394 metastastic effect Effects 0.000 claims description 12
- 206010061289 metastatic neoplasm Diseases 0.000 claims description 12
- 238000007621 cluster analysis Methods 0.000 claims description 11
- 238000004422 calculation algorithm Methods 0.000 claims description 8
- 230000002708 enhancing effect Effects 0.000 claims description 7
- 101150027186 Hfm1 gene Proteins 0.000 claims description 6
- 230000006044 T cell activation Effects 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 5
- 229940124650 anti-cancer therapies Drugs 0.000 claims description 4
- 238000011319 anticancer therapy Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 4
- 108091008042 inhibitory receptors Proteins 0.000 claims description 4
- 238000000491 multivariate analysis Methods 0.000 claims description 4
- 238000012315 univariate regression analysis Methods 0.000 claims description 4
- 210000000662 T-lymphocyte subset Anatomy 0.000 claims description 3
- 108091008034 costimulatory receptors Proteins 0.000 claims description 3
- 230000009545 invasion Effects 0.000 claims description 3
- 239000003446 ligand Substances 0.000 claims description 3
- 241001529453 unidentified herpesvirus Species 0.000 claims description 3
- 102000036365 BRCA1 Human genes 0.000 claims 1
- 230000011559 double-strand break repair via nonhomologous end joining Effects 0.000 claims 1
- 230000000903 blocking effect Effects 0.000 abstract description 9
- 230000005971 DNA damage repair Effects 0.000 description 44
- 210000004027 cell Anatomy 0.000 description 34
- 102100021429 DNA-directed RNA polymerase II subunit RPB1 Human genes 0.000 description 22
- 101001106401 Homo sapiens DNA-directed RNA polymerase II subunit RPB1 Proteins 0.000 description 22
- 102100025401 Breast cancer type 1 susceptibility protein Human genes 0.000 description 19
- 101710089372 Programmed cell death protein 1 Proteins 0.000 description 15
- 238000002474 experimental method Methods 0.000 description 13
- 238000004458 analytical method Methods 0.000 description 12
- 210000001519 tissue Anatomy 0.000 description 12
- -1 RFC1 Proteins 0.000 description 9
- 230000006801 homologous recombination Effects 0.000 description 8
- 238000002744 homologous recombination Methods 0.000 description 8
- 230000005778 DNA damage Effects 0.000 description 7
- 231100000277 DNA damage Toxicity 0.000 description 7
- 210000001744 T-lymphocyte Anatomy 0.000 description 7
- 230000000875 corresponding effect Effects 0.000 description 7
- 230000003902 lesion Effects 0.000 description 6
- 230000033607 mismatch repair Effects 0.000 description 6
- 230000006780 non-homologous end joining Effects 0.000 description 6
- 208000007433 Lymphatic Metastasis Diseases 0.000 description 5
- 108020004459 Small interfering RNA Proteins 0.000 description 5
- 238000000692 Student's t-test Methods 0.000 description 5
- 210000003289 regulatory T cell Anatomy 0.000 description 5
- 238000012353 t test Methods 0.000 description 5
- 238000001262 western blot Methods 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 101000650600 Homo sapiens DNA-directed RNA polymerase I subunit RPA2 Proteins 0.000 description 4
- 101001092206 Homo sapiens Replication protein A 32 kDa subunit Proteins 0.000 description 4
- 102100035525 Replication protein A 32 kDa subunit Human genes 0.000 description 4
- 238000002648 combination therapy Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 210000000822 natural killer cell Anatomy 0.000 description 4
- 238000011002 quantification Methods 0.000 description 4
- 108700040618 BRCA1 Genes Proteins 0.000 description 3
- 102100025570 Cancer/testis antigen 1 Human genes 0.000 description 3
- 101000856237 Homo sapiens Cancer/testis antigen 1 Proteins 0.000 description 3
- 101001096355 Homo sapiens Replication factor C subunit 3 Proteins 0.000 description 3
- 108060003951 Immunoglobulin Proteins 0.000 description 3
- 108010061593 Member 14 Tumor Necrosis Factor Receptors Proteins 0.000 description 3
- 102100037855 Replication factor C subunit 3 Human genes 0.000 description 3
- 102100028785 Tumor necrosis factor receptor superfamily member 14 Human genes 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- 239000000556 agonist Substances 0.000 description 3
- 239000000090 biomarker Substances 0.000 description 3
- 238000002512 chemotherapy Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 210000003162 effector t lymphocyte Anatomy 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 102000018358 immunoglobulin Human genes 0.000 description 3
- 238000001959 radiotherapy Methods 0.000 description 3
- 238000000611 regression analysis Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000002195 synergetic effect Effects 0.000 description 3
- 102000008203 CTLA-4 Antigen Human genes 0.000 description 2
- 108010021064 CTLA-4 Antigen Proteins 0.000 description 2
- 229940045513 CTLA4 antagonist Drugs 0.000 description 2
- 208000005623 Carcinogenesis Diseases 0.000 description 2
- 102100028907 Cullin-4A Human genes 0.000 description 2
- 108050006400 Cyclin Proteins 0.000 description 2
- 108020004414 DNA Proteins 0.000 description 2
- 102100021122 DNA damage-binding protein 2 Human genes 0.000 description 2
- 102100029995 DNA ligase 1 Human genes 0.000 description 2
- 102100024829 DNA polymerase delta catalytic subunit Human genes 0.000 description 2
- 102100024823 DNA polymerase delta subunit 2 Human genes 0.000 description 2
- 102100020782 DNA polymerase delta subunit 3 Human genes 0.000 description 2
- 102100023877 E3 ubiquitin-protein ligase RBX1 Human genes 0.000 description 2
- 101710095156 E3 ubiquitin-protein ligase RBX1 Proteins 0.000 description 2
- 102100037114 Elongin-C Human genes 0.000 description 2
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 2
- 102100034533 Histone H2AX Human genes 0.000 description 2
- 101000916245 Homo sapiens Cullin-4A Proteins 0.000 description 2
- 101001041466 Homo sapiens DNA damage-binding protein 2 Proteins 0.000 description 2
- 101000863770 Homo sapiens DNA ligase 1 Proteins 0.000 description 2
- 101000909198 Homo sapiens DNA polymerase delta catalytic subunit Proteins 0.000 description 2
- 101000909189 Homo sapiens DNA polymerase delta subunit 2 Proteins 0.000 description 2
- 101000932004 Homo sapiens DNA polymerase delta subunit 3 Proteins 0.000 description 2
- 101000932009 Homo sapiens DNA polymerase delta subunit 4 Proteins 0.000 description 2
- 101000881731 Homo sapiens Elongin-C Proteins 0.000 description 2
- 101001067891 Homo sapiens Histone H2AX Proteins 0.000 description 2
- 101000619640 Homo sapiens Leucine-rich repeats and immunoglobulin-like domains protein 1 Proteins 0.000 description 2
- 101001096365 Homo sapiens Replication factor C subunit 2 Proteins 0.000 description 2
- 101000582404 Homo sapiens Replication factor C subunit 4 Proteins 0.000 description 2
- 101000582412 Homo sapiens Replication factor C subunit 5 Proteins 0.000 description 2
- 101000709305 Homo sapiens Replication protein A 14 kDa subunit Proteins 0.000 description 2
- 101000709341 Homo sapiens Replication protein A 30 kDa subunit Proteins 0.000 description 2
- 108010002350 Interleukin-2 Proteins 0.000 description 2
- 102000015335 Ku Autoantigen Human genes 0.000 description 2
- 108010025026 Ku Autoantigen Proteins 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 2
- 102100024168 Polymerase delta-interacting protein 2 Human genes 0.000 description 2
- 102100036691 Proliferating cell nuclear antigen Human genes 0.000 description 2
- 101710178916 RING-box protein 1 Proteins 0.000 description 2
- 102100037851 Replication factor C subunit 2 Human genes 0.000 description 2
- 102100030542 Replication factor C subunit 4 Human genes 0.000 description 2
- 102100030541 Replication factor C subunit 5 Human genes 0.000 description 2
- 102100034372 Replication protein A 14 kDa subunit Human genes 0.000 description 2
- 102100034373 Replication protein A 30 kDa subunit Human genes 0.000 description 2
- 108060008683 Tumor Necrosis Factor Receptor Proteins 0.000 description 2
- 238000001772 Wald test Methods 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 102000036639 antigens Human genes 0.000 description 2
- 108091007433 antigens Proteins 0.000 description 2
- 230000033590 base-excision repair Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 230000036952 cancer formation Effects 0.000 description 2
- 231100000504 carcinogenesis Toxicity 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 229960004316 cisplatin Drugs 0.000 description 2
- DQLATGHUWYMOKM-UHFFFAOYSA-L cisplatin Chemical compound N[Pt](N)(Cl)Cl DQLATGHUWYMOKM-UHFFFAOYSA-L 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010835 comparative analysis Methods 0.000 description 2
- 210000001151 cytotoxic T lymphocyte Anatomy 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 201000004101 esophageal cancer Diseases 0.000 description 2
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 2
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 2
- 230000028993 immune response Effects 0.000 description 2
- 230000001506 immunosuppresive effect Effects 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 230000005764 inhibitory process Effects 0.000 description 2
- 210000004185 liver Anatomy 0.000 description 2
- 238000001325 log-rank test Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 210000003071 memory t lymphocyte Anatomy 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 230000020520 nucleotide-excision repair Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 238000003068 pathway analysis Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000035755 proliferation Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000008439 repair process Effects 0.000 description 2
- 238000002271 resection Methods 0.000 description 2
- 238000012174 single-cell RNA sequencing Methods 0.000 description 2
- 230000004614 tumor growth Effects 0.000 description 2
- 102000003298 tumor necrosis factor receptor Human genes 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- 230000003827 upregulation Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- KIAPWMKFHIKQOZ-UHFFFAOYSA-N 2-[[(4-fluorophenyl)-oxomethyl]amino]benzoic acid methyl ester Chemical compound COC(=O)C1=CC=CC=C1NC(=O)C1=CC=C(F)C=C1 KIAPWMKFHIKQOZ-UHFFFAOYSA-N 0.000 description 1
- 102100023990 60S ribosomal protein L17 Human genes 0.000 description 1
- 102100026882 Alpha-synuclein Human genes 0.000 description 1
- 101100339431 Arabidopsis thaliana HMGB2 gene Proteins 0.000 description 1
- 108700020462 BRCA2 Proteins 0.000 description 1
- 102000052609 BRCA2 Human genes 0.000 description 1
- 108091032955 Bacterial small RNA Proteins 0.000 description 1
- 101150008921 Brca2 gene Proteins 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 102100025277 C-X-C motif chemokine 13 Human genes 0.000 description 1
- 102100027207 CD27 antigen Human genes 0.000 description 1
- 101150013553 CD40 gene Proteins 0.000 description 1
- 102100030933 CDK-activating kinase assembly factor MAT1 Human genes 0.000 description 1
- 102100037631 Centrin-2 Human genes 0.000 description 1
- 102100033674 Centromere protein X Human genes 0.000 description 1
- 108091006146 Channels Proteins 0.000 description 1
- 206010053138 Congenital aplastic anaemia Diseases 0.000 description 1
- 102100025525 Cullin-5 Human genes 0.000 description 1
- 102100026810 Cyclin-dependent kinase 7 Human genes 0.000 description 1
- 102000012698 DDB1 Human genes 0.000 description 1
- 102100035186 DNA excision repair protein ERCC-1 Human genes 0.000 description 1
- 102100031866 DNA excision repair protein ERCC-5 Human genes 0.000 description 1
- 108010035476 DNA excision repair protein ERCC-5 Proteins 0.000 description 1
- 102100031867 DNA excision repair protein ERCC-6 Human genes 0.000 description 1
- 102100031868 DNA excision repair protein ERCC-8 Human genes 0.000 description 1
- 108090000133 DNA helicases Proteins 0.000 description 1
- 102000003844 DNA helicases Human genes 0.000 description 1
- 102100033195 DNA ligase 4 Human genes 0.000 description 1
- 102100028849 DNA mismatch repair protein Mlh3 Human genes 0.000 description 1
- 102100034157 DNA mismatch repair protein Msh2 Human genes 0.000 description 1
- 102100037700 DNA mismatch repair protein Msh3 Human genes 0.000 description 1
- 102100021147 DNA mismatch repair protein Msh6 Human genes 0.000 description 1
- 102100029910 DNA polymerase epsilon subunit 2 Human genes 0.000 description 1
- 102100029905 DNA polymerase epsilon subunit 3 Human genes 0.000 description 1
- 102100036948 DNA polymerase epsilon subunit 4 Human genes 0.000 description 1
- 102100029094 DNA repair endonuclease XPF Human genes 0.000 description 1
- 102100039116 DNA repair protein RAD50 Human genes 0.000 description 1
- 102100022477 DNA repair protein complementing XP-C cells Human genes 0.000 description 1
- 102100033072 DNA replication ATP-dependent helicase DNA2 Human genes 0.000 description 1
- 102100040401 DNA topoisomerase 3-alpha Human genes 0.000 description 1
- 101100226017 Dictyostelium discoideum repD gene Proteins 0.000 description 1
- 101100170004 Dictyostelium discoideum repE gene Proteins 0.000 description 1
- 101100170005 Drosophila melanogaster pic gene Proteins 0.000 description 1
- 101150105460 ERCC2 gene Proteins 0.000 description 1
- 102100030208 Elongin-A Human genes 0.000 description 1
- 102100030209 Elongin-B Human genes 0.000 description 1
- 102100029075 Exonuclease 1 Human genes 0.000 description 1
- 108010067741 Fanconi Anemia Complementation Group N protein Proteins 0.000 description 1
- 201000004939 Fanconi anemia Diseases 0.000 description 1
- 102100029347 Fanconi anemia core complex-associated protein 100 Human genes 0.000 description 1
- 102100022352 Fanconi anemia core complex-associated protein 24 Human genes 0.000 description 1
- 102100034553 Fanconi anemia group J protein Human genes 0.000 description 1
- 102100036089 Fascin Human genes 0.000 description 1
- 102100027581 Forkhead box protein P3 Human genes 0.000 description 1
- 102100033962 GTP-binding protein RAD Human genes 0.000 description 1
- 102100031885 General transcription and DNA repair factor IIH helicase subunit XPB Human genes 0.000 description 1
- 102100035184 General transcription and DNA repair factor IIH helicase subunit XPD Human genes 0.000 description 1
- 102100038308 General transcription factor IIH subunit 1 Human genes 0.000 description 1
- 102100032864 General transcription factor IIH subunit 2 Human genes 0.000 description 1
- 102100032863 General transcription factor IIH subunit 3 Human genes 0.000 description 1
- 102100032862 General transcription factor IIH subunit 4 Human genes 0.000 description 1
- 102100032865 General transcription factor IIH subunit 5 Human genes 0.000 description 1
- 208000031448 Genomic Instability Diseases 0.000 description 1
- 102100021186 Granulysin Human genes 0.000 description 1
- 102100030386 Granzyme A Human genes 0.000 description 1
- 102100030385 Granzyme B Human genes 0.000 description 1
- 102100031150 Growth arrest and DNA damage-inducible protein GADD45 alpha Human genes 0.000 description 1
- 108700010013 HMGB1 Proteins 0.000 description 1
- 101150021904 HMGB1 gene Proteins 0.000 description 1
- 102100022536 Helicase POLQ-like Human genes 0.000 description 1
- 102100034458 Hepatitis A virus cellular receptor 2 Human genes 0.000 description 1
- 101710083479 Hepatitis A virus cellular receptor 2 homolog Proteins 0.000 description 1
- 102100037907 High mobility group protein B1 Human genes 0.000 description 1
- 102100022893 Histone acetyltransferase KAT5 Human genes 0.000 description 1
- 101000834898 Homo sapiens Alpha-synuclein Proteins 0.000 description 1
- 101000858064 Homo sapiens C-X-C motif chemokine 13 Proteins 0.000 description 1
- 101000914511 Homo sapiens CD27 antigen Proteins 0.000 description 1
- 101000583935 Homo sapiens CDK-activating kinase assembly factor MAT1 Proteins 0.000 description 1
- 101000880516 Homo sapiens Centrin-2 Proteins 0.000 description 1
- 101000944476 Homo sapiens Centromere protein X Proteins 0.000 description 1
- 101000851684 Homo sapiens Chimeric ERCC6-PGBD3 protein Proteins 0.000 description 1
- 101000765038 Homo sapiens Class E basic helix-loop-helix protein 40 Proteins 0.000 description 1
- 101000856414 Homo sapiens Cullin-5 Proteins 0.000 description 1
- 101000911952 Homo sapiens Cyclin-dependent kinase 7 Proteins 0.000 description 1
- 101000876529 Homo sapiens DNA excision repair protein ERCC-1 Proteins 0.000 description 1
- 101000920783 Homo sapiens DNA excision repair protein ERCC-6 Proteins 0.000 description 1
- 101000920778 Homo sapiens DNA excision repair protein ERCC-8 Proteins 0.000 description 1
- 101000927847 Homo sapiens DNA ligase 3 Proteins 0.000 description 1
- 101000927810 Homo sapiens DNA ligase 4 Proteins 0.000 description 1
- 101000577867 Homo sapiens DNA mismatch repair protein Mlh3 Proteins 0.000 description 1
- 101001134036 Homo sapiens DNA mismatch repair protein Msh2 Proteins 0.000 description 1
- 101001027762 Homo sapiens DNA mismatch repair protein Msh3 Proteins 0.000 description 1
- 101000968658 Homo sapiens DNA mismatch repair protein Msh6 Proteins 0.000 description 1
- 101000864190 Homo sapiens DNA polymerase epsilon subunit 2 Proteins 0.000 description 1
- 101000864175 Homo sapiens DNA polymerase epsilon subunit 3 Proteins 0.000 description 1
- 101000804960 Homo sapiens DNA polymerase epsilon subunit 4 Proteins 0.000 description 1
- 101001094659 Homo sapiens DNA polymerase kappa Proteins 0.000 description 1
- 101000743929 Homo sapiens DNA repair protein RAD50 Proteins 0.000 description 1
- 101000618535 Homo sapiens DNA repair protein complementing XP-C cells Proteins 0.000 description 1
- 101000927313 Homo sapiens DNA replication ATP-dependent helicase DNA2 Proteins 0.000 description 1
- 101000611068 Homo sapiens DNA topoisomerase 3-alpha Proteins 0.000 description 1
- 101000670537 Homo sapiens E3 ubiquitin-protein ligase RNF168 Proteins 0.000 description 1
- 101001107071 Homo sapiens E3 ubiquitin-protein ligase RNF8 Proteins 0.000 description 1
- 101001011859 Homo sapiens Elongin-A Proteins 0.000 description 1
- 101001011846 Homo sapiens Elongin-B Proteins 0.000 description 1
- 101000918264 Homo sapiens Exonuclease 1 Proteins 0.000 description 1
- 101001062402 Homo sapiens Fanconi anemia core complex-associated protein 100 Proteins 0.000 description 1
- 101000824568 Homo sapiens Fanconi anemia core complex-associated protein 24 Proteins 0.000 description 1
- 101000848171 Homo sapiens Fanconi anemia group J protein Proteins 0.000 description 1
- 101000914689 Homo sapiens Fanconi-associated nuclease 1 Proteins 0.000 description 1
- 101000861452 Homo sapiens Forkhead box protein P3 Proteins 0.000 description 1
- 101001132495 Homo sapiens GTP-binding protein RAD Proteins 0.000 description 1
- 101000920748 Homo sapiens General transcription and DNA repair factor IIH helicase subunit XPB Proteins 0.000 description 1
- 101000666405 Homo sapiens General transcription factor IIH subunit 1 Proteins 0.000 description 1
- 101000655398 Homo sapiens General transcription factor IIH subunit 2 Proteins 0.000 description 1
- 101000655391 Homo sapiens General transcription factor IIH subunit 3 Proteins 0.000 description 1
- 101000655406 Homo sapiens General transcription factor IIH subunit 4 Proteins 0.000 description 1
- 101000655402 Homo sapiens General transcription factor IIH subunit 5 Proteins 0.000 description 1
- 101001002170 Homo sapiens Glutamine amidotransferase-like class 1 domain-containing protein 3, mitochondrial Proteins 0.000 description 1
- 101001040751 Homo sapiens Granulysin Proteins 0.000 description 1
- 101001009599 Homo sapiens Granzyme A Proteins 0.000 description 1
- 101001009603 Homo sapiens Granzyme B Proteins 0.000 description 1
- 101001066158 Homo sapiens Growth arrest and DNA damage-inducible protein GADD45 alpha Proteins 0.000 description 1
- 101000899334 Homo sapiens Helicase POLQ-like Proteins 0.000 description 1
- 101001046996 Homo sapiens Histone acetyltransferase KAT5 Proteins 0.000 description 1
- 101000998139 Homo sapiens Interleukin-32 Proteins 0.000 description 1
- 101000968674 Homo sapiens MutS protein homolog 4 Proteins 0.000 description 1
- 101000968663 Homo sapiens MutS protein homolog 5 Proteins 0.000 description 1
- 101000578059 Homo sapiens Non-homologous end-joining factor 1 Proteins 0.000 description 1
- 101000738901 Homo sapiens PMS1 protein homolog 1 Proteins 0.000 description 1
- 101000611936 Homo sapiens Programmed cell death protein 1 Proteins 0.000 description 1
- 101000652359 Homo sapiens Spermatogenesis-associated protein 2 Proteins 0.000 description 1
- 101000831007 Homo sapiens T-cell immunoreceptor with Ig and ITIM domains Proteins 0.000 description 1
- 101000716102 Homo sapiens T-cell surface glycoprotein CD4 Proteins 0.000 description 1
- 101000914514 Homo sapiens T-cell-specific surface glycoprotein CD28 Proteins 0.000 description 1
- 101000735431 Homo sapiens Terminal nucleotidyltransferase 4A Proteins 0.000 description 1
- 101000648265 Homo sapiens Thymocyte selection-associated high mobility group box protein TOX Proteins 0.000 description 1
- 101000843556 Homo sapiens Transcription factor HES-1 Proteins 0.000 description 1
- 101000851370 Homo sapiens Tumor necrosis factor receptor superfamily member 9 Proteins 0.000 description 1
- 101000717428 Homo sapiens UV excision repair protein RAD23 homolog A Proteins 0.000 description 1
- 101000717424 Homo sapiens UV excision repair protein RAD23 homolog B Proteins 0.000 description 1
- 101000607909 Homo sapiens Ubiquitin carboxyl-terminal hydrolase 1 Proteins 0.000 description 1
- 101000814276 Homo sapiens WD repeat-containing protein 48 Proteins 0.000 description 1
- 229940076838 Immune checkpoint inhibitor Drugs 0.000 description 1
- 108091008036 Immune checkpoint proteins Proteins 0.000 description 1
- 102000037982 Immune checkpoint proteins Human genes 0.000 description 1
- 102000037984 Inhibitory immune checkpoint proteins Human genes 0.000 description 1
- 108091008026 Inhibitory immune checkpoint proteins Proteins 0.000 description 1
- 102100037850 Interferon gamma Human genes 0.000 description 1
- 108010074328 Interferon-gamma Proteins 0.000 description 1
- 102000000588 Interleukin-2 Human genes 0.000 description 1
- 102100033501 Interleukin-32 Human genes 0.000 description 1
- 102100033284 Leucine-rich repeats and immunoglobulin-like domains protein 3 Human genes 0.000 description 1
- 102000046961 MRE11 Homologue Human genes 0.000 description 1
- 108700019589 MRE11 Homologue Proteins 0.000 description 1
- 229910015837 MSH2 Inorganic materials 0.000 description 1
- 108010074346 Mismatch Repair Endonuclease PMS2 Proteins 0.000 description 1
- 102100037480 Mismatch repair endonuclease PMS2 Human genes 0.000 description 1
- 102000013609 MutL Protein Homolog 1 Human genes 0.000 description 1
- 108010026664 MutL Protein Homolog 1 Proteins 0.000 description 1
- 102100021157 MutS protein homolog 4 Human genes 0.000 description 1
- 102100021156 MutS protein homolog 5 Human genes 0.000 description 1
- 102100028156 Non-homologous end-joining factor 1 Human genes 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 102100037482 PMS1 protein homolog 1 Human genes 0.000 description 1
- 102100040884 Partner and localizer of BRCA2 Human genes 0.000 description 1
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 1
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 1
- 108091007744 Programmed cell death receptors Proteins 0.000 description 1
- 102100020949 Putative glutamine amidotransferase-like class 1 domain-containing protein 3B, mitochondrial Human genes 0.000 description 1
- 102000004909 RNF168 Human genes 0.000 description 1
- 102000004910 RNF8 Human genes 0.000 description 1
- 108050003452 SH2 domains Proteins 0.000 description 1
- 102000014400 SH2 domains Human genes 0.000 description 1
- 230000006052 T cell proliferation Effects 0.000 description 1
- 229940126547 T-cell immunoglobulin mucin-3 Drugs 0.000 description 1
- 102100024834 T-cell immunoreceptor with Ig and ITIM domains Human genes 0.000 description 1
- 102100036011 T-cell surface glycoprotein CD4 Human genes 0.000 description 1
- 102100027213 T-cell-specific surface glycoprotein CD28 Human genes 0.000 description 1
- 102100034939 Terminal nucleotidyltransferase 4A Human genes 0.000 description 1
- 102100028788 Thymocyte selection-associated high mobility group box protein TOX Human genes 0.000 description 1
- 102000000504 Tumor Suppressor p53-Binding Protein 1 Human genes 0.000 description 1
- 108010041385 Tumor Suppressor p53-Binding Protein 1 Proteins 0.000 description 1
- 102100022153 Tumor necrosis factor receptor superfamily member 4 Human genes 0.000 description 1
- 101710165473 Tumor necrosis factor receptor superfamily member 4 Proteins 0.000 description 1
- 102100040245 Tumor necrosis factor receptor superfamily member 5 Human genes 0.000 description 1
- 102100036856 Tumor necrosis factor receptor superfamily member 9 Human genes 0.000 description 1
- 102100033019 Tyrosine-protein phosphatase non-receptor type 11 Human genes 0.000 description 1
- 101710116241 Tyrosine-protein phosphatase non-receptor type 11 Proteins 0.000 description 1
- 102100020845 UV excision repair protein RAD23 homolog A Human genes 0.000 description 1
- 102100020779 UV excision repair protein RAD23 homolog B Human genes 0.000 description 1
- 102100039865 Ubiquitin carboxyl-terminal hydrolase 1 Human genes 0.000 description 1
- 102100039414 WD repeat-containing protein 48 Human genes 0.000 description 1
- 108010000443 X-ray Repair Cross Complementing Protein 1 Proteins 0.000 description 1
- 102000002258 X-ray Repair Cross Complementing Protein 1 Human genes 0.000 description 1
- 108700031763 Xeroderma Pigmentosum Group D Proteins 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000011226 adjuvant chemotherapy Methods 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000006023 anti-tumor response Effects 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 229940041181 antineoplastic drug Drugs 0.000 description 1
- 230000005975 antitumor immune response Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 230000017531 blood circulation Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000012398 clinical drug development Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000016396 cytokine production Effects 0.000 description 1
- 101150077768 ddb1 gene Proteins 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 210000004443 dendritic cell Anatomy 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000009274 differential gene expression Effects 0.000 description 1
- 238000012172 direct RNA sequencing Methods 0.000 description 1
- 238000002224 dissection Methods 0.000 description 1
- 230000005782 double-strand break Effects 0.000 description 1
- 230000035622 drinking Effects 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 238000010201 enrichment analysis Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000002825 functional assay Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 102000047758 human TNFRSF18 Human genes 0.000 description 1
- 239000012274 immune-checkpoint protein inhibitor Substances 0.000 description 1
- 238000010166 immunofluorescence Methods 0.000 description 1
- 238000010185 immunofluorescence analysis Methods 0.000 description 1
- 230000001024 immunotherapeutic effect Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000028709 inflammatory response Effects 0.000 description 1
- 238000011221 initial treatment Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 101150062507 kdc gene Proteins 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 210000001165 lymph node Anatomy 0.000 description 1
- 230000001926 lymphatic effect Effects 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 238000010197 meta-analysis Methods 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 230000009456 molecular mechanism Effects 0.000 description 1
- 238000010172 mouse model Methods 0.000 description 1
- 101150071637 mre11 gene Proteins 0.000 description 1
- 210000004985 myeloid-derived suppressor cell Anatomy 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000002611 ovarian Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 238000010837 poor prognosis Methods 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 239000000092 prognostic biomarker Substances 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000011127 radiochemotherapy Methods 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 230000008263 repair mechanism Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000004043 responsiveness Effects 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000005783 single-strand break Effects 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000004797 therapeutic response Effects 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000011222 transcriptome analysis Methods 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 102000035160 transmembrane proteins Human genes 0.000 description 1
- 108091005703 transmembrane proteins Proteins 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 108010073629 xeroderma pigmentosum group F protein Proteins 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/30—Detection of binding sites or motifs
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Theoretical Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a processing method, a processing system, processing equipment and a computer readable storage medium of esophageal squamous cell carcinoma data, wherein the processing method comprises the following steps: acquiring sequencing data of a sample to be tested; inputting the sequencing data of the sample to be tested into the constructed classification model to obtain classification results of DDR-silent subtype and non-DDR-silent subtype; based on the classification results of the DDR-silent subtype, a treatment regimen of whether anti-PD-1 antibody+anti-GITR promoter/anti-BTLA inhibitor is administered is given. According to the method, sequencing data of the sample to be tested are utilized to divide the sample to be tested into two classification results of the DDR-silent subtype and the non-DDR-silent subtype, then a combined treatment strategy of PD-1 blocking combined GITR triggering or BTLA blocking is given out according to the classification result of the DDR-silent subtype, and the effectiveness of the combined immunotherapy strategy is verified.
Description
Technical Field
The invention relates to the field of data analysis, in particular to a processing method and a processing system of esophageal squamous cell carcinoma data.
Background
Esophageal squamous cell carcinoma (Esophageal squamous cell carcinoma, ESCC) is a malignant tumor that threatens human health. Five year survival in ESCC patients is less than 20% in developed countries and less than 5% in many developing countries. Notably, some primary esophageal cancer patients often relapse rapidly after esophageal resection, and the prognosis of these patients remains poor. To date, no accurate molecular biomarkers can predict the development of these primary ESCC patients, resulting in inadequate clinical management. Thus, there is an urgent need to identify new prognostic biomarkers for primary ESCC.
A variety of synergistic repair mechanisms can rapidly and properly repair DNA damage in normal cells; DNA double strand breaks are repaired primarily by Homologous Recombination (HR) and non-homologous end joining (NHEJ), DNA single strand breaks are repaired primarily by mismatch repair (MMR) and nucleotide excision repair pathways (NER). DNA Damage Repair (DDR) defects can lead to accumulation of DNA damage and genomic instability, production of neoantigens, and up-regulation of expression of immune checkpoints, ultimately altering immune balance in the Tumor Microenvironment (TME). Interestingly, DDR deficiency becomes an important determinant of anti-tumor immune response by affecting antigenicity, adjuvanticity and responsiveness, which may contribute to the response of immunotherapy. Recent studies have revealed the potential of some DDR-based biomarkers in predicting immune therapeutic responses; however, the value of DDR-related features for prognostic evaluation and personalized immunotherapy has not yet been fully elucidated. Thus, revealing the correlation between the change in tumor DDR pathway and prognosis, and the regimen of personalized immunotherapy based on DDR-specific features is of paramount importance.
ESCC treatment generally involves a variety of modalities including surgery, radiation therapy and chemotherapy. Recently, immune checkpoint inhibitors (e.g., anti-PD-1) have produced significant survival benefits for advanced and metastatic ESCC. For primary esophageal cancer, chemo-radiotherapy or adjuvant chemotherapy after esophageal resection is the primary treatment modality; however, many patients are inherently resistant to the conventional treatments described above and have limited clinical efficacy. To date, no immunotherapy has been approved for the treatment of primary ESCC. Thus, there is an urgent need to fully understand the immune microenvironment and develop optimal immunotherapeutic approaches for primary ESCC patients.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention provides a processing method of esophageal squamous cell carcinoma data, which utilizes sequencing data of a sample to be detected to divide the sample to be detected into two classification results of DDR-silent subtype and non-DDR-silent subtype, then gives a combined treatment strategy of PD-1 blocking combined GITR triggering or BTLA blocking according to the classification result of DDR-silent subtype, and verifies the effectiveness of the combined immunotherapy strategy. According to the method, two independent prognosis prediction genes are obtained through analysis of sequencing data, classification of the primary ESCC samples to be tested is achieved, and prognosis of primary ESCC patients is effectively improved.
The first aspect of the application discloses a method for processing esophageal squamous cell carcinoma data, which comprises the following steps:
acquiring sequencing data of a sample to be tested;
inputting the sequencing data of the sample to be tested into the constructed classification model to obtain classification results of DDR-silent subtype and non-DDR-silent subtype;
based on the classification results of the DDR-silent subtype, a treatment regimen of whether anti-PD-1 antibody+anti-GITR promoter/anti-BTLA inhibitor is administered is given.
A second aspect of the present application discloses a method for processing esophageal squamous cell carcinoma data, comprising:
acquiring sequencing data of a sample to be tested;
based on gene expression data of sequencing data of the sample to be tested, classification results of DDR-silent subtype and non-DDR-silent subtype are obtained, and the gene expression data comprises the expression quantity of one or more genes: HFM1, BRCA1;
giving a treatment regimen of whether to administer anti-PD-1 antibody + anti-GITR promoter/anti-BTLA inhibitor based on the classification result of the DDR-silent;
alternatively, the classification result of the DDR-silent subtype corresponds to high HFM1 gene expression.
The sequencing data of the sample to be tested is RNA-seq data of a primary ESCC patient;
optionally, the anti-PD-1 antibody comprises: inVivoMabanti-mouse PD-1; the anti-PD-1 antibody is preferably clone RMP1-14;
optionally, the anti-GITR promoter comprises: inVivoMAbanti-mouse GITR; the anti-GITR promoter is preferably clone DTA-1; GITR acts as a co-stimulatory receptor, becoming a potential target for enhancing immunotherapy, and plays a key role in T cell activation, with its activity enhancing other anti-cancer therapies through synergy;
optionally, the anti-BTLA inhibitor comprises: inVivoMAbanti-mouse BTLA; the anti-BTLA inhibitor is preferably clone 6A6; BTLA acts as an inhibitory receptor and the ligand is herpes virus invasion mediator (HVEM).
Both BTLA and PD-1 are highly expressed, and/or GITR is low and PD-1 is highly expressed in a T cell subset of the classification result of the DDR-silent subtype.
The construction method of the classification model comprises the following steps:
acquiring sequencing data of a training set sample and a life cycle condition corresponding to the sample;
extracting a path related to the survival rate and the gene expression condition thereof from the sequencing data of the training set sample;
performing cluster analysis on the training set samples based on the lifetime condition to obtain two groups of classification of DDR-silent subtype and non-DDR-silent subtype, and characterizing the passage of each group of classification and the gene expression condition thereof to obtain the classification model;
optionally, the survival-related pathway includes one or more of the following: MMR pathway, NER pathway, FA pathway, and NHEJ pathway.
The construction method further comprises the following steps: based on the gene expression condition of the passage related to the survival rate, obtaining (8) DDR gene sets and corresponding gene expression conditions related to survival results by utilizing a univariate regression analysis method; the DDR gene set related to the survival result and the corresponding gene expression situation are processed by utilizing a multivariate analysis method, so that the gene expression situations of a prognosis prediction gene and a prognosis prediction gene are obtained;
and carrying out cluster analysis on the training set sample based on the survival condition to obtain two groups of classification of DDR-silent subtype and non-DDR-silent subtype, and representing the prognosis prediction gene of each group of classification and the gene expression condition thereof to obtain a classification model.
The cluster analysis method comprises the following steps: a consistency clustering algorithm;
optionally, the sequencing data of the training set sample includes: RNA-seq data of primary ESCC tumor tissue samples and metastatic ESCC tumor tissue samples.
A third aspect of the present application discloses a system for processing esophageal squamous cell carcinoma data, comprising:
the acquisition unit is used for acquiring sequencing data of the sample to be tested;
the classification unit is used for inputting the sequencing data of the sample to be tested into the constructed classification model to obtain classification results of the DDR-silent subtype and the non-DDR-silent subtype;
and an output unit for giving a treatment scheme of whether the anti-PD-1 antibody+the anti-GITR antibody/the anti-BTLA antibody is given or not based on the classification result of the DDR-silent subtype.
In a fourth aspect the present application discloses a device for processing esophageal squamous cell carcinoma data, said device comprising: a memory and a processor;
the memory is used for storing program instructions; the processor is used for calling program instructions which, when executed, are used for executing the processing method of esophageal squamous cell carcinoma data.
A fifth aspect of the present application discloses a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described method of processing esophageal squamous cell carcinoma data.
The application has the following beneficial effects:
1. the application creatively discloses a processing method of esophageal squamous cell carcinoma data, which utilizes the specificity of DDR-silent subtype to give a combined treatment strategy of PD-1 blocking combined GITR triggering or BTLA blocking to a sample to be tested of classified DDR-silent subtype, and provides potential clinical significance for the treatment and management strategy of primary ESCC with DDR-silent subtype.
2. The method creatively classifies primary ESCC patients into DDR-silent subtype and non-DDR-silent subtype, and carries out clinical prognosis evaluation on the patients based on analysis of sequencing data or prognosis prediction genes of the patients. Meanwhile, the application also discloses a model construction method for parting the primary ESCC according to the DDR pathway gene set and the gene expression condition thereof, in the model construction process, two independent prognosis biomarkers BRCA1 and HFM1 are determined, the classification model can be used for effectively predicting the subsequent survival rate of a primary ESCC patient with frequent rapid recurrence and poor prognosis, the immune treatment scheme is effectively guided, a new clue and visual angle are provided for tumor heterogeneity based on the identification of a novel DDR molecular subtype, and the potential clinical significance of the treatment and management strategy of the primary ESCC patient of the DDR-silent subtype is revealed.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for processing esophageal squamous cell carcinoma data provided by a first aspect of an embodiment of the invention;
FIG. 2 is a schematic view of an apparatus for processing and analyzing esophageal squamous cell carcinoma data provided in a fourth aspect of the invention;
FIG. 3 is a schematic flow chart of a processing analysis system for esophageal squamous cell carcinoma data provided by a third aspect of an embodiment of the invention;
FIG. 4 is an ESCC tumor cluster analysis chart based on DDR gene map provided by the embodiment of the invention;
FIG. 5 is a graph showing the results of the BRCA1 and HFM1 mediated DNA damage in ESCC provided in the examples of the present invention;
FIG. 6 is a schematic of ESCC tumor growth inhibition more effective in combination PD-1 and BTLA blocking/GITR trigger induction provided by an embodiment of the invention.
Detailed Description
In order to enable those skilled in the art to better understand the present invention, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present invention with reference to the accompanying drawings.
In some of the flows described in the specification and claims of the present invention and in the foregoing figures, a plurality of operations occurring in a particular order are included, but it should be understood that the operations may be performed out of order or performed in parallel, with the order of operations such as 101, 102, etc., being merely used to distinguish between the various operations, the order of the operations themselves not representing any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments according to the invention without any creative effort, are within the protection scope of the invention.
Fig. 1 is a schematic flow chart of a processing method of esophageal squamous cell carcinoma data provided by a first aspect of an embodiment of the invention, specifically, the method comprises the following steps:
101: acquiring sequencing data of a sample to be tested;
in one embodiment, the sequencing data of the test sample is RNA-seq data of a primary ESCC patient. Primary is relative to secondary and metastatic. That is, a disease first occurs in a tissue or organ for which the disease is primary. As an example: primary hepatocellular carcinoma, i.e., hepatocellular carcinoma occurs first, while secondary liver cancer is cancer in other areas, which is transferred to the liver along with blood flow or lymphatic path, and the primary area is in other tissues or organs, not the liver. The primary ESCC patient in this example refers to a surgically resected patient who received a primary tumor, followed by radiation therapy, with or without chemotherapy.
In one embodiment, the RNA-seq, i.e., transcriptome sequencing, technique is a sequencing analysis using high throughput sequencing techniques that reflects the expression levels of mRNA, smallRNA, noncodingRNA, etc., or some of them. In the last decade, RNA-Seq technology has evolved rapidly and has become an indispensable tool for analyzing differential gene expression/variable cleavage of mRNA at the transcriptome level. With the development of the next generation sequencing technology, the application range of the RNA-Seq technology becomes wider: firstly, in the field of RNA biology, RNA-Seq can be applied to single cell gene expression/protein expression/RNA structure analysis; secondly, the concept of spatial transcriptomes is also growing. Long read long/direct RNA-Seq technology and better data analysis and calculation tools have the advantage of helping biologists to gain insight into RNA biology with RNA-Seq-e.g. when and where transcription starts; and how to influence RNA functions by in vivo folding and intermolecular actions.
A transcriptome is a collection of all transcripts produced by a particular species or cell type. Transcriptome research can research gene functions and gene structures from an overall level, reveals specific biological processes and molecular mechanisms in disease occurrence processes, and has been widely applied to the fields of basic research, clinical diagnosis, drug development and the like.
In one embodiment, the sample to be tested is a primary ESCC patient clinically used to receive a prognostic evaluation.
102: inputting the sequencing data of the sample to be tested into the constructed classification model to obtain classification results of DDR-silent subtype and non-DDR-silent subtype;
in one embodiment, both BTLA and PD-1 are highly expressed, and/or GITR is low and PD-1 is highly expressed, in a T cell subset of the classification results for the DDR-silent subtype. Wherein PD1, BTLA and GITR are transcriptome analysis parts, and anti-PD-1, anti-BTLA and anti-GITR are inhibitors of the tumorigenesis experiment.
In one embodiment, the method for constructing the classification model includes:
acquiring sequencing data of a training set sample and a life cycle condition corresponding to the sample;
extracting a path related to the survival rate and the gene expression condition thereof from the sequencing data of the training set sample; the training set samples included RNA-seq data for tumor tissue of 82 primary ESCCs and 73 ESCCs with lymph node metastasis; the patients received surgical excision of the primary tumor and lymph node dissection followed by radiotherapy, with or without chemotherapy. The data of the 155 patients are from ESCC queues of the Shanxi province tumor Hospital (SCH), the RNA-seq data of the SCH queues are stored in Gene Expression Omnibus (GEO), the accession number is GSE53625, and clinical and pathological data of 97 patients are determined through retrospective examination of SCH electronic medical records, and the follow-up period is finished 2019/06 8 months. RNA-seq data analysis of the HiSeq Illumina platform was from UCSC Xena atlas
(https:// xenabrowser. Net/datapages /) collection, the RNA-seq data covers TPM levels and log2 (x+1) normalization;
and carrying out cluster analysis on the training set samples based on the lifetime condition to obtain two groups of classification of DDR-silent subtype and non-DDR-silent subtype, and characterizing the passage of each group of classification and the gene expression condition thereof to obtain the classification model.
Optionally, the survival-related pathway includes one or more of the following: MMR pathway, NER pathway, FA pathway, and NHEJ pathway; the gene expression conditions of the survival rate-related pathway comprise gene expression profiles of one or more of the following genes: POLD1, POLD2, POLD3, POLD4, MSH2, MSH3, MSH6, MLH1, MLH3, PMS1, PMS2, MSH4, MSH5, EXO1, HMGB1, LIG1, PCNA, RFC2, RFC4, RFC3, RFC5, RFC1, RPA2, RPA3, RPA4, POLD1, POLD2, POLD3, POLD4, PCNA, RFC1, RFC2, RFC3, RFC4, RFC1, RFC3 RFC5, POLE2, POLE3, POLE4, POLK, CUL4A, DDB, DDB2, RBX1, CUL4A, DDB1, DDB2, RBX1, CETN2, RAD23B, XPC, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR 3, CUL5, ERCC1, ERCC4, ERCC5, LIG1, TCEB2, TCEB3, A, POLR, RPA2, TCEB1, RPA2, RED 2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR2A, POLR 3, TCEB1, RPA2 RPA3, RPA4, CDK7, ERCC2, ERCC3, GTF2H1, GTF2H2, GTF2H3, GTF2H4, GTF2H5, MNAT1, ERCC6, ERCC8, LIG3, RAD23A, POLR2, XRCC1, GADD45A, POLR 45A, POLR2, TOP3A, POLR 3A, POLR 1, BRCA2, BRIP1, PALB2, FAAP100, FAAP24, A, POLR 1, HES1, STRA13, UBE 2A, POLR2, A, POLR, DNA2, FAN1, HELQ, KAT5, RAD 51A, POLR2, USP1, WDR48, A, POLR 1, MRE11A, POLR 3, RAD50, RNF168, RNF8, TP53BP1, DCLRE 1A, POLR 4, NHEJ1, kdc, XRCC5, brcc 6, LIG4, mbe 2A, POLR, prcc 37, XRCC 37, A, POLR, XRCC5, XRCC 37.
Optionally, the survival-related pathway is obtained by: extracting a DDR channel gene set and a gene expression condition thereof from sequencing data of the training set sample; performing univariate Cox regression analysis on the DDR pathway gene set to obtain a pathway related to survival rate and a gene expression condition of the pathway related to survival rate; the DDR pathway gene set comprises: BER pathway (base excision repair, n=43), MMR pathway (mismatch repair, n=27), NER pathway (nucleotide excision repair, n=70), FA pathway (fanconi anemia, n=36), HR pathway (homologous recombination, n=55) and NHEJ pathway (non-homologous end joining, n=37);
in one embodiment, to characterize the DDR subtype, the DDR subtype is first subjected to Differential Expression (DE) analysis using R-package limma (V3.50.3) to determine subtype-specific genes. Differentially Expressed Genes (DEG) were defined as log fc) < = -1 or > = 1 and adjusted for P value <0.05. Then, a pathway enrichment analysis was performed on the DEG from the MSigDB (genome database: https:// www.jianshu.com/p/99369b2f7a7 d) for a set of carefully selected marker pathways to identify enriched pathways in the DDR subtype, as implemented by the R-packet cluster analysis program clusterifier (version 4.2.2).
In one embodiment, the building method further comprises: based on the gene expression condition of the passage related to the survival rate, obtaining (8) DDR gene sets and corresponding gene expression conditions related to survival results by utilizing a univariate regression analysis method; the DDR gene set related to the survival result and the corresponding gene expression situation are processed by utilizing a multivariate analysis method, so that the gene expression situations of a prognosis prediction gene and a prognosis prediction gene are obtained; sex, grade, smoking history and drinking history are controlled in the multivariate analysis;
and carrying out cluster analysis on the training set sample based on the survival condition to obtain two groups of classification of DDR-silent subtype and non-DDR-silent subtype, and representing the prognosis prediction gene of each group of classification and the gene expression condition thereof to obtain a classification model. The DDR-silent subtype corresponds to the gene expression condition of a low survival rate pathway. There is no specific threshold for the survival rate, and it is concluded by statistical comparative analysis between the DDR-active subtype and the DDR-silent subtype.
In one embodiment, the method of cluster analysis is: a consistency clustering algorithm; consistency clustering is also called consensus clustering, and is a method for aggregating the results of various clustering algorithms, and is also called clustering integration or aggregation of clusters. It is meant that a number of different (input) clusters have been obtained for a particular dataset and that it is desirable to find a single (consistent) cluster, in some sense more appropriate than existing clusters. Thus, consistent clustering is a problem of coordinating clustering information about the same dataset from different sources or different runs of the same algorithm. This clustering procedure was performed using the R-packet consissuclusteriplus, iterated 1000 times and resampled 90%. The core algorithm is a k-means algorithm based on Euclidean distance, and a single algorithm cannot be realized.
Optionally, the univariate regression analysis is univariate Cox regression analysis;
optionally, the sequencing data of the training set sample includes: RNA-seq data of primary ESCC tumor tissue samples and metastatic ESCC tumor tissue samples. By analyzing the RNA-seq data of primary and metastatic ESCC tumor tissue samples, DDR pathway analysis determined that the DDR active subtype and DDR silent subtype have independent prognostic value in primary ESCC, but not in metastatic ESCC.
In one example, the non-DDR-side subtype was the DDR-side subtype, and the correlation between DDR subtype and ESCC survival with and without LNM (lymph node metastasis) was studied using a hierarchical analysis method, which showed that the primary ESCC tumor of DDR-side subtype had the worst survival rate (log-rankp=0.032) compared to the metastatic ESCC tumor, but no significant difference was observed in survival rate of metastatic ESCC tumor between DDR subtypes (log-rankp=0.34). DDR pathway analysis established that the DDR active subtype and DDR silent subtype have independent prognostic value in primary ESCC, but not in metastatic ESCC.
In one embodiment, to further verify the association between DDR subtype and survival outcomes, DDR subtypes of 74 tumors in the TCGA-ESCC cohort and 117 tumors in the Chen cohort were also summarized. Consistent with the findings in this cohort, DDR subtype assisted survival prediction was only used for primary ESCC tumors, allowing identification of a subset of patients with good or poor outcome (TCGA-ESCC cohort, hr=0.075, 95% ci 0.008-0.674, log-rankp=0.004; for Chen cohort, hr=0.430, 95% ci 0.186-0.995, log-rankp=0.042), and failure to stratify survival of ESCC tumors with LNM. Multivariate Cox regression analysis showed that the DDR subtype was a powerful predictor of survival outcome and it was independent of clinical variables and underscores the value of the DDR subtype and its robustness in predicting primary ESCC patient survival outcome. Stratified analysis is to separate the population into different layers (sub-groups) according to a certain characteristic, such as gender, age, etc., and analyze the association of exposure and disease in each layer separately. The objective of hierarchical analysis is to control confounding factors, adjust the interference of these factors-estimate the magnitude of the confounding factors' impact on the relationship between exposure and outcome. Hierarchical analysis is a scenario to cope with mean value failure. Wherein, the TCGA-ESCC queue comprises RNA-seq data of 74 patients, collected from UCSC Xenaatlas (https:// xenabowser. Net/datapages /); the Chen cohort was 117 cases of ESCC patient microarray data of the academy of medical science and the college of beijing synergetics, and clinical data was obtained from Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/GEO/query/acc.cgiac=gse 53624).
Optionally, the prognostic prediction gene includes: BRCA1 gene and HFM1 gene; the DDR-active subtype corresponds to the high BRCA1 gene expression quantity, and the DDR-silent subtype corresponds to the high HFM1 gene expression quantity; there is no specific threshold for the amount of gene expression, and it is concluded by statistical comparative analysis between the DDR-active subtype and the DDR-silent subtype.
In one embodiment, 3 independent queues are used for prognosis evaluation of DDR genes in primary and metastatic ESCC tumor tissues respectively by using meta analysis, and BRCA1 and HFM1 are predictors of survival results of primary ESCC patients, but do not contribute to prognosis of metastatic ESCC; BRCA1 was identified as a favorable prognostic factor, with high expression associated with improved survival, combined HR of 0.22, while HFM1 is a risk factor, with increased expression associated with poor survival results, with a different aggregate HR of 4.41.
In one embodiment, the cells sense and repair DNA damage, maintain genomic integrity and prevent tumorigenesis in the presence of BRCA 1. BRCA1 deficiency can disrupt normal DDR and lead to accumulation of DNA damage. However, the role of HFM1 (an ATP dependent DNA helicase homolog) in DDR has not been studied. To determine the role of BRCA1 and HFM1 in ESCC cell DDR, cell models of cisplatin (DDP) and X-IR induced in vitro DNA damage were constructed, expression of BRCA1 or HFM1 was silenced using transient siRNA transfection, and cells were treated with cisplatin (DDP) or X-IR. The knockout efficiency of BRCA1 and HFM1 was examined by Western blotting. To directly assess DDR, γh2ax (a mature DNA DSB marker) was visualized by immunofluorescence. Spontaneous and DDP or IR induced γh2ax lesions were counted and analyzed. After DDP or X-IR treatment, γH2AX accumulates. Furthermore, immunofluorescence analysis showed a significant increase in endogenous γh2ax accumulation in KYSE410 and KYSE450 cells following BRCA1 knockout under IR and DDP treatment. In contrast, HFM1 knockdown significantly reduced the number of γH2AX lesions in KYSE30 and KYSE450 cells treated with X-IR or DDP. These results indicate that the loss of BRCA1 results in DDR defects, which support the role of BRCA1 as an advantageous prognostic factor, whereas the loss of HFM1 promotes DDR, supporting the role of HFM1 as a prognostic risk factor.
103: giving a treatment regimen of whether to administer anti-PD-1 antibody + anti-GITR promoter/anti-BTLA inhibitor based on the classification result of the DDR-silent subtype;
in one embodiment, the anti-PD-1 antibody comprises: inVivoMabanti-mouse PD-1; the anti-PD-1 antibody is preferably clone RMP1-14; PD-1 is programmed death receptor 1, an important immunosuppressive molecule, which is an immunoglobulin superfamily; wherein each antibody has a clone number, followed by a name and followed by a clone number;
optionally, the anti-GITR promoter comprises: inVivoMAbanti-mouse GITR; the anti-GITR promoter is preferably clone DTA-1; wherein each antibody has a clone number, followed by a name and followed by a clone number; GITR acts as a co-stimulatory receptor, becoming a potential target for enhancing immunotherapy, and plays a key role in T cell activation, with its activity enhancing other anti-cancer therapies through synergy; GITR promoters are attractive targets in immunotherapy; GITR promotes activation and proliferation of effector T cells and reduces the level of regulatory T cells. GITR (TNFRSF 18/CD 357/AITR) is a type 1 transmembrane protein belonging to the TNFRSF superfamily, and other members also include OX40, CD27, CD40 and 4-1BB. Human GITR is expressed at high levels on cd4+cd25+foxp3+ Tregs and at low levels on naive and memory T cells. In activation of cd8+ and cd4+ effector T cells, GITR expression on Tregs and effector T cells increases rapidly, reaching maximum levels on activated Tregs. GITR is also expressed on natural killer cells (NK), and is also expressed at low levels on B cells, macrophages and dendritic cells, and can be upregulated by activation, particularly on NK cells. In recent years, GITR has been widely studied as a pharmacological target. The agonist mab activates GITR to enhance immune and inflammatory responses, thereby enhancing anti-tumor responses. In contrast, GITR inhibitors inhibit T cell activation and immune responses. Thus, GITR agonist mab was further developed as an anti-tumor drug. GITR, like other co-stimulatory molecules, plays a key role in T cell activation, and its activity may enhance other anti-cancer therapies through synergy. anti-PD-1 and GITR agonist mab combination therapy could achieve long-term survival in ovarian and breast cancer mouse models, stimulate IFN-gamma producing conventional T cells, suppress immunosuppressive Tregs and myeloid-derived suppressor cells. The combination therapy successfully restored the activity of cd8+ T cells and induced proliferation of precursor effector memory T cell phenotypes in a CD 226-dependent manner.
Optionally, the anti-BTLA inhibitor comprises: inVivoMAbanti-mouse BTLA; the anti-BTLA inhibitor is preferably clone 6A6; wherein each antibody has a clone number, followed by a name and followed by a clone number; BTLA acts as an inhibitory receptor and the ligand is herpes virus invasion mediator (HVEM). The BTLA inhibitor is an inhibitory receptor of the immunoglobulin superfamily; BTLA belongs to the CD28 family and has structural similarity to PD-1 and CTLA-4. It has an extracellular immunoglobulin domain, an Immunoreceptor Tyrosine Inhibitory Motif (ITIM), and an immunoreceptor tyrosine-based switching motif (ITSM). BTLA signaling involves phosphorylation of ITIMs and SH2 domain-containing phosphatase 1 (SHP-1)/SHP-2 binding, thereby inhibiting T cell proliferation and cytokine production. BTLA expression in malignant tissue is higher than normal tissue, with higher BTLA expression being positively correlated with higher HVEM expression. Furthermore, BTLA expression affects prognosis, total 5-year survival (OS) for low BTLA expression groups was 48.3%, decreasing to 17.9% when BTLA was highly expressed. Higher BTLA expression is also associated with lymph node metastasis. BTLA is associated with other co-inhibitory receptor expression. In advanced melanoma, foucade et al demonstrated that 42% of NY-ESO-1 specific CD8+ T lymphocytes co-express BTLA and PD-1, and that these cells have a partially dysfunctional phenotype. TIM-3 and PD-1 are upregulated when NY-ESO-1 specific cd8+ T lymphocytes are stimulated with cognate antigen for prolonged periods of time. BTLA expression follows different patterns, suggesting that BTLA upregulation depends on different conditions, not functional depletion driven by high antigen loading. Furthermore, blocking BTLA by anti-BTLA antibodies can enhance production of IFN- γ, tnfα and IL-2 by NY-ESO-1 specific cd8+ T cells. Interestingly, when anti-BTLA was combined with anti-PD-1, a synergistic effect was observed in the functional assay.
In one embodiment, to characterize the gene expression status of PD-1, BTLA, and GITR on the T-package compartment involved in ESCC in a single cell state, scRNA-seq data of tumor tissue of 31 primary ESCC patients were obtained, the scRNA-seq data including 32918T cells; 7T cell subsets were identified based on typical gene markers (T helper 17,cytotoxic T cells,NK T cells,exhausted CD8T cells,memory CD8T cells,t cells, and regulatory CD 4T cells), the regulatory CD 4T cells are labeled with transcripts including CD4, IL32, FOXP3 and IL2 RA. exhausted CD 8T cells have typical markers of failure, including TOX, CTLA-4, TIGIT and CXCL13, and cytotoxic T cell subsets characterized by high expression of GNLY, GZMA, GZMB and NKG 7.
A second aspect of the present application discloses a method for processing esophageal squamous cell carcinoma data, comprising:
acquiring sequencing data of a sample to be tested;
based on gene expression data of sequencing data of the sample to be tested, classification results of DDR-silent subtype and non-DDR-silent subtype are obtained, and the gene expression data comprises the expression quantity of one or more genes: HFM1, BRCA1;
giving a treatment regimen of whether to administer anti-PD-1 antibody + anti-GITR promoter/anti-BTLA inhibitor based on the classification result of the DDR-silent;
alternatively, the classification result of the DDR-silent subtype corresponds to high HFM1 gene expression.
Fig. 2 is a processing analysis device for esophageal squamous cell carcinoma data provided by an embodiment of the invention, the device comprising: a memory and a processor; the memory is used for storing program instructions; the processor is used for calling program instructions which, when executed, are used for executing the processing method of esophageal squamous cell carcinoma data.
Fig. 3 is a processing analysis system for esophageal squamous cell carcinoma data provided by an embodiment of the invention, comprising:
an acquiring unit 301, configured to acquire sequencing data of a sample to be tested;
the classification unit 302 is configured to input sequencing data of the sample to be tested into the constructed classification model to obtain classification results of the DDR-silent subtype and the non-DDR-silent subtype;
an output unit 303 that gives a treatment regimen of whether to administer the anti-PD-1 antibody+the anti-GITR antibody/the anti-BTLA antibody based on the classification result of the DDR-silent subtype.
The processing and analyzing system for esophageal squamous cell carcinoma data provided by the embodiment of the invention comprises:
the acquisition unit is used for acquiring gene expression data of the sample to be detected; the gene expression data of the sample to be tested comprises the gene expression data of one or more of the following genes: BRCA1 gene, HFM1 gene;
the classification unit is used for obtaining classification results of the DDR-silent subtype and the non-DDR-silent subtype based on gene expression data of sequencing data of the sample to be detected, wherein the gene expression data comprises the expression quantity of one or more genes: HFM1, BRCA1;
and an output unit for giving a treatment scheme of whether the anti-PD-1 antibody+the anti-GITR promoter/the anti-BTLA inhibitor is given or not based on the classification result of the DDR-silent.
A computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method of processing esophageal squamous cell carcinoma data as described above.
FIG. 4 is a chart of ESCC tumor cluster analysis based on DDR gene map provided by the embodiment of the invention, wherein,
(A) Heat map of fold change in DDR gene expression between DDR subtypes. Red bars represent DDR-active subtypes and green bars represent DDR-silent subtypes. DDR subtypes are classified by consensus clustering methods. (B-D) Kaplan-Meier curves compare the OS (log rank test) of DDR-active subtype, DDR-silent subtype and transition subtype groups. HR and 95% ci were calculated by double sided Wald test using univariate Cox regression. (E) Kaplan-Meier curves compare the OS of DDR-active subtype and DDR-silent subtype in primary esophageal squamous cell carcinoma (log rank test). HR and 95% ci were calculated by double sided Wald test using univariate Cox regression.
FIG. 5 is a graph of the results of the BRCA1 and HFM1 mediated DNA damage reaction in ESCC provided in the examples of the present invention, wherein (A, B) KYSE410 and KYSE450 cells are transfected with BRCA1 siRNA, treated with 2. Mu.g/ml DDP, and analyzed by Western blotting for γH2AX. (C, D) KYSE410 and KYSE450 cells were transfected with BRCA1 siRNA, exposed to IR (4 Gy), harvested at the indicated times and analyzed by Western blot for γH2AX. (E, F) representative pictures and quantification of gamma H2AX lesions in control and BRCA1 knockdown KYSE410 and KYSE450 cells were treated with 2. Mu.g/ml DDP for the indicated times. Data represent three independent experiments. Each dot represents one cell, and 50 cells per group were counted for this experiment with Image J. Error bars represent ± SD of the experiment. The P-value was determined by unpaired double sided t-test. (G, H) representative pictures and quantification of γH2AX lesions in control and BRCA1 knockdown KYSE410 and KYSE450 cells, treatment with IR (4 Gy) for the indicated times. Data represent three independent experiments. Each dot represents one cell, and 50 cells per group were counted for this experiment with Image J. Error bars represent ± SD of the experiment. The P-value was determined by unpaired double sided t-test. (I, J) KYSE30 and KYSE450 cells transfected with HFM1 siRNA, treated with 2. Mu.g/ml DDP and analyzed by Western blotting
γh2ax. (K, L) KYSE30 and KYSE450 cells were transfected with HFM1 siRNA, exposed to IR (4 Gy), harvested at the indicated times and analyzed by Western blot for γH2AX. Representative pictures and quantification of γh2ax lesions in (M, N) control and HFM1 knockdown KYSE30 and KYSE450 cells were treated with 2 μg/ml DDP for the indicated times. Data represent three independent experiments. Each dot represents one cell and Image J counted 50 cells for each group of the experiment. Error bars represent ± SD of the experiment. The P-value was determined by unpaired double sided t-test. (O, P) control and HFM1 knockdown KYSE30 and KYSE450 cells with IR (4 Gy) treatment for a specified time of representative pictures and quantification of gamma H2AX lesions. Data represent three independent experiments. Each dot represents one cell and Image J counted 50 cells for each group of the experiment. Error bars represent ± SD of the experiment. The P-value was determined by unpaired double sided t-test.
FIG. 6 is a schematic of a treatment plan for (A) alpha-GITR, alpha-PD-1, or combination therapy, providing a more effective ESCC tumor growth inhibition profile for combined PD-1 and BTLA blocking/GITR trigger induction in accordance with an embodiment of the present invention. (B, C) tumor images and statistics of tumor weight from isogenic mEC model receiving indicated treatment. (D) Schematic of treatment plan for α -BTLA, α -PD-1 or combination therapy. (E, F) tumor images and statistics of tumor weight from isogenic mEC model receiving indicated treatment. Data in C and E represent mean ± SD and are analyzed by unpaired double sided t-test.
The results of the verification of the present verification embodiment show that assigning an inherent weight to an indication may moderately improve the performance of the present method relative to the default settings.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in implementing the methods of the above embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, where the storage medium may be a read only memory, a magnetic disk or optical disk, etc.
While the foregoing describes a computer device provided by the present invention in detail, those skilled in the art will appreciate that the foregoing description is not meant to limit the invention thereto, as long as the scope of the invention is defined by the claims appended hereto.
Claims (10)
1. A method of processing esophageal squamous cell carcinoma data, comprising:
acquiring sequencing data of a sample to be tested;
inputting the sequencing data of the sample to be tested into the constructed classification model to obtain classification results of DDR-silent subtype and non-DDR-silent subtype;
based on the classification results of the DDR-silent subtype, a treatment regimen of whether anti-PD-1 antibody+anti-GITR promoter/anti-BTLA inhibitor is administered is given.
2. A method of processing esophageal squamous cell carcinoma data, comprising:
acquiring sequencing data of a sample to be tested;
based on gene expression data of sequencing data of the sample to be tested, classification results of DDR-silent subtype and non-DDR-silent subtype are obtained, and the gene expression data comprises the expression quantity of one or more genes: HFM1, BRCA1;
giving a treatment regimen of whether to administer anti-PD-1 antibody + anti-GITR promoter/anti-BTLA inhibitor based on the classification result of the DDR-silent;
alternatively, the classification result of the DDR-silent subtype corresponds to high HFM1 gene expression.
3. The method for processing esophageal squamous cell carcinoma data according to claim 1 or 2, wherein the sequencing data of the sample to be tested is RNA-seq data of a primary ESCC patient;
optionally, the anti-PD-1 antibody comprises: inVivoMabanti-mouse PD-1; the anti-PD-1 antibody is preferably clone RMP1-14;
optionally, the anti-GITR promoter comprises: inVivoMAbanti-mouse GITR; the anti-GITR promoter is preferably clone DTA-1; GITR acts as a co-stimulatory receptor, becoming a potential target for enhancing immunotherapy, and plays a key role in T cell activation, with its activity enhancing other anti-cancer therapies through synergy;
optionally, the anti-BTLA inhibitor comprises: inVivoMAbanti-mouse BTLA; the anti-BTLA inhibitor is preferably clone 6A6; BTLA acts as an inhibitory receptor and ligands are mediators of herpes virus invasion.
4. The method of processing esophageal squamous cell carcinoma data according to claim 1 or 2, wherein both BTLA and PD-1 are highly expressed, and/or GITR is low-expressed and PD-1 is highly expressed in a T cell subset of the classification result of the DDR-silent subtype.
5. The method for processing esophageal squamous cell carcinoma data according to claim 1, wherein the method for constructing the classification model comprises:
acquiring sequencing data of a training set sample and a life cycle condition corresponding to the sample;
extracting a path related to the survival rate and the gene expression condition thereof from the sequencing data of the training set sample; performing cluster analysis on the training set samples based on the lifetime condition to obtain two groups of classification of DDR-silent subtype and non-DDR-silent subtype, and characterizing the passage of each group of classification and the gene expression condition thereof to obtain the classification model;
optionally, the survival-related pathway includes one or more of the following: MMR pathway, NER pathway, FA pathway, and NHEJ pathway.
6. The method for processing esophageal squamous cell carcinoma data of claim 5, wherein the constructing method further comprises: based on the gene expression condition of the passage related to the survival rate, obtaining a DDR gene set related to the survival result and a corresponding gene expression condition by utilizing a univariate regression analysis method; the DDR gene set related to the survival result and the corresponding gene expression situation are processed by utilizing a multivariate analysis method, so that the gene expression situations of a prognosis prediction gene and a prognosis prediction gene are obtained;
and carrying out cluster analysis on the training set sample based on the survival condition to obtain two groups of classification of DDR-silent subtype and non-DDR-silent subtype, and representing the prognosis prediction gene of each group of classification and the gene expression condition thereof to obtain a classification model.
7. The method for processing esophageal squamous cell carcinoma data according to claim 5, wherein the method for cluster analysis is as follows: a consistency clustering algorithm;
optionally, the sequencing data of the training set sample includes: RNA-seq data of primary ESCC tumor tissue samples and metastatic ESCC tumor tissue samples.
8. A system for processing esophageal squamous cell carcinoma data, comprising:
the acquisition unit is used for acquiring sequencing data of the sample to be tested;
the classification unit is used for inputting the sequencing data of the sample to be tested into the constructed classification model to obtain classification results of the DDR-silent subtype and the non-DDR-silent subtype;
and an output unit for giving a treatment scheme of whether the anti-PD-1 antibody+the anti-GITR antibody/the anti-BTLA antibody is given or not based on the classification result of the DDR-silent subtype.
9. A device for processing esophageal squamous cell carcinoma data, the device comprising: a memory and a processor;
the memory is used for storing program instructions; the processor is adapted to invoke program instructions, which when executed, are adapted to carry out the method of processing esophageal squamous cell carcinoma data of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements a method for processing esophageal squamous cell carcinoma data as set forth in any of the preceding claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310062863.0A CN116129998B (en) | 2023-01-19 | 2023-01-19 | Esophageal squamous cell carcinoma data processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310062863.0A CN116129998B (en) | 2023-01-19 | 2023-01-19 | Esophageal squamous cell carcinoma data processing method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116129998A true CN116129998A (en) | 2023-05-16 |
CN116129998B CN116129998B (en) | 2024-06-11 |
Family
ID=86306086
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310062863.0A Active CN116129998B (en) | 2023-01-19 | 2023-01-19 | Esophageal squamous cell carcinoma data processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116129998B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116716404A (en) * | 2023-06-13 | 2023-09-08 | 中国医学科学院北京协和医院 | Device for distinguishing ovarian clear cell carcinoma from high-grade serous carcinoma based on S100A2 |
CN116978554A (en) * | 2023-09-25 | 2023-10-31 | 中国医学科学院基础医学研究所 | Method, system and equipment for processing prognosis data of multiple myeloma |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180051346A1 (en) * | 2015-03-17 | 2018-02-22 | Stichting Het Nederlands Kanker Instituut-Antoni van Leeuwenhoek Ziekenhuis | Methods and means for subtyping invasive lobular breast cancer |
CN109863251A (en) * | 2016-05-17 | 2019-06-07 | 基因中心治疗公司 | To the method for squamous cell lung carcinoma subtype typing |
CN112735521A (en) * | 2021-01-22 | 2021-04-30 | 安徽医科大学第一附属医院 | Guidance selection of bladder cancer immune classification system suitable for anti-PD-1/PD-L1 immunotherapy patients |
CN113192560A (en) * | 2021-03-02 | 2021-07-30 | 郑州大学第一附属医院 | Construction method of hepatocellular carcinoma typing system based on iron death process |
CN113230405A (en) * | 2021-05-08 | 2021-08-10 | 中国医学科学院肿瘤医院 | Application of agent for inhibiting activity of protein kinase CLK in preparation of medicine for treating or improving esophageal squamous cell carcinoma |
CN113870951A (en) * | 2021-10-28 | 2021-12-31 | 四川大学 | Prediction system for predicting head and neck squamous cell carcinoma immune subtype |
WO2022036245A1 (en) * | 2020-08-14 | 2022-02-17 | Castle Biosciences, Inc. | Methods of diagnosing and treating patients with cutaneous squamous cell carcinoma |
CN114686591A (en) * | 2022-05-12 | 2022-07-01 | 浙江大学医学院附属第四医院 | Lung squamous carcinoma immunotherapy curative effect prediction model based on gene expression condition and construction method and application thereof |
CN115232877A (en) * | 2022-08-05 | 2022-10-25 | 中国医学科学院肿瘤医院 | Molecular typing diagnosis marker for esophageal squamous carcinoma and application thereof |
CN115612734A (en) * | 2021-07-14 | 2023-01-17 | 郑州大学 | Molecular marker group of human esophageal squamous cell carcinoma and application thereof |
-
2023
- 2023-01-19 CN CN202310062863.0A patent/CN116129998B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180051346A1 (en) * | 2015-03-17 | 2018-02-22 | Stichting Het Nederlands Kanker Instituut-Antoni van Leeuwenhoek Ziekenhuis | Methods and means for subtyping invasive lobular breast cancer |
CN109863251A (en) * | 2016-05-17 | 2019-06-07 | 基因中心治疗公司 | To the method for squamous cell lung carcinoma subtype typing |
US20190338366A1 (en) * | 2016-05-17 | 2019-11-07 | Genecentric Therapeutics, Inc. | Methods for subtyping of lung squamous cell carcinoma |
WO2022036245A1 (en) * | 2020-08-14 | 2022-02-17 | Castle Biosciences, Inc. | Methods of diagnosing and treating patients with cutaneous squamous cell carcinoma |
CN112735521A (en) * | 2021-01-22 | 2021-04-30 | 安徽医科大学第一附属医院 | Guidance selection of bladder cancer immune classification system suitable for anti-PD-1/PD-L1 immunotherapy patients |
CN113192560A (en) * | 2021-03-02 | 2021-07-30 | 郑州大学第一附属医院 | Construction method of hepatocellular carcinoma typing system based on iron death process |
CN113230405A (en) * | 2021-05-08 | 2021-08-10 | 中国医学科学院肿瘤医院 | Application of agent for inhibiting activity of protein kinase CLK in preparation of medicine for treating or improving esophageal squamous cell carcinoma |
CN115612734A (en) * | 2021-07-14 | 2023-01-17 | 郑州大学 | Molecular marker group of human esophageal squamous cell carcinoma and application thereof |
CN113870951A (en) * | 2021-10-28 | 2021-12-31 | 四川大学 | Prediction system for predicting head and neck squamous cell carcinoma immune subtype |
CN114686591A (en) * | 2022-05-12 | 2022-07-01 | 浙江大学医学院附属第四医院 | Lung squamous carcinoma immunotherapy curative effect prediction model based on gene expression condition and construction method and application thereof |
CN115232877A (en) * | 2022-08-05 | 2022-10-25 | 中国医学科学院肿瘤医院 | Molecular typing diagnosis marker for esophageal squamous carcinoma and application thereof |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116716404A (en) * | 2023-06-13 | 2023-09-08 | 中国医学科学院北京协和医院 | Device for distinguishing ovarian clear cell carcinoma from high-grade serous carcinoma based on S100A2 |
CN116716404B (en) * | 2023-06-13 | 2024-01-30 | 中国医学科学院北京协和医院 | Device for distinguishing ovarian clear cell carcinoma from high-grade serous carcinoma based on S100A2 |
CN116978554A (en) * | 2023-09-25 | 2023-10-31 | 中国医学科学院基础医学研究所 | Method, system and equipment for processing prognosis data of multiple myeloma |
CN116978554B (en) * | 2023-09-25 | 2024-01-30 | 中国医学科学院基础医学研究所 | Method, system and equipment for processing prognosis data of multiple myeloma |
Also Published As
Publication number | Publication date |
---|---|
CN116129998B (en) | 2024-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Long et al. | Development and validation of a TP53-associated immune prognostic model for hepatocellular carcinoma | |
CN116129998B (en) | Esophageal squamous cell carcinoma data processing method and system | |
Long et al. | A mutation-based gene set predicts survival benefit after immunotherapy across multiple cancers and reveals the immune response landscape | |
Chen et al. | Identification and validation of an 11-ferroptosis related gene signature and its correlation with immune checkpoint molecules in glioma | |
WO2019178283A1 (en) | Methods and compositions for treating and prognosing colorectal cancer | |
Jiang et al. | GARP correlates with tumor-infiltrating T-cells and predicts the outcome of gastric cancer | |
Emamdoost et al. | The miR-125a-3p inhibits TIM-3 expression in AML cell line HL-60 in vitro | |
Piña‑Sánchez et al. | Circulating microRNAs and their role in the immune response in triple‑negative breast cancer | |
Li et al. | SEMA6B overexpression predicts poor prognosis and correlates with the tumor immunosuppressive microenvironment in colorectal cancer | |
Pullikuth et al. | Bulk and single-cell profiling of breast tumors identifies TREM-1 as a dominant immune suppressive marker associated with poor outcomes | |
Freedman et al. | Biological aspects of cancer health disparities | |
Chen et al. | Identification of prognostic metabolism‐related genes in clear cell renal cell carcinoma | |
Kocher et al. | Multi-omic characterization of pancreatic ductal adenocarcinoma relates CXCR4 mRNA expression levels to potential clinical targets | |
Summerer et al. | Integrative analysis of the microRNA-mRNA response to radiochemotherapy in primary head and neck squamous cell carcinoma cells | |
Huang et al. | The development and validation of a novel senescence-related long-chain non-coding RNA (lncRNA) signature that predicts prognosis and the tumor microenvironment of patients with hepatocellular carcinoma | |
US20230290440A1 (en) | Urothelial tumor microenvironment (tme) types | |
Polcaro et al. | rs822336 binding to C/EBPβ and NFIC modulates induction of PD-L1 expression and predicts anti-PD-1/PD-L1 therapy in advanced NSCLC | |
Kwon et al. | Genetic and immune microenvironment characterization of HER2‐positive gastric cancer: Their association with response to trastuzumab‐based treatment | |
CN110093422A (en) | Application of the LINC02159 in adenocarcinoma of lung diagnosis and treatment | |
Dai et al. | Development of a CD8+ T cell-based molecular classification for predicting prognosis and heterogeneity in triple-negative breast cancer by integrated analysis of single-cell and bulk RNA-sequencing | |
CN114788869A (en) | Medicine for treating recurrent or metastatic nasopharyngeal carcinoma and curative effect evaluation marker thereof | |
Ye et al. | Metabolism-associated molecular classification of gastric adenocarcinoma | |
Lin et al. | LncRNA DIRC1 is a novel prognostic biomarker and correlated with immune infiltrates in stomach adenocarcinoma | |
Chen et al. | Identifying tumor antigens and immune subtypes of renal cell carcinoma for immunotherapy development | |
CN115982644B (en) | Esophageal squamous cell carcinoma classification model construction and data processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |