CN111739586B - 以87个基因作为生物标志物预测细胞增殖活性的模型 - Google Patents
以87个基因作为生物标志物预测细胞增殖活性的模型 Download PDFInfo
- Publication number
- CN111739586B CN111739586B CN202010554703.4A CN202010554703A CN111739586B CN 111739586 B CN111739586 B CN 111739586B CN 202010554703 A CN202010554703 A CN 202010554703A CN 111739586 B CN111739586 B CN 111739586B
- Authority
- CN
- China
- Prior art keywords
- cell
- genes
- cell proliferation
- gene
- expression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 193
- 230000004663 cell proliferation Effects 0.000 title claims abstract description 112
- 230000000694 effects Effects 0.000 title claims abstract description 51
- 239000000090 biomarker Substances 0.000 title claims abstract description 6
- 230000014509 gene expression Effects 0.000 claims abstract description 114
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 81
- 201000011510 cancer Diseases 0.000 claims abstract description 66
- 230000035755 proliferation Effects 0.000 claims abstract description 33
- 238000000034 method Methods 0.000 claims abstract description 13
- 238000000338 in vitro Methods 0.000 claims abstract description 9
- 238000003559 RNA-seq method Methods 0.000 claims description 14
- 239000013604 expression vector Substances 0.000 claims description 12
- 238000007417 hierarchical cluster analysis Methods 0.000 claims description 10
- 101150079187 87 gene Proteins 0.000 claims description 9
- 101000945496 Homo sapiens Proliferation marker protein Ki-67 Proteins 0.000 claims description 9
- 102100034836 Proliferation marker protein Ki-67 Human genes 0.000 claims description 9
- 238000001727 in vivo Methods 0.000 claims description 8
- 238000007621 cluster analysis Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000005065 mining Methods 0.000 claims description 6
- 102100032857 Cyclin-dependent kinase 1 Human genes 0.000 claims description 4
- 101710106279 Cyclin-dependent kinase 1 Proteins 0.000 claims description 4
- 102100030960 DNA replication licensing factor MCM2 Human genes 0.000 claims description 4
- 101000583807 Homo sapiens DNA replication licensing factor MCM2 Proteins 0.000 claims description 4
- 101001018431 Homo sapiens DNA replication licensing factor MCM7 Proteins 0.000 claims description 4
- 102100039864 ATPase family AAA domain-containing protein 2 Human genes 0.000 claims description 3
- 102100033393 Anillin Human genes 0.000 claims description 3
- 108090000461 Aurora Kinase A Proteins 0.000 claims description 3
- 102000004000 Aurora Kinase A Human genes 0.000 claims description 3
- 102100032306 Aurora kinase B Human genes 0.000 claims description 3
- 108700020462 BRCA2 Proteins 0.000 claims description 3
- 102000052609 BRCA2 Human genes 0.000 claims description 3
- 102100021663 Baculoviral IAP repeat-containing protein 5 Human genes 0.000 claims description 3
- 102100024486 Borealin Human genes 0.000 claims description 3
- 101150008921 Brca2 gene Proteins 0.000 claims description 3
- 108700020472 CDC20 Proteins 0.000 claims description 3
- 101150023302 Cdc20 gene Proteins 0.000 claims description 3
- 102100025053 Cell division control protein 45 homolog Human genes 0.000 claims description 3
- 102100038099 Cell division cycle protein 20 homolog Human genes 0.000 claims description 3
- 102100024478 Cell division cycle-associated protein 2 Human genes 0.000 claims description 3
- 102000011682 Centromere Protein A Human genes 0.000 claims description 3
- 108010076303 Centromere Protein A Proteins 0.000 claims description 3
- 102100023344 Centromere protein F Human genes 0.000 claims description 3
- 102100023443 Centromere protein H Human genes 0.000 claims description 3
- 102100023444 Centromere protein K Human genes 0.000 claims description 3
- 102100035366 Centromere protein M Human genes 0.000 claims description 3
- 102100033211 Centromere protein W Human genes 0.000 claims description 3
- 102100025832 Centromere-associated protein E Human genes 0.000 claims description 3
- 102100031219 Centrosomal protein of 55 kDa Human genes 0.000 claims description 3
- 101710092479 Centrosomal protein of 55 kDa Proteins 0.000 claims description 3
- 102100040484 Claspin Human genes 0.000 claims description 3
- 102100032951 Condensin complex subunit 2 Human genes 0.000 claims description 3
- 102100032980 Condensin-2 complex subunit G2 Human genes 0.000 claims description 3
- 102100025191 Cyclin-A2 Human genes 0.000 claims description 3
- 102100039523 Cytoskeleton-associated protein 2-like Human genes 0.000 claims description 3
- 102100037980 Disks large-associated protein 5 Human genes 0.000 claims description 3
- 102100024739 E3 ubiquitin-protein ligase UHRF1 Human genes 0.000 claims description 3
- 102100028773 Endonuclease 8-like 3 Human genes 0.000 claims description 3
- 102100026121 Flap endonuclease 1 Human genes 0.000 claims description 3
- 108090000652 Flap endonucleases Proteins 0.000 claims description 3
- 108010008599 Forkhead Box Protein M1 Proteins 0.000 claims description 3
- 102100023374 Forkhead box protein M1 Human genes 0.000 claims description 3
- 102100032340 G2/mitotic-specific cyclin-B1 Human genes 0.000 claims description 3
- 102100033201 G2/mitotic-specific cyclin-B2 Human genes 0.000 claims description 3
- 102100031610 HIRA-interacting protein 3 Human genes 0.000 claims description 3
- 102100038147 Histone chaperone ASF1B Human genes 0.000 claims description 3
- 101000887284 Homo sapiens ATPase family AAA domain-containing protein 2 Proteins 0.000 claims description 3
- 101000732632 Homo sapiens Anillin Proteins 0.000 claims description 3
- 101000798306 Homo sapiens Aurora kinase B Proteins 0.000 claims description 3
- 101000762405 Homo sapiens Borealin Proteins 0.000 claims description 3
- 101000934421 Homo sapiens Cell division control protein 45 homolog Proteins 0.000 claims description 3
- 101000980905 Homo sapiens Cell division cycle-associated protein 2 Proteins 0.000 claims description 3
- 101000907941 Homo sapiens Centromere protein F Proteins 0.000 claims description 3
- 101000907934 Homo sapiens Centromere protein H Proteins 0.000 claims description 3
- 101000907931 Homo sapiens Centromere protein K Proteins 0.000 claims description 3
- 101000737696 Homo sapiens Centromere protein M Proteins 0.000 claims description 3
- 101000944447 Homo sapiens Centromere protein W Proteins 0.000 claims description 3
- 101000914247 Homo sapiens Centromere-associated protein E Proteins 0.000 claims description 3
- 101000750011 Homo sapiens Claspin Proteins 0.000 claims description 3
- 101000942617 Homo sapiens Condensin complex subunit 2 Proteins 0.000 claims description 3
- 101000942591 Homo sapiens Condensin-2 complex subunit G2 Proteins 0.000 claims description 3
- 101000934320 Homo sapiens Cyclin-A2 Proteins 0.000 claims description 3
- 101000888538 Homo sapiens Cytoskeleton-associated protein 2-like Proteins 0.000 claims description 3
- 101000951365 Homo sapiens Disks large-associated protein 5 Proteins 0.000 claims description 3
- 101000760417 Homo sapiens E3 ubiquitin-protein ligase UHRF1 Proteins 0.000 claims description 3
- 101001123819 Homo sapiens Endonuclease 8-like 3 Proteins 0.000 claims description 3
- 101000868643 Homo sapiens G2/mitotic-specific cyclin-B1 Proteins 0.000 claims description 3
- 101000713023 Homo sapiens G2/mitotic-specific cyclin-B2 Proteins 0.000 claims description 3
- 101000993314 Homo sapiens HIRA-interacting protein 3 Proteins 0.000 claims description 3
- 101000884473 Homo sapiens Histone chaperone ASF1B Proteins 0.000 claims description 3
- 101001081176 Homo sapiens Hyaluronan mediated motility receptor Proteins 0.000 claims description 3
- 101001008951 Homo sapiens Kinesin-like protein KIF15 Proteins 0.000 claims description 3
- 101001027621 Homo sapiens Kinesin-like protein KIF20A Proteins 0.000 claims description 3
- 101001027631 Homo sapiens Kinesin-like protein KIF20B Proteins 0.000 claims description 3
- 101001006776 Homo sapiens Kinesin-like protein KIFC1 Proteins 0.000 claims description 3
- 101001112162 Homo sapiens Kinetochore protein NDC80 homolog Proteins 0.000 claims description 3
- 101000590482 Homo sapiens Kinetochore protein Nuf2 Proteins 0.000 claims description 3
- 101000701585 Homo sapiens Kinetochore protein Spc24 Proteins 0.000 claims description 3
- 101000711455 Homo sapiens Kinetochore protein Spc25 Proteins 0.000 claims description 3
- 101001004822 Homo sapiens Leucine-rich repeat and WD repeat-containing protein 1 Proteins 0.000 claims description 3
- 101000687968 Homo sapiens Membrane-associated tyrosine- and threonine-specific cdc2-inhibitory kinase Proteins 0.000 claims description 3
- 101000957259 Homo sapiens Mitotic spindle assembly checkpoint protein MAD2A Proteins 0.000 claims description 3
- 101000938705 Homo sapiens N-acetyltransferase ESCO2 Proteins 0.000 claims description 3
- 101000991410 Homo sapiens Nucleolar and spindle-associated protein 1 Proteins 0.000 claims description 3
- 101001098930 Homo sapiens Pachytene checkpoint protein 2 homolog Proteins 0.000 claims description 3
- 101000721172 Homo sapiens Protein DBF4 homolog A Proteins 0.000 claims description 3
- 101000817237 Homo sapiens Protein ECT2 Proteins 0.000 claims description 3
- 101000668416 Homo sapiens Regulator of chromosome condensation Proteins 0.000 claims description 3
- 101000707664 Homo sapiens Rho GTPase-activating protein 11A Proteins 0.000 claims description 3
- 101000575639 Homo sapiens Ribonucleoside-diphosphate reductase subunit M2 Proteins 0.000 claims description 3
- 101000863815 Homo sapiens SHC SH2 domain-binding protein 1 Proteins 0.000 claims description 3
- 101000582914 Homo sapiens Serine/threonine-protein kinase PLK4 Proteins 0.000 claims description 3
- 101000980900 Homo sapiens Sororin Proteins 0.000 claims description 3
- 101000825632 Homo sapiens Spindle and kinetochore-associated protein 1 Proteins 0.000 claims description 3
- 101000633445 Homo sapiens Structural maintenance of chromosomes protein 2 Proteins 0.000 claims description 3
- 101000630117 Homo sapiens Synaptonemal complex central element protein 2 Proteins 0.000 claims description 3
- 101000830894 Homo sapiens Targeting protein for Xklp2 Proteins 0.000 claims description 3
- 101000809797 Homo sapiens Thymidylate synthase Proteins 0.000 claims description 3
- 101000836150 Homo sapiens Transforming acidic coiled-coil-containing protein 3 Proteins 0.000 claims description 3
- 101000795350 Homo sapiens Tripartite motif-containing protein 59 Proteins 0.000 claims description 3
- 101000708381 Homo sapiens U11/U12 small nuclear ribonucleoprotein 25 kDa protein Proteins 0.000 claims description 3
- 101000807354 Homo sapiens Ubiquitin-conjugating enzyme E2 C Proteins 0.000 claims description 3
- 101000837581 Homo sapiens Ubiquitin-conjugating enzyme E2 T Proteins 0.000 claims description 3
- 102100027735 Hyaluronan mediated motility receptor Human genes 0.000 claims description 3
- 102100027630 Kinesin-like protein KIF15 Human genes 0.000 claims description 3
- 102100037694 Kinesin-like protein KIF20A Human genes 0.000 claims description 3
- 102100037691 Kinesin-like protein KIF20B Human genes 0.000 claims description 3
- 102100027942 Kinesin-like protein KIFC1 Human genes 0.000 claims description 3
- 102100023890 Kinetochore protein NDC80 homolog Human genes 0.000 claims description 3
- 102100032431 Kinetochore protein Nuf2 Human genes 0.000 claims description 3
- 102100030536 Kinetochore protein Spc24 Human genes 0.000 claims description 3
- 102100026519 Lamin-B2 Human genes 0.000 claims description 3
- 102100025973 Leucine-rich repeat and WD repeat-containing protein 1 Human genes 0.000 claims description 3
- 102100024262 Membrane-associated tyrosine- and threonine-specific cdc2-inhibitory kinase Human genes 0.000 claims description 3
- 102100038792 Mitotic spindle assembly checkpoint protein MAD2A Human genes 0.000 claims description 3
- 102100030822 N-acetyltransferase ESCO2 Human genes 0.000 claims description 3
- 102100030991 Nucleolar and spindle-associated protein 1 Human genes 0.000 claims description 3
- 102100038993 Pachytene checkpoint protein 2 homolog Human genes 0.000 claims description 3
- 108010000598 Polycomb Repressive Complex 1 Proteins 0.000 claims description 3
- 102100025198 Protein DBF4 homolog A Human genes 0.000 claims description 3
- 102100040437 Protein ECT2 Human genes 0.000 claims description 3
- 102100033947 Protein regulator of cytokinesis 1 Human genes 0.000 claims description 3
- 108010068097 Rad51 Recombinase Proteins 0.000 claims description 3
- 102000002490 Rad51 Recombinase Human genes 0.000 claims description 3
- 102100039977 Regulator of chromosome condensation Human genes 0.000 claims description 3
- 102100031354 Rho GTPase-activating protein 11A Human genes 0.000 claims description 3
- 102100026006 Ribonucleoside-diphosphate reductase subunit M2 Human genes 0.000 claims description 3
- 102100029989 SHC SH2 domain-binding protein 1 Human genes 0.000 claims description 3
- 101100010298 Schizosaccharomyces pombe (strain 972 / ATCC 24843) pol2 gene Proteins 0.000 claims description 3
- 102100030267 Serine/threonine-protein kinase PLK4 Human genes 0.000 claims description 3
- 102100023776 Signal peptidase complex subunit 2 Human genes 0.000 claims description 3
- 102100024483 Sororin Human genes 0.000 claims description 3
- 102100022915 Spindle and kinetochore-associated protein 1 Human genes 0.000 claims description 3
- 102100029540 Structural maintenance of chromosomes protein 2 Human genes 0.000 claims description 3
- 108010002687 Survivin Proteins 0.000 claims description 3
- 102100026178 Synaptonemal complex central element protein 2 Human genes 0.000 claims description 3
- 102100024813 Targeting protein for Xklp2 Human genes 0.000 claims description 3
- 102100034838 Thymidine kinase, cytosolic Human genes 0.000 claims description 3
- 102100038618 Thymidylate synthase Human genes 0.000 claims description 3
- 102100027048 Transforming acidic coiled-coil-containing protein 3 Human genes 0.000 claims description 3
- 102100029717 Tripartite motif-containing protein 59 Human genes 0.000 claims description 3
- 102100031474 U11/U12 small nuclear ribonucleoprotein 25 kDa protein Human genes 0.000 claims description 3
- 102100037256 Ubiquitin-conjugating enzyme E2 C Human genes 0.000 claims description 3
- 102100028705 Ubiquitin-conjugating enzyme E2 T Human genes 0.000 claims description 3
- 108010052219 lamin B2 Proteins 0.000 claims description 3
- 102100025450 DNA replication factor Cdt1 Human genes 0.000 claims description 2
- 102100033587 DNA topoisomerase 2-alpha Human genes 0.000 claims description 2
- 101001046554 Dictyostelium discoideum Thymidine kinase 1 Proteins 0.000 claims description 2
- 102100039849 Histone H2A type 1 Human genes 0.000 claims description 2
- 101000914265 Homo sapiens DNA replication factor Cdt1 Proteins 0.000 claims description 2
- 101001035431 Homo sapiens Histone H2A type 1 Proteins 0.000 claims description 2
- 101000762967 Homo sapiens Lymphokine-activated killer T-cell-originated protein kinase Proteins 0.000 claims description 2
- 101000794228 Homo sapiens Mitotic checkpoint serine/threonine-protein kinase BUB1 beta Proteins 0.000 claims description 2
- 101000744394 Homo sapiens Oxidized purine nucleoside triphosphate hydrolase Proteins 0.000 claims description 2
- 101000649929 Homo sapiens Serine/threonine-protein kinase VRK1 Proteins 0.000 claims description 2
- 101000945477 Homo sapiens Thymidine kinase, cytosolic Proteins 0.000 claims description 2
- 102100026753 Lymphokine-activated killer T-cell-originated protein kinase Human genes 0.000 claims description 2
- 102100030144 Mitotic checkpoint serine/threonine-protein kinase BUB1 beta Human genes 0.000 claims description 2
- 102100039792 Oxidized purine nucleoside triphosphate hydrolase Human genes 0.000 claims description 2
- 102100028235 Serine/threonine-protein kinase VRK1 Human genes 0.000 claims description 2
- 108010046308 Type II DNA Topoisomerases Proteins 0.000 claims description 2
- 239000013598 vector Substances 0.000 claims description 2
- 108700026220 vif Genes Proteins 0.000 claims description 2
- 238000010220 Pearson correlation analysis Methods 0.000 claims 1
- -1 RAGAP 1 Proteins 0.000 claims 1
- 238000011156 evaluation Methods 0.000 abstract description 7
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000012163 sequencing technique Methods 0.000 abstract description 2
- 230000002596 correlated effect Effects 0.000 abstract 1
- 238000003745 diagnosis Methods 0.000 abstract 1
- 210000004027 cell Anatomy 0.000 description 175
- 210000001519 tissue Anatomy 0.000 description 45
- 230000002062 proliferating effect Effects 0.000 description 12
- 206010041823 squamous cell carcinoma Diseases 0.000 description 7
- 208000031671 Large B-Cell Diffuse Lymphoma Diseases 0.000 description 6
- 208000000102 Squamous Cell Carcinoma of Head and Neck Diseases 0.000 description 6
- 206010012818 diffuse large B-cell lymphoma Diseases 0.000 description 6
- 201000000459 head and neck squamous cell carcinoma Diseases 0.000 description 6
- 239000003550 marker Substances 0.000 description 6
- 238000004393 prognosis Methods 0.000 description 6
- 210000003714 granulocyte Anatomy 0.000 description 5
- 210000000130 stem cell Anatomy 0.000 description 5
- 206010009944 Colon cancer Diseases 0.000 description 4
- 210000004698 lymphocyte Anatomy 0.000 description 4
- 239000002243 precursor Substances 0.000 description 4
- 210000001626 skin fibroblast Anatomy 0.000 description 4
- 208000006545 Chronic Obstructive Pulmonary Disease Diseases 0.000 description 3
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 3
- 206010038019 Rectal adenocarcinoma Diseases 0.000 description 3
- 238000004833 X-ray photoelectron spectroscopy Methods 0.000 description 3
- 208000009956 adenocarcinoma Diseases 0.000 description 3
- 208000029742 colonic neoplasm Diseases 0.000 description 3
- 210000004072 lung Anatomy 0.000 description 3
- 210000003738 lymphoid progenitor cell Anatomy 0.000 description 3
- 201000001281 rectum adenocarcinoma Diseases 0.000 description 3
- 239000000725 suspension Substances 0.000 description 3
- XOOUIPVCVHRTMJ-UHFFFAOYSA-L zinc stearate Chemical compound [Zn+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O XOOUIPVCVHRTMJ-UHFFFAOYSA-L 0.000 description 3
- 206010006187 Breast cancer Diseases 0.000 description 2
- 208000026310 Breast neoplasm Diseases 0.000 description 2
- 201000009030 Carcinoma Diseases 0.000 description 2
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 2
- 101001096541 Homo sapiens Rac GTPase-activating protein 1 Proteins 0.000 description 2
- 206010027406 Mesothelioma Diseases 0.000 description 2
- 206010060862 Prostate cancer Diseases 0.000 description 2
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 2
- 102100037414 Rac GTPase-activating protein 1 Human genes 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 208000020990 adrenal cortex carcinoma Diseases 0.000 description 2
- 210000000270 basal cell Anatomy 0.000 description 2
- 210000001185 bone marrow Anatomy 0.000 description 2
- 230000009702 cancer cell proliferation Effects 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 210000002514 epidermal stem cell Anatomy 0.000 description 2
- 201000004101 esophageal cancer Diseases 0.000 description 2
- 208000005017 glioblastoma Diseases 0.000 description 2
- 210000003297 immature b lymphocyte Anatomy 0.000 description 2
- 210000002490 intestinal epithelial cell Anatomy 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001575 pathological effect Effects 0.000 description 2
- 230000002980 postoperative effect Effects 0.000 description 2
- 102000004169 proteins and genes Human genes 0.000 description 2
- 230000002381 testicular Effects 0.000 description 2
- 101150029062 15 gene Proteins 0.000 description 1
- 101150028074 2 gene Proteins 0.000 description 1
- NTJTXGBCDNPQIV-UHFFFAOYSA-N 4-oxaldehydoylbenzoic acid Chemical compound OC(=O)C1=CC=C(C(=O)C=O)C=C1 NTJTXGBCDNPQIV-UHFFFAOYSA-N 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 description 1
- 206010005003 Bladder cancer Diseases 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 208000017897 Carcinoma of esophagus Diseases 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 208000030808 Clear cell renal carcinoma Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 102100032952 Condensin complex subunit 3 Human genes 0.000 description 1
- 108050006400 Cyclin Proteins 0.000 description 1
- 102100028630 Cytoskeleton-associated protein 2 Human genes 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 201000010915 Glioblastoma multiforme Diseases 0.000 description 1
- 101000942622 Homo sapiens Condensin complex subunit 3 Proteins 0.000 description 1
- 101000766848 Homo sapiens Cytoskeleton-associated protein 2 Proteins 0.000 description 1
- 101001008953 Homo sapiens Kinesin-like protein KIF11 Proteins 0.000 description 1
- 101000605743 Homo sapiens Kinesin-like protein KIF23 Proteins 0.000 description 1
- 101001003581 Homo sapiens Lamin-B1 Proteins 0.000 description 1
- 101000896657 Homo sapiens Mitotic checkpoint serine/threonine-protein kinase BUB1 Proteins 0.000 description 1
- 102100027629 Kinesin-like protein KIF11 Human genes 0.000 description 1
- 102100038406 Kinesin-like protein KIF23 Human genes 0.000 description 1
- 102100026517 Lamin-B1 Human genes 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 102100021691 Mitotic checkpoint serine/threonine-protein kinase BUB1 Human genes 0.000 description 1
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 1
- 101100268917 Oryctolagus cuniculus ACOX2 gene Proteins 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 206010061332 Paraganglion neoplasm Diseases 0.000 description 1
- 102100036691 Proliferating cell nuclear antigen Human genes 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- 102100031463 Serine/threonine-protein kinase PLK1 Human genes 0.000 description 1
- 101150040974 Set gene Proteins 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 238000000692 Student's t-test Methods 0.000 description 1
- 201000008754 Tenosynovial giant cell tumor Diseases 0.000 description 1
- 208000024313 Testicular Neoplasms Diseases 0.000 description 1
- 206010057644 Testis cancer Diseases 0.000 description 1
- UCONUSSAWGCZMV-UHFFFAOYSA-N Tetrahydro-cannabinol-carbonsaeure Natural products O1C(C)(C)C2CCC(C)=CC2C2=C1C=C(CCCCC)C(C(O)=O)=C2O UCONUSSAWGCZMV-UHFFFAOYSA-N 0.000 description 1
- 101100123309 Thermotoga maritima (strain ATCC 43589 / DSM 3109 / JCM 10099 / NBRC 100826 / MSB8) gyrA gene Proteins 0.000 description 1
- 208000000728 Thymus Neoplasms Diseases 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 201000005969 Uveal melanoma Diseases 0.000 description 1
- 101150070651 Vrk1 gene Proteins 0.000 description 1
- 238000004115 adherent culture Methods 0.000 description 1
- 230000001464 adherent effect Effects 0.000 description 1
- 208000007128 adrenocortical carcinoma Diseases 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 210000001356 basal cell of epidermis Anatomy 0.000 description 1
- 201000011199 bladder lymphoma Diseases 0.000 description 1
- 206010005084 bladder transitional cell carcinoma Diseases 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 206010073251 clear cell renal cell carcinoma Diseases 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- 208000030381 cutaneous melanoma Diseases 0.000 description 1
- 208000035647 diffuse type tenosynovial giant cell tumor Diseases 0.000 description 1
- 210000002615 epidermis Anatomy 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 201000005619 esophageal carcinoma Diseases 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 210000002077 granulocytopoietic cell Anatomy 0.000 description 1
- 201000005787 hematologic cancer Diseases 0.000 description 1
- 208000024200 hematopoietic and lymphoid system neoplasm Diseases 0.000 description 1
- 208000030776 invasive breast carcinoma Diseases 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 210000002202 late pro-b cell Anatomy 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 210000005229 liver cell Anatomy 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 208000030173 low grade glioma Diseases 0.000 description 1
- 201000005249 lung adenocarcinoma Diseases 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 210000001165 lymph node Anatomy 0.000 description 1
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 1
- 210000000135 megakaryocyte-erythroid progenitor cell Anatomy 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 210000001616 monocyte Anatomy 0.000 description 1
- 210000000822 natural killer cell Anatomy 0.000 description 1
- 230000009701 normal cell proliferation Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 201000010302 ovarian serous cystadenocarcinoma Diseases 0.000 description 1
- 201000002528 pancreatic cancer Diseases 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 208000007312 paraganglioma Diseases 0.000 description 1
- 208000028591 pheochromocytoma Diseases 0.000 description 1
- 108010056274 polo-like kinase 1 Proteins 0.000 description 1
- 210000003491 skin Anatomy 0.000 description 1
- 201000003708 skin melanoma Diseases 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 238000012353 t test Methods 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 201000003120 testicular cancer Diseases 0.000 description 1
- 208000002918 testicular germ cell tumor Diseases 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 201000009377 thymus cancer Diseases 0.000 description 1
- 201000002510 thyroid cancer Diseases 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 239000000439 tumor marker Substances 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 208000037965 uterine sarcoma Diseases 0.000 description 1
- 238000002759 z-score normalization Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/30—Unsupervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Data Mining & Analysis (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Bioethics (AREA)
- Analytical Chemistry (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Oncology (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Hospice & Palliative Care (AREA)
- Physiology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
本发明提供以87个基因作为生物标志物预测细胞增殖活性的模型。细胞增殖基因集合表达水平与细胞的增殖活性正相关。本发明提供了一套无需体外培养对细胞增殖活性进行评估的方法。结合单细胞测序技术,可以快捷简便的测定体内各细胞类型的增殖活性。本发明可以帮助我们判断癌症组织中是否存在显著增殖的正常细胞。当癌症组织中存在大量该类细胞时,针对细胞增殖标志物的治疗与评估手段将会受到干扰而可能失败,当癌症组织不存在大量该类细胞时,针对细胞增殖标志物的治疗与评估手段有望成功。本发明对于基于细胞增殖机制的癌症诊疗具有辅助指导意义。
Description
技术领域
本发明属于基因技术及生物医学领域,具体涉及一种以87个基因作为生物标志物预测细胞增殖活性的方法
背景技术
癌细胞的大量无序增殖是肿瘤发生的关键机制。针对细胞增殖机制,人们发展出化疗等治疗手段。同时,人们开发出多个细胞增殖基因标志物如MKI67,MCM2和PCNA等,使用其mRNA或者蛋白表达水平来指示癌症细胞的增殖活性,从而辅助评估术后病患的预后情况。特别是针对MKI67的蛋白表达量,人们开发出Ki-67指数来标记病理样本中Ki-67表达阳性细胞的比率,从而评估肺癌、乳腺癌、前列腺癌、宫颈癌、结直肠癌、膀胱癌、淋巴癌等癌症患者的预后。
增殖并不是癌症细胞所独有的特性。已有研究表明,人体皮肤、骨髓与胃肠道等组织中存在大量具有增殖活性的细胞。当癌症发生于上述组织时,术后病患的癌症组织样本中MKI67等细胞增殖标志物的表达量部分来源于癌症细胞,部分来源于正常增殖细胞,将无法准确反映癌症细胞的增殖活性。由于缺乏足够的数据支持,美国临床肿瘤学会(ASCO)肿瘤标志物指导委员会不建议将Ki-67指数作为新近诊断为乳腺癌的患者的常规预后标志物。这一现象的部分原因是由于正常骨髓与淋巴结等免疫器官中亦存在大量增殖细胞,在患者病理样本中,Ki-67指数无法精确区分正常增殖细胞与肿瘤细胞,导致其癌症细胞增殖活性估计精度下降,进而导致预测患者预后能力的下降。
体外培养能够帮助我们鉴定正常细胞的增殖能力。但是目前这一方法存在极大的困难:1.部分细胞无法在体外进行培养;2.部分细胞由于生存环境存在巨大差异,其体外培养条件下的增殖能力无法反映体内环境中的真实增殖能力。
发明内容
针对人体不同类型细胞增殖活性的差异和当前培养方式细胞增殖活性评估的困难,本发明提供一个以87个细胞增殖基因集合作为标志物评估细胞增殖活性的方法。为了实现这一目的,本发明采用以下技术方案。
1.建立细胞增殖基因集合,由87个基因组成细胞增殖基因集合,具体实施步骤如下:
(1)数据采集
从Tabula Muris数据库(https://tabula-muris.ds.czbiohub.org/)获得不同类型正常细胞的单细胞RNA-Seq数据,从癌症基因组图谱(TCGA)数据库(http://cancergenome.nih.gov/)获得癌症和癌旁组织RNA-Seq数据,从GTEx(Genotype-TissueExpression Project)数据库(https://www.gtexportal.org/)中获得组织RNA-Seq数据,从CCLE(Cancer Cell Line Encyclopedia)数据库(https://portals.broadinstitute.org/ccle)获得细胞系RNA-Seq数据与细胞增殖活性数据。
(2)干/组细胞特异性表达基因集合挖掘
a)将Tabula Muris数据库中的体内正常单细胞按细胞类型归为81类,计算各类细胞的基因表达值。对某一特定细胞类型i当中的某一基因j,计算其表达值(Xji)如下:
其中m为属于细胞类型i的细胞总数,n为细胞类型i中细胞基因j的reads count大于0的细胞的数目。如此,计算细胞类型i中所有基因的表达值。依次,计算81种细胞类型的所有基因的表达值。
b)将81类细胞分为两组:干/组细胞组与其他细胞组。
c)使用层次聚类分析,挖掘在干/组细胞组中高表达,在其他细胞组极低表达的基因,作为干/组细胞特异性基因集合。
(3)细胞增殖基因集合挖掘
a)获得GTEx数据库中各正常组织样本中干/组细胞特异性基因集合中基因的表达值。在绝大多数正常组织中不具有增殖活性的终末细胞占据主要成分,为此对上述基因进行层次聚类分析,获得在正常组织中低表达的87个基因组成的基因群(87个基因包括ANLN、ARHGAP11A、ASF1B、ATAD2、AURKA、AURKB、BIRC5、BRCA2、BUB1、BUB1B、CCNA2、CCNB1、CCNB2、CDC20、CDC45、CDCA2、CDCA5、CDCA8、CDK1、CDT1、CENPA、CENPE、CENPF、CENPH、CENPK、CENPM、CENPW、CEP55、CKAP2、CKAP2L、CLSPN、DBF4、DLGAP5、ECT2、ESCO2、FEN1、FOXM1、HIRIP3、HIST1H2AG、HMMR、KIF11、KIF15、KIF20A、KIF20B、KIF23、KIFC1、LMNB1、LMNB2、LRWD1、MAD2L1、MCM2、MKI67、NCAPG、NCAPG2、NCAPH、NDC80、NEIL3、NUDT1、NUF2、NUSAP1、PBK、PKMYT1、PLK1、PLK4、PRC1、RACGAP1、RAD51、RCC1、RRM2、SHCBP1、SKA1、SMC2、SNRNP25、SPC24、SPC25、SYCE2、TACC3、TK1、TOP2A、TPX2、TRIM59、TRIP13、TYMS、UBE2C、UBE2T、UHRF1和VRK1)。
b)获得TCGA数据库癌和癌旁组织样本中上述87个基因的表达值。对某一个基因j(1≤j≤87),计算其在所有癌和癌旁组织样本中Z-score标准化的基因表达值Yj。对某一个样本k,列举其87个基因的表达向量为{Y1k,Y2k,…,Y87k},然后,计算基因集合的表达值为上述87个基因表达向量的中值(median{Y1k,Y2k,…,Y87k})。进一步使用T检验将每一种癌症的样本的基因集合表达值与所有癌旁样本的基因集合表达值进行比较。由于绝大多数癌组织由高增殖的癌细胞组成,进一步确认上述基因集合在癌组织高表达,在癌旁低表达,至此,确认上述87个基因组成的基因群为细胞增殖基因集合。
2.使用上述细胞增殖基因集合建立预测细胞增殖活性的模型,具体实施步骤如下:
(1)细胞增殖基因集合预测体外培养癌细胞系增殖活性
a)获得CCLE数据库中各癌症细胞系中细胞增殖基因集合中基因的表达值。同样,对某一个基因j(1≤j≤87),计算其在所有细胞系样本中Z-score标准化的基因表达值Zj。对某一个细胞系样本k,列举其87个基因的表达向量为{Z1k,Z2k,…,Z87k},然后,计算基因集合的表达值为上述87个基因表达值的中值(median{Z1k,Z2k,…,Z87k})。计算每一个细胞系样本的细胞增殖基因集合表达值。
b)获得CCLE数据库中部分细胞增殖活性数据(倍增时间)。
c)对细胞系样本的细胞增殖基因集合表达值数据与对应细胞系的倍增时间数据,进行皮尔森相关分析。确认在来源于实体瘤的癌症细胞系中,细胞增殖活性与87个基因组成的细胞增殖基因集合表达值存在显著正相关,即细胞增殖基因集合表达高低可以预测来源于实体瘤的癌症细胞系的增殖活性。
(2)建立细胞增殖活性预测模型
a)将Tabula Muris数据库中的单细胞按细胞类型归为81类,获得各类细胞的基因表达值如上。
b)使用上述87个基因的表达值对81个细胞类型进行层次聚类分析。通过聚类分析,将细胞类型聚成2-3类。
c)对81个细胞类型中的每一个细胞类型,计算其细胞增殖基因集合表达值,获取每一个细胞类型中细胞增殖基因集合中87个基因的表达值。对某一个细胞类型i,对某一个基因j(1≤j≤87的基因表达值Xji,列举其87个基因的表达向量为{X1i,X2i,…,X87i},然后,计算细胞增殖基因集合的表达值为上述87个基因表达值的中值(median{X1i,X2i,…,X87i)。
d)依据聚类分析的结果,将81个细胞类型聚成2-3个不同的细胞类型群,对每一个细胞类型群,获得其细胞增殖基因集合的表达值向量,比较不同细胞类型群的细胞增殖基因集合的表达值(T检验,双尾)。以P<0.05为阈值,判断是否某一细胞类型群的细胞增殖基因集合表达值显著高于其他细胞类型群,从而确认高表达细胞增殖基因的细胞类型与低表达细胞增殖基因的细胞类型,实现对81种细胞类型增殖活性的评估。
至此,使用细胞增殖基因集合表达水平实现体内81种正常细胞类型的增殖活性的评估。
本发明通过单细胞RNA-Seq数据,识别出87个基因组成的细胞增殖相关基因标志物集合,使用该集合,我们评估体内不同正常细胞类型的增殖活性,全面识别体内高速增殖的正常细胞类型。这一技术的实现,可以帮助我们判断癌症组织中是否存在具有增殖活性的正常细胞。当癌症组织中存在大量该类细胞时,针对细胞增殖标志物的治疗与评估手段将会受到干扰而可能失败。
本发明的优势在于,(1)基于培养的细胞增殖活性判断方法需要对正常组织细胞进行体外培养,目前部分组织细胞无法进行体外培养,部分组织细胞受培养条件影响体内外细胞增殖活性存在巨大差异,本方法利用单细胞技术,直接对体内组织细胞的增殖活性进行评估,能够准确评估细胞的增殖活性。(2)本方法所获得的正常细胞增殖活性结果可以辅助判断癌症组织中是否存在大量具有增殖能力的正常细胞,从而为针对细胞增殖机制的癌症治疗与评估手段提供指导。
附图说明
图1:干/组细胞组高表达基因聚类分析热图。图中一列表示一种细胞类型,一行表示一个基因。对在干/组细胞组中任一细胞类型表达水平>0.5的基因进行聚类分析,聚成15个基因群,发现一个162个基因组成的基因群,其基因在干/组细胞组显著表达,在其他细胞类型中极低表达。图中Epi-SC指示表皮干细胞,数字1-7指示Slamf1阳性多能组细胞(1)、巨核-红系祖细胞(2)、晚期B前体细胞(3)、粒单核组细胞(4)、粒系细胞(5)、淋巴祖细胞(6)和自然杀伤前体细胞(7),这8类细胞组成干/组细胞组。
图2:干/组细胞特异性表达基因集合基因在54个人正常组织样本的聚类分析热图。图中一列表示一个样本,一行表示一个基因,同一颜色的样本属于同一个组织类型。在54个人正常组织的17382样本中对干/组细胞特异性表达基因集合基因进行聚类分析,聚成2个基因群。发现由87个基因聚集形成的基因群只在(1)培养后皮肤成纤维细胞(culturedskin fibroblasts),(2)EBV转染淋巴细胞(EBV-transformed lymphocytes)和(3)睾丸组织(testis)组织中高表达,在其他组织中均为低表达。
图3:不同癌症中细胞增殖基因集合的表达水平箱式图。获得来源于32种癌和癌旁组织的9630个样本的细胞增殖基因集合表达水平值,然后将所有癌旁样本合并(Control)。使用t检验比较每一种癌症和Control的细胞增殖基因集合表达水平。以双尾P-value<0.05为指标,红色高亮其细胞增殖基因集合表达水平显著高于癌旁组的癌症类型。
图4:细胞增殖基因集合表达水平和细胞最佳倍增时间相关分析。图中每一个点表示一个细胞系,横坐标表示细胞系的细胞增殖基因集合表达水平,纵坐标指示(供货商提供的)细胞系倍增时间。计算细胞增殖基因集合表达水平和细胞最佳倍增时间的皮尔森相关系数和P-value。
图5:Tabula Muris中81个不同正常细胞类型的聚类分析热图。图中一列表示一种细胞类型,一行表示一个细胞增殖基因集合中的一个基因。根据细胞增殖基因集合中基因的表达水平将81个正常细胞类型聚成三类。
具体实施方式
下面结合附图和实施例详细描述本发明,以下所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员,在不脱离本发明方法的前提下,还可以做出若干改进和补充,这些改进和补充也应视为本发明的保护范围。
实施例1:使用Tabula Muris数据库、TCGA数据库、GTEx数据库与CCLE数据库建立含有87个基因的细胞增殖基因集合,预测Tabula Muris数据库中收集的体内81种不同正常细胞类型的增殖活性,并辅助判断TCGA数据库中的癌症组织中是否存在大量具有增殖活性的正常细胞,指导针对细胞增殖标志物的癌症治疗与评估手段。
(1)数据采集
从Tabula Muris数据库获得Smart-Seq2单细胞测序技术产生的来源于81种不同正常细胞类型的53760个单细胞RNA-Seq数据。从癌症基因组图谱(TCGA)数据库获得32种癌症的9630个癌与癌旁组织RNA-Seq数据,同时获得其中31种癌症预后数据。从GTEx数据库获得54个组织的17382组织RNA-Seq数据。从CCLE数据库获得1019个细胞系样本的RNA-Seq数据与其中部分细胞系的培养方式(悬浮/贴壁/半贴壁)和倍增时间信息。
(2)干/组细胞特异性表达基因集合挖掘
首先,我们将Tabula Muris数据库中收集的53760个体内正常单细胞按细胞类型归为81类,计算不同细胞类型当中基因的表达值。对某一特定细胞类型i当中的某一基因j,计算其表达值(Xji)如下:其中m为属于细胞类型i的细胞总数,n为细胞类型i中细胞基因j的reads count大于0的细胞的数目。如此,计算细胞类型i中所有基因的表达值。依次,计算81种细胞类型的所有基因的表达值。
其次,将81类细胞分为两组:干/组细胞组与其他细胞组(表1)。干/组细胞组包括表皮干细胞(stem cell of epidermis),Slamf1阳性多能组细胞(Slamf1-positivemultipotent progenitor cell),巨核-红系祖细胞(megakaryocyte-erythroidprogenitor cell),晚期B前体细胞(late pro-B cell),粒单核组细胞(granulocytemonocyte progenitor cell),粒系细胞(granulocytopoietic cell),淋巴祖细胞(commonlymphoid progenitor)和自然杀伤前体细胞(pre-natural killer cell);其他细胞组包括余下的73类细胞。
最终,筛选在干/组细胞组任一细胞类型中表达值>0.5的基因。对这些基因,使用层次聚类分析,挖掘在干/组细胞组中高表达同时在其他细胞组极低表达的162个基因组成的基因群,作为干/组细胞特异性基因集合(图1)。
表1:Tabula Muris数据库中81种正常细胞类型
(3)细胞增殖基因集合挖掘
首先,获得GTEx数据库中各正常组织样本中干/组细胞特异性基因集合中162个基因的表达值。对每一个基因,获得其在17382个组织样本中的表达值,进行Z-score标准化,获得该基因的标准化后的表达值。依此,获得162个基因的标准化后表达值。对基因进行层次聚类分析,由于在绝大多数正常组织中不具有增殖活性的终末细胞占据主要成分,获得只在(i)培养后皮肤成纤维细胞(cultured skin fibroblasts),(ii)EBV转染淋巴细胞(EBV-transformed lymphocytes)和(iii)睾丸组织(testis)组织中高表达,在其他51个组织低表达的87个基因聚集而成的基因群(图2)。该87个基因组成细胞增殖基因集合(表2)。
表2:细胞增殖基因集合基因列表
ANLN | CCNA2 | CENPA | CLSPN | KIF11 | MCM2 | PBK | SKA1 | TRIM59 |
ARHGAP11A | CCNB1 | CENPE | DBF4 | KIF15 | MKI67 | PKMYT1 | SMC2 | TRIP13 |
ASF1B | CCNB2 | CENPF | DLGAP5 | KIF20A | NCAPG | PLK1 | SNRNP25 | TYMS |
ATAD2 | CDC20 | CENPH | ECT2 | KIF20B | NCAPG2 | PLK4 | SPC24 | UBE2C |
AURKA | CDC45 | CENPK | ESCO2 | KIF23 | NCAPH | PRC1 | SPC25 | UBE2T |
AURKB | CDCA2 | CENPM | FEN1 | KIFC1 | NDC80 | RACGAP1 | SYCE2 | UHRF1 |
BIRC5 | CDCA5 | CENPW | FOXM1 | LMNB1 | NEIL3 | RAD51 | TACC3 | VRK1 |
BRCA2 | CDCA8 | CEP55 | HIRIP3 | LMNB2 | NUDTI | RCC1 | TK1 | |
BUB1 | CDK1 | CKAP2 | HISTTH2AG | LRWD1 | NUF2 | RRM2 | top2A | |
BUBTB | CDTT | CKAP2L | HMMR | MAD2L1 | NUSAP1 | SHCBP1 | TPX2 |
其次,获得TCGA数据库中32种癌症(表3)癌症组织和癌旁组织共计9630样本中上述87个基因的表达值。对某一个基因j(1≤j≤87),计算其在所有癌和癌旁组织样本中Z-score标准化的基因表达值Yj。对某一个样本k,列举其87个基因的表达向量为{Y1k,Y2k,...,Y87k},然后,计算基因集合的表达值为上述87个基因表达向量的中值(median{Y1k,Y2k,...,Y87k})。进一步使用T检验将每一种癌症的样本的基因集合表达值与所有癌旁样本的基因集合表达值进行比较。由于绝大多数癌症组织由高增殖的癌细胞组成,进一步确认在32个癌症类型的28个癌症中,细胞增殖基因集合在癌症组织中的表达值高于癌旁组织的表达值(P<0.05,表3和图3)。
表3:TCGA数据库中32种癌症名称中英文对照
HNSC | 头颈鳞状细胞癌 | SKCM | 皮肤黑色素瘤 |
KICH | 肾嫌色细胞癌 | STAD | 胃癌 |
KIRC | 肾透明细胞癌 | BLCA | 膀胱尿路上皮癌 |
KIRP | 肾乳头状细胞癌 | TGCT | 睾丸癌 |
LAML | 急性髓细胞样白血病 | THCA | 甲状腺癌 |
LGG | 脑低级别胶质瘤 | THYM | 胸腺癌 |
LIHC | 肝细胞肝癌 | UCEC | 子宫内膜癌 |
LUAD | 肺腺癌 | UCS | 子宫肉瘤 |
LUSC | 肺鳞癌 | UVM | 葡萄膜黑色素瘤 |
ACC | 肾上腺皮质癌 | BRCA | 乳腺浸润癌 |
MESO | 间皮瘤 | CESC | 宫颈鳞癌和腺癌 |
OV | 卵巢浆液性囊腺癌 | COAD | 结肠癌 |
PAAD | 胰腺癌 | DLBC | 弥漫性大B细胞淋巴瘤 |
PCPG | 嗜铬细胞瘤和副神经节瘤 | ESCA | 食管癌 |
PRAD | 前列腺癌 | GBM | 多形成性胶质细胞瘤 |
READ | 直肠腺癌 | SARC | 肉瘤 |
(4)细胞增殖基因集合预测癌细胞增殖活性
首先,获得CCLE数据库中所有癌症细胞系中细胞增殖基因集合中87个基因的表达值。
对某一个基因j(1≤j≤87),计算其在1019个细胞系样本中Z-score标准化的基因表达值Zj。对某一个细胞系样本k,列举其87个基因的表达向量为{Z1k,Z2k,...,Z87k},然后,计算基因集合的表达值为上述87个基因表达值的中值(median{Z1k,Z2k,...,Z87k})。
计算每一个细胞系样本的细胞增殖基因集合表达值。
其次,CCLE数据库提供部分细胞系的培养方式(悬浮/半贴壁/贴壁)和倍增时间(由供货商提供/由CCLE工作人员统计)的信息。认为供货商提供的倍增时间表达了细胞系的最佳倍增时间,可以作为细胞系的细胞增殖活性能力的指标。为此,获得99个以半贴壁或贴壁培养的细胞系(由供货商提供的)倍增时间数据,作为这批细胞系的细胞增殖活性的指标。最后,发现这批细胞系的细胞增殖基因集合表达值与细胞倍增时间存在负相关关系(皮尔斯相关分析,图4)。由于细胞倍增时间越短,细胞增殖活性越强。这一结果表明细胞增殖基因集合表达值与细胞增殖活性之间存在正相关关系。一般实体瘤为半贴壁或贴壁方式生长而血液瘤以悬浮方式生长。这一结果表明,实体瘤的细胞增殖基因集合表达值可以预测其细胞增殖活性。
(5)细胞增殖活性评估
首先,将Tabula Muris数据库中的单细胞按细胞类型归为81类,计算各类细胞的基因表达值如上。对这81个不同细胞类型,获取每一个细胞类型中细胞增殖基因集合中87个基因的表达值。
其次,使用上述87基因的表达值对81种不同正常细胞类型进行层次聚类分析。使用R软件包“factoextra”进行层次聚类分析,使用的距离度量为“euclidean”,聚类方法为“ward.D2”,依据层次聚类树结果,将81个细胞类型聚成三类(图5)。一类为干/组细胞组,其他两类来源于其他细胞组(表1),分别为显著增殖组(该组细胞具有显著细胞增殖基因表达从而具有一定的细胞增殖能力)和稀少增殖组((该组细胞很少表达细胞增殖基因从而细胞增殖能力很弱)。
最后,对81个细胞类型中的每一个细胞类型,计算其细胞增殖基因集合表达值。对81个细胞类型中的每一个细胞类型,计算其细胞增殖基因集合表达值。获取每一个细胞类型中细胞增殖基因集合中87个基因的表达值。对某一个细胞类型i,对某一个基因j(1≤j≤87的基因表达值Xji,列举其87个基因的表达向量为{X1i,X2i,…,X87i},然后,计算细胞增殖基因集合的表达值为上述87个基因表达值的中值(median{X1i,X2i,…,X87i)。比较上述3类不同细胞类型群的细胞增殖基因集合表达值结果。使用T检验方法,以双尾P<0.05为阈值,发现干/组细胞组群的细胞增殖基因集合表达值显著大于显著增殖组的细胞增殖基因集合表达值,同时显著增殖组的细胞增殖基因集合表达值大于稀少增殖组的细胞增殖基因集合表达值。如此,成功将81种不同正常细胞类型分成三类细胞增殖能力不同的细胞类型群,实现相应细胞类型的增殖能力的等级评估。
(6)正常细胞类型增殖活性指导细胞增殖标志物的临床应用
首先,依据细胞类型信息,发现在显著增殖组中immature B cell(非成熟B细胞),basal cell of epidermis(表皮基底细胞),epithelial cell of large intestine(大肠上皮细胞)所在组织可能会发生实体瘤。
其次,对TCGA的31组具有临床预后信息的实体瘤分析,发现DLBC(弥漫性大B细胞淋巴瘤)癌组织中可能包含大量非成熟B细胞,HNSC(头颈鳞状细胞癌),LUSC(肺鳞癌),ESCA(食管癌)和CESC(宫颈鳞癌和腺癌)均包含鳞状上皮细胞癌,其癌组织可能包含大量表皮基底细胞,而COAD(结肠癌)和READ(直肠腺癌)其癌组织可能包含大量大肠上皮细胞。这7类癌症中均含有大量具有显著增殖活性的正常细胞,基于细胞增殖标志物的治疗与预后预测可能会失败。
最后,使用TCGA的癌症组织RNA-Seq数据与临床progression-free interval(PFI,疾病缓解期)数据,运用Cox比例风险回归模型(连续变量方法)方法,判断DLBC,HNSC,LUSC,ESCA,CESC,COAD和READ患者术后癌组织样本增殖标记MKI67表达值是否对其疾病缓解期具有预测意义。以P-value<0.05为阈值,发现MKI67表达值不能够预测这7种癌症的疾病缓解期(表4)。这一结果与我们预测结果相一致。
表4:7种癌组织包含大量显著增殖正常细胞类型的癌症细胞增殖标志物MKI67预后分析
癌症类型 | Hazard Ratio(95%置信区间) | Type 3 P-value |
宫颈鳞癌和腺癌 | 1.03(0.77-1.37) | 0.8566 |
结肠癌 | 0.87(0.54-1.4) | 0.5555 |
弥漫性大B细胞淋巴瘤 | 1(0.55-1.81) | 0.9902 |
食管癌 | 1.04(0.89-1.21) | 0.6189 |
头颈鳞状细胞癌 | 0.97(0.84-1.12) | 0.6502 |
肺鳞癌 | 1.18(0.95-1.45) | 0.1277 |
直肠腺癌 | 1.61(0.73-3.55) | 0.242 |
Claims (3)
1.一种以87个基因作为生物标志物预测细胞增殖活性的模型,其特征在于,通过以下步骤实现:
(1)建立细胞增殖基因集合:
1)数据采集
从Tabula Muris数据库获得不同类型正常细胞的单细胞RNA-Seq数据,从癌症基因组图谱数据库获得癌症和癌旁组织RNA-Seq数据,从GTEx数据库中获得组织RNA-Seq数据,从CCLE数据库获得细胞系RNA-Seq数据与细胞增殖活性数据;
2)干/组细胞特异性表达基因集合挖掘
a)将Tabula Muris数据库中的体内正常单细胞按细胞类型归为81类,计算各类细胞的基因表达值,对某一特定细胞类型i当中的某一基因j,计算其表达值(Xji)如下:
其中m为属于细胞类型i的细胞总数,n为细胞类型i中细胞基因j的reads count大于0的细胞的数目,计算细胞类型i中所有基因的表达值,依次计算81种细胞类型的所有基因的表达值;
b)将81类细胞分为两组:干/组细胞组与其他细胞组;
c)使用层次聚类分析,挖掘在干/组细胞组中高表达,在其他细胞组极低表达的基因,作为干/组细胞特异性基因集合;
3)细胞增殖基因集合挖掘
a)获得GTEx数据库中各正常组织样本中干/组细胞特异性基因集合中基因的表达值,在绝大多数正常组织中不具有增殖活性的终末细胞占据主要成分,为此对基因进行层次聚类分析,获得在正常组织中低表达的87个基因组成的基因群;
b)获得TCGA数据库癌和癌旁组织样本中上述87个基因的表达值,对某一个基因j,计算其在所有癌和癌旁组织样本中Z-score标准化的基因表达值Yj,对某一个样本k,列举其87个基因的表达向量为{Y1k,Y2k,…,Y87k},然后,计算基因集合的表达值为87个基因表达向量的中值,进一步使用T检验将每一种癌症的样本的基因集合表达值与所有癌旁样本的基因集合表达值进行比较,由于绝大多数癌组织由高增殖的癌细胞组成,进一步确认上述基因集合在癌组织高表达,在癌旁低表达,确认上述87个基因组成的基因群为细胞增殖基因集合;
(2)使用上述细胞增殖基因集合建立预测细胞增殖活性的模型:
1)细胞增殖基因集合预测体外培养癌细胞系增殖活性
a)获得CCLE数据库中各癌症细胞系中细胞增殖基因集合中基因的表达值,对某一个基因j,计算其在所有细胞系样本中Z-score标准化的基因表达值Zj,对某一个细胞系样本k,列举其87个基因的表达向量为{Z1k,Z2k,…,Z87k},然后,计算基因集合的表达值为上述87个基因表达值的中值,计算每一个细胞系样本的细胞增殖基因集合表达值;
b)获得CCLE数据库中部分细胞增殖活性数据;
c)对细胞系样本的细胞增殖基因集合表达值数据与对应细胞系的倍增时间数据,进行皮尔森相关分析,确认在来源于实体瘤的癌症细胞系中,细胞增殖活性与87个基因组成的细胞增殖基因集合表达值存在显著正相关,通过细胞增殖基因集合表达高低预测来源于实体瘤的癌症细胞系的增殖活性;
2)建立细胞增殖活性预测模型
a)将Tabula Muris数据库中的单细胞按细胞类型归为81类,获得各类细胞的基因表达值;
b)使用上述87个基因的表达值对81个细胞类型进行层次聚类分析,通过聚类分析,将细胞类型聚成2-3类;
c)对81个细胞类型中的每一个细胞类型,计算其细胞增殖基因集合表达值,获取每一个细胞类型中细胞增殖基因集合中87个基因的表达值,对某一个细胞类型i,对某一个基因j的基因表达值Xji,列举其87个基因的表达向量为{X1i,X2i,…,X87i},然后计算细胞增殖基因集合的表达值为上述87个基因表达值的中值;
d)依据聚类分析的结果,将81个细胞类型聚成2-3个不同的细胞类型群,对每一个细胞类型群,获得其细胞增殖基因集合的表达值向量,比较不同细胞类型群的细胞增殖基因集合的表达值,以P<0.05为阈值,判断是否某一细胞类型群的细胞增殖基因集合表达值显著高于其他细胞类型群,从而确认高表达细胞增殖基因的细胞类型与低表达细胞增殖基因的细胞类型,实现对81种细胞类型增殖活性的评估。
2.根据权利要求1所述的一种以87个基因作为生物标志物预测细胞增殖活性的模型,其特征在于,87个基因为:ANLN、ARHGAP11A、ASF1B、ATAD2、AURKA、AURKB、BIRC5、BRCA2、BUB1、BUB1B、CCNA2、CCNB1、CCNB2、CDC20、CDC45、CDCA2、CDCA5、CDCA8、CDK1、CDT1、CENPA、CENPE、CENPF、CENPH、CENPK、CENPM、CENPW、CEP55、CKAP2、CKAP2L、CLSPN、DBF4、DLGAP5、ECT2、ESCO2、FEN1、FOXM1、HIRIP3、HIST1H2AG、HMMR、KIF11、KIF15、KIF20A、KIF20B、KIF23、KIFC1、LMNB1、LMNB2、LRWD1、MAD2L1、MCM2、MKI67、NCAPG、NCAPG2、NCAPH、NDC80、NEIL3、NUDT1、NUF2、NUSAP1、PBK、PKMYT1、PLK1、PLK4、PRC1、RACGAP1、RAD51、RCC1、RRM2、SHCBP1、SKA1、SMC2、SNRNP25、SPC24、SPC25、SYCE2、TACC3、TK1、TOP2A、TPX2、TRIM59、TRIP13、TYMS、UBE2C、UBE2T、UHRF1和VRK1。
3.根据权利要求1所述的一种以87个基因作为生物标志物预测细胞增殖活性的模型,其特征在于:步骤(2)中获得CCLE数据库中部分细胞增殖活性数据,是指获得CCLE数据库中部分细胞倍增时间增殖活性数据,使用T检验比较不同细胞类型群的细胞增殖基因集合的表达值。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010554703.4A CN111739586B (zh) | 2020-06-17 | 2020-06-17 | 以87个基因作为生物标志物预测细胞增殖活性的模型 |
PCT/CN2020/101544 WO2021253544A1 (zh) | 2020-06-17 | 2020-07-13 | 以87个基因作为生物标志物预测细胞增殖活性的模型 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010554703.4A CN111739586B (zh) | 2020-06-17 | 2020-06-17 | 以87个基因作为生物标志物预测细胞增殖活性的模型 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111739586A CN111739586A (zh) | 2020-10-02 |
CN111739586B true CN111739586B (zh) | 2024-04-05 |
Family
ID=72649544
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010554703.4A Active CN111739586B (zh) | 2020-06-17 | 2020-06-17 | 以87个基因作为生物标志物预测细胞增殖活性的模型 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111739586B (zh) |
WO (1) | WO2021253544A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114042161B (zh) * | 2021-11-17 | 2023-05-30 | 浙江省人民医院 | Cenpw抑制剂在制备抗肿瘤药物中的应用 |
GB2613386A (en) * | 2021-12-02 | 2023-06-07 | Apis Assay Tech Limited | Diagnostic test |
CN117954097A (zh) * | 2023-03-16 | 2024-04-30 | 中国人民解放军空军军医大学 | 一种肺腺癌预后评估系统和设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108424969A (zh) * | 2018-06-06 | 2018-08-21 | 深圳市颐康生物科技有限公司 | 一种生物标志物、诊断或预估死亡风险的方法 |
CN109852671A (zh) * | 2011-07-19 | 2019-06-07 | 皇家飞利浦有限公司 | 使用目标基因表达的概率建模评估细胞信号传导途径活性 |
CN109859801A (zh) * | 2019-02-14 | 2019-06-07 | 辽宁省肿瘤医院 | 一种含有七个基因作为生物标志物预测肺鳞癌预后的模型及建立方法 |
CN110441523A (zh) * | 2019-08-09 | 2019-11-12 | 首都医科大学附属北京朝阳医院 | Atad2蛋白作为标志物在判断卵巢癌增殖状态中的应用 |
KR20200038660A (ko) * | 2018-10-04 | 2020-04-14 | 사회복지법인 삼성생명공익재단 | 바이오마커의 선별 방법 및 이를 이용한 암의 진단을 위한 정보제공방법 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150294062A1 (en) * | 2012-10-29 | 2015-10-15 | Ontario Institute For Cancer Research (Oicr) | Method for Identifying a Target Molecular Profile Associated with a Target Cell Population |
US11427873B2 (en) * | 2018-08-10 | 2022-08-30 | Omniseq, Inc. | Methods and systems for assessing proliferative potential and resistance to immune checkpoint blockade |
CN109797221A (zh) * | 2019-03-13 | 2019-05-24 | 上海市第十人民医院 | 一种用于对肌层浸润性膀胱癌进行分子分型和/或预后预测的生物标记物组合及其应用 |
-
2020
- 2020-06-17 CN CN202010554703.4A patent/CN111739586B/zh active Active
- 2020-07-13 WO PCT/CN2020/101544 patent/WO2021253544A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109852671A (zh) * | 2011-07-19 | 2019-06-07 | 皇家飞利浦有限公司 | 使用目标基因表达的概率建模评估细胞信号传导途径活性 |
CN108424969A (zh) * | 2018-06-06 | 2018-08-21 | 深圳市颐康生物科技有限公司 | 一种生物标志物、诊断或预估死亡风险的方法 |
KR20200038660A (ko) * | 2018-10-04 | 2020-04-14 | 사회복지법인 삼성생명공익재단 | 바이오마커의 선별 방법 및 이를 이용한 암의 진단을 위한 정보제공방법 |
CN109859801A (zh) * | 2019-02-14 | 2019-06-07 | 辽宁省肿瘤医院 | 一种含有七个基因作为生物标志物预测肺鳞癌预后的模型及建立方法 |
CN110441523A (zh) * | 2019-08-09 | 2019-11-12 | 首都医科大学附属北京朝阳医院 | Atad2蛋白作为标志物在判断卵巢癌增殖状态中的应用 |
Non-Patent Citations (1)
Title |
---|
MicroRNA:一种新型的肺癌诊断、预测和治疗的生物标志物;关雅萍(综述);王俊(审阅);王宝成(审阅);中国肿瘤生物治疗杂志;第20卷(第4期);498-505 * |
Also Published As
Publication number | Publication date |
---|---|
WO2021253544A1 (zh) | 2021-12-23 |
CN111739586A (zh) | 2020-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111739586B (zh) | 以87个基因作为生物标志物预测细胞增殖活性的模型 | |
Keren et al. | A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging | |
Santagata et al. | Taxonomy of breast cancer based on normal cell phenotype predicts outcome | |
Vatter et al. | High-dimensional phenotyping identifies age-emergent cells in human mammary epithelia | |
Abreu et al. | Male breast cancer: Looking for better prognostic subgroups | |
Saare et al. | High-throughput sequencing approach uncovers the miRNome of peritoneal endometriotic lesions and adjacent healthy tissues | |
Schwede et al. | Stem cell-like gene expression in ovarian cancer predicts type II subtype and prognosis | |
Liu et al. | Identification of a gene signature for renal cell carcinoma–associated fibroblasts mediating cancer progression and affecting prognosis | |
Yin et al. | Integrative radiomics expression predicts molecular subtypes of primary clear cell renal cell carcinoma | |
Liu et al. | Discovery of microarray-identified genes associated with ovarian cancer progression | |
Kawaguchi et al. | Gene Expression Signature–Based Prognostic Risk Score in Patients with Primary Central Nervous System Lymphoma | |
Amiri Souri et al. | Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer | |
Wang et al. | Single-cell transcriptional regulation and genetic evolution of neuroendocrine prostate cancer | |
Riester et al. | Distance in cancer gene expression from stem cells predicts patient survival | |
Goh et al. | Transcriptomics indicate nuclear division and cell adhesion not recapitulated in MCF7 and MCF10A compared to luminal A breast tumours | |
Wang et al. | A comprehensive understanding of ovarian carcinoma survival prognosis by novel biomarkers. | |
Gross et al. | A multi-omic analysis of MCF10A cells provides a resource for integrative assessment of ligand-mediated molecular and phenotypic responses | |
Bell et al. | PanIN and CAF transitions in pancreatic carcinogenesis revealed with spatial data integration | |
Li et al. | [Retracted] Identification of Tumor Tissue of Origin with RNA‐Seq Data and Using Gradient Boosting Strategy | |
Ouyang et al. | Integrated analysis of mRNA and extrachromosomal circular DNA profiles to identify the potential mRNA biomarkers in breast cancer | |
Armanious et al. | Digital gene expression analysis might aid in the diagnosis of thyroid cancer | |
Bell et al. | Spatial transcriptomics of FFPE pancreatic intraepithelial neoplasias reveals cellular and molecular alterations of progression to pancreatic ductal carcinoma | |
Ajaib et al. | GBMdeconvoluteR accurately infers proportions of neoplastic and immune cell populations from bulk glioblastoma transcriptomics data | |
Diaz‐Romero et al. | Hierarchical clustering of flow cytometry data for the study of conventional central chondrosarcoma | |
CN117954097A (zh) | 一种肺腺癌预后评估系统和设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |