NZ752676B2 - Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer - Google Patents
Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer Download PDFInfo
- Publication number
- NZ752676B2 NZ752676B2 NZ752676A NZ75267614A NZ752676B2 NZ 752676 B2 NZ752676 B2 NZ 752676B2 NZ 752676 A NZ752676 A NZ 752676A NZ 75267614 A NZ75267614 A NZ 75267614A NZ 752676 B2 NZ752676 B2 NZ 752676B2
- Authority
- NZ
- New Zealand
- Prior art keywords
- gene
- genes
- patient
- rna
- expression
- Prior art date
Links
- 206010038389 Renal cancer Diseases 0.000 title claims abstract description 46
- 201000010982 kidney cancer Diseases 0.000 title claims abstract description 46
- 230000014509 gene expression Effects 0.000 title abstract description 166
- 238000004422 calculation algorithm Methods 0.000 title abstract description 23
- 102100009778 TUBB2A Human genes 0.000 claims abstract description 18
- 101710034504 TUBB2A Proteins 0.000 claims abstract description 18
- 102100002605 EIF4EBP1 Human genes 0.000 claims abstract description 13
- 101710022869 EIF4EBP1 Proteins 0.000 claims abstract description 13
- 102100011753 LMNB1 Human genes 0.000 claims abstract description 6
- 101700044890 LMNB1 Proteins 0.000 claims abstract description 6
- 229920000160 (ribonucleotides)n+m Polymers 0.000 claims description 87
- 206010028980 Neoplasm Diseases 0.000 claims description 71
- 208000006265 Renal Cell Carcinoma Diseases 0.000 claims description 42
- 101710045275 NANOS3 Proteins 0.000 claims description 23
- 101700006637 NOS3 Proteins 0.000 claims description 23
- 229920002676 Complementary DNA Polymers 0.000 claims description 17
- 102000017930 EDNRB Human genes 0.000 claims description 17
- 108060003362 EDNRB Proteins 0.000 claims description 17
- 206010073251 Clear cell renal cell carcinoma Diseases 0.000 claims description 16
- 102100011808 PLPP3 Human genes 0.000 claims description 15
- 101700023324 PLPP3 Proteins 0.000 claims description 15
- 239000002299 complementary DNA Substances 0.000 claims description 15
- 238000003757 reverse transcription PCR Methods 0.000 claims description 14
- 102100016449 CCL5 Human genes 0.000 claims description 13
- 101700063377 CCL5 Proteins 0.000 claims description 13
- 229920002287 Amplicon Polymers 0.000 claims description 8
- 102100007476 APOLD1 Human genes 0.000 claims description 4
- 101710004988 APOLD1 Proteins 0.000 claims description 4
- 101710043957 CEACAM1 Proteins 0.000 claims description 3
- 102100011828 CEACAM1 Human genes 0.000 claims description 3
- 102100000871 NOS3 Human genes 0.000 claims 2
- 102100009666 CXCL3 Human genes 0.000 claims 1
- 101700014760 CXCL3 Proteins 0.000 claims 1
- 201000011510 cancer Diseases 0.000 abstract description 59
- 230000010261 cell growth Effects 0.000 abstract description 34
- 230000032823 cell division Effects 0.000 abstract description 28
- 238000004166 bioassay Methods 0.000 abstract description 22
- 238000005259 measurement Methods 0.000 abstract description 20
- 239000012472 biological sample Substances 0.000 abstract description 7
- 239000000523 sample Substances 0.000 description 47
- 238000010606 normalization Methods 0.000 description 43
- 230000028993 immune response Effects 0.000 description 39
- 230000002792 vascular Effects 0.000 description 34
- 210000001519 tissues Anatomy 0.000 description 32
- 239000000047 product Substances 0.000 description 31
- 238000003752 polymerase chain reaction Methods 0.000 description 30
- 108090001005 Interleukin-6 Proteins 0.000 description 28
- 229920003013 deoxyribonucleic acid Polymers 0.000 description 27
- 238000000034 method Methods 0.000 description 22
- 102100007766 NANOS3 Human genes 0.000 description 21
- -1 PPA2B Proteins 0.000 description 19
- 102100000546 CX3CL1 Human genes 0.000 description 19
- 101710027818 CX3CL1 Proteins 0.000 description 19
- 210000004027 cells Anatomy 0.000 description 19
- 238000004458 analytical method Methods 0.000 description 18
- 108020004999 Messenger RNA Proteins 0.000 description 16
- 229920002106 messenger RNA Polymers 0.000 description 16
- 210000003734 Kidney Anatomy 0.000 description 14
- 229920000272 Oligonucleotide Polymers 0.000 description 14
- 238000006243 chemical reaction Methods 0.000 description 14
- 230000004044 response Effects 0.000 description 13
- 201000010099 disease Diseases 0.000 description 12
- 230000000694 effects Effects 0.000 description 12
- 229920000023 polynucleotide Polymers 0.000 description 12
- 239000002157 polynucleotide Substances 0.000 description 12
- 206010027476 Metastasis Diseases 0.000 description 11
- 230000034994 death Effects 0.000 description 11
- 231100000517 death Toxicity 0.000 description 11
- 238000009396 hybridization Methods 0.000 description 11
- ZHNUHDYFZUAESO-UHFFFAOYSA-N formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 10
- 239000000203 mixture Substances 0.000 description 10
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 10
- 229920001184 polypeptide Polymers 0.000 description 9
- 230000004083 survival Effects 0.000 description 9
- 210000001165 Lymph Nodes Anatomy 0.000 description 8
- 230000003321 amplification Effects 0.000 description 8
- 230000002596 correlated Effects 0.000 description 8
- 238000001514 detection method Methods 0.000 description 8
- 238000003199 nucleic acid amplification method Methods 0.000 description 8
- 230000000875 corresponding Effects 0.000 description 7
- 238000002493 microarray Methods 0.000 description 7
- 238000003860 storage Methods 0.000 description 7
- 230000000240 adjuvant Effects 0.000 description 6
- 239000002671 adjuvant Substances 0.000 description 6
- 102000004965 antibodies Human genes 0.000 description 6
- 108090001123 antibodies Proteins 0.000 description 6
- 238000004590 computer program Methods 0.000 description 6
- 230000037361 pathway Effects 0.000 description 6
- 238000000746 purification Methods 0.000 description 6
- 238000003753 real-time PCR Methods 0.000 description 6
- 230000000268 renotropic Effects 0.000 description 6
- 238000003196 serial analysis of gene expression Methods 0.000 description 6
- 239000011780 sodium chloride Substances 0.000 description 6
- 102100006247 AAMP Human genes 0.000 description 5
- 101700026995 AAMP Proteins 0.000 description 5
- 102000033147 ERVK-25 Human genes 0.000 description 5
- 229920002459 Intron Polymers 0.000 description 5
- 102100013322 MTOR Human genes 0.000 description 5
- 238000002123 RNA extraction Methods 0.000 description 5
- HRXKRNGNAMMEHJ-UHFFFAOYSA-K Trisodium citrate Chemical compound [Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O HRXKRNGNAMMEHJ-UHFFFAOYSA-K 0.000 description 5
- 230000022534 cell killing Effects 0.000 description 5
- 238000002512 chemotherapy Methods 0.000 description 5
- 230000009089 cytolysis Effects 0.000 description 5
- 238000010195 expression analysis Methods 0.000 description 5
- 150000002500 ions Chemical class 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 5
- 230000017074 necrotic cell death Effects 0.000 description 5
- 230000001575 pathological Effects 0.000 description 5
- 102000004169 proteins and genes Human genes 0.000 description 5
- 108090000623 proteins and genes Proteins 0.000 description 5
- 238000000575 proteomic Methods 0.000 description 5
- 239000011778 trisodium citrate Substances 0.000 description 5
- 238000004450 types of analysis Methods 0.000 description 5
- 238000010200 validation analysis Methods 0.000 description 5
- 230000000989 vascularization Effects 0.000 description 5
- 238000005406 washing Methods 0.000 description 5
- 108010092799 EC 2.7.7.49 Proteins 0.000 description 4
- 102100011478 EMCN Human genes 0.000 description 4
- 101710035483 EMCN Proteins 0.000 description 4
- 102100015249 VEGFA Human genes 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 239000000975 dye Substances 0.000 description 4
- 230000002068 genetic Effects 0.000 description 4
- 238000003364 immunohistochemistry Methods 0.000 description 4
- 239000002773 nucleotide Substances 0.000 description 4
- 238000004393 prognosis Methods 0.000 description 4
- 238000010839 reverse transcription Methods 0.000 description 4
- 239000001509 sodium citrate Substances 0.000 description 4
- 239000000758 substrate Substances 0.000 description 4
- 238000001356 surgical procedure Methods 0.000 description 4
- 102100005664 ATP5F1E Human genes 0.000 description 3
- 101710034033 ATP5F1E Proteins 0.000 description 3
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 3
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 3
- 229940088598 Enzyme Drugs 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 229920000665 Exon Polymers 0.000 description 3
- 210000003195 Fascia Anatomy 0.000 description 3
- 229920001850 Nucleic acid sequence Polymers 0.000 description 3
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 3
- 108010065917 TOR Serine-Threonine Kinases Proteins 0.000 description 3
- 108010006785 Taq Polymerase Proteins 0.000 description 3
- 108010073929 Vascular Endothelial Growth Factor A Proteins 0.000 description 3
- 230000001058 adult Effects 0.000 description 3
- 101710024770 arf-1.1 Proteins 0.000 description 3
- 238000001574 biopsy Methods 0.000 description 3
- 235000019994 cava Nutrition 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 230000000295 complement Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 230000001419 dependent Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 230000002401 inhibitory effect Effects 0.000 description 3
- 238000007477 logistic regression Methods 0.000 description 3
- 238000010841 mRNA extraction Methods 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 108020004707 nucleic acids Proteins 0.000 description 3
- 150000007523 nucleic acids Chemical class 0.000 description 3
- 125000003729 nucleotide group Chemical group 0.000 description 3
- 239000012188 paraffin wax Substances 0.000 description 3
- 238000000513 principal component analysis Methods 0.000 description 3
- 230000003252 repetitive Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 101700043375 sing Proteins 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 210000004881 tumor cells Anatomy 0.000 description 3
- 238000002604 ultrasonography Methods 0.000 description 3
- 102100011247 ANGPTL3 Human genes 0.000 description 2
- 101710043744 ANGPTL3 Proteins 0.000 description 2
- 210000004100 Adrenal Glands Anatomy 0.000 description 2
- 241000972773 Aulopiformes Species 0.000 description 2
- 108060006672 BUB1 Proteins 0.000 description 2
- 102100017590 CA12 Human genes 0.000 description 2
- 101700033407 CA12 Proteins 0.000 description 2
- 102100015175 CAVIN2 Human genes 0.000 description 2
- 101710010268 CAVIN2 Proteins 0.000 description 2
- 101700006955 CCNB1 Proteins 0.000 description 2
- 102100003215 CCNB1 Human genes 0.000 description 2
- 102100008151 CCR7 Human genes 0.000 description 2
- 102100008191 CD8A Human genes 0.000 description 2
- 101700054655 CD8A Proteins 0.000 description 2
- 102100009641 CXCL10 Human genes 0.000 description 2
- 101710032181 CXCL10 Proteins 0.000 description 2
- 102100009686 CXCL9 Human genes 0.000 description 2
- 101700052645 CXCL9 Proteins 0.000 description 2
- 229920001405 Coding region Polymers 0.000 description 2
- 102000005889 Cysteine-Rich Protein 61 Human genes 0.000 description 2
- 108010019961 Cysteine-Rich Protein 61 Proteins 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 102100013182 FLT4 Human genes 0.000 description 2
- 102000006602 Glyceraldehyde-3-Phosphate Dehydrogenases Human genes 0.000 description 2
- 108020004445 Glyceraldehyde-3-Phosphate Dehydrogenases Proteins 0.000 description 2
- 102100004114 ICAM2 Human genes 0.000 description 2
- 101700035976 ICAM2 Proteins 0.000 description 2
- 101710006668 ITGB5 Proteins 0.000 description 2
- 102100012521 ITGB5 Human genes 0.000 description 2
- 101700058295 JAG1 Proteins 0.000 description 2
- 102100006542 JAG1 Human genes 0.000 description 2
- 101700062932 LAMB1 Proteins 0.000 description 2
- 102100011557 LDB2 Human genes 0.000 description 2
- 101700041982 LDB2 Proteins 0.000 description 2
- 101710027475 MAP2K3 Proteins 0.000 description 2
- 102100015875 MAP2K3 Human genes 0.000 description 2
- 101700036258 MECOM Proteins 0.000 description 2
- 102100004965 MMP14 Human genes 0.000 description 2
- 101700056196 MMP14 Proteins 0.000 description 2
- 101700036611 MTOR Proteins 0.000 description 2
- 229920001776 Mature messenger RNA Polymers 0.000 description 2
- 206010061289 Metastatic neoplasm Diseases 0.000 description 2
- 101700080605 NUC1 Proteins 0.000 description 2
- 101700020221 NUDT6 Proteins 0.000 description 2
- 102100010770 NUDT6 Human genes 0.000 description 2
- 102100007812 PDGFA Human genes 0.000 description 2
- 102100007815 PDGFB Human genes 0.000 description 2
- 102100007816 PDGFC Human genes 0.000 description 2
- 101700041067 PDGFC Proteins 0.000 description 2
- 102100017565 PDGFD Human genes 0.000 description 2
- 101700025642 PDGFD Proteins 0.000 description 2
- 102100004939 PDGFRB Human genes 0.000 description 2
- 101710038847 PRKCH Proteins 0.000 description 2
- 102100019669 PRKCH Human genes 0.000 description 2
- 102100001937 PTPRB Human genes 0.000 description 2
- 101700082725 PTPRB Proteins 0.000 description 2
- 102100014221 PTTG1 Human genes 0.000 description 2
- 101700005669 PTTG1 Proteins 0.000 description 2
- 108010051742 Platelet-Derived Growth Factor beta Receptor Proteins 0.000 description 2
- 108010019674 Proto-Oncogene Proteins c-sis Proteins 0.000 description 2
- 102100016115 RAF1 Human genes 0.000 description 2
- 101700007719 RAF1 Proteins 0.000 description 2
- 101710007825 RNASE3 Proteins 0.000 description 2
- 101700056898 SGK1 Proteins 0.000 description 2
- 102100018035 SGK1 Human genes 0.000 description 2
- 102100008951 SGPP1 Human genes 0.000 description 2
- 101710031784 SHANK3 Proteins 0.000 description 2
- 102100014742 SHANK3 Human genes 0.000 description 2
- 101700004573 SNRK Proteins 0.000 description 2
- 102100007126 SNRK Human genes 0.000 description 2
- 108060007846 SPP2 Proteins 0.000 description 2
- 102100006015 TIMP3 Human genes 0.000 description 2
- 102100012513 TPX2 Human genes 0.000 description 2
- 108060008454 TPX2 Proteins 0.000 description 2
- 102100016885 TSPAN7 Human genes 0.000 description 2
- 101710041050 TSPAN7 Proteins 0.000 description 2
- 229940035295 Ting Drugs 0.000 description 2
- 108010031429 Tissue Inhibitor of Metalloproteinase-3 Proteins 0.000 description 2
- 102000004060 Transforming Growth Factor-beta Type II Receptor Human genes 0.000 description 2
- 108010082684 Transforming Growth Factor-beta Type II Receptor Proteins 0.000 description 2
- 102100008786 UGCG Human genes 0.000 description 2
- 101710012086 UGCG Proteins 0.000 description 2
- 108010053100 Vascular Endothelial Growth Factor Receptor-3 Proteins 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 2
- 230000033115 angiogenesis Effects 0.000 description 2
- 239000000090 biomarker Substances 0.000 description 2
- UIIMBOGNXHQVGW-UHFFFAOYSA-M buffer Substances [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 2
- 230000024245 cell differentiation Effects 0.000 description 2
- 230000001413 cellular Effects 0.000 description 2
- 201000010240 chromophobe renal cell carcinoma Diseases 0.000 description 2
- 238000007374 clinical diagnostic method Methods 0.000 description 2
- 230000004186 co-expression Effects 0.000 description 2
- 230000001186 cumulative Effects 0.000 description 2
- 230000003247 decreasing Effects 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 229940079593 drugs Drugs 0.000 description 2
- 230000029578 entry into host Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000007850 fluorescent dye Substances 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000002962 histologic Effects 0.000 description 2
- 230000003447 ipsilateral Effects 0.000 description 2
- 230000036210 malignancy Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 description 2
- 229920001239 microRNA Polymers 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 230000001613 neoplastic Effects 0.000 description 2
- 101700006494 nucA Proteins 0.000 description 2
- 230000003287 optical Effects 0.000 description 2
- 210000000056 organs Anatomy 0.000 description 2
- 201000010279 papillary renal cell carcinoma Diseases 0.000 description 2
- 108010017843 platelet-derived growth factor A Proteins 0.000 description 2
- 238000007674 radiofrequency ablation Methods 0.000 description 2
- 235000019515 salmon Nutrition 0.000 description 2
- 239000001488 sodium phosphate Substances 0.000 description 2
- 229910000162 sodium phosphate Inorganic materials 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- RYFMWSXOAZQYPI-UHFFFAOYSA-K trisodium phosphate Chemical compound [Na+].[Na+].[Na+].[O-]P([O-])([O-])=O RYFMWSXOAZQYPI-UHFFFAOYSA-K 0.000 description 2
- 230000004614 tumor growth Effects 0.000 description 2
- 102100017333 ADD1 Human genes 0.000 description 1
- 101700024838 ADD1 Proteins 0.000 description 1
- 101710025934 ADD3 Proteins 0.000 description 1
- 101710034857 ATIC Proteins 0.000 description 1
- 241000713838 Avian myeloblastosis virus Species 0.000 description 1
- 102100005258 BUB1 Human genes 0.000 description 1
- 210000004369 Blood Anatomy 0.000 description 1
- 210000001124 Body Fluids Anatomy 0.000 description 1
- 229940098773 Bovine Serum Albumin Drugs 0.000 description 1
- 108091003117 Bovine Serum Albumin Proteins 0.000 description 1
- 101710010581 CASP10 Proteins 0.000 description 1
- AIYUHDOJVYHVIT-UHFFFAOYSA-M Caesium chloride Chemical compound [Cl-].[Cs+] AIYUHDOJVYHVIT-UHFFFAOYSA-M 0.000 description 1
- 240000002804 Calluna vulgaris Species 0.000 description 1
- 235000007575 Calluna vulgaris Nutrition 0.000 description 1
- 210000000805 Cytoplasm Anatomy 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N D-sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 101700011961 DPOM Proteins 0.000 description 1
- 241000238557 Decapoda Species 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-N Deoxycytidine triphosphate Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO[P@](O)(=O)O[P@](O)(=O)OP(O)(O)=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-N 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N Deoxyguanosine triphosphate Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- 229960000633 Dextran Sulfate Drugs 0.000 description 1
- 108020004461 Double-Stranded RNA Proteins 0.000 description 1
- 206010059866 Drug resistance Diseases 0.000 description 1
- 101700082870 EGR1 Proteins 0.000 description 1
- 102100003893 EGR1 Human genes 0.000 description 1
- 101700070897 ENO2 Proteins 0.000 description 1
- 102100003738 ENO2 Human genes 0.000 description 1
- 102100016168 EPAS1 Human genes 0.000 description 1
- 229940109526 Ery Drugs 0.000 description 1
- 102100006565 FLT1 Human genes 0.000 description 1
- 101710030892 FLT1 Proteins 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 210000004907 Glands Anatomy 0.000 description 1
- 101710037770 HIF1AN Proteins 0.000 description 1
- 102100003044 HIF1AN Human genes 0.000 description 1
- 102100015626 HLA-DPB1 Human genes 0.000 description 1
- 108010045483 HLA-DPB1 antigen Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 101710006661 ITGB1 Proteins 0.000 description 1
- 102100001478 ITGB1 Human genes 0.000 description 1
- 210000000987 Immune System Anatomy 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- UGQMRVRMYYASKQ-KMPDEGCQSA-N Inosine Natural products O[C@H]1[C@H](O)[C@@H](CO)O[C@@H]1N1C(N=CNC2=O)=C2N=C1 UGQMRVRMYYASKQ-KMPDEGCQSA-N 0.000 description 1
- 102000004889 Interleukin-6 Human genes 0.000 description 1
- 108090001007 Interleukin-8 Proteins 0.000 description 1
- 108020004391 Introns Proteins 0.000 description 1
- 241000334154 Isatis tinctoria Species 0.000 description 1
- 241000229754 Iva xanthiifolia Species 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 210000004072 Lung Anatomy 0.000 description 1
- 101710029649 MDV043 Proteins 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 208000007054 Medullary Carcinoma Diseases 0.000 description 1
- 210000003205 Muscles Anatomy 0.000 description 1
- 241001182492 Nes Species 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 238000009004 PCR Kit Methods 0.000 description 1
- 101700061424 POLB Proteins 0.000 description 1
- 101700086882 PPIG Proteins 0.000 description 1
- 102100003642 PPIG Human genes 0.000 description 1
- 108091005771 Peptidases Proteins 0.000 description 1
- 210000002381 Plasma Anatomy 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 108010026552 Proteome Proteins 0.000 description 1
- 101700054624 RF1 Proteins 0.000 description 1
- 102100000959 RGCC Human genes 0.000 description 1
- 101700018446 RGCC Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 210000002796 Renal Veins Anatomy 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 229920001914 Ribonucleotide Polymers 0.000 description 1
- 108020004418 Ribosomal RNA Proteins 0.000 description 1
- 101710043414 SREBF1 Proteins 0.000 description 1
- 210000002966 Serum Anatomy 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 241000338137 Teratosphaeria nubilosa Species 0.000 description 1
- FQENQNTWSFEDLI-UHFFFAOYSA-J Tetrasodium pyrophosphate Chemical compound [Na+].[Na+].[Na+].[Na+].[O-]P([O-])(=O)OP([O-])([O-])=O FQENQNTWSFEDLI-UHFFFAOYSA-J 0.000 description 1
- 206010044412 Transitional cell carcinoma Diseases 0.000 description 1
- 108010020713 Tth polymerase Proteins 0.000 description 1
- PGAVKCOVUIYSFO-XVFCMESISA-N Uridine triphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-XVFCMESISA-N 0.000 description 1
- 210000002700 Urine Anatomy 0.000 description 1
- 102100019577 VCAM1 Human genes 0.000 description 1
- 101700058634 VCAM1 Proteins 0.000 description 1
- 101700068732 VEGFA Proteins 0.000 description 1
- 210000003462 Veins Anatomy 0.000 description 1
- 208000008383 Wilms Tumor Diseases 0.000 description 1
- ASCUXPQGEXGEMJ-GPLGTHOPSA-N [(2R,3S,4S,5R,6S)-3,4,5-triacetyloxy-6-[[(2R,3R,4S,5R,6R)-3,4,5-triacetyloxy-6-(4-methylanilino)oxan-2-yl]methoxy]oxan-2-yl]methyl acetate Chemical compound CC(=O)O[C@@H]1[C@@H](OC(C)=O)[C@@H](OC(C)=O)[C@@H](COC(=O)C)O[C@@H]1OC[C@@H]1[C@@H](OC(C)=O)[C@H](OC(C)=O)[C@@H](OC(C)=O)[C@H](NC=2C=CC(C)=CC=2)O1 ASCUXPQGEXGEMJ-GPLGTHOPSA-N 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 238000009098 adjuvant therapy Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000000259 anti-tumor Effects 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 201000009030 carcinoma Diseases 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000004663 cell proliferation Effects 0.000 description 1
- 229940098124 cesium chloride Drugs 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 230000000973 chemotherapeutic Effects 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000009535 clinical urine test Methods 0.000 description 1
- 201000010276 collecting duct carcinoma Diseases 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000002860 competitive Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000002681 cryosurgery Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-J dATP(4-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-J 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 230000004059 degradation Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000000432 density-gradient centrifugation Methods 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000006209 dephosphorylation reaction Methods 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N edta Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 108010018033 endothelial PAS domain-containing protein 1 Proteins 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002349 favourable Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000000834 fixative Substances 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 230000036737 immune function Effects 0.000 description 1
- 238000009169 immunotherapy Methods 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 230000002757 inflammatory Effects 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 201000000062 kidney sarcoma Diseases 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000001926 lymphatic Effects 0.000 description 1
- 230000003211 malignant Effects 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 238000001819 mass spectrum Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000001404 mediated Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000001394 metastastic Effects 0.000 description 1
- 238000010208 microarray analysis Methods 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 201000008026 nephroblastoma Diseases 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000036961 partial Effects 0.000 description 1
- 230000002093 peripheral Effects 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 239000001267 polyvinylpyrrolidone Substances 0.000 description 1
- 235000013855 polyvinylpyrrolidone Nutrition 0.000 description 1
- 229920000036 polyvinylpyrrolidone Polymers 0.000 description 1
- 238000010837 poor prognosis Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000000171 quenching Effects 0.000 description 1
- 230000002285 radioactive Effects 0.000 description 1
- 230000002829 reduced Effects 0.000 description 1
- 238000002271 resection Methods 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 229920002973 ribosomal RNA Polymers 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000003248 secreting Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 231100000486 side effect Toxicity 0.000 description 1
- 239000012064 sodium phosphate buffer Substances 0.000 description 1
- 229940048086 sodium pyrophosphate Drugs 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 235000019818 tetrasodium diphosphate Nutrition 0.000 description 1
- 239000001577 tetrasodium phosphonato phosphate Substances 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 229940038773 trisodium citrate Drugs 0.000 description 1
- 229950010342 uridine triphosphate Drugs 0.000 description 1
- 230000036642 wellbeing Effects 0.000 description 1
Abstract
The present invention provides algorithm-based molecular assays that involve measurement of expression levels of genes from a biological sample obtained from a kidney cancer patient. The present invention also provides methods of obtaining a quantitative score for a patient with kidney cancer based on measurement of expression levels of genes from a biological sample obtained from a kidney cancer patient. The genes may be grouped into functional gene subsets for calculating the quantitative score and the gene subsets may be weighted according to their contribution to cancer recurrence. In particular, the present invention relates to the gene subset EIF4EBP1, LMNB1 and TUBB2A assigned to the cell growth/division gene group. on measurement of expression levels of genes from a biological sample obtained from a kidney cancer patient. The genes may be grouped into functional gene subsets for calculating the quantitative score and the gene subsets may be weighted according to their contribution to cancer recurrence. In particular, the present invention relates to the gene subset EIF4EBP1, LMNB1 and TUBB2A assigned to the cell growth/division gene group.
Description
GENE EXPRESSION PROFILE ALGORITHM FOR ATING A
RECURRENCE SCORE FOR A PATIENT WITH KIDNEY CANCER
TECHNICAL FIELD
The present disclosure relates to molecular diagnostic assays that provide
information concerning gene expression profiles to determine prognostic information for
cancer patients. Specifically, the present disclosure provides an algorithm comprising genes,
or co-expressed genes, the expression levels of which may be used to determine the
likelihood that a kidney cancer patient will experience a positive or a negative clinical
outcome. The present disclosure provides gene expression information useful for calculating
a recurrence score for a patient with kidney cancer.
UCTION
The American Cancer y’s estimates that in 2013 there will be about
65,150 new cases of kidney cancer and about 13,680 deaths from kidney cancer in the United
States. (American Cancer Society, Kidney Cancer (Adult) Renal Cell Carcinoma Overview,
available online at http://www.cancer.org/acs/groups/cid/documents/webcontent/OO3052-pdf.pdf).
Renal cell carcinoma (RCC), also called renal adenocarcinoma or hypernephroma, is the most
common type of kidney cancer, accounting for more than 9 out of 10 cases of kidney cancer,
and it accounts for approximately 2-3% of all malignancies. (Id. ; National Comprehensive
Cancer k Guidelines (NCCN) Clinical Practice Guidelines in Oncology, Kidney
Cancer, Version 1.2013.) For unknown reasons, the rate of RCC has increased by 2% per
year for the past 65 years. (NCCN Clinical ce Guidelines in Oncology, Kidney
Cancer.)
There are multiple subtypes of RCC, including clear cell renal cell oma,
papillary renal cell carcinoma, chromophobe renal cell carcinoma, ting duct renal cell
carcinoma, and unclassified renal cell carcinoma. Clear cell renal cell carcinoma ) is
the most common subtype of renal cell oma, with about 7 out of 10 patients with RCC
having ccRCC. (American Cancer Society, Kidney Cancer (Adult) Renal Cell Carcinoma
Overview)
Evaluation and staging of RCC includes visualization via imaging methods,
such as computed tomographic (CT) scan, ultrasound, or magnetic nce imaging (MRI),
and physical and laboratory evaluations. Needle-biopsy may be med to diagnose RCC
and guide surveillance of disease. Physicians classify tumors based on clinical and
pathological features, such as tumor stage, regional lymph node status, tumor size, nuclear
grade, and histologic is. Such designations can be subjective, and there is a lack of
concordance among pathology tories in making such determination (Al-Ayanti M et al.
(2003) Arch Pathol Lab Med 127, 593-596), highlighting the need for more objective
designations.
ent of RCC varies depending on the stage of the cancer, the patient’s
l health, the likely side effects of treatment, the chances of curing the disease, the
chances of improving survival, and/or relieving ms associated with the cancer.
Surgery is the main treatment for RCC that can be removed. (American Cancer Society
Kidney Cancer (Adult) Renal Cell Carcinoma Overview.) Even after al excision, 20-
% of patients with localized tumors experience e, most of which occur within three
years. (NCCN Clinical Practice Guidelines in Oncology, Kidney Cancer.) Lung metastasis
is the most common site of distant relapse, occurring in 50-60% of patients. (161.)
If a patient has a small tumor, e. g., < 3 cm, however, the physician may not
perform y, instead opting to monitor the tumor’s growth. Such active surveillance may
allow some ts to avoid surgery and other treatments. In non-surgical candidates,
particularly the elderly and those with competing health risks, ablative techniques, such as
cryosurgery or radiofrequency ablation, or active surveillance may be used.
Physicians require stic information to help them make informed
treatment decisions for ts with RCC and recruit appropriate high risk patients into
clinical trials in order to se the statistical power of the trial. Existing methods are based
on subjective es and therefore may provide inaccurate prognostic information.
This application discloses molecular assays that involve measurement of
expression level(s) of one or more genes or gene subsets from a biological sample obtained
from a kidney cancer t. For example, the likelihood of a clinical outcome may be
described in terms of a quantitative score based on observed clinical features of the disease or
recurrence-free interval.
In addition, this application discloses methods of obtaining a recurrence score
(RS) for a patient with kidney cancer based on measurement of expression level(s) of one or
more genes or gene subsets from a biological sample obtained from a kidney cancer patient.
The present disclosure provides a method for obtaining a recurrence score for
a patient with kidney cancer comprising measuring a level of at least one RNA transcript, or
expression t f, in a tumor sample obtained from the patient. The RNA
ript, or expression product thereof, may be selected from APOLDl, EDNRB, NOS3,
PPA2B, EIF4EBP1, LMNBl, TUBB2A, CCL5, l, CX3CL1, and IL-6. The
method comprises normalizing the gene expression level against a level of at least one
reference RNA transcript, or sion product thereof, in the tumor sample. In some
embodiments, normalization may include compression of gene expression measurements for
low sing genes and/or genes with nonlinear functional forms. The method also
comprises assigning the normalized level to a gene subset. The gene subset may be selected
from a vascular normalization group, a cell growth/division group, and an immune response
group. In some embodiments, APOLDl, EDNRB, N083, and PPA2B are assigned to the
vascular normalization group. In various embodiments, EIF4EBPl, LMNBl, and TUBB2A
are assigned to the cell growth/division group. In other embodiments, CCL5, CEACAMl,
and CX3CLl are assigned to the immune response group. The method also comprises
weighting the gene subset according to its contribution to the assessment of risk of cancer
recurrence. The method further comprises calculating a recurrence score for the patient using
the weighted gene subsets and the normalized levels. The method may r comprise
creating a report sing the recurrence score.
The present disclosure also provides a method of predicting a hood of a
clinical outcome for a patient with kidney . The method comprises determining a level
of one or more RNA transcripts, or an expression product thereof, in a tumor sample obtained
from the patient. The one or more RNA transcripts is selected from APOLDl, EDNRB,
NOS3, PPA2B, EIF4EBP1, LMNBl, , CCL5, CEACAMl, CX3CL1, and IL-6.
The method also comprises assigning the one or more RNA transcripts, or an expression
product thereof, to one or more gene subsets. The method also comprises assigning the
normalized level to a gene subset. The gene subset may be selected from a vascular
ization group, a cell growth/division group, and an immune response group. In some
embodiments, APOLDl, EDNRB, N083, and PPA2B are assigned to the vascular
ization group. In various embodiments, EIF4EBP1, LMNBl, and TUBB2A are
assigned to the cell growth/division group. In other embodiments, CCL5, CEACAMl, and
CX3CLl are assigned to the immune response group. The method further comprises
calculating a quantitative score for the patient by weighting the level of one or more RNA
transcripts, or an expression product thereof, by their contribution to the assessment of the
likelihood of a clinical outcome. The method additionally comprises predicting a likelihood
of a clinical outcome for the patient based on the quantitative score. In some embodiments,
an increase in the quantitative score correlates with an increased likelihood of a negative
al outcome. In some embodiments, the clinical outcome is cancer recurrence.
In some embodiments of the present disclosure, the kidney cancer is renal cell
carcinoma. In other embodiments, the kidney cancer is clear cell renal cell carcinoma.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows predictiveness curves and 95% ence intervals for
patients with Stage 1 ccRCC (A) and patients with Stage 2 or Stage 3 ccRCC (B) based on
the algorithm bed in the Examples.
DETAILED DESCRIPTION
DEFINITIONS
Unless defined otherwise, technical and scientific terms used herein have the
same meaning as commonly understood by one of ordinary skill in the art to which this
invention belongs. Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd
ed., J. Wiley & Sons (New York, NY 1994), and March, Advanced Organic Chemistry
Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, NY 1992),
e one skilled in the art with a general guide to many of the terms used in the present
application.
One skilled in the art will recognize many methods and als similar or
equivalent to those described herein, which could be used in the ce of the t
invention. Indeed, the present ion is in no way limited to the methods and materials
described herein. For purposes of the invention, the following terms are defined below.
The terms “tumor” and n” as used herein, refer to all neoplastic cell
growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous
cells and tissues.
The terms “cancer,” “cancerous,” and “carcinoma” refer to or describe the
physiological condition in mammals that is typically characterized by lated cell
growth. Examples of cancer in the present disclosure include cancer of the kidney, such as
renal cell carcinoma (RCC, renal cell cancer, or renal cell adenocarcinoma), clear cell renal
cell carcinoma, papillary renal cell carcinoma, chromophobe renal cell carcinoma, collecting
duct renal cell oma, unclassified renal cell carcinoma, transitional cell carcinoma,
Wilms tumor, and renal sarcoma.
As used , the terms “kidney cancer,77 CCrenal cancer,” or “renal cell
carcinoma” refer to cancer that has arisen from the kidney.
The terms “renal cell cancer” or “renal cell carcinoma” (RCC), as used ,
refer to cancer which ates in the lining of the proximal convoluted tubule. More
specifically, RCC encompasses several relatively common histologic subtypes: clear cell
renal cell carcinoma, papillary (chromophil), chromophobe, collecting duct carcinoma, and
medullary carcinoma. Clear cell renal cell carcinoma (ccRCC) is the most common e
of RCC. Incidence of ccRCC is increasing, comprising 80% of localized e and more
than 90% of metastatic disease.
The “pathology” includes all phenomena that compromise the well-being of
the patient. This includes, without limitation, abnormal or uncontrollable cell growth,
metastasis, interference with the normal functioning of neighboring cells, release of nes
or other secretory products at abnormal levels, suppression or aggravation of inflammatory or
immunological response, neoplasia, premalignancy, malignancy, invasion of surrounding or
distant tissues or organs, such as lymph nodes, etc.
The a Joint Committee on Cancer (AJCC) staging system (7th ed.,
2010) (also ed to as the TNM (tumor, node, metastasis) system) for kidney cancer uses
Roman numerals I through IV (1-4) to describe the extent of the disease. (Edge, SB, et al.,
AJCC Cancer Staging Manual, (7th Ed. 2010.)) In general, the lower the number, the less the
cancer has spread. A higher number, such as stage IV, generally s a more s
cancer. The TNM staging system is as follows:
y Tumor (T)
Tx Primary tumor cannot be assessed
T0 No evidence of primary tumor
Tl Tumor 7 cm or less in greatest dimension, limited to the kidney
Tla Tumor 4 cm or less in st dimension, limited to the kidney
le Tumor more than 4 cm but not more than 7 cm in greatest dimension, limited to the
kidney
T2 Tumor more than 7 cm in greatest dimension, limited to the kidney
T2a Tumor more than 7 cm but less than or equal to 10 cm in the greatest dimension, d
to the kidney
T2b Tumor more than 10 cm, limited to the kidney
T3 Tumor extends into major veins or perinephric tissues but not into the ipsilateral adrenal
gland and not beyond Gerota’s fascia
T3a Tumor grossly extends into the renal vein or its segmental (muscle containing) branches,
or tumor invades perirenal and/or renal sinus fat but not beyond Gerota’ s fascia
T3b Tumor grossly extends into the vena cava below the diaphragm
T3c Tumor grossly extends into the vena cava above the diaphragm or invades the wall of the
vena cava
T4 Tumor invades beyond ’a fascia (including contiguous extension into the ipsilateral
adrenal gland)
Regional Lymph Nodes (N)
NX Regional lymph nodes cannot be assessed
N0 N0 al lymph node metastasis
N1 Metastasis in al lymph node(s)
Distant Metastasis (M)
M0 N0 distant metastasis
Ml Distant metastasis
Anatomic Stage/Prognostic Groups
Stage 1 T1 N0 M0
Stage II T2 N0 M0
Stage III T2 N0 M0
Stage IV T4 Any N M0
Any T Any N M1
The term “early stage renal cancer”, as used herein, refers to Stages 1-3.
Reference to tumor “grade” for renal cell carcinoma as used herein refers to a
g system based on microscopic appearance of tumor cells. According to the TNM
staging system of the AJCC, the various grades of renal cell oma are:
GX (grade of differentiation cannot be assessed);
Gl (well differentiated);
G2 (moderately differentiated); and
G3-G4 (poorly entiated/undifferentiated).
“Increased grade” as used herein refers to classification of a tumor at a grade
that is more advanced, e. g., Grade 4 (G4) 4 is an sed grade relative to Grades 1, 2, and
3. Tumor grading is an important prognostic factor in renal cell carcinoma. H. Rauschmeier,
et al., World J Urol 2:103-108 (1984).
The terms “necrosis” or “histologic necrosis” as used herein refer to the death
of living cells or tissues. The presence of necrosis may be a prognostic factor in cancer. For
example, necrosis is ly seen in renal cell carcinoma (RCC) and has been shown to be
an adverse prognostic factor in certain RCC subtypes. V. Foria, et al., J Clin Pathol 58(1):39-
43 (2005).
The terms “nodal on” or “node-positive (N+)” as used herein refer to the
presence of cancer cells in one or more lymph nodes associated with the organ (e.g., drain the
organ) ning a primary tumor. Assessing nodal invasion is part of tumor staging for
most s, including renal cell carcinoma.
The term “prognosis” is used herein to refer to the prediction of the likelihood
that a cancer patient will have a cancer-attributable death or progression, ing
recurrence, metastatic spread, and drug resistance, of a neoplastic disease, such as kidney
cancer.
The term “prognostic gene” is used herein to refer to a gene, the expression of
which is ated, positively or negatively, with a likelihood of cancer recurrence in a
cancer patient treated with the standard of care. A gene may be both a prognostic and
predictive gene, depending on the association of the gene expression level with the
corresponding endpoint. For example, using a Cox proportional hazards model, if a gene is
only prognostic, its hazard ratio (HR) does not change when measured in patients treated with
the standard of care or in patients treated with a new intervention.
The term ction” is used herein to refer to the likelihood that a cancer
patient will have a particular response to treatment, whether positive (“beneficial response”)
or ve, following surgical removal of the primary tumor. For example, treatment could
include targeted drugs, immunotherapy, or chemotherapy.
The terms “predictive gene” and “response indicator gene” are used
interchangeably herein to refer to a gene, the expression level of which is associated,
positively or negatively, with likelihood of beneficial response to treatment. A gene may be
both a stic and tive gene, and vice versa, depending on the correlation of the
gene expression level with the corresponding endpoint (e. g., likelihood of al without
recurrence, likelihood of beneficial response to ent). A predictive gene can be
identified using a Cox proportional hazards model to study the interaction between gene
expression levels and the effect of treatment [comparing patients treated with treatment A to
patients who did not receive treatment A (but may have received standard of care, e.g.
treatment B)]. The hazard ratio (HR) for a predictive gene will change when ed in
untreated/standard of care ts versus patients treated with treatment A.
As used herein, the term “expression level” as applied to a gene refers to the
normalized level of a gene product, e. g., the normalized value determined for the RNA
expression level of a gene or for the polypeptide expression level of a gene.
The term “gene t” or “expression product” are used herein to refer to
the RNA transcription products (transcripts) of the gene, including mRNA, and the
polypeptide products of such RNA transcripts. A gene t can be, for example, an
unspliced RNA, an mRNA, a splice variant mRNA, a microRNA, a fragmented RNA, a
polypeptide, a post-translationally modified polypeptide, a splice variant polypeptide, etc.
The term “RNA transcript” as used herein refers to the RNA transcription
ts of a gene, for example, mRNA, an ced RNA, a splice variant mRNA, a micro
RNA, and a fragmented RNA.
Unless indicated otherwise, each gene name used herein corresponds to the
al Symbol assigned to the gene and provided by Entrez Gene (URL:
www.ncbi.nlm.nih.gov/sites/entrez) as of the filing date of this application.
The terms “correlated” and “associated” are used interchangeably herein to
refer to the association between two measurements (or ed entities). The disclosure
provides genes and gene s, the expression levels of which are associated with a
particular outcome measure, such as for example the association between the expression level
of a gene and the likelihood of clinical outcome. For example, the increased expression level
of a gene may be positively correlated (positively associated) with an increased likelihood of
good clinical outcome for the patient, such as an increased likelihood of long-term survival
without recurrence of the cancer, and the like. Such a positive correlation may be
demonstrated statistically in various ways, e. g. by a low hazard ratio for cancer recurrence or
death. In another example, the increased sion level of a gene may be negatively
correlated (negatively associated) with an increased likelihood of good al outcome for
the patient. In that case, for example, the patient may have a decreased likelihood of long-
term survival without recurrence of the cancer, and the like. Such a negative correlation
indicates that the patient likely has a poor prognosis, and this may be demonstrated
statistically in various ways, e. g., a high hazard ratio for cancer recurrence or death.
“Correlated” is also used herein to refer to the association between the expression levels of
two different genes, such that expression level of a first gene can be substituted with an
expression level of a second gene in a given thm in view of their ation of
sion. Such “correlated expression” of two genes that are substitutable in an thm
usually involves gene sion levels that are positively correlated with one another, e. g., if
increased expression of a first gene is vely correlated with an outcome (e. g., increased
likelihood of good clinical e), then the second gene that is ressed and exhibits
ated expression with the first gene is also positively correlated with the same outcome.
A ive clinical outcome” can be assessed using any endpoint indicating a
benefit to the patient, including, without limitation, (1) inhibition, to some extent, of tumor
growth, including slowing down and complete growth arrest; (2) reduction in the number of
tumor cells; (3) reduction in tumor size; (4) inhibition (i.e., reduction, slowing down or
complete stopping) of tumor cell infiltration into adjacent peripheral organs and/or tissues;
(5) inhibition of metastasis; (6) enhancement of anti-tumor immune response, possibly
ing in regression or rejection of the tumor; (7) relief, to some extent, of one or more
ms associated with the tumor; (8) increase in the length of survival following
treatment; and/or (9) decreased mortality at a given point of time following treatment.
Positive clinical response may also be expressed in terms of various measures of clinical
outcome. Positive clinical outcome can also be considered in the context of an individual’s
outcome relative to an outcome of a population of patients having a comparable clinical
diagnosis, and can be assessed using various endpoints such as an increase in the duration of
Recurrence-Free interval (RFI), an increase in the time of survival as compared to Overall
al (OS) in a population, an se in the time of Disease-Free Survival (DFS), an
increase in the duration of Distant Recurrence-Free Interval (DRFI), and the like. An increase
in the likelihood of positive clinical response ponds to a decrease in the likelihood of
cancer ence.
The term “risk classification” means a level of risk (or likelihood) that a
subject will experience a particular clinical outcome. A subject may be classified into a risk
group or classified at a level of risk based on the methods of the present disclosure, e. g. high,
medium, or low risk. A “risk group” is a group of subjects or individuals with a similar level
of risk for a particular clinical outcome.
The term “long-term” survival is used herein to refer to survival for a
particular period of time, e. g., for at least 3 years, or for at least 5 years.
The terms “recurrence” and “relapse” are used herein, in the context of
potential clinical outcomes of cancer, to refer to a local or distant metastases. Identification of
a recurrence could be done by, for example, CT imaging, ultrasound, arteriogram, or X-ray,
biopsy, urine or blood test, physical exam, or research center tumor registry.
The term rence-Free Interval (RFI)” is used herein to refer to the time
(in years) from randomization to first kidney cancer recurrence or death due to recurrence of
kidney cancer.
The term “Overall al (OS)” is used herein to refer to the time (in years)
from ization to death from any cause.
The term “Disease-Free al (DFS)” is used herein to refer to the time (in
years) from randomization to first kidney cancer recurrence or death from any cause.
The calculation of the measures listed above in practice may vary from study
to study depending on the tion of events to be either censored or not censored.
The term “Hazard Ratio (HR)” as used herein refers to the effect of an
explanatory variable on the hazard or risk of an event (i.e. recurrence or death). In
proportional hazards regression models, the HR is the ratio of the predicted hazard for two
groups (e. g. patients with two different stages of cancer) or for a unit change in a uous
variable (e. g. one standard deviation change in gene expression).
The term “microarray” refers to an ordered arrangement of izable array
elements, e. g., oligonucleotide or polynucleotide probes, on a substrate.
The term “polynucleotide,” when used in singular or plural generally refers to
any bonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA
or modified RNA or DNA. Thus, for instance, polynucleotides are defined herein to include,
without limitation, - and double-stranded RNA, and RNA including - and double-
stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded
or, more typically, double-stranded or include single- and double-stranded regions. In
addition, the term “polynucleotide” as used herein refers to -stranded regions
comprising RNA or DNA or both RNA and DNA. The strands in such regions may be from
the same le or from ent molecules. The regions may include all of one or more
of the les, but more typically involve only a region of some of the molecules. One of
the molecules of a triple-helical region often is an ucleotide. The term “polynucleotide”
specifically includes cDNAs. The term es DNAs (including cDNAs) and RNAs that
contain one or more modified bases. Thus, DNAs or RNAs with backbones modified for
stability or for other reasons, are “polynucleotides” as that term is intended herein. Moreover,
DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as
tritiated bases, are included within the term “polynucleotides” as defined herein. In general,
the term “polynucleotide” embraces all chemically, enzymatically and/or metabolically
ed forms of unmodified polynucleotides, as well as the chemical forms of DNA and
RNA characteristic of viruses and cells, including simple and complex cells.
The term “oligonucleotide” refers to a relatively short polynucleotide,
including, without limitation, single-stranded deoxyribonucleotides, single- or double-
stranded ribonucleotides, RNArDNA hybrids and double-stranded DNAs. Oligonucleotides,
such as single-stranded DNA probe oligonucleotides, are often synthesized by chemical
methods, for e using automated oligonucleotide synthesizers that are commercially
available. However, oligonucleotides can be made by a variety of other methods, including in
vitro recombinant DNA-mediated techniques and by expression of DNAs in cells and
organisms.
As used , the term “expression level” as applied to a gene refers to the
level of the expression product of a gene, e. g. the normalized value determined for the RNA
expression product of a gene or for the polypeptide expression level of a gene.
The term “CT” as used herein refers to threshold cycle, the cycle number in
quantitative polymerase chain reaction (qPCR) at which the fluorescence generated within a
reaction well s the defined threshold, i.e. the point during the reaction at which a
sufficient number of amplicons have accumulated to meet the defined threshold.
The term “Cp” as used herein refers to “crossing point.” The Cp value is
calculated by determining the second derivatives of entire qPCR amplification curves and
their maximum value. The Cp value represents the cycle at which the increase of
fluorescence is highest and where the logarithmic phase of a PCR begins.
The terms “threshold” or “thresholding” refer to a procedure used to account
for non-linear relationships between gene expression measurements and clinical response as
well as to further reduce variation in ed gene expression ements and patient
scores induced by low expressing genes. When thresholding is applied, all measurements
below or above a old are set to that threshold value. Non-linear relationship n
gene sion and e could be examined using smoothers or cubic splines to model
gene expression in Cox PH regression on recurrence free interval or logistic regression on
recurrence . Variation in reported patient scores could be examined as a function of
variability in gene expression at the limit of quantitation and/or detection for a particular gene.
As used herein, the term “amplicon,” refers to pieces of DNA that have been
synthesized using amplification techniques, such as polymerase chain reactions (PCR) and
ligase chain reactions.
“Stringency” of hybridization reactions is readily determinable by one of
ordinary skill in the art, and generally is an empirical calculation dependent upon probe
length, washing temperature, and salt concentration. In general, longer probes e higher
temperatures for proper annealing, while shorter probes need lower temperatures.
Hybridization generally depends on the ability of denatured DNA to re-anneal when
complementary s are present in an environment below their g temperature. The
higher the degree of desired homology between the probe and izable ce, the
higher the ve temperature which can be used. As a result, it s that higher relative
temperatures would tend to make the reaction conditions more stringent, while lower
temperatures less so. For additional details and explanation of stringency of hybridization
reactions, see Ausubel et al., Current Protocols in Molecular Biology, Wiley Interscience
Publishers, (1995).
“Stringent conditions” or “high stringency conditions”, as defined herein,
lly: (1) employ low ionic strength and high temperature for washing, for example 0.015
M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 500C; (2)
employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v)
formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50mM
sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at
420C; or (3) employ 50% formamide, 5 X SSC (0.75 M NaCl, 0.075 M sodium citrate), 50
mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 X Denhardt's solution,
sonicated salmon sperm DNA (50 _g/ml), 0.1% SDS, and 10% n sulfate at 420C, with
washes at 420C in 0.2 X SSC (sodium chloride/sodium citrate) and 50% formamide, followed
by a high-stringency wash consisting of 0.1 X SSC containing EDTA at 550C.
“Moderately stringent conditions” may be identified as described by
Sambrook et al., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor
Press, 1989, and include the use of washing solution and hybridization conditions (e. g.,
temperature, ionic strength and %SDS) less stringent that those described above. An eXample
of moderately stringent conditions is overnight incubation at 370C in a solution comprising:
% formamide, 5 X SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium
phosphate (pH 7.6), 5 X Denhardt's on, 10% dextran sulfate, and 20 mg/ml denatured
sheared salmon sperm DNA, followed by washing the filters in l X SSC at about 37-500C.
The skilled artisan will recognize how to adjust the temperature, ionic th, etc. as necessary
to accommodate factors such as probe length and the like.
The terms “splicing” and “RNA splicing” are used interchangeably and refer
to RNA processing that removes s and joins eXons to produce mature mRNA with
uous coding sequence that moves into the cytoplasm of a eukaryotic cell.
As used herein, the term “eXon” refers to any segment of an interrupted gene
that is represented in the mature RNA product. As used herein, the term “intron” refers to any
t of DNA that is transcribed but d from within the transcript by splicing
together the eXons on either side of it. “Intronic RNA” refers to mRNA derived from an
ic region of DNA. Operationally, eXonic sequences occur in the mRNA sequence of a
gene as defined by Ref. SEQ ID numbers. Operationally, intron ces are the intervening
sequences within the c DNA of a gene.
The term “co-eXpressed”, as used herein, refers to a statistical correlation
between the eXpression level of one gene and the eXpression level of another gene. Pairwise
co-eXpression may be calculated by various methods known in the art, e. g., by calculating
n ation coefficients or Spearman correlation coefficients. Co-eXpressed gene
cliques may also be identified using a graph theory. An analysis of ression may be
calculated using normalized eXpression data.
A “computer-based system” refers to a system of hardware, software, and data
storage medium used to analyze ation. The minimum hardware of a patient computer-
based system comprises a central processing unit (CPU), and hardware for data input, data
output (e. g., display), and data storage. An ordinarily skilled artisan can readily appreciate
that any currently available computer-based systems and/or ents thereof are suitable
for use in connection with the methods of the present disclosure. The data storage medium
may comprise any manufacture comprising a recording of the present information as
described above, or a memory access device that can access such a manufacture.
To “record” data, programming or other information on a er readable
medium refers to a process for storing information, using any such methods as known in the
art. Any convenient data storage ure may be chosen, based on the means used to access
the stored information. A variety of data processor programs and formats can be used for
storage, e. g. word sing text file, database format, etc.
A “processor” or “computing means” references any hardware and/or software
combination that will perform the functions ed of it. For example, a suitable processor
may be a programmable digital microprocessor such as available in the form of an electronic
controller, ame, server or personal computer (desktop or portable). Where the
processor is programmable, suitable programming can be communicated from a remote
on to the processor, or previously saved in a computer program product (such as a
portable or fixed computer readable storage medium, whether magnetic, optical or solid state
device based). For example, a magnetic medium or optical disk may carry the programming,
and can be read by a suitable reader communicating with each processor at its corresponding
station.
The terms “surgery” or “surgical resection” are used herein to refer to surgical
removal of some or all of a tumor, and usually some of the surrounding tissue. Examples of
al techniques include laparoscopic procedures, biopsy, or tumor ablation, such as
erapy, radio frequency ablation, and high intensity ultrasound. In cancer ts, the
extent of tissue removed during surgery depends on the state of the tumor as observed by a
surgeon. For example, a partial ctomy indicates that part of one kidney is d; a
simple nephrectomy entails l of all of one kidney; a radical nephrectomy, all of one
kidney and neighboring tissue (e. g., l gland, lymph nodes) removed; and ral
nephrectomy, both kidneys removed.
ALGORITHM-BASED METHODS AND GENE SUBSETS
The present disclosure provides an algorithm-based molecular stic
assay for ining an expected clinical outcome, e. g., prognosis. The cancer can be, for
example, renal cell carcinoma or clear cell renal cell carcinoma. The present disclosure also
provides a method for obtaining a recurrence score for a patient with kidney cancer. For
example, the expression levels of the prognostic genes may be used to obtain a recurrence
score for a patient with kidney cancer. The algorithm-based assay and associated information
provided by the practice of the methods of the present invention facilitate optimal treatment
decision-making in kidney cancer. For example, such a clinical tool would enable physicians
to identify ts who have a low likelihood of ence and therefore may be able to
forgo adjuvant treatment. Similarly, such a tool may also enable physicians to identify
patients who have a high likelihood of recurrence and who may be good candidates for
adjuvant treatment.
As used herein, a itative score” is an arithmetically or mathematically
ated numerical value for aiding in simplifying or disclosing or ing the analysis of
more complex quantitative information, such as the correlation of certain expression levels of
the disclosed genes or gene subsets to a likelihood of a clinical outcome of a kidney cancer
patient. A tative score may be determined by the application of a specific algorithm.
The algorithm used to calculate the quantitative score in the methods disclosed herein may
group the expression level values of genes. The grouping of genes may be performed at least
in part based on knowledge of the relative bution of the genes according to physiologic
functions or component cellular characteristics, such as in the groups discussed herein. A
quantitative score may be determined for a gene group (“gene group score”). The formation
of groups, in addition, can tate the mathematical ing of the contribution of various
expression levels of genes or gene s to the quantitative score. The ing of a gene
or gene group representing a physiological process or component cellular characteristic can
reflect the contribution of that process or characteristic to the pathology of the cancer and
clinical outcome, such as ence or upgrading/upstaging of the cancer. The present
invention provides an algorithm for calculating the tative scores, for example, as set
forth in the Examples. In an embodiment of the invention, an increase in the quantitative
score indicates an increased likelihood of a negative clinical e.
In an embodiment, a quantitative score is a “recurrence ” which
indicates the likelihood of a cancer recurrence, upgrading or upstaging of a cancer, adverse
pathology, non-organ-confined disease, high-grade disease, and/or high-grade or non-organ-
confined disease. An increase in the ence score may correlate with an increase in the
likelihood of cancer recurrence, upgrading or upstaging of a cancer, adverse pathology, non-
organ-confined e, high-grade disease, and/or high-grade or non-organ-confined disease.
The gene subsets of the present invention include a vascular normalization
gene group, an immune response gene group, a cell growth/division gene group, and IL-6.
The gene subset identified herein as the “vascular ization group”
includes genes that are involved with vascular and/or angiogenesis functions. The vascular
ization group includes, for example, APOLDl, EDNRB, N083, and PPA2B.
The gene subset identified herein as the “cell growth/division group” includes
genes that are involved in key cell growth and cell division pathway(s). The cell
growth/division group includes, for example, EIF4EBP1, LMNBl, and TUBB2A.
The gene subset fied herein as the “immune response group” includes
genes that are involved in functions of the immune system. The immune response group
includes, for example, CCL5, CEACAMl, and CX3CLl.
Additionally, expression levels of certain individual genes may be used for
calculating the recurrence score. For example, the expression level of IL-6 may be used to
calculate the recurrence score. Although IL-6 may be involved in immune responses it may
also be involved in other biological processes making it less le to be grouped with other
immune related genes.
The present invention also provides s to determine a threshold
expression level for a particular gene. A threshold expression level may be ated for a
specific gene. A threshold sion level for a gene may be based on a ized
expression level. In one example, a CT threshold expression level may be calculated by
assessing functional forms using logistic regression or Cox proportional hazards regression.
The present invention further provides methods to determine genes that co-
express with ular genes identified by, e. g., quantitative RT-PCR (qRT-PCR), as
validated biomarkers relevant to a particular type of cancer. The co-expressed genes are
themselves useful biomarkers. The co-expressed genes may be tuted for the genes with
which they co-express. The methods can e identifying gene cliques from microarray
data, normalizing the microarray data, computing a pairwise Spearman correlation matrix for
the array probes, ing out significant co-expressed probes across different studies,
building a graph, mapping the probe to genes, and ting a gene clique report. The
expression levels of one or more genes of a gene clique may be used to calculate the
likelihood that a patient with kidney cancer will experience a ve clinical outcome, such
as a reduced likelihood of a cancer recurrence.
Any one or more combinations of gene groups may be assayed in the method
of the present invention. For example, a vascular normalization gene group may be assayed,
alone or in combination, with a cell growth/division gene group, an immune response gene
group, and or Il-6. In addition, any number of genes within each gene group may be assayed.
In a specific embodiment of the invention, a method for predicting a clinical
outcome for a patient with kidney cancer comprises measuring an expression level of at least
one gene from a vascular normalization gene group, or a ressed gene thereof, and at
least one gene from a cell growth/division gene group, or a co-expressed gene thereof. In
another embodiment, the expression level of at least two genes from a vascular normalization
gene group, or a co-expressed gene thereof, and at least two genes from a cell
/division gene group, or a co-expressed gene f, are measured. In yet another
embodiment, the expression levels of at least three genes are measured from each of the
vascular normalization gene group and the cell growth/division gene group. In a further
embodiment, the expression levels of at least four genes from the vascular normalization gene
group and at least three genes from the cell growth/differentiation gene group are measured.
In another embodiment of the invention, at least one gene from a vascular
normalization gene group, or a ressed gene thereof, and at least one gene from an
immune response gene group, or a co-expressed gene thereof are measured. In another
embodiment, the expression level of at least two genes from a vascular normalization gene
group, or a co-expressed gene thereof, and at least two genes from an immune response gene
group, or a co-expressed gene thereof, are measured. In yet another ment, the
expression levels of at least three genes are measured from each of the ar normalization
gene group and the immune response gene group. In a further embodiment, the sion
levels of at least four genes from the ar ization gene group and at least three
genes from the immune response gene group are measured.
In a further embodiment of the invention, an expression level of at least one
gene from a vascular normalization gene group, or a co-expressed gene thereof, and IL-6 are
ed. In another embodiment, the expression level of at least two genes from a vascular
normalization gene group, or a co-expressed gene thereof, and IL-6 are measured. In yet
another embodiment, the expression levels of at least three genes from the vascular
normalization gene group and IL-6 are measured. In a further embodiment, the sion
levels of at least four genes from the vascular normalization gene group and IL-6 are
Additionally, an sion level of at least one gene from a vascular
normalization gene group, or a co-expressed gene thereof, and at least one gene from an
immune response gene group, or a co-expressed gene thereof is ed. In another
embodiment, the sion level of at least two genes from a vascular normalization gene
group, or a co-expressed gene thereof, and at least two genes from an immune response gene
group, or a co-expressed gene thereof, are measured. In yet another embodiment, the
expression levels of at least three genes are measured from each of the vascular normalization
gene group and the immune response gene group. In a further embodiment, the expression
levels of at least four genes from the vascular normalization gene group and at least three
genes from the immune response gene group are measured.
In a specific embodiment of the invention, a method for predicting a al
e for a patient with kidney cancer comprises measuring an sion level of at least
one gene from a cell growth/division gene group, or a co-expressed gene thereof, and at least
one gene from an immune response gene group, or a co-expressed gene thereof. In another
embodiment, the expression level of at least two genes from a cell growth/division gene
group, or a co-expressed gene thereof, and at least two genes from an immune response gene
group, or a co-expressed gene thereof, are measured. In yet another embodiment, the
expression levels of at least three genes are ed from each of the cell growth/division
gene group and the immune response gene group.
In a further embodiment of the invention, an expression level of at least one
gene from a cell growth/division gene group, or a co-expressed gene thereof, and IL-6 are
measured. In another embodiment, the expression level of at least two genes from a cell
growth/division gene group, or a co-expressed gene f, and IL-6 are measured. In yet
another embodiment, the expression levels of at least three genes from the cell
growth/division gene group and IL-6 are measured.
In a further embodiment of the invention, an expression level of at least one
gene from an immune response gene group, or a co-expressed gene thereof, and IL-6 are
measured. In another embodiment, the expression level of at least two genes from an
immune response gene group, or a co-expressed gene thereof, and IL-6 are measured. In yet
another embodiment, the expression levels of at least three genes from the immune response
gene group and IL-6 are ed.
In an additional embodiment of the invention, an expression level of at least
one gene from a vascular normalization gene group, or a co-expressed gene thereof, at least
one gene from a cell growth/division gene group, or a co-expressed gene thereof, and at least
one gene from an immune se gene group are measured. In another ment, the
expression level of at least two genes from a vascular normalization gene group, or a co-
expressed gene thereof, at least two genes from a cell growth/division gene group, or a coexpressed
gene thereof, and at least two genes from an immune response gene group are
measured. In yet another embodiment, the expression levels of at least three genes are
measured from each of the vascular normalization gene group, the cell growth/division gene
group, and the immune response gene group. In a further embodiment, the expression levels
of at least four genes from the vascular normalization gene group, at least three genes from
the cell growth/differentiation gene group, and at least three genes from the immune response
gene group are measured.
In another embodiment of the invention, an expression level of at least one
gene from a vascular normalization gene group, or a co-expressed gene f, at least one
gene from a cell growth/division gene group, or a co-expressed gene thereof, at least one
gene from an immune response gene group, and IL-6 are measured. In another embodiment,
the expression level of at least two genes from a vascular normalization gene group, or a co-
expressed gene thereof, at least two genes from a cell /division gene group, or a co-
expressed gene thereof, at least two genes from an immune response gene group, and IL-6 are
ed. In yet another embodiment, the expression levels of at least three genes are
measured from each of the vascular normalization gene group, the cell growth/division gene
group, and the immune response gene group, and IL-6. In a further embodiment, the
expression levels of at least four genes from the ar normalization gene group, at least
three genes from the cell /differentiation gene group, at least three genes from the
immune response gene group, and IL-6 are ed.
Additionally, expression levels of one or more genes that do not fall within the
gene subsets described herein may be ed with any of the combinations of the gene
subsets bed herein. Alternatively, any gene that falls within a gene subset may be
analyzed separately from the gene , or in another gene subset.
In a specific embodiment, the method of the invention comprises measuring
the sion levels of the specific combinations of genes and gene subsets shown in the
Examples. In a further embodiment, gene group score(s) and quantitative score(s) are
ated according to the algorithm(s) shown in the Examples. In certain embodiments, the
method of the ion comprises ing sion levels of the cancer-related genes
APOLDl, CCL5, CEACAMl, CX3CL1, EDNRB, EIF4EBP1, IL6, LMNBl, NOS3,
, and TUBB2A, and the reference genes AAMP, ARFl, ATPSE, GPXl, and RPLPl,
normalizing the expression levels of one or more of the cancer-related genes against the
expression levels of one or more of the reference genes, assigning the normalized expression
levels to gene subsets, weighting the gene subset according to its contribution to cancer
recurrence, calculating a ence score using the weighted gene subset and the normalized
levels, and creating a report comprising the recurrence score.
In certain ments, the method of the invention comprises ing
expression levels of certain subgroups of cancer-related genes ed from the group
consisting of: (1) APOLDl, N083, and EMCN; (2) APOLDl, NOS3, IL6, 1L8, and EMCN;
(3) CEACAMl, CX3CL1, IL6, and IL8; (4) EIF4EBP1 and LMNBl; (5) APOLDl, EDNRB,
and NOS3; (6) APOLDl, EDNRB, and PPAP2B; (7) APOLDl, N083, and PPAP2B; (8)
EDNRB, N083, and PPAP2B; (9) APOLDl and NOS3; (10) N083 and PPAP2B; (11)
APOLDl, NOS3, PPAP2B, and CEACAMl; (12) APOLDl, NOS3, PPAP2B, and ;
(13) APOLDl, NOS3, CEACAMl, and CX3CL1; (14) APOLDl, , CEACAMl, and
CX3CL1; (15) NOS3, PPAP2B, CEACAMl, and CX3CL1; (16) APOLDl, NOS3,
CEACAMl, CX3CL1, and EIF4EBP1; (17) NOS3, PPAP2B, CEACAMl, CX3CL1, and
EIF4EBP1; (18) APOLDl, NOS3, CEACAMl, CX3CL1, and LMNBl; (19) NOS3,
PPAP2B, CEACAMl, CX3CL1, and LMNBl; (20) APOLDl, NOS3, CEACAMl, CX3CL1,
and TUBB2A; and (21) NOS3, PPAP2B, CEACAMl, CX3CL1, and TUBB2A and the
reference genes AAMP, ARFl, ATPSE, GPXl, and RPLPl, normalizing the expression
levels of one or more of the subgroups of cancer-related genes against the expression levels
of one or more of the reference genes, and creating a report comprising the risk of recurrence.
In certain embodiments, the risk of recurrence is estimated from a hazard ratio calculated
using the normalized expression levels of one or more subgroups of cancer-related genes.
Various technological approaches for determination of expression levels of the
disclosed genes are set forth in this specification, including, without limitation, RT-PCR,
rrays, high-throughput sequencing, serial analysis of gene expression (SAGE) and
Digital Gene Expression (DGE), which will be discussed in detail below. In ular
aspects, the expression level of each gene may be determined in relation to various features of
the expression products of the gene including exons, introns, protein epitopes and protein
activity.
The expression product that is assayed can be, for example, RNA or a
polypeptide. The expression product may be fragmented. For example, the assay may use
primers that are complementary to target ces of an expression product and could thus
measure full transcripts as well as those fragmented expression products containing the target
ce. Further information is provided in Tables A and B.
The RNA expression product may be assayed directly or by detection of a
cDNA product resulting from a PCR-based amplification method, e.g., quantitative e
transcription polymerase chain reaction (qRT-PCR). (See e. g., U.S. Patent No. 7,587,279).
Polypeptide expression product may be assayed using immunohistochemistry (IHC) by
proteomics techniques. Further, both RNA and polypeptide expression products may also be
assayed using microarrays.
AL UTILITY
Currently, of the expected clinical e for RCC patients is based on
subjective determinations of a tumor’s clinical and pathologic features. For example,
physicians make decisions about the appropriate al procedures and adjuvant therapy
based on a renal tumor’s stage, grade, and the presence of necrosis. Although there are
standardized measures to guide pathologists in making these decisions, the level of
concordance between pathology laboratories is low. (See Al-Ayanti M et al. (2003) Arch
Pathol Lab Med 127, 6) It would be useful to have a ucible molecular assay for
ining and/or ming these tumor teristics.
In addition, standard clinical criteria, by themselves, have limited ability to
tely estimate a patient’s prognosis. It would be useful to have a reproducible molecular
assay to assess a patient’s prognosis based on the biology of his or her tumor. Such
information could be used for the purposes of patient ling, selecting patients for
clinical trials (e. g., adjuvant trials), and understanding the biology of renal cell carcinoma. In
addition, such a test would assist physicians in making surgical and treatment
recommendations based on the biology of each patient’s tumor. For example, a c test
could fy RCC patients based on risk of recurrence and/or likelihood of long-term
survival without recurrence (relapse, metastasis, etc.). There are several g and planned
clinical trials for RCC therapies, including adjuvant radiation and chemotherapies. It would
be useful to have a genomic test able to identify high-risk patients more accurately than
standard clinical criteria, thereby r enriching an nt RCC population for study.
This would reduce the number of patients needed for an adjuvant trial and the time needed for
definitive testing of these new agents in the adjuvant setting.
Finally, it would be useful to have a molecular assay that could predict a
patient’s likelihood to respond to specific treatments. Again, this would facilitate individual
treatment decisions and recruiting patients for clinical trials, and increase physician and
patient confidence in making healthcare decisions after being diagnosed with cancer.
METHODS OF ASSAYING SION LEVELS OF A GENE PRODUCT
Methods of expression profiling include methods based on sequencing of
polynucleotides, methods based on hybridization analysis of polynucleotides, and
proteomics- based methods. Representative methods for sequencing-based is e
Massively Parallel Sequencing (see e. g., Tucker et al., The American J. Human Genetics
85:142-154, 2009) and Serial Analysis of Gene Expression (SAGE). ary methods
known in the art for the quantification of mRNA expression in a sample include northern
blotting and in situ hybridization (Parker & Barnes, Methods in Molecular y 106247-
283 (1999)); RNase protection assays (Hod, hniques 13:852-854 (1992)); and PCR-
based s, such as reverse transcription polymerase chain reaction (RT-PCR) (Weis et
al., Trends in Genetics 264 (1992)). Antibodies may be employed that can recognize
sequence-specific duplexes, including DNA duplexes, RNA duplexes, and A hybrid
duplexes or DNA-protein duplexes.
Nucleic Acid Sequencing-Based Methods
Nucleic acid sequencing technologies are suitable methods for sion
analysis. The principle underlying these methods is that the number of times a cDNA
sequence is detected in a sample is directly related to the relative RNA levels corresponding
to that sequence. These s are sometimes referred to by the term Digital Gene
Expression (DGE) to reflect the discrete numeric property of the resulting data. Early
methods applying this principle were Serial Analysis of Gene Expression (SAGE) and
Massively Parallel Signature Sequencing (MPSS). See, e. g., S. Brenner, et al., Nature
Biotechnology 18(6):630-634 (2000).
More recently, the advent of “next-generation” sequencing technologies has
made DGE simpler, higher throughput, and more affordable. As a result, more laboratories
are able to utilize DGE to screen the expression of more nucleic acids in more individual
patient samples than previously possible. See, e. g., J. Marioni, Genome Research 18(9):1509-
1517 ; R. Morin, Genome Research 18(4):610-621 (2008); A. avi, Nature
Methods 5(7):621-628 (2008); N. Cloonan, Nature Methods 5(7):613-619 (2008). Massively
parallel sequencing methods have also d whole genome or riptome sequencing,
allowing the analysis of not only coding but also non-coding sequences. As reviewed in
Tucker et al., The American J. Human Genetics 85:142-154 (2009), there are several
commercially available ely parallel sequencing rms, such as the Illumina
Genome Analyzer (Illumina, Inc., San Diego, CA), d tems SOLiDTM Sequencer
(Life logies, ad, CA), Roche GS-FLX 454 Genome Sequencer (Roche Applied
Science, Germany), and the s® Genetic Analysis Platform (Helicos Biosciences Corp.,
Cambridge, MA). Other developing logies may be used.
Reverse Transcription PCR R)
The starting material is typically total RNA ed from a human tumor,
usually from a primary tumor. Optionally, normal tissues from the same patient can be used
as an internal control. RNA can be extracted from a tissue sample, e. g., from a sample that is
fresh, frozen (e. g. fresh frozen), or paraffin-embedded and fixed (e. g. formalin-fixed).
General methods for RNA extraction are well known in the art and are
disclosed in standard oks of molecular biology, including Ausubel et al., Current
Protocols of Molecular Biology, John Wiley and Sons (1997). Methods for RNA extraction
from paraffin embedded tissues are disclosed, for example, in Rupp and Locker, Lab Invest.
56:A67 (1987), and De Andres et al., BioTechniques 18:42044 (1995). In particular, RNA
isolation can be performed using a purification kit, buffer set and protease from commercial
manufacturers, such as Qiagen, according to the manufacturer’s instructions. For example,
total RNA from cells in culture can be isolated using Qiagen RNeasy mini-columns. Other
commercially available RNA isolation kits include MasterPureTM Complete DNA and RNA
Purification Kit (EPICENTRE®, Madison, WI), and Paraffin Block RNA Isolation Kit
(Ambion, Inc.). Total RNA from tissue samples can be isolated using RNA Stat-60 (Tel-
Test). RNA prepared from a tumor sample can be isolated, for e, by cesium chloride
density gradient centrifugation. The isolated RNA may then be depleted of ribosomal RNA
as described in US. Pub. No. 2011/0111409.
The sample containing the RNA is then ted to reverse transcription to
produce cDNA from the RNA template, followed by exponential ication in a PCR
reaction. The two most commonly used reverse transcriptases are avian myeloblastosis virus
reverse transcriptase (AMV-RT) and Moloney murine ia virus reverse transcriptase
(MMLV-RT). The reverse transcription step is typically primed using specific primers,
random hexamers, or oligo-dT primers, depending on the circumstances and the goal of
expression profiling. For example, ted RNA can be reverse-transcribed using a
GeneAmp RNA PCR kit (Perkin Elmer, CA, USA), ing the manufacturer’s
instructions. The derived cDNA can then be used as a template in the subsequent PCR
reaction.
PCR-based methods use a thermostable DNA-dependent DNA rase,
such as a Taq DNA polymerase. For example, TaqMan® PCR typically utilizes the 5’-
nuclease activity of Taq or Tth polymerase to hydrolyze a hybridization probe bound to its
target amplicon, but any enzyme with lent 5’ se activity can be used. Two
oligonucleotide s are used to generate an amplicon typical of a PCR reaction product.
A third ucleotide, or probe, can be designed to facilitate detection of a nucleotide
sequence of the amplicon located between the hybridization sites of the two PCR primers.
The probe can be detectably labeled, e. g., with a reporter dye and can r be provided
with both a fluorescent dye, and a er fluorescent dye, as in a TaqMan® probe
configuration. Where a TaqMan® probe is used, during the amplification reaction, the Taq
DNA polymerase enzyme cleaves the probe in a template-dependent manner. The resultant
probe fragments disassociate in solution, and signal from the released reporter dye is free
from the quenching effect of the second fluorophore. One molecule of reporter dye is
liberated for each new molecule synthesized, and detection of the ched reporter dye
provides the basis for quantitative interpretation of the data.
TaqMan® RT-PCR can be performed using commercially available
equipment, such as, for example, ABI PRISM 7700TM Sequence Detection SystemTM
n-Elmer-Applied Biosystems, Foster City, CA, USA), or LightCycler (Roche
Molecular Biochemicals, Mannheim, Germany). In a preferred embodiment, the 5' nuclease
procedure is run on a real-time quantitative PCR device such as the ABI PRISM 7700TM
Sequence Detection SystemTM. The system consists of a thermocycler, laser,
charge-coupled device (CCD), camera and computer. The system amplifies s in a
384-well format on a thermocycler. The RT-PCR may be performed in triplicate wells with
an lent of 2ng RNA input per 10 uL-reaction volume. During amplification,
laser-induced fluorescent signal is collected in real-time through fiber optics cables for all
wells, and detected at the CCD. The system includes software for g the instrument and
for analyzing the data.
5'-Nuclease assay data are generally initially expressed as a threshold
cycle (“CT”). Fluorescence values are recorded during every cycle and represent the amount
of t amplified to that point in the ication reaction. The old cycle (CT) is
generally bed as the point when the fluorescent signal is first recorded as statistically
icant. The Cp value is calculated by determining the second derivatives of entire qPCR
amplification curves and their maximum value. The Cp value represents the cycle at which
the se of fluorescence is highest and where the logarithmic phase of a PCR begins.
To minimize errors and the effect of sample-to-sample variation, RT-
PCR is y performed using an internal standard. The ideal internal standard gene (also
referred to as a reference gene) is expressed at a constant level among cancerous and non-
cancerous tissue of the same origin (i.e., a level that is not significantly different among
normal and cancerous tissues), and is not significantly ed by the experimental treatment
(i.e., does not exhibit a significant difference in expression level in the relevant tissue as a
result of exposure to chemotherapy). RNAs most frequently used to normalize patterns of
gene expression are mRNAs for the housekeeping genes glyceraldehydephosphate-
dehydrogenase (GAPDH) and B-actin. Gene expression measurements can be ized
relative to the mean of one or more (e. g., 2, 3, 4, 5, or more) reference genes. Reference-
normalized expression measurements can range from 0 to 15, where a one unit increase
generally reflects a 2-fold increase in RNA quantity.
Real time PCR is compatible both with quantitative competitive PCR,
where an internal itor for each target sequence is used for normalization, and with
quantitative comparative PCR using a normalization gene contained within the sample, or a
housekeeping gene for RT-PCR. For further details see, e. g. Held et al., Genome Research
6:986-994 (1996).
Design of PCR Primers and Probes
PCR primers and probes can be designed based upon exon, intron, or
intergenic sequences present in the RNA transcript of interest. Primer/probe design can be
performed using publicly available re, such as the DNA BLAT software developed by
Kent, W.J., Genome Res. 12(4):656-64 (2002), or by the BLAST software including its
variations.
Where necessary or desired, repetitive sequences of the target sequence
can be masked to te non-specific signals. Exemplary tools to accomplish this include
the Repeat Masker program available on-line through the Baylor College of ne, which
screens DNA sequences against a library of repetitive elements and returns a query sequence
in which the repetitive elements are masked. The masked ces can then be used to
design primer and probe sequences using any commercially or otherwise publicly available
/probe design packages, such as Primer Express (Applied Biosystems); MGB assay-
by-design (Applied Biosystems); Primer3 (Steve Rozen and Helen J. Skaletsky (2000)
3 on the WWW for l users and for biologist programmers. In: z S,
Misener S (eds) Bioinformatics Methods and ols: Methods in Molecular Biology.
Humana Press, , NJ, pp 365-386).
Other factors that can influence PCR primer design include primer
length, melting temperature (Tm), and G/C content, specificity, complementary primer
ces, and 3'-end sequence. In general, optimal PCR primers are generally 17-30 bases
in length, and contain about 20-80%, such as, for example, about 50-60% G+C bases, and
exhibit Tm's between 50 and 80 OC, e.g. about 50 to 70 0C.
For r guidelines for PCR primer and probe design see, e.g.
Dieffenbach, CW. et al, “General ts for PCR Primer Design” in: PCR Primer, A
Laboratory Manual, Cold Spring Harbor Laboratory Press,. New York, 1995, pp. 133-155;
Innis and Gelfand, “Optimization of PCRs” in: PCR Protocols, A Guide to Methods and
ations, CRC Press, London, 1994, pp. 5-11; and Plasterer, T.N. Primerselect: Primer
and probe design. Methods MoI. Biol. 70:520-527 (1997), the entire disclosures of which are
hereby expressly incorporated by reference.
Tables A and B provide further information concerning the primer,
probe, and amplicon sequences associated with the Examples disclosed herein.
MassARRAY® System
In MassARRAY-based methods, such as the exemplary method
developed by om, Inc. (San Diego, CA) following the isolation of RNA and reverse
transcription, the obtained cDNA is spiked with a synthetic DNA le (competitor),
which matches the targeted cDNA region in all positions, except a single base, and serves as
an internal standard. The CDNA/competitor e is PCR ied and is subjected to a
post-PCR shrimp alkaline phosphatase (SAP) enzyme treatment, which results in the
dephosphorylation of the ing nucleotides. After inactivation of the alkaline
phosphatase, the PCR products from the competitor and cDNA are subjected to primer
extension, which generates distinct mass signals for the competitor— and CDNA-derived PCR
products. After purification, these products are dispensed on a chip array, which is pre-
loaded with components needed for analysis with matrix- assisted laser desorption ionization
time-of—flight mass spectrometry (MALDI-TOF MS) is. The cDNA present in the
reaction is then quantified by analyzing the ratios of the peak areas in the mass spectrum
generated. For further details see, e. g. Ding and Cantor, Proc. Natl. Acad. Sci. USA
100:3059-3064 (2003).
Other PCR-based Methods
] Further PCR-based techniques that can find use in the methods
disclosed herein include, for example, BeadArray® technology (Illumina, San Diego, CA;
Oliphant et al., Discovery of Markers for Disease ement to hniques), June 2002;
Ferguson et al., Analytical Chemistry 72:5618 (2000)); BeadsArray for Detection of Gene
Expression® (BADGE), using the commercially available LuminexlOO LabMAP® system
and multiple color-coded microspheres (Luminex Corp., , TX) in a rapid assay for
gene expression (Yang et al., Genome Res. 11:1888-1898 (2001)); and high coverage
expression profiling (HiCEP) analysis (Fukumura et al., Nucl. Acids. Res. 31(16) e94 (2003).
Microarrays
In this method, cleotide sequences of interest (including cDNAs
and ucleotides) are arrayed on a substrate. The arrayed sequences are then contacted
under ions suitable for specific hybridization with detectably labeled cDNA generated
from RNA of a sample. The source of RNA typically is total RNA isolated from a tumor
sample, and optionally from normal tissue of the same patient as an internal control or cell
lines. RNA can be extracted, for example, from frozen or archived paraffin-embedded and
fixed (e. g. formalin-fixed) tissue samples.
For example, PCR amplified inserts of cDNA clones of a gene to be
assayed are applied to a substrate in a dense array. Usually at least 10,000 nucleotide
sequences are applied to the substrate. For e, the microarrayed genes, immobilized on
the microchip at 10,000 elements each, are le for hybridization under stringent
conditions. Fluorescently labeled cDNA probes may be generated through incorporation of
fluorescent nucleotides by reverse transcription of RNA extracted from tissues of interest.
Labeled cDNA probes applied to the chip hybridize with specificity to each spot of DNA on
the array. After washing under stringent conditions to remove non-specifically bound probes,
the chip is scanned by confocal laser copy or by another detection method, such as a
CCD camera. Quantitation of hybridization of each arrayed element allows for assessment of
corresponding mRNA abundance.
With dual color fluorescence, separately labeled cDNA probes
generated from two sources of RNA are hybridized pair wise to the array. The relative
abundance of the transcripts from the two sources corresponding to each ied gene is
thus determined simultaneously. The miniaturized scale of the ization affords a
convenient and rapid evaluation of the expression pattern for large s of genes. Such
s have been shown to have the sensitivity required to detect rare ripts, which are
expressed at a few copies per cell, and to reproducibly detect at least approximately two-fold
differences in the expression levels (Schena et at, Proc. Natl. Acad. Sci. USA 93(2):106-149
(1996)). Microarray analysis can be performed on commercially ble equipment,
following the manufacturer's protocols, such as by using the Affymetrix GenChip®
technology, or Incyte's rray technology.
Isolating RNA from Body Fluids
Methods of isolating RNA for expression analysis from blood, plasma
and serum (see for example, Tsui NB et al. (2002) Clin. Chem. 7-53 and references
cited therein) and from urine (see for e, Boom R et al. (1990) J Clin iol. 28,
495-503 and reference cited therein) have been described.
Methods of Isolating RNA from Paraffin-Embedded Tissue
The steps of a representative ol for profiling gene expression
using fixed, paraffin-embedded tissues as the RNA source, including mRNA isolation,
purification primer extension and amplification are provided in various published journal
es. (See, e.g., T.E. Godfrey et al,. J. Molec. Diagnostics 2: 84-91 (2000); K. Specht et
al., Am. J. Pathol. 158: 419-29 (2001), M. Cronin, et al., Am J Pathol 164:35-42 (2004)).
Immunohistochemistry
Immunohistochemistry methods are also suitable for detecting the
expression levels of genes and applied to the method disclosed herein. Antibodies (e. g.,
monoclonal dies) that specifically bind a gene product of a gene of interest can be used
in such methods. The antibodies can be detected by direct labeling of the antibodies
lves, for e, with radioactive labels, fluorescent labels, hapten labels such as,
biotin, or an enzyme such as horse radish peroxidase or alkaline phosphatase. Alternatively,
unlabeled y antibody can be used in conjunction with a labeled secondary antibody
specific for the primary dy. histochemistry ols and kits are well known
in the art and are commercially available.
Proteomics
The term “proteome” is defined as the totality of the proteins present in
a sample (e. g. tissue, organism, or cell culture) at a certain point of time. Proteomics
includes, among other things, study of the global changes of protein expression in a sample
(also referred to as “expression proteomics”). Proteomics typically includes the following
steps: (1) separation of individual proteins in a sample by 2-D gel electrophoresis (2-D
PAGE); (2) identification of the individual proteins recovered from the gel, e. g. my mass
spectrometry or N- terminal sequencing, and (3) is of the data using bioinformatics.
General Description of the mRNA Isolation, Purification and ication
The steps of a representative protocol for profiling gene expression
using fixed, paraffin-embedded tissues as the RNA source, including mRNA isolation,
purification, primer extension and amplification are provided in various published journal
articles. (See, e.g., T.E. y, et al,. J. Molec. Diagnostics 2: 84-91 ; K. Specht et
al., Am. J. Pathol. 158: 419-29 (2001), M. Cronin, et al., Am J Pathol 164:35-42 (2004)).
Briefly, a representative process starts with cutting a tissue sample section (e. g. about 10 um
thick sections of a paraffin-embedded tumor tissue sample). The RNA is then extracted, and
n and DNA are removed. After analysis of the RNA concentration, RNA repair is
performed if desired. The sample can then be subjected to analysis, e. g., by reverse
transcribed using gene specific promoters ed by RT-PCR.
STATISTICAL ANALYSIS OF GENE EXPRESSION LEVELS IN IDENTIFICATION
OF MARKER GENES FOR USE IN PROGNOSTIC METHODS
One skilled in the art will recognize that there are many statistical
methods that may be used to determine whether there is a significant relationship between an
outcome of interest (e. g., likelihood of al, likelihood of response to chemotherapy) and
expression levels of a marker gene as described here. This relationship can be presented as a
continuous recurrence score (RS), or patients may be fied into risk groups (e. g., low,
intermediate, high). For example, a Cox proportional hazards regression model may provide
an adequate fit to a particular clinical endpoint (e. g., RFI, DFS, OS). One assumption of the
Cox proportional hazards regression model is the tional hazards assumption, i.e. the
assumption that effect parameters multiply the underlying hazard. Assessments of model
adequacy may be med including, but not limited to, examination of the cumulative sum
of martingale residuals. One d in the art would recognize that there are numerous
statistical s that may be used (e.g., n and Parmer (2002), ing spline,
etc.) to fit a flexible parametric model using the hazard scale and the Weibull distribution
with natural spline ing of the log cumulative hazards function, with effects for
ent (chemotherapy or observation) and RS allowed to be time-dependent. (See, P.
Royston, M. Farmer, Statistics in Medicine 21(15:2l75-2l97 (2002).) The relationship
between recurrence risk and (l) recurrence risk groups; and (2) clinical/pathologic covariates
(e. g., number of nodes examined, pathological T stage, tumor grade, lymphatic or vascular
invasion, etc.) may also be tested for significance.
In an exemplary embodiment, power calculations were carried for the
Cox tional hazards model with a single non-binary covariate using the method
proposed by F. Hsieh and P. Lavori, Control Clin Trials 21:552-560 (2000) as implemented
in PASS 2008.
GENERAL DESCRIPTION OF ARY EMBODIMENTS
This disclosure provides a method for obtaining a recurrence score for
a patient with kidney cancer by assaying sion levels of certain prognostic genes from a
tumor sample obtained from the patient. Such methods involve use of gene subsets that are
created based on similar functions of gene products. For example, prognostic methods
WO 94078
disclosed herein involve assaying expression levels of gene subsets that include at least one
gene from each of a vascular normalization group, an immune response group, and cell
growth/division group, and IL-6, and calculating a recurrence score (RS) for the patient by
weighting the expression levels of each of the gene subsets by their respective contributions
to cancer recurrence. The weighting may be different for each gene subset, and may be either
positive or negative. For example, the vascular normalization gene group score may be
ed by multiplying a factor of -0.45, the immune response gene group score may be
weighted by multiplying a factor of -0.31, the cell growth/division gene group score may be
weighted by a factor of +0.27, and the value for IL-6 may be multiplied by a factor of +0.04.
Normalization of sion Levels
] The expression data used in the methods disclosed herein can be
normalized. Normalization refers to a process to correct for (normalize away), for example,
differences in the amount of RNA assayed and variability in the quality of the RNA used, to
remove unwanted sources of systematic variation in CT measurements, and the like. With
respect to RT-PCR experiments involving archived fixed paraffin embedded tissue s,
sources of systematic variation are known to include the degree of RNA degradation ve
to the age of the patient sample and the type of fixative used to store the sample. Other
sources of atic variation may be attributable to laboratory processing conditions.
Assays can provide for normalization by incorporating the expression
of certain normalizing genes, which genes are relatively invariant under the relevant
conditions. Exemplary normalization genes include eeping genes. Normalization can
be based on the mean or median signal (CT) of all of the assayed genes or a large subset
thereof (global normalization approach). In general, the normalizing genes, also ed to as
reference genes should be genes that are known to be invariant in kidney cancer as compared
to ncerous kidney tissue, and are not icantly affected by various sample and
process conditions, thus provide for normalizing away extraneous effects.
Unless noted otherwise, normalized expression levels for each
ested tumor/patient will be expressed as a percentage of the expression level
measured in the reference set. A reference set of a sufficiently high number (e. g., 40) of
tumors yields a distribution of normalized levels of each mRNA species. The level measured
in a particular tumor sample to be ed falls at some percentile within this range, which
can be determined by methods well known in the art.
In exemplary embodiments, one or more of the following genes are
used as references by which the expression data is ized: AAMP, ARFl, ATPSE,
GPXl, and RPLPl. The calibrated weighted average CT measurements for each of the
prognostic genes may be normalized relative to the mean of five or more reference genes.
Those skilled in the art will recognize that normalization may be
achieved in us ways, and the techniques described above are intended only to be
exemplary, not exhaustive.
Standardization of Expression Levels
The expression data used in the methods disclosed herein can be
standardized. Standardization refers to a process to effectively put all the genes on a
comparable scale. This is performed because some genes will exhibit more variation (a
broader range of sion) than . rdization is performed by dividing each
expression value by its standard deviation across all samples for that gene. Hazard ratios are
then interpreted as the proportional change in the hazard for the clinical endpoint (clinical
recurrence, biological recurrence, death due to kidney cancer, or death due to any cause) per
1 standard deviation increase in expression.
Bridging sion Measurements and Calibration
An oligonucleotide set represents a forward primer, reverse primer,
and probe that are used to build a primer and probe (P3) pool and gene specific primer (GSP)
pool. Systematic differences in RT-PCR cycle threshold (CT) measurements can result
between different oligonucleotide sets due to inherent variations oligonucleotide syntheses.
For example, differences in oligonucleotide sets may exist between development, production
(used for validation), and future production nucleotide sets. Thus, use of statistical ation
procedures to adjust for systematic ences in oligonucleotide sets resulting in translation
in the gene coefficients used in calculating RS may be desirable. For example, for each of the
genes assayed for use in an algorithm, one may use a scatterplot of CT ements for
production ucleotide sets versus CT measurements from a corresponding sample used
in different oligonucleotide set to create linear sion model that treats the effect of lot-to-
lot differences as a random . Examination of such a plot will reveal that the variance of
CT measurements increases exponentially as a function of the mean CT. The random effects
linear regression model can be evaluated with log-linear variance, to obtain a linear
calibration equation. A calculated mean squared error (MSE) for the scores can be compared
to the MSE if no calibration scheme is used at all.
As another example, a latent variable ement of CT (e.g. first
principle component) may be derived from various oligonucleotide sets. The latent variable is
a reasonable e of the “true” underlying CT ement. Similar to the method
described above, a linear regression model may be fit to the sample pairs treating the effects
of ences as a random effect, and the weighted average CT value adjusted to a calibrated
Centering and Data Compression/Scaling
Systematic differences in the distribution of patient RS due to analytical
or sample differences may exist between early development, clinical validation and commercial
samples. A nt centering tuning parameter may be used in the algorithm to account for such
difference.
Data compression is a procedure used to reduce the variability in observed
normalized CT values beyond the limit of quantitation (LOQ) of the assay. ically, for each
of the kidney cancer assay genes, ce in CT measurements increase exponentially as the
normalized CT for a gene extends beyond the LOQ of the assay. To reduce such variation,
normalized CT values for each gene may be compressed towards the LOQ of the assay.
Additionally, normalized CT values may be rescaled. For example, ized CT values of the
prognostic and reference genes may be rescaled to a range of 0 to 15, where a one-unit increase
generally reflects a 2-fold increase in RNA quantity.
Threshold Values
The present invention describes a method to ine a threshold
value for expression of a -related gene, comprising measuring an expression level of a
gene, or its expression product, in a tumor section obtained from a cancer patient,
normalizing the expression level to obtain a normalized expression level, calculating a
old value for the normalized expression level, and determining a score based on the
likelihood of recurrence or ally beneficial response to treatment, wherein if the
normalized expression level is less than the threshold value, the threshold value is used to
determine the score, and wherein if the normalized expression level is greater or equal to the
threshold value, the normalized expression level is used to determine the score.
WO 94078
For example, a threshold value for each cancer-related gene may be
determined through examination of the functional form of relationship n gene
expression and e. Examples of such analyses are presented for Cox PH regression on
recurrence free interval where gene expression is modeled using l splines and for
logistic regression on ence status where gene expression is modeled using a lowess
smoother.
In some embodiments, if the relationship between the term and the risk
of recurrence is non-linear or expression of the gene is relatively low, a threshold may be
used. In an embodiment, when the sion of IL6 is <4 CT the value is fixed at 4 CT.
KITS OF THE INVENTION
The materials for use in the methods of the present invention are suited
for preparation of kits produced in accordance with well-known procedures. The present
disclosure thus es kits comprising agents, which may include gene-specific or gene-
selective probes and/or primers, for quantitating the expression of the disclosed genes for
predicting prognostic outcome or response to treatment. Such kits may optionally contain
reagents for the extraction of RNA from tumor s, in particular fixed paraffin-
embedded tissue samples and/or reagents for RNA amplification. In addition, the kits may
optionally comprise the reagent(s) with an identifying ption or label or instructions
relating to their use in the methods of the present invention. The kits may comprise containers
(including microliter plates suitable for use in an automated implementation of the method),
each with one or more of the various ts (typically in trated form) utilized in the
methods, including, for example, pre-fabricated microarrays, buffers, the appropriate
nucleotide triphosphates (e. g., dATP, dCTP, dGTP and dTTP; or rATP, rCTP, rGTP and
UTP), reverse transcriptase, DNA polymerase, RNA polymerase, and one or more probes and
s of the present invention (e. g., appropriate length poly(T) or random primers linked to
a er reactive with the RNA polymerase). atical algorithms used to estimate or
quantify prognostic or predictive information are also properly potential components of kits.
REPORTS
The methods of this invention, when practiced for commercial
diagnostic purposes, lly produce a report or summary of information obtained from the
-described methods. For example, a report may include information concerning
expression levels of prognostic genes, a Recurrence Score, a prediction of the predicted
clinical outcome for a particular patient, or thresholds. The methods and reports of this
invention can further include storing the report in a database. The method can create a record
in a database for the subject and populate the record with data. The report may be a paper
report, an auditory , or an electronic record. The report may be displayed and/or stored
on a computing device (e. g., handheld device, desktop er, smart device, website, etc.).
It is contemplated that the report is provided to a physician and/or the patient. The receiving
of the report can further include establishing a network connection to a server computer that
es the data and report and requesting the data and report from the server computer.
COMPUTER PROGRAM
] The values from the assays described above, such as expression data,
recurrence score, treatment score and/or benefit score, can be calculated and stored ly.
Alternatively, the above-described steps can be completely or partially performed by a
computer program product. The present invention thus provides a computer program product
including a computer le storage medium having a computer program stored on it. The
program can, when read by a computer, execute relevant calculations based on values
obtained from analysis of one or more biological sample from an individual (e. g., gene
expression levels, ization, thresholding, and conversion of values from assays to a
score and/or graphical ion of likelihood of recurrence/response to chemotherapy, gene
ression or clique analysis, and the like). The computer program product has stored
therein a er program for performing the calculation.
The t disclosure provides systems for executing the program
bed above, which system generally includes: a) a central computing environment; b) an
input device, operatively connected to the computing environment, to receive patient data,
wherein the patient data can include, for example, expression level or other value obtained
from an assay using a biological sample from the patient, or microarray data, as described in
detail above; c) an output device, ted to the ing environment, to provide
information to a user (e. g., medical personnel); and d) an algorithm executed by the central
computing environment (e.g., a sor), where the algorithm is executed based on the data
received by the input device, and wherein the thm calculates a RS, risk or benefit group
classification, gene co-expression analysis, thresholding, or other functions described herein.
The methods provided by the present invention may also be automated in whole or in part.
All aspects of the present invention may also be ced such that a
limited number of additional genes that are co-expressed with the disclosed genes, for
example as evidenced by statistically meaningful Pearson and/or Spearman ation
coefficients, are included in a prognostic test in addition to and/or in place of sed genes.
Having described the invention, the same will be more readily
understood through reference to the ing Examples, which are provided by way of
illustration, and are not intended to limit the invention in any way.
EXAMPLES
EXAMPLE 1: SELECTION OF GENES FOR ALGORITHM DEVELOPMENT
A gene fication study to identify genes associated with clinical
recurrence is described in US. Provisional ation Nos. 61/294,038, filed y 11,
2010, and 61/346,230, filed May 19, 2010, and in US. Application Publication No.
2011/0171633, filed January 7, 2011, and published July 14, 2011 (all of which are hereby
incorporated by reference). Briefly, ts with stage I-III ccRCC who underwent
nephrectomy at Cleveland Clinic between 1985 and 2003 with archived paraffin-embedded
nephrectomy samples were identified. RNA was extracted from 6 x 10 um dissected tumor
sections and RNA expression fied for 732 genes (including 5 reference genes) using
RT-PCR. The primary endpoint was recurrence-free interval (RFI), defined as time from
nephrectomy to first recurrence or death clue to RCC. 931 patients with complete
clinical/pathology data and tissue blocks were evaluable. Patient characteristics were: 63%
male, median age 61, stage I (68%), II (1 0%) and III (22%), median follow-up of 5.6 years,
-year ence rates in stage I, II, and III were 10%, 29%, and 45% respectively.
Clinical/pathology covariates significantly associated with RFI included microscopic
necrosis, Fuhrman grade, stage, tumor size and lymph node involvement (all p<0.001).
Based on the results of the identification study, 448 genes were
significantly (p<0.05, unadjusted; Cox models) ated with RFI. For the majority of
these genes (366 (82%)), increased expression was associated with better outcome. Many of
the genes were significantly 5) associated with necrosis (503 , Fuhrman grade
(494), stage (482), tumor size (492), and nodal status (183). 300 genes were significantly
(p<0.05, unadjusted) associated with at least 4 of the 5 pathologic and clinical covariates
described above.
A smaller set of 72 genes was selected for developing multi-gene
models as follows: 29 genes associated with RFI after adjustment for disease stage, Fuhrman
grade, tumor size, necrosis and nodal status controlling false ery rate (FDR) at 10%;
the top 14 genes associated with RFI in univariate analyses; 12 genes that were members of
the vascular endothelial growth factor/mammalian target of rapamycin (VEGF/mTOR)
vascularization pathway; and 17 genes from additional biological pathways that were
identified by principal component analysis (PCA). These data were used to select the final 11
cancer-related genes and 5 reference genes and to develop a multi-gene algorithm to predict
recurrence of ccRCC for patients with stage II renal cancer.
E 2: ALGORITHM PMENT
The genes identified in the studies described in e 1 were
ered for inclusion in the Recurrence Score. A r set of 72 genes was selected as
follows:
0 29 genes associated with RFI after covariate adjustment and FDR control at 10% using
Storey’s procedure (Storey JD (2002) A direct approach to false discovery rates.
Journal of the Royal Statistical Society: Series B 64:479—498; Storey JD, Taylor JE,
nd DO (2004) Strong control, conservative point estimation and aneous
conservative consistency of false discovery rates: a unified approach. Journal of the
Royal Statistical y, Series B 66:187-205.).
0 14 genes most significant before covariate adjustment
0 12 genes members of VEGF / mTOR pathways
0 17 genes were selected by principal component analysis to identify genes from
additional pathways
To determine the association between each of the 72 genes and RFI,
univariate and multivariable analyses were used. Tables lA (univariate is) and 1B
(multivariable analysis) report the Hazard Ratio, 95% confidence interval, Chi-squared, p-
value, and q-value for each of the 72 genes.
Table 1A: Univariate is for 72 genes: association with RFI
Association with RFI
Rank Official
N HR 95 % CI Chi-Sq p-value q-value
22 A2M
29 ADD1
58 ANGPTL3
26 APOLD1
4 AQPl
34 BUB 1 *
24 C1 30rf15
40 CA12*
42 CASP10
73 CCL5 931 0.99 (0.87,1.13) 0.02 0.894 0.455
48 CCNB1*
66 CCR7
69 CD8A
CEACAM1
27 CX3CL1
68 CXCL10 931 0.89 (078,101) 3.29 0.070 0.049
67 CXCL9
47 CYR61
23 EDNRB
53 EGR1
1 EMCN
56 ENO2*
17 EPAS1
31 FLT1
14 FLT4
62 HIF1AN
45 HLA—DPB1
ICAM2
19 ID1
50 IL6*
36 IL8*
65 ITGB1
72 ITGB5 931 0.97 (0.85,1.11) 0.22 0.640 0.341
JAG1
12 KDR
54 KIT
21 KL
55 KRAS
63 LAMB 1 *
9 LDB2
52 LMNB 1 *
49 LOX*
43 MAP2K3
41 MMP14*
6O MTOR
18 NOS3
16 NUDT6
61 PDGFA
33 PDGFB
37 PDGFC
28 PDGFD
57 PDGFRB
3 PPAPZB
32 PRKCH
6 PTPRB
44 PTTG1 *
64 RAF1
RGSS
7 SDPR
39 SGK1
1 1 SHANK3
SNRK
46 SPP1 *
2 TEK
13 TGFBR2
TIMP3
TPX2*
8 TSPAN7
7O TUBB2A*
51 UGCG
38 VCAM1
59 VEGFA
Key:
Genes marked w1th an asterlsk (*) are ted such that 1ncreased expresswn IS
ated with worse outcome
Genes in bold are the top 10 genes with respect to magnitude of the Hazard
Ratio (HR)
Table 1B: Multivariable analysis for 72 genes: association with RFI
Association with RFI Adjusted for 5 Clin/path Covariates
Rank l
by HR Symbol N HR 95% CI Chi-Sq p-value q-value
22 A2M 928 0.93 (080,108) 0.99 0.3191 0.5394
29 ADDl
58 ANGPTL3
26 APOLDl
4 AQPl
34 BUB1*
24 C13orf15
40 CA12* 928 1.08 (095,123) 1.46 0.2267 0.4919
42 CASPlO
73 CCL5
48 CCNB1*
66 CCR7
69 CD8A
CEACAMl
27 CX3CL1
68 CXCL10
67 CXCL9
47 CYR61 927 0.93 (0811.08) 0.84 0.3603 0.5581
........
23 EDNRB
53 EGRl 927 0.91 (079,104) 1.93 0.1652 0.4234
1 EMCN
56 EN02*
17 EPASl
31 FLTl 928 0.91 (080,105) 1.57 0.2106 0.4919
14 FLT4 926 0.86 (073,102) 3.11 0.0776 0.3289
62 HIFlAN 928 1.02 (090,116) 0.10 0.7571 0.7052
45 HLA—DPBl
ICAM2
19 m1
50 IL6* 928 1.04 (092,118) 0.46 0.4994 0.6384
36 1L8 * 928 1.11 (098,126) 2.89 0.0890 0.3350
65 ITGB 1
72 ITGB5
JAG1 (075,103)
12 KDR (074,100)
54 KIT (083,113)
21 KL
55 KRAS
63 LAMB 1 *
LDB2
52 LMNB 1 *
49 LOX* (086,112)
43 MAP2K3 (0.79 1.06)
41 MMP14*
60 MTOR (081,107)
18 NOS3
16 NUDT6
61 PDGFA (085,112)
33 PDGFB (081,108)
37 PDGFC (082,106)
28 PDGFD (079,105)
57 PDGFRB (090,118)
PPAPZB
32 PRKCH
PTPRB
44 PTTG1 * 16)
64 RAF1 (0.92 1.22)
RGSS
SDPR
39 SGK1
11 SHANK3
SNRK
46 SPP1 *
13 TGFBR2
TIMP3
TPX2*
TSPAN7
70 TUBB2A*
51 UGCG
928 0.92 (081,104) 1.88 0.1708 0.4271
928 1.01 (0.88,1.17) 0.03 0.8727 0.7408
Genes marked with an asterisk (*) are associated such that increased expression is associated
with worse outcome
Genes in bold are the top 10 genes w.r.t. ude of the Hazard Ratio (HR)
is in this table is adjusted for stage, necrosis status, tumor size, Furhman grade, nodal
status.
The 72-gene set was d further to 14 genes by exploring the
contribution of genes to the multi-gene models, consistency of performance across endpoints,
and ical performance. Selection of the final set of 11 genes was based on multivariable
analyses which considered all possible combinations of the 14 genes and ranked models by
standardized hazard ratio for the multi-gene score (Crager, Journal of Applied Statistics 2012
February; 36(2),399-417) corrected for regression to the mean (RM). This method corrects
for selecting among combinations of genes and ers combinations selected from all 732
genes investigated in the gene identification study. The identified maximum RM-corrected
hazard ratio is ed (Crager, Stat Med. 2010 Jan 15;29(1):33-45.)) and provides a
realistic estimate of the performance of the given multi-gene model on an independent
dataset.
Additional considerations for gene selection included assay
performance of individual genes (heterogeneity) when assessed in fixed paraffin-embedded
tumor tissue, level and variability of gene expression, and functional form of the relationship
with clinical outcome.
The gene expression panel included cancer-related genes and nce
genes, as shown in Table 2.
Table 2: Gene Expression Panel
ion Reference Accession
Number
APOLDl NM_030817 AAMP NM_001087
CCLS NM_002985 ARFl NM_00 1658
CEACAMl RM_001712 ATPSE 886
CX3CL1 '/M_002996
xM_004095
0600
XVI_005573
XVI_000603
PPAPZB )M_003713
TUBBZA F M_001069
Overview of the Al orithm for in a Recurrence Score
After using quantitative RT-PCR to determine the mRNA expression
levels of the chosen genes, the genes were grouped into subsets. Genes known to be
associated with vascular and/or angiogenesis functions were grouped in a “vascular
normalization” gene group. Genes known to be associated with immune function were
grouped in an “immune response” gene group. Genes associated with key cell growth and
cell on pathway(s) were grouped in a “cell / division” gene group.
The gene expression for some genes may be thresholded if the
relationship between the term and the risk of recurrence is non-linear or expression of the
gene is relatively low. For example, when the expression of 1L6 is found at <4 CT the value
is fixed at 4 CT.
In the next step, the ed tumor level of each mRNA in a subset
was multiplied by a coefficient reflecting its relative intra-set contribution to the risk of
cancer recurrence. This product was added to the other products between mRNA levels in
the subset and their coefficients, to yield a term, e.g. a vascular normalization term, a cell
growth/division term, and an immune response term. For example, the immune response
term is (0.5 CCL5 + CEACAMl + CX3CL1) / 3 (see the Example below).
The contribution of each term to the overall recurrence score was then
ed by use of a coefficient. For example, the immune response term was multiplied by -
0.31.
The sum of the terms obtained ed the recurrence score (RS).
2014/040003
A relationship between recurrence score (RS) and risk of recurrence
has been found by measuring expression of the test and reference genes in biopsied tumor
ens from a tion of patients with clear cell renal cell carcinoma and applying the
thm.
The RS scale generated by the algorithm of the t invention can
be adjusted in various ways. Thus, while the RS scale specifically described above
effectively runs from -3.2 to -0.2, the range could be selected such that the scale run from 0 to
, 0 to 50, or 0 to 100, for example. For example, in a particular scaling approach, scaled
recurrence score (RS) is calculated on a scale of 0 to 100. For convenience, 10 CT units are
added to each measured CT value, and unscaled RS is calculated as described before. Scaled
recurrence score values are calculated using the equations shown below.
The Recurrence Score (RS) on a scale from 0 to 100 was derived from
the reference-normalized expression measurements as follows:
RSu=
- 0.45 X Vascular Normalization Gene Group Score
- 0.31 X Immune Response Gene Group Score
+ 0.27 X Cell Growth/ Division Gene Group Score
+ 0.04 X IL6
where
Vascular Normalization Gene Group Score = (0.5 APOLD1+ 0.5 EDNRB + NOS3 +
PPAZB) / 4
Cell / Division Gene Group Score = (EIF4EBP1 + 1.3 LMNBl + TUBB2A) / 3
Immune Response Gene Group Score = (0.5 CCL5 + l + CX3CL1) / 3
The RSu (Recurrence Score unsealed) is then rescaled to be between 0 and 100:
RS = (RSu + 3.7) X 26.4,
If (RSu + 3.7) X 26.4<0, then RS=0.
If (RSu + 3.7) X 26.4>100, then RS=100.
EXAMPLE 3: PERFORMANCE OF THE ALGORITHM
The performance of the final genes included in the algorithm with and
without ment for correction for sion to the mean with respect to the endpoint of
recurrence is summarized in Table 3.
When using analyses that control the false discovery rate such as
Storey’s procedure, increasing the proportion of genes with little or no association decreases
the identification power even for genes ly associated with outcome. Therefore,
analyzing all of the genes together as one very large set can be eXpected to produce an
analysis with lower power to identify truly associated genes. To mitigate this issue, a
“separate class” analysis (Efron B. Simultaneous inference: When should hypothesis g
problems be ed. Ann. App]. Statist. 2008;2:197—223.) was done. In the separate class
analysis, false discovery rates are calculated within each gene class, using information from
all the genes to improve the accuracy of the calculation. Two gene classes were selected
prospectively on the basis of prior information and/or belief about their association with
cancer recurrence, and the remaining genes places in the third class.
Table 3: Performance of the Genes in the Algorithm
Higher
eXpreSSion (1' RM-
Official p- MLB
N Class more (+)/ ASHR SHR (95% CI) Value Corrected
Symbol value ASHR
less (-) (FDR) ASHR
risk
1 2 PPAP2B (—) 2.00 0.50 (045,055) <0.001 <0.001 1.73 1.97
2 1 NOS3 (—) 1.83 0.55 62) <0.001 <0.001 1.59 1.80
3 2 EDNRB (—) 1.78 0.56 (050,063) <0.001 <0.001 1.58 1.76
4 2 APOLDl (—) 1.74 0.57 (051,064) <0.001 <0.001 1.55 1.72
3 CX3CL1 (—) 1.72 0.58 (052,065) <0.001 <0.001 1.45 1.68
6 3 CEACAMl (—) 1.70 0.59 (051,067) <0.001 <0.001 1.42 1.64
7 3 lL6* (+) 1.38 1.38 (125,153) <0.001 <0.001 1.24 1.35
8 3 LMNBl (+) 1.40 1.40 (123,160) <0.001 <0.001 1.22 1.34
9 3 EIF4EBP1 (+) 1.19 1.19 (104,137) 0.010 0.004 1.09 1.16
3 TUBB2A (+) 1.09 1.09 (096,124) 0.200 0.054 1.03 1.07
11 1 CCL5 (—) 1.01 0.99 (0.87,1.13) 0.894 0.125 1.01 1.03
Abbreviations: ASHR = absolute standardized hazard ratio, RM = regression to the mean
corrected, FDR = false discovery rate.
* 1L6 expression thresholded at 4 CT.
In the Cox model stratified by stage, the final Recurrence Score
yielded absolute standardized HR =2.16 (95% CI 1.89, 2.48) and sion to the mean
corrected absolute standardized HR =1.91 (95% CI 1.38, 2.30) for the association with
ence.
Performance of the Recurrence Score can also be demonstrated by the
predictiveness curves (Hung Y, Pepe MS, Feng Z. (2007). Evaluating the tiveness of a
continuous marker. Biometrics 63:1181-1188.) shown in Figures 1A and 1B. These curves
are plots of the estimated risk of ence (vertical axis) against the population quantile
(rank) of the risk. The curve as a whole shows the population distribution of risk. More
effective prognostic scores te lower risk ts from higher risk patients, which are
reflected by the curve separating from the average risk line. Risk cut-points can then be
applied to describe how many patients fall into various risk groups. For e, the cut-
points can be used to describe how many patients with stage 1 RCC have a risk > 16%.
EXAMPLE 4: GENEITY STUDY
] An internal study examining the variability due to tissue heterogeneity
was run on a sample of renal cancer fixed paraffin-embedded tissue (FPET) blocks. Eight (8)
patients with two (2) blocks for each patient and three (3) sections within each block were
assessed using the methods and thm provided in the above Examples. Heterogeneity
was measured by assessing between block variability and within block variability. The
between block variability measures the biological variability between FPET blocks within the
same patient. This provides an estimate of the population level variability. The within block
variability captures both the tissue heterogeneity within a block as well as the technical assay-
related variability. The normalized individual gene scores as well as the Recurrence Score
were calculated and within block, between block and between patient variability estimates
were generated. The results of the analysis are listed in tables 4 and 5 below. The high ratio
of the between patient variability to the between and within block variability is generally
favorable. This indicates that the tissue heterogeneity and cal assay related variability
is low compared with the clinically informative patient to patient variability in the individual
gene measurements and the Recurrence Score.
Table 4: Recurrence Score Variance Component tes
ce Lower Upper
Component SD 95% 95%
.60 10.15 33.32
Table 5: Individual Normalized Gene Variance Component tes
Gene Variance Comp SD Lower 95% Upper 95%
AAMP.1 n Patient 0.39 0.26 0.84
Between Block
Within Block
APOLD1.1 Between Patient
APOLD1.1 Between Block
APOLD1.1 Within Block
ARF1.1 Between Patient
ARF1.1 Between Block
ARF1.1 Within Block
ATP5E.1 Between Patient
ATP5E.1 Between Block
ATP5E.1 Within Block
CCL5.2 Between Patient
CCL5.2 Between Block
CCL5.2 Within Block
CEACAM1.1 Between Patient
1.1 Between Block
CEACAM1.1 Within Block
CX3CL1.1 Between Patient
CX3CL1.1 Between Block
CX3CL1.1 Within Block
EDNRB.1 Between Patient
1 Between Block 0.36 0.23 0.75
Gene Variance Comp SD Lower 95% Upper 95%
EDNRB.1 Within Block
EIF4EBP1.1 Between Patient
EIF4EBP1.1 Between Block
EIF4EBP1.1 Within Block
Between Patient 0.43 0.28 0.88
Between Block 0.04 0.02 0.12
Within Block 0.04 0.04 0.06
Between Patient 1.24 0.81 2.60
Between Block 0.28 0.18 0.62
IL6.3 Within Block 0.19 0.15 0.25
LMNB1.1 Within Block 0.14 0.11 0.18
NOS3.1 Between t 0.75 0.48 1.69
NOS3.1 Between Block 0.32 0.21 0.70
NOS3.1 Within Block 0.21 0.17 0.28
.1 Between Patient 0.89 0.58 1.90
PPAP2B.1 Between Block
PPAP2B.1 Within Block
RPLP1.1 n Patient
RPLP1.1 Between Block
RPLP1.1 Within Block 0.05 0.04 0.07
TUBB.1 Between Patient 0.52 0.34 1.07
TUBB.1 Between Block 0.00 .
Gene Variance Comp SD Lower 95% Upper 95%
TUBB.1 Within Block 0.15 0.12 0.19
EXAMPLE 5: ONAL MULTI-GENE COMBINATIONS
A number of alternative multi-gene models were also evaluated, using
either the dataset from the gene identification study or the dataset from the validation study.
Additional representative gene combinations tested on the dataset from the gene
identification study are shown in Table 6. Additional representative gene combinations tested
on the dataset from the validation study are shown in Table 7. Models 1-4 shown in Table 6
were not tested on the dataset from the validation study, and so are omitted from Table 7.
Those Tables both list ated coefficients ing each gene’s relative weight in an
algorithm to predict the risk of cancer recurrence. The ed tumor level of each mRNA
encoding the specific genes used in the various models tested (e.g., model 11 included
APOLDl, NOS3, PPAPZB, and CEACAMl) was multiplied by the listed coefficient to
e an alternative score. The performance of each alternative score, as measured by
absolute standard hazard ratios and the ponding 95% confidence intervals, is also
shown in the Tables. Where two genes are listed in the header row (e. g., APOLDl-EDNRB,
IL6-IL8), that column lists the coefficient of the average measured tumor level of the mRNA
encoding those two genes.
wwomwd. mmwofio.
minwmd mwbmmd mmmvvd
:ommd owwmmd movfld
Ommbod Ommmmd Hmwomd
flambmd. omflwmd. mmofl
md. moENd. memmd. meflmd. wavmd. mmoflmd. wozumd.
:25 mowmmd. mowwmd. hmHmNd. Nwmvmd. . ommwmd. 2mm”? NSbmd. wammd.
iii Hm
9 :wwmd. wade. Nmmed. wvowmd. on??? mumbvd. vwmmvd. meovd. wwmwmd. mVOde.
250 NE
2: thde. SNSd. voommd. whowmd. . Vmwovd. wmmwmd. wwowmd. H
md. mumbmd. wwmwmd. wwbwmd. owmwmd. Hoflmvd. Nmmomd.
Rafi mbmovd. mVOSd.
223538 bwwflod.
mooofio Nmovod Emmod. $2 Mann
md. mod- omowod . mmowod. 922.0. wmwwod.
250 Ammdémav A:.m-mm.3 pv.m-m>.$ Amodfivav Cm.~.>.$ deéwav Amo.m.w.3 Awodéwav .C Codéwav w.
:63 3 Ambdfiwav Amodéwav Awb.m-mw.3 deéwav Amwdémav Amo.m-ww.3 Amw.m.w.3 Qa.m-mw.3
mmd ZEN mm; S; mm; VON 2N EN mm: EN EN RN mad RN mmd EN SEN wmd SEN
28? H N m v m w b w m 2 : 2 2 3 a 3 : m: a
WO 94078
bwmmfid 9%de
mHNOmd- wmimd-
md- wammd.
movd.
mmmwmd. bwowmd-
252.0-
Amwdéwav
:25 ppm”? .
iii mmsvd omilud
gang? 2:de mflmmvd
2: novamd. mmmbmd. wbmflmd. Emwmd. mmommd. mbbflmd. wSSd. mmmflmd. . md.
.8 Omwwmd. mommmd. oflmvmd. BEND. m3£.0. ESfio. bgmmd. mammd. Emmmd. oomomd.
282358 mm
oibvd. bowmvd. mumbvd. wouqmd. owwbmd. wmmwmd. Ewbmd. SummNd. . mmbwod. mowflmd.
voavd. mmbvvd. omflomd. 2
250 Hmmd. ovommd. bwwmvd. memvd. mogvd. bomed. ommovd. mflommd. gammd. vawd. waiud. mummmd.
:63 woqwmd. Sebmd.
Hwbwod.
2:3 392.0. mmmmmd. mommfio. mmmwod. mmmwfld. momOmd. :bbmd. :vad. mwSNd.
>w.3 Amofiéwav Amofiébav pm.m->>.$ Ammfifiwav pm.m-w>.$ Cb.m-mw.3 Cw.m-m>.$ Cbfiébav Amm.m-wb.3 Awofifibav w.3 Amzuéwav @4333 Gofiébav Qwfiéwav pm.m->>.$
wmd oNN ovN mvd bmd OWN omN ovN wvd mmd VmN NWN mmd EN EN NQN EN
m w b w m OH : NH 2 E a 3 : m: 3 om 3
Claims (13)
1. A method for predicting a risk of recurrence for a t with kidney cancer, comprising: measuring, in a tumor sample obtained from the patient, a level of an RNA transcript of each of genes P1, LMNB1, and TUBB2A; normalizing the level of each RNA transcript against a level of an RNA transcript in the tumor sample from at least one reference gene, to provide a normalized level of each RNA transcript; calculating a quantitative score for the patient using the normalized RNA transcript , wherein the quantitative score comprises (EIF4EBP1 + 1.3(LMNB1) + TUBB2A)/3; predicting a risk of recurrence for the patient based on the quantitative score.
2. The method of claim 1, further comprising measuring a level of an RNA transcript from one or more of APOLD1, EDNRB, NOS3, and PPAP2B.
3. The method of claim 2, n the method ses calculating a further quantitative score comprising: (0.5(APOLD1) + 0.5 (EDNRB) + NOS3 + PPAP2B)4.
4. The method of any one of claims 1-3, further comprising measuring a level of an RNA transcript from one or more of CCL5, CEACAM1, and .
5. The method of claim 4, wherein the method comprises calculating a r quantitative score comprising: (0.5(CCL5) + CEACAM1 + CXCL3)/3.
6. The method of any one of claims 1-5, wherein the levels of the RNA transcripts are determined by quantitative RT-PCR.
7. The method of any one of claims 1-6, wherein measuring the level of the RNA transcripts in the tumor sample comprises extracting RNA from the tumor , reverse transcribing the RNA transcript of each gene to produce a cDNA of each gene, amplifying the cDNA to produce amplicons of the RNA transcripts of the genes, and ng levels of the amplicons of the genes.
8. The method of any one of claims 1-7, further comprising creating a report providing the quantitative score.
9. The method of any one of claims 1-8, wherein the kidney cancer is renal cell carcinoma (RCC).
10. The method of claim 9, wherein the RCC is clear cell renal cell carcinoma (ccRCC).
11. The method of any one of claims 1-10, wherein the tumor sample is to be obtained from a .
12. The method of any one of claims 1-11, wherein the tumor sample is fresh or frozen.
13. The method of any one of claims 1-11, wherein the tumor sample is paraffinembedded and fixed.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361829100P | 2013-05-30 | 2013-05-30 | |
US61/829,100 | 2013-05-30 | ||
NZ711680A NZ711680B2 (en) | 2013-05-30 | 2014-05-29 | Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer |
Publications (2)
Publication Number | Publication Date |
---|---|
NZ752676A NZ752676A (en) | 2021-05-28 |
NZ752676B2 true NZ752676B2 (en) | 2021-08-31 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7042784B2 (en) | How to Quantify Prostate Cancer Prognosis Using Gene Expression | |
JP7307602B2 (en) | Methods of using gene expression to determine the likelihood of clinical outcome in kidney cancer | |
JP2024037948A (en) | Methods to predict clinical outcome of cancer | |
US11551782B2 (en) | Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer | |
WO2010127322A1 (en) | Gene expression profile algorithm and test for likelihood of recurrence of colorectal cancer and response to chemotherapy | |
AU2017268510A1 (en) | Method for using gene expression to determine prognosis of prostate cancer | |
US20110287958A1 (en) | Method for Using Gene Expression to Determine Colorectal Tumor Stage | |
NZ752676B2 (en) | Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer | |
AU2015202116B2 (en) | Method to use gene expression to determine likelihood of clinical outcome of renal cancer | |
NZ711680B2 (en) | Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer |