WO2021037134A1 - 肺腺癌分子分型及生存风险基因群及诊断产品和应用 - Google Patents

肺腺癌分子分型及生存风险基因群及诊断产品和应用 Download PDF

Info

Publication number
WO2021037134A1
WO2021037134A1 PCT/CN2020/111702 CN2020111702W WO2021037134A1 WO 2021037134 A1 WO2021037134 A1 WO 2021037134A1 CN 2020111702 W CN2020111702 W CN 2020111702W WO 2021037134 A1 WO2021037134 A1 WO 2021037134A1
Authority
WO
WIPO (PCT)
Prior art keywords
related genes
genes
gene
immune
proliferation
Prior art date
Application number
PCT/CN2020/111702
Other languages
English (en)
French (fr)
Inventor
周彤
周伟庆
胡志元
马琳琳
陆俊欢
Original Assignee
上海善准生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海善准生物科技有限公司 filed Critical 上海善准生物科技有限公司
Priority to CN202080062083.0A priority Critical patent/CN114341367B/zh
Priority to US17/753,254 priority patent/US20220364183A1/en
Publication of WO2021037134A1 publication Critical patent/WO2021037134A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/111General methods applicable to biologically active non-coding nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the invention belongs to the field of biotechnology, and specifically relates to a gene group used for determining the subtype of lung adenocarcinoma and evaluating the survival risk of subjects, as well as in vitro diagnostic products and applications thereof.
  • tissue microarray and immunohistochemistry technology to detect related genes that may affect the prognosis of lung cancer, combine the clinical pathological characteristics and prognosis data of patients, and use statistical methods to screen and construct individualized prognostic predictions for lung cancer. Model and verify it. After surgery for lung cancer patients, it can be used to predict the survival of lung cancer in 5 years or more. Those with a low risk of recurrence can consider not doing radiotherapy and chemotherapy to reduce the occurrence of adverse reactions and the economic burden of treatment; patients with a high risk of recurrence are advised to receive chemotherapy, radiotherapy or biological therapy in time to receive the greatest clinical benefit.
  • molecular diagnosis based on expression profiles can help identify groups that can benefit from a treatment plan, improve treatment efficiency, and avoid ineffective treatments.
  • studies have shown that a prognostic model combined with genomics can better risk stratification and prognosis assessment for lung cancer patients than using clinical parameters alone.
  • Lung cancer is mainly divided into two categories: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC).
  • SCLC small cell lung cancer
  • NSCLC non-small cell lung cancer
  • the latter includes adenocarcinoma, squamous cell carcinoma, large cell carcinoma and other types, accounting for more than 80% of all lung cancers.
  • Lung adenocarcinoma is one of the main types. Its molecular pathogenesis is complex and targeted therapies are relatively abundant.
  • Faruki and Mayhew et al. (Faruki H, et al., Journal of thoracic oncology: official publication of the International Association for the Study of Lung Cancer.
  • lung squamous cell carcinoma found that the expression profile characteristics of lung adenocarcinoma and lung squamous cell carcinoma were significantly different. It divided lung adenocarcinoma into three subtypes: TRU, PP and PI, and found that the subtypes of lung adenocarcinoma can be As a marker of tumor immune cell expression and PD-L1 expression. Chinnaiyan et al. (Shukla S, et al., Journal of the National Cancer Institute. 2017, 109(1)) established a 4-gene combination to divide lung adenocarcinoma patients into high-risk and low-risk groups, and the prognosis of the high-risk group was significant Lower than the low-risk group.
  • the present invention provides a group of genes for determining molecular typing of lung adenocarcinoma and/or assessing the survival risk of lung adenocarcinoma patients, which includes genes related to molecular typing and survival risk assessment.
  • the gene group further includes a reference gene.
  • the molecular classification of lung adenocarcinoma includes LAD1 type, LAD2 type, LAD3 type, LAD4 type, LAD5 type and mixed type.
  • the present invention also provides reagents for detecting the expression level of genes in the gene group of the present invention.
  • the reagent is a reagent for detecting the amount of RNA, especially mRNA, transcribed from the gene of the present invention; or it is a reagent for detecting the amount of cDNA complementary to mRNA.
  • the reagent is a primer, a probe, or a combination thereof.
  • the present invention also provides a product for molecular typing and/or survival risk assessment of lung adenocarcinoma, which comprises the reagent of the present invention.
  • the present invention also provides the application of the gene group or reagent of the present invention in the preparation of products.
  • the product is used to determine the molecular classification of lung adenocarcinoma and/or to assess the survival risk of patients with lung adenocarcinoma.
  • the product is a second-generation sequencing kit, a real-time fluorescent quantitative PCR detection kit, a gene chip, a protein microarray, an ELISA diagnostic kit, or an immunohistochemistry (IHC) kit.
  • the product is a second-generation sequencing kit or a real-time fluorescent quantitative PCR detection kit.
  • the present invention also provides a method for determining the molecular typing and/or survival risk of lung adenocarcinoma in a subject, the method comprising: (1) providing a sample of the subject; (2) determining the The expression level of genes in the gene group of the present invention in the sample; (3) determining the molecular typing and/or survival risk of lung adenocarcinoma of the subject.
  • Figure 1 shows the molecular typing of lung adenocarcinoma and survival risk-related genes (proliferation-related genes, immune-related genes, and intercellular substance-related genes) in LAD1, LAD2, LAD3, LAD4, LAD5 and mixed (Mixed). ) In the expression heat map.
  • Figure 2 shows the Kaplan-Meier survival curve, indicating that each subtype of lung adenocarcinoma has a different survival risk.
  • the 5-year survival rate of LAD1 subtype is good
  • the 5-year survival rate of LAD2 subtype and LAD4 subtype is relatively poor
  • the prognosis of LAD3 subtype and LAD5 subtype is moderate.
  • Figure 3 shows the Kaplan-Meier survival curve, indicating that the proliferation index can indicate the prognosis of lung adenocarcinoma.
  • lung adenocarcinoma cases can be divided into two groups with fast proliferation and slow proliferation. The 5-year survival rate of the fast proliferation group is lower.
  • Figure 4 shows the Kaplan-Meier survival curve, indicating that the immune index can indicate the prognosis of lung adenocarcinoma.
  • lung adenocarcinoma cases can be divided into two groups with strong immune index and weak immune index.
  • the 5-year survival rate of the strong immune index group is higher.
  • Figure 5 shows the Kaplan-Meier survival curve, which indicates that the lung adenocarcinoma survival risk index calculated according to the subtype, proliferation index, and immune index can indicate survival risk.
  • the 5-year survival rate of the low-risk (survival risk index is 0-35) group is higher, the 5-year survival rate of the medium-risk (survival risk index is 36-70) group is medium, and the high-risk group (survival risk index is 71-100) The 5-year survival rate of the group is lower.
  • Figure 6 shows a pie chart of statistical data for molecular typing and risk assessment of 21 lung adenocarcinoma samples.
  • the detection of the gene expression level described herein can be achieved, for example, by detecting the target nucleic acid (for example, RNA transcript), or by detecting the amount of the target polypeptide (for example, the encoded protein), for example, by detecting the protein expression level by proteomics methods. to fulfill.
  • the amount of the target polypeptide such as the amount of the polypeptide, protein or protein fragment encoded by the target gene, can be normalized to the amount of the total protein in the sample or the amount of the polypeptide encoded by the reference gene.
  • the amount of target nucleic acid such as the amount of target gene DNA, its RNA transcript, or the amount of cDNA complementary to RNA transcript, can refer to the amount of total DNA, total RNA or total cDNA in the sample or refer to the DNA, RNA of a set of reference genes
  • the amount of transcript or cDNA complementary to RNA transcript is normalized.
  • polypeptide refers to a compound composed of amino acids connected by peptide bonds, including full-length polypeptides or amino acid fragments.
  • polypeptide and protein can be used interchangeably.
  • nucleotide includes deoxyribonucleotides and ribonucleotides.
  • nucleic acid refers to a polymer composed of two or more nucleotides, and encompasses deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and nucleic acid analogs.
  • RNA transcript refers to total RNA, that is, coding or non-coding RNA, including RNA directly derived from tissue or peripheral blood samples, as well as RNA indirectly derived from tissue or blood samples after cell lysis.
  • Total RNA includes tRNA, mRNA, and rRNA, where mRNA includes mRNA transcribed from target genes, as well as mRNA from other non-target genes.
  • mRNA can include precursor mRNA and mature mRNA, and it can be either the full length of the mRNA or its fragments.
  • the RNA that can be used for detection is preferably mRNA, and more preferably mature mRNA.
  • cDNA refers to DNA having a complementary base sequence to RNA. Those skilled in the art can apply methods known in the art to obtain RNA transcripts of genes and/or cDNAs complementary to their RNA transcripts from the DNA of genes, for example, by chemical synthesis methods or molecular cloning methods.
  • the target nucleic acid for example, RNA transcript
  • the target nucleic acid can be detected and quantified, for example, by hybridization, amplification, or sequencing.
  • the RNA transcript is hybridized with the probe or primer to form a complex, and the amount of the target nucleic acid is obtained by detecting the amount of the complex.
  • hybridization refers to the process of combining two nucleic acid fragments through stable and specific hydrogen bonds to form a double helix complex under appropriate conditions.
  • amplification primer refers to a nucleic acid fragment containing 5-100 nucleotides, preferably containing 15-30 nuclei capable of initiating an enzymatic reaction (eg, an enzymatic amplification reaction) Glycidic acid.
  • hybridization probe refers to a nucleic acid sequence (which can be DNA or RNA) comprising at least 5 nucleotides, for example, comprising 5 to 100 nucleotides, which can interact with a target nucleic acid (for example, The RNA transcript of the target gene or the amplified product of the RNA transcript, or cDNA complementary to the RNA transcript) hybridize to form a complex.
  • the hybridization probe may also include a marker for detection.
  • TaqMan probe is a probe based on TaqMan technology.
  • Its 5'end carries a fluorescent group, such as FAM, TET, HEX, NED, VIC or Cy5, etc.
  • its 3'end carries a fluorescence quenching group (such as TAMRA and BHQ group) or non-fluorescence quenching group (TaqMan MGB probe), which has a nucleotide sequence that can hybridize with the target nucleic acid, and can be reported to form a complex when applied to real-time fluorescent quantitative PCR (RT-PCR) The amount of nucleic acid.
  • RT-PCR real-time fluorescent quantitative PCR
  • reference gene refers to a gene that can be used as a reference to correct and standardize the expression level of the target gene.
  • the inclusion criteria for reference genes that can be considered include: (1) Stable expression in tissues, and its expression The level is not affected by pathological conditions or drug treatment or is less affected; (2) The expression level should not be too high to avoid too high a proportion of the data obtained by expression data (such as obtained through second-generation sequencing), which affects the data of other genes Accuracy of detection and interpretation. Therefore, reagents that can be used to detect the expression level of the reference gene of the present invention are also within the protection scope of the present invention.
  • Reference genes that can be used in the present invention include but are not limited to "housekeeping genes". In this article, “reference gene”, “internal reference gene” and “housekeeping gene” can be used interchangeably.
  • housekeeping gene refers to a type of gene whose product is necessary to maintain the basic life activities of cells, and is continuously expressed in most or almost all tissues at various stages of individual growth, and the expression level is less affected by environmental factors.
  • lung adenocarcinoma refers to a type of lung cancer, which belongs to non-small cell lung cancer, which originates from the bronchial mucosal epithelium, and a few originate from the mucous glands of the large bronchi.
  • the incidence rate is lower than that of squamous cell carcinoma and undifferentiated carcinoma, and the age of onset is younger, and it is relatively more common in women.
  • Most adenocarcinomas originate from smaller bronchi and are peripheral lung cancers. Lung adenocarcinoma is more likely to occur in young women, with a history of smoking, and Asian ethnic groups.
  • lung adenocarcinoma includes, but is not limited to, primary lung adenocarcinoma and metastatic lung adenocarcinoma.
  • the term "molecular classification of lung adenocarcinoma” refers to a lung adenocarcinoma classification method established based on the gene expression profile of lung adenocarcinoma tumor tissue.
  • prognosis refers to the prediction of the course and development of lung adenocarcinoma, including but not limited to the prediction of the survival risk of lung adenocarcinoma. Lung cancer with a lower survival risk has a better prognosis, and vice versa.
  • the "survival risk assessment” in this article refers to the assessment of the possibility of lung adenocarcinoma patients' disease progression or death due to lung adenocarcinoma and related causes within a specified period starting from randomization.
  • disease progression includes but is not limited to tumor cell increase, reappearance and metastasis.
  • relapse risk and “survival risk” can be used interchangeably.
  • the survival risk assessment is carried out by calculating the survival risk score (also called the survival risk index).
  • the present invention provides a set of gene clusters, which include genes related to lung adenocarcinoma molecular typing and survival risk assessment.
  • the lung adenocarcinoma molecular typing and survival risk assessment related genes of the present invention may include: (1) 69 proliferation-related genes, (2) 73 immune-related genes, and (3) 38 intercellular substance-related genes.
  • proliferation-related genes include: PLK1, PRC1, CCNB1, DLGAP5, KPNA2, CCNA2, RRM2, HMMR, KIF20A, FOXM1, MKI67, KIF14, TK1, HJURP, TPX2, EXO1, KIF11, NEK2, KIF23, CDCA3, CDK1, SPAG5, KIF4A, GTSE1, CDKN3, CDC25C, PRR11, CCNB2, MAD2L1, PKMYT1, CENPE, ASPM, CENPF, BUB1, NDC80, NUSAP1, CEP55, NCAPG, BIRC5, ZWINT, TTK, ESPL1, DEPDC1, MELK, CDC20, CDC6, AURKA, NEIL3, CDT1, KIF2C, KIFC1, NCAPH, KIF18B, AURKB, UBE2C, TYMS, TOP2A, PBK, CDC45, CDCA8, CENPA, MYBL2, SKA1, M
  • (2) 73 immune-related genes include: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37, IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52, DOCK2, CD84, FGL2, FOLR2, NCKAP1L, TRAC, MNDA, MRC1, PLEK, LCP1, SPIB, CD53, CD3E, SLCO2B1, MS4A6A, CYBB, CD4, SH2D1A, TFEC, LYZ, ITGAM, TLR8, CSF1R, CXCL13, GPNMB, CCR5, HK3, CMKLR1, IL2RG, TYROBP, HCK, ITGB2, LAPTM5, SIGLEC1, AOAH, C3AR1, MSR1, IL2RA, CCL5, ADAMDEC1, LILRB4, CXCL11, FPR3, SELL, C
  • (3) 38 intercellular substance-related genes include: LOXL2, SPOCK1, COL1A1, POSTN, ADAM12, COL6A2, COL5A1, COL11A1, COL5A2, COL1A2, MXRA5, THBS2, INHBA, VCAN, ADAMTS12, GREM1, COL3A1, SULF1, ADAMTS2 PRRX1, COL15A1, SPARC, THY1, FAP, DIO2, FN1, COL6A3, FBN1, SYNDIG1, AEBP1, LRRC15, CILP, ISLR, GAS1, COL10A1, ASPN, MMP2 and EPYC.
  • the present invention provides a set of gene groups, which include genes related to lung adenocarcinoma molecular typing and survival risk assessment, that is, one or more of (1) 69 proliferation-related genes as described above, (2 ) One or more of 73 immune-related genes, and (3) one or more of 38 intercellular substance-related genes.
  • the gene group includes 180 genes related to lung adenocarcinoma molecular typing and survival risk assessment (see Table 1), which includes (1) 69 proliferation-related genes as described above; (2) 73 genes Immune-related genes, and (3) 38 intercellular substance-related genes.
  • the gene group includes 70 lung adenocarcinoma molecular typing and survival risk assessment related genes (see Table 2), which includes (1) 23 proliferation-related genes: PLK1, PRC1, CCNB1, DLGAP5 , KPNA2, CCNA2, RRM2, FOXM1, MKI67, KIF14, HJURP, TPX2, NEK2, CDK1, CDKN3, ASPM, CEP55, BIRC5, MELK, CDC20, TYMS, AURKA and TOP2A; (2) 30 immune-related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37, IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52, DOCK2, CD84, FOLR2, NCKAP1L, TRAC, MNDA, MRC1, PLEK, SPIB, CD53, CD4 and LYZ; and (3) 17
  • the gene group includes 24 genes related to lung adenocarcinoma molecular typing and survival risk assessment (see Table 3), which includes (1) 9 proliferation-related genes: PLK1, PRC1, CCNB1, and MKI67 , TPX2, MELK, CDC20, TYMS and TOP2A; (2) 9 immune-related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R and CD4; and (3) 6 intercellular substance-related genes: SPOCK1, COL1A1, POSTN, ADAM12, COL6A2 and COL5A1.
  • Table 3 includes (1) 9 proliferation-related genes: PLK1, PRC1, CCNB1, and MKI67 , TPX2, MELK, CDC20, TYMS and TOP2A; (2) 9 immune-related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R and CD4; and (3) 6 intercellular substance-related genes: S
  • the gene group includes 21 lung adenocarcinoma molecular typing and survival risk assessment-related genes (see Table 4), which includes (1) 8 proliferation-related genes: PLK1, PRC1, CCNB1, and MKI67 , TPX2, MELK, CDC20 and TOP2A; (2) 7 immune-related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7 and IL7R; and (3) 6 intercellular substance-related genes: SPOCK1, COL1A1, POSTN, ADAM12, COL6A2 and COL5A1.
  • Table 4 includes (1) 8 proliferation-related genes: PLK1, PRC1, CCNB1, and MKI67 , TPX2, MELK, CDC20 and TOP2A; (2) 7 immune-related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7 and IL7R; and (3) 6 intercellular substance-related genes: SPOCK1, COL1A1, POSTN, ADAM12, COL6A
  • the gene group may also include reference genes.
  • the reference gene is a housekeeping gene.
  • Housekeeping genes that can be used in the present invention include but are not limited to one or more of the following: GAPDH, GUSB, MRPL19, PSMC4, SF3A1, TFRC, ACTB, and RPLPO.
  • the gene group of the present invention may also include at least one of the following reference genes (for example, 1, 2, 3, 4, 5, 6, 7 or 8), preferably at least 3, and most preferably 6 : GAPDH, GUSB, MRPL19, PSMC4, SF3A1, TFRC, ACTB and RPLP0.
  • the reference genes include GAPDH, GUSB, MRPL19, PSMC4, SF3A1 and TFRC. In another specific embodiment, the reference genes include GAPDH, GUSB and TFRC. In another specific embodiment, the reference gene includes ACTB.
  • the gene group of the present invention includes 180 genes related to molecular typing and survival risk assessment as described above, and reference genes.
  • the reference genes include GAPDH, GUSB, MRPL19, PSMC4, SF3A1 and TFRC, and the gene groups are shown in Table 1.
  • the gene group of the present invention includes 70 genes related to molecular typing and survival risk assessment as described above, and reference genes.
  • the reference genes include GAPDH, GUSB, MRPL19, PSMC4, SF3A1 and TFRC, and the gene groups are shown in Table 2.
  • the gene group of the present invention includes the 24 molecular typing and survival risk assessment related genes as described above, and reference genes.
  • the reference gene includes ACTB, and the gene group is shown in Table 3.
  • the gene group of the present invention includes 21 molecular typing and survival risk assessment related genes as described above, and reference genes.
  • the reference gene includes three of GAPDH, GUSB, MRPL19, PSMC4, SF3A1 and TFRC.
  • the reference genes include GAPDH, GUSB and TFRC, and the gene groups are shown in Table 4.
  • Proliferation-related genes KIF14 13 Proliferation-related genes TK1 14 Proliferation-related genes HJURP 15 Proliferation-related genes TPX2 16 Proliferation-related genes EXO1 17 Proliferation-related genes KIF11 18 Proliferation-related genes NEK2 19 Proliferation-related genes KIF23 20 Proliferation-related genes CDCA3 twenty one Proliferation-related genes CDK1 twenty two Proliferation-related genes SPAG5 twenty three Proliferation-related genes KIF4A twenty four Proliferation-related genes GTSE1 25 Proliferation-related genes CDKN3 26 Proliferation-related genes CDC25C 27 Proliferation-related genes PRR11 28 Proliferation-related genes CCNB2 29 Proliferation-related genes MAD2L1 30 Proliferation-related genes PKMYT1 31 Proliferation-related genes CENPE 32 Proliferation-related genes ASPM 33 Proliferation-related genes CENPF 34 Proliferation-related genes BUB1 35 Proliferation-related genes NDC80
  • Housekeeping gene PSMC4 Housekeeping gene SF3A1 186 Housekeeping gene TFRC
  • the gene group of the present invention can be used to determine the molecular classification (subtype classification) of lung adenocarcinoma and/or to assess the survival risk of patients with lung adenocarcinoma.
  • the molecular classification of lung adenocarcinoma can include LAD1, LAD2, LAD3, LAD4, LAD5 and mixed types. Survival risks can include low-risk, medium-risk, and high-risk.
  • gene group of the present invention is not limited to the combinations listed above.
  • those skilled in the art should be able to combine the genes related to molecular typing and survival risk assessment of the present invention and reference genes to obtain a gene group containing a combination of different genes.
  • These gene groups are also included in the present invention. Within the scope of protection.
  • the present invention relates to a reagent for detecting the expression level of genes in the gene group of the present invention and its application in the preparation of detection/diagnostic products.
  • the gene group is as described above.
  • the reagent or the detection/diagnostic product can be used to determine the molecular typing of lung adenocarcinoma and/or to assess the survival risk of patients with lung adenocarcinoma.
  • the selection in the reagent or product may each correspond to a gene in the gene group of the present invention.
  • the primers of SEQ ID NO. 153-202 or the probes of SEQ ID NO. 203-227 it does not mean that the reagent or product of the present invention must contain all of these primers or probes , It means that the reagent or product will contain those primers or probes corresponding to the genes covered therein.
  • the reagent is used to detect the amount of target nucleic acid (such as DNA, RNA transcript or cDNA complementary to the RNA transcript in the gene group of the present invention), and preferably, it is used to detect the amount of the target nucleic acid.
  • target nucleic acid such as DNA, RNA transcript or cDNA complementary to the RNA transcript in the gene group of the present invention
  • the RNA transcripts of genes in the gene group of the invention particularly the amount of mRNA, or the amount of cDNA complementary to mRNA is detected.
  • the reagent is a reagent for detecting the amount of RNA transcripts, particularly mRNA, of a target gene (ie, a gene in the gene group of the present invention).
  • the reagent is a reagent for detecting the amount of cDNA complementary to the mRNA.
  • the reagent is a probe or primer or a combination thereof, which can hybridize to a partial sequence of a target nucleic acid (for example, a gene of the gene group of the present invention, its RNA transcript or a cDNA complementary to the RNA transcript) Form a complex.
  • a target nucleic acid for example, a gene of the gene group of the present invention, its RNA transcript or a cDNA complementary to the RNA transcript
  • the probes and primers are highly specific to the target nucleic acid. Probes and primers can be artificially synthesized.
  • the reagent is a primer.
  • the primer has the sequence shown in SEQ ID NO. 1-152 (see also Table 5).
  • the primer has SEQ ID NO. 1-6, 17, 18, 23, 24, 37-40, 45-58, 61, 62, 107-118, 141-144, 151 and 152 The sequence shown (see also Table 6).
  • the primer has a sequence as shown in SEQ ID NO. 153-202 (see also Table 7).
  • the primer has the sequence shown in SEQ ID NO. 228-275 (see also Table 8).
  • the primers are used for next-generation sequencing, preferably for targeted sequencing.
  • the primer is used for targeted sequencing and has the sequence shown in SEQ ID NO. 1-152 (Table 5).
  • the primers are used for targeted sequencing and have SEQ ID NO. 1-6, 17, 18, 23, 24, 37-40, 45-58, 61, 62, 107-118, Sequences shown in 141-144, 151 and 152 (Table 6).
  • the primers are used for quantitative PCR, preferably real-time fluorescent quantitative PCR (RT-PCR), such as SYBR Green RT-PCR based on SYBR Green dye and TaqMan RT-PCR based on TaqMan technology.
  • RT-PCR real-time fluorescent quantitative PCR
  • the primer is used for SYBR Green RT-PCR, and has the sequence shown in SEQ ID No. 153-202 (see also Table 7) or the sequence shown in SEQ ID No. 228-275 (See also Table 8).
  • the primer is used for TaqMan RT-PCR, and has the sequence shown in SEQ ID No. 153-202 (Table 7) or the sequence shown in SEQ ID No.
  • the primers are used for multiplex RT-PCR and have the sequence shown in SEQ ID NO. 153-202 (Table 7). In another specific embodiment, the primers are used for single-plex or multiplex RT-PCR and have the sequence shown in SEQ ID NO. 228-275 (Table 8).
  • the primer is used to prepare a detection/diagnostic product
  • the product is a second-generation sequencing kit based on targeted sequencing or a real-time fluorescent quantitative PCR kit.
  • the reagents are probes, including but not limited to probes for RT-PCR, in situ hybridization (ISH), DNA imprinting or RNA imprinting, gene chip technology and the like.
  • the probe is a probe that can be used for in situ hybridization.
  • the probe used for in situ hybridization can be, for example, for two-color silver stained in situ hybridization (DISH), DNA fluorescence in situ hybridization (DNA-FISH), RNA fluorescence in situ hybridization (RNA-FISH), chromogenic in situ hybridization (CISH) and other probes, the probe may have a label, the label may be a fluorescent group (for example, Alexa Fluor dye, FITC, Texas Red, Cy3, Cy5, etc.), biotin, digoxigenin Wait.
  • the probe can be used for gene chip detection, and the probe may also have a label, and the label may be a fluorescent group.
  • the probe can be used to prepare a detection/diagnostic product, and the product is a gene chip.
  • the probe is used for RT-PCR. In one embodiment, the probe is used for TaqMan RT-PCR. In one embodiment, the probe is a TaqMan probe. In one embodiment, the probe has the sequence shown in SEQ ID NO. 203-227 (see also Table 7). In a specific embodiment, the probe is a TaqMan probe with the sequence shown in SEQ ID NO. 203-227. In another embodiment, the probe has the sequence shown in SEQ ID NO. 276-299 (see also Table 8). In a specific embodiment, the probe is a TaqMan probe with the sequence shown in SEQ ID NO. 276-299.
  • the probe can be used to prepare a detection/diagnostic product, and the product is a real-time fluorescent quantitative PCR detection kit.
  • the reagent is a combination of primers and probes.
  • the probe is a TaqMan probe.
  • the combination of primers and probes is used for RT-PCR, such as single-plex or multiplex RT-PCR.
  • the primer has the sequence shown in SEQ ID NO. 153-202.
  • the primer has the sequence shown in SEQ ID NO. 228-275.
  • the probe has the sequence shown in SEQ ID NO. 203-227.
  • the probe has the sequence shown in SEQ ID NO. 276-299.
  • the primer has the sequence shown in SEQ ID NO. 153-202, and the probe is a TaqMan probe with the sequence shown in SEQ ID NO. 203-227 (see also Table 7 ).
  • the primer has the sequence shown in SEQ ID NO. 228-275, and the probe is a TaqMan probe with the sequence shown in SEQ ID NO. 276-299 (see also Table 8 ).
  • the probes and primers can be used to prepare diagnostic products, which are real-time fluorescent quantitative PCR detection kits, such as multiplex or single-plex real-time fluorescent quantitative PCR detection kits.
  • the reagent is used to detect the amount of a polypeptide encoded by a target gene (a gene in the gene group of the present invention).
  • a target gene a gene in the gene group of the present invention.
  • the reagent is an antibody, antibody fragment or affinity protein, which can specifically bind to the polypeptide encoded by the target gene. More preferably, the reagent is an antibody or antibody fragment that can specifically bind to the polypeptide encoded by the target gene.
  • the antibodies, antibody fragments or affinity proteins may also carry labels for detection, such as enzymes (such as peroxidase horseradish enzyme), radioisotopes, fluorescent labels (such as Alexa Fluor dye, FITC, Texas Red, Cy3, Cy5, etc.), chemiluminescent substances (such as luminol), biotin, quantum dot labeling (Qdot), etc.
  • the reagent is an antibody or antibody fragment that can specifically bind to the polypeptide encoded by the target gene, and optionally has a label for detection, and the label is selected from the group consisting of enzymes and radioisotopes. , Fluorescent markers, chemiluminescent substances, biotin, quantum dot labels.
  • the reagent is used to prepare a detection/diagnostic product, and the product is a protein chip (such as a protein microarray), an ELISA diagnostic kit or an immunohistochemistry (IHC) kit.
  • the present invention provides a product that can be used to determine the molecular typing of lung adenocarcinoma and/or assess the survival risk of patients with lung adenocarcinoma.
  • the product contains the reagent of the present invention.
  • the product may be a second-generation sequencing kit based on targeted sequencing, a real-time fluorescent quantitative PCR kit, a gene chip, a protein chip ELISA diagnostic kit, or an immunohistochemistry (IHC) kit or a combination thereof.
  • the product is a diagnostic product based on next-generation sequencing (NGS).
  • NGS next-generation sequencing
  • the product includes a reagent for detecting the expression level of genes of the gene group of the present invention.
  • the gene group includes 186 genes, that is, the above-mentioned 180 molecular typing and survival risk assessment related genes and 6 housekeeping genes (see also Table 1).
  • the gene group includes 76 genes, that is, the above-mentioned 70 molecular typing and survival risk assessment related genes and 6 housekeeping genes (see also Table 2).
  • the gene group of the present invention includes 25 genes, that is, 24 genes related to molecular typing and survival risk assessment as described above and 1 housekeeping gene (see also Table 3).
  • the gene group of the present invention includes 24 genes, that is, the 21 molecular typing and survival risk assessment related genes as described above and 3 housekeeping genes, the 3 housekeeping genes Including three of GAPDH, GUSB, MRPL19, PSMC4, SF3A1 and TRFC.
  • the gene group of the present invention includes 24 genes, that is, the 21 molecular typing and survival risk assessment related genes as described above and 3 housekeeping genes (see also Table 4).
  • the diagnostic product based on next-generation sequencing (NGS) includes primers having the sequence shown in SEQ ID NO. 1-152 (see also Table 5).
  • the diagnostic product based on Next Generation Sequencing (NGS) includes a diagnostic product having SEQ ID NO. 1-6, 17, 18, 23, 24, 37-40, 45-58, 61, 62 , 107-118, 141-144, 151 and 152 (see also Table 6).
  • the diagnostic product is a diagnostic product based on fluorescent quantitative PCR, preferably real-time fluorescent quantitative PCR (RT-PCR), such as SYBR Green RT-PCR and TaqMan RT-PCR.
  • RT-PCR real-time fluorescent quantitative PCR
  • TaqMan RT-PCR can be, for example, multiplex RT-PCR and single-plex RT-PCR.
  • the diagnostic product includes a reagent for detecting the expression level of genes of the gene group of the present invention.
  • the gene group includes 186 genes, that is, the above-mentioned 180 molecular typing and survival risk assessment related genes and 6 housekeeping genes (see also Table 1).
  • the gene group includes 76 genes, that is, the above-mentioned 70 molecular typing and survival risk assessment related genes and 6 housekeeping genes (see also Table 2). In another embodiment, the gene group includes 25 genes, that is, 24 genes related to molecular typing and survival risk assessment as described above and 1 housekeeping gene (see also Table 3). In another embodiment, the gene group includes 24 genes, that is, the 21 molecular typing and survival risk assessment related genes and 3 housekeeping genes as described above (see also Table 4).
  • the diagnostic product based on fluorescent quantitative PCR includes a primer having a sequence shown in SEQ ID NO. 153-202. In another specific embodiment, the diagnostic product based on fluorescent quantitative PCR comprises a TaqMan probe having the sequence shown in SEQ ID NO.
  • the fluorescent quantitative PCR-based diagnostic product includes a primer having a sequence shown in SEQ ID NO. 153-202, and a TaqMan probe having a sequence shown in SEQ ID NO. 203-227 ( See also Table 7).
  • the diagnostic product based on fluorescent quantitative PCR includes a primer having a sequence shown in SEQ ID NO. 228-275.
  • the diagnostic product based on fluorescent quantitative PCR comprises a TaqMan probe having the sequence shown in SEQ ID NO. 276-299.
  • the fluorescent quantitative PCR-based diagnostic product comprises a primer having the sequence shown in SEQ ID NO. 228-275, and a TaqMan probe having the sequence shown in SEQ ID NO. 276-299 ( See also Table 8).
  • the product is an in vitro diagnostic product.
  • the product is a diagnostic kit.
  • the product is used to determine the subtype of lung adenocarcinoma and/or to assess the survival risk of patients with lung adenocarcinoma.
  • the product further comprises total RNA extraction reagents, reverse transcription reagents, second-generation sequencing reagents and/or quantitative PCR reagents.
  • the total RNA extraction reagent can be a conventional total RNA extraction reagent in the art. Examples include but are not limited to RNA storm CD201, Qiagen 73504, Invitrogen, and ABI AM1975.
  • the reverse transcription reagent may be a conventional reverse transcription reagent in the art, and preferably includes a dNTP solution and/or RNA reverse transcriptase.
  • Examples of reverse transcription reagents include but are not limited to NEB M0368L, Thermo K1622, ABI 4366596.
  • the second-generation sequencing reagent may be a reagent conventionally used in the art, as long as it can meet the requirements of performing next-generation sequencing on the obtained sequence.
  • the second-generation sequencing reagents may be commercially available products, examples of which include but are not limited to Illumina Reagent Kit v3(150 cycle)(MS-102-3001), Targeted RNA Index Kit A-96 Indices (384 Samples) (RT-402-1001).
  • Second-generation sequencing is a conventional second-generation sequencing in the field, such as targeted RNA-seq technology. Therefore, the next-generation sequencing reagents can also include Illumina customized reagents that can be used to construct a targeted RNA-seq library, such as Targeted RNA Custom Panel Kit (96 Samples) (RT-102-1001).
  • the quantitative PCR reagent is a reagent commonly used in the field, as long as it can meet the requirements for quantitative PCR of the obtained sequence.
  • the quantitative PCR reagents may be commercially available.
  • the quantitative PCR technology is a conventional quantitative PCR technology in the field, preferably a real-time fluorescent quantitative PCR technology, such as SYBR Green RT-PCR and Taqman RT-PCR technology.
  • the PCR reagents preferably further include reagents for constructing a library for quantitative PCR.
  • the quantitative PCR reagents may also include real-time fluorescent quantitative PCR reagents, such as reagents for SYBR Green RT-PCR (such as SYBR Green premix, such as SYBR Green PCR Master Mix) and reagents for Taqman RT-PCR. Reagents (eg Taqman RT-PCR Master Mix).
  • SYBR Green RT-PCR such as SYBR Green premix, such as SYBR Green PCR Master Mix
  • Reagents eg Taqman RT-PCR Master Mix
  • the detection platform used for quantitative PCR detection can be ABI7500 real-time fluorescent quantitative PCR instrument or Roche 480II real-time fluorescent quantitative PCR machine or all other PCR machines that can perform real-time fluorescent quantitative detection.
  • the product is a second-generation sequencing kit based on targeted RNA-seq, which includes primers having the sequence shown in Table 5 or Table 6, and optionally, further includes one selected from the following Or more: total RNA extraction reagents, reverse transcription reagents and second-generation sequencing reagents.
  • the second-generation sequencing reagent is a reagent customized by Illumina that can be used to construct a target RNA-seq library.
  • the product is a SYBR Green RT-PCR kit, which includes primers having the sequence shown in Table 7 or Table 8, and optionally, further includes one or more selected from the following : Total RNA extraction reagents, reverse transcription reagents and reagents for SYBR Green RT-PCR.
  • the product is a TaqMan RT-PCR detection kit, which comprises primers and probes having the sequence shown in Table 7 or primers and probes having the sequence shown in Table 8, optionally , Also includes one or more selected from the following: total RNA extraction reagents, reverse transcription reagents and reagents for TaqMan RT-PCR.
  • the diagnostic product of the present invention also preferably includes a device for extracting a test sample from a subject; for example, a device for extracting tissue or blood from a subject, preferably any blood needle that can be used for blood collection. , Syringes, etc.
  • the subject is a mammal, preferably a human, especially a patient suffering from lung adenocarcinoma.
  • the present invention also relates to a method for determining the molecular typing and/or survival risk of lung adenocarcinoma of a subject, the method comprising
  • the methods of the invention can be used for diagnostic or non-diagnostic purposes.
  • the subject used in the method of the present invention is a mammal, preferably a human, especially a patient suffering from lung adenocarcinoma.
  • the sample used in step (1) is not particularly limited, as long as the expression level of genes in the gene group can be obtained from it.
  • the subject's total RNA, total protein, etc. can be extracted from the sample, preferably Total RNA.
  • the sample is preferably a sample of tissue, blood, plasma, body fluid or a combination thereof, preferably a tissue sample, especially a paraffin tissue sample.
  • the sample is a tumor tissue sample or a tissue sample containing tumor cells.
  • the sample is a tissue with a high content of tumor cells.
  • Step (2) can be performed using methods known in the art for determining gene expression levels.
  • a person skilled in the art can select the sample type and sample size in step (1) according to needs, and select conventional techniques in the art to implement the determination in step (2).
  • the expression level of the target gene (for example, the gene related to molecular typing and survival risk assessment of the present invention) is standardized according to the expression level of the reference gene.
  • the method of normalizing the expression level of genes is well known to those skilled in the art.
  • step (2) can be achieved by detecting the amount of the polypeptide encoded by the target gene (gene in the gene group of the present invention).
  • the detection can be achieved by the above reagents and techniques known in the art, where the techniques include but are not limited to enzyme-linked immunosorbent assay (ELISA), chemiluminescence immunoassay techniques (such as immunochemiluminescence analysis, Chemiluminescence enzyme immunoassay, electrochemiluminescence immunoassay), flow cytometry, immunohistochemistry (IHC).
  • ELISA enzyme-linked immunosorbent assay
  • chemiluminescence immunoassay techniques such as immunochemiluminescence analysis, Chemiluminescence enzyme immunoassay, electrochemiluminescence immunoassay
  • flow cytometry immunohistochemistry
  • step (2) can be achieved by detecting the amount of target nucleic acid.
  • the detection can be achieved by the above reagents and techniques known in the art, including but not limited to molecular hybridization technology, quantitative PCR technology, or nucleic acid sequencing technology.
  • Molecular hybridization technology includes but is not limited to ISH technology (such as DISH, DNA-FISH, RNA-FISH, CISH technology, etc.), DNA imprinting or RNA imprinting technology, gene chip technology (such as microarray chip or microfluidic chip technology), etc., In situ hybridization techniques are preferred.
  • Quantitative PCR technology includes but is not limited to semi-quantitative PCR and RT-PCR technology, preferably RT-PCR technology, such as SYBR Green RT-PCR technology, TaqMan RT-PCR technology.
  • Nucleic acid sequencing technologies include, but are not limited to, Sanger sequencing, next-generation sequencing (NGS), third-generation sequencing, single-cell sequencing technologies, etc., preferably second-generation sequencing, and more preferably targeted RNA-seq technology. More preferably, the detection is achieved using the reagent of the present invention.
  • next-generation sequencing technology is used to determine the expression level of genes in the gene group of the present invention.
  • the genes of the gene group are shown in Table 1, Table 2, Table 3, or Table 4.
  • the gene group includes 70 molecular typing and survival risk assessment related genes and 6 housekeeping genes as described above, and you can also refer to Table 2.
  • the gene group includes 21 molecular typing and survival risk assessment related genes and 3 housekeeping genes as described above, and you can also refer to Table 4.
  • step (2) may include:
  • (2a-1) Extract the total RNA in the sample
  • (2a-2) Convert the optionally purified total RNA into cDNA, and then prepare it into a library that can be used for next-generation sequencing;
  • step (2a-3) Sequencing the library obtained in step (2a-2), and optionally standardizing the expression levels of molecular typing and survival risk assessment related genes according to the expression levels of housekeeping genes.
  • step (2a-1) can be performed by conventional methods in the art, preferably using a commercially available RNA extraction kit to extract total RNA from fresh frozen tissue or paraffin-embedded tissue of the subject.
  • RNA storm CD201 or Qiagen 73504 can be used for extraction.
  • step (2a-2) may include the following steps:
  • step (2a-2) primers shown in Table 5 or Table 6 are used to amplify cDNA to prepare a library for sequencing.
  • Step (2a-3) can be completed by RNA sequencing.
  • the sequencing method can be a conventional RNA-seq sequencing method for determining gene expression levels in the art.
  • Illumina NextSeq/MiSeq/MiniSeq/iSeq series sequencers are used for second-generation sequencing.
  • the primers in the kit are used to amplify the genes in the gene group of the present invention.
  • the obtained gene sequence can be subjected to next-generation sequencing.
  • next-generation sequencing is a targeted RNA-seq technology, and paired-end sequencing or single-end sequencing is performed with an Illumina NextSeq/MiSeq/MiniSeq/iSeq sequencer.
  • Such a process can be completed automatically by the instrument itself.
  • a fluorescent quantitative PCR method can also be used to determine the expression level of genes in the gene group of the present invention.
  • the genes of the gene group are shown in Table 1, Table 2, Table 3, or Table 4.
  • the gene group includes the 24 molecular typing and survival risk assessment related genes and one housekeeping gene as described above, and you can also refer to Table 3.
  • the gene group includes 21 molecular typing and survival risk assessment related genes and 3 housekeeping genes as described above, and can also refer to Table 4.
  • step (2) may include:
  • (2b-3) Perform real-time fluorescent quantitative PCR (RT-PCR) detection on the obtained cDNA, and optionally standardize the expression levels of molecular typing and survival risk assessment related genes according to the expression levels of housekeeping genes.
  • RT-PCR real-time fluorescent quantitative PCR
  • step (2b-1) can be performed by conventional methods in the art, preferably a commercially available RNA extraction kit is used to extract the total RNA of the subject's fresh frozen tissue or paraffin-embedded tissue.
  • RNA storm CD201 or Qiagen 73504 can be used for extraction.
  • the reverse transcription of step (2b-2) can be performed using a commercially available reverse transcription kit.
  • the RT-PCR method in step (2b-3) is TaqMan RT-PCR.
  • primers and probes can be used to perform RT-PCR detection on the genes shown in Table 3 or Table 4 respectively, and the probes are TaqMan probes.
  • the sequences of the primers and probes are shown in Table 7 or Table 8.
  • the primers and probes shown in Table 7 are used for single-plex or multiplex RT-PCR detection.
  • the primers and probes shown in Table 8 are used for single-plex or multiplex RT-PCR detection.
  • the RT-PCR method in step (2b-3) is SYBR Green RT-PCR, and primers and commercially available SYBR Green premix can be used to pair the genes shown in Table 3 or Table 4, respectively. Or test at the same time.
  • the sequence of the primer is shown in SEQ ID No. 153-202 (see also Table 7) or shown in SEQ ID No. 228-275 (see also Table 8).
  • the above RT-PCR detection can use ABI 7500 real-time fluorescent quantitative PCR instrument (Applied Biosystems) or Roche's 480II) proceed. After the reaction is over, record the Ct value of each gene, which represents the expression level of each gene.
  • step (3) can be completed by statistically analyzing the expression levels of genes in the gene group of the present invention in the sample obtained in step (2). It can optionally be based on the single sample prediction method SSP (Single Sample Predictor) pioneered by Hu et al. (see Hu Z, et al., BMC genomics. 2006, 7: 96.) and the optimized method of Parker et al. (see Parker JS, et al.) al,Journal of clinical oncology:official journal of the American Society of Clinical Oncology.2009,27(8):1160-7.) to carry out molecular classification and survival risk prediction of lung adenocarcinoma.
  • SSP Single Sample Predictor
  • the gene expression data obtained in step (2) is analyzed to obtain the subtype classification of a single sample, and the survival risk can be calculated.
  • step (3) includes molecular typing of lung adenocarcinoma, which includes determining the subject’s lung adenocarcinoma based on the expression level of each gene in the subject’s sample obtained in step (2) Molecular typing.
  • the inventors used EPIG gene expression profile analysis program (see Zhou T, et al., 2006. Environment Health Perspect 114(4), 553-559; Chou JW, et al., 2007. BMC Bioinformatics 8,427) to analyze the TCGA database Example of lung adenocarcinoma gene expression with complete clinical information to obtain the gene expression profile of the present invention. Furthermore, according to the gene expression profile, the method of hierarchical clustering is used to compare the similarity between the detected genes, and the genes are grouped; the similarity of the expression profiles among lung adenocarcinoma samples is compared, and the lung adenocarcinomas are grouped into groups.
  • Lung adenocarcinomas are classified into LAD1, LAD2, LAD3, LAD4, LAD5 and mixed types; the gene expression profile of molecular subtypes of lung adenocarcinoma is used as standard test data for molecular classification and survival risk of samples Evaluation.
  • the molecular subtypes of lung adenocarcinoma can include LAD1, LAD2, LAD3, LAD4, LAD5 and mixed types:
  • LAD1 subtype The main characteristics of LAD1 subtype are low expression of proliferation genes, high expression of immune genes, low expression of intercellular substance-related genes, and the highest 5-year survival rate;
  • LAD2 subtypes are mainly characterized by high expression of proliferation genes, low expression of immune genes, moderate expression of intercellular substance-related genes, and low 5-year survival rate;
  • LAD3 subtypes are mainly characterized by low expression of proliferation genes, moderate expression of immune genes, high expression of intercellular substance-related genes, and a moderate 5-year survival rate;
  • LAD4 subtypes are mainly characterized by moderate expression of proliferation genes, low expression of immune genes, high expression of intercellular substance-related genes, and low 5-year survival rate;
  • LAD5 subtype The main characteristics of LAD5 subtype are high expression of proliferation genes, medium expression of immune genes, low expression of intercellular substance-related genes, and medium 5-year survival rate;
  • step (3) may include:
  • step (2) According to the expression level of genes in the gene group of the present invention in the sample obtained in step (2), the Pearson correlation analysis method is used to calculate the expression profile of the gene group of the present invention in the sample and the standard test data.
  • step (3) further includes judging the survival risk of the subject according to the expression level of immune-related genes or proliferation-related genes. Specifically, it can include:
  • step (3a) includes the following steps:
  • (3a-1) According to the expression data of the proliferation-related genes in the gene group of the present invention in a statistically significant number of lung adenocarcinoma samples (training set), calculate the weighted average of the expression levels of the proliferation-related genes in the training set, Combining the survival data, use statistical software known in the art (such as x-tile software, SPSS or other analysis software that can be used to calculate the critical value, preferably x-tile software) for survival analysis, and obtain the maximum differentiation of survival curves The weighted average of the differences is used as the critical value;
  • step (2) According to the expression levels of proliferation-related genes obtained in step (2), calculate the weighted average of the expression levels of proliferation-related genes in the subject’s sample, that is, the proliferation index, based on the threshold described in step (3a-1) Value, judging whether the proliferation index is fast (the expression level of the proliferation-related genes obtained in step (2)>the critical value) or slow (the expression level of the proliferation-related genes obtained in the step (2) ⁇ the critical value);
  • step (3a-3) Perform survival risk assessment according to the proliferation index obtained in step (3a-2): if the proliferation index of the subject is fast, the survival risk is high and the prognosis is poor; if the proliferation index of the subject is slow, the prognosis is poor. The survival risk is low and the prognosis is good.
  • the proliferation index can be calculated by the following formula:
  • n is the number of proliferation-related genes used to calculate the proliferation index, which is an integer from 1 to 69.
  • proliferation-related genes include: PLK1, PRC1, CCNB1, DLGAP5, KPNA2, CCNA2, RRM2, HMMR, KIF20A, FOXM1, MKI67, KIF14, TK1, HJURP, TPX2, EXO1, KIF11, NEK2 , KIF23, CDCA3, CDK1, SPAG5, KIF4A, GTSE1, CDKN3, CDC25C, PRR11, CCNB2, MAD2L1, PKMYT1, CENPE, ASPM, CENPF, BUB1, NDC80, NUSAP1, CEP55, NCAPG, BIRC5, ZWINT, TTK, ESPL1, DEPDC , MELK, CDC20, CDC6, AURKA, NEIL3, CDT1, KIF2C, KIFC1, NCAPH, KIF18B, AUR
  • step (3b) includes the following steps:
  • (3b-1) According to the expression data of the immune-related genes in the gene group of the present invention in a statistically significant number of lung adenocarcinoma samples (training set), calculate the weighted average of the expression levels of immune-related genes in the training set, Combining the survival data, use statistical software known in the art (such as x-tile software, SPSS or other analysis software that can be used to calculate the critical value, preferably x-tile software) for survival analysis, and obtain the maximum differentiation of survival curves The weighted average of the differences is used as the critical value;
  • step (3) Based on the immune-related gene expression level obtained in step (2), calculate the weighted average of the immune-related gene expression levels in the subject’s sample, that is, the subject’s immune index, based on step (3a-1 ) The critical value, judging whether the immune index is strong (the expression level of immune-related genes obtained in step (2)>the critical value) or weak (the expression level of immune-related genes obtained in step (2) ⁇ the critical value);
  • step (3b-3) Perform survival risk assessment based on the immune index obtained in step (3b-2): if the subject’s immune index is strong, the subject’s immune function is strong, the survival risk is low, and the prognosis is poor; If the immune index is weak, the subject's immune function is weak, the survival risk is high, and the prognosis is better.
  • the immune index can be calculated by the following formula:
  • n is the number of immune-related genes used to calculate the immune index, which is an integer from 1 to 73.
  • immune-related genes include: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37, IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52 , DOCK2, CD84, FGL2, FOLR2, NCKAP1L, TRAC, MNDA, MRC1, PLEK, LCP1, SPIB, CD53, CD3E, SLCO2B1, MS4A6A, CYBB, CD4, SH2D1A, TFEC, LYZ, ITGAM, TLR8, CSF1R, CXCL13, GPNMB , CCR5, HK3, CMKLR1, IL2RG, TYROBP, HCK, ITGB2, LAPTM5, SIGLEC1, AOAH, C
  • immune-related genes include: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37, IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52, DOCK2, CD84, FOLR2, NCKAP1L, TRAC, MNDA, MRC1, PLEK, SPIB, CD53, CD4 and LYZ (see also Table 2).
  • step (3) further includes step (3c) calculating the survival risk of patients with lung adenocarcinoma, which includes the following steps:
  • step (3c-1) Adopt the Cox model, take the occurrence of disease progression or death and the time of occurrence as the observation endpoint, according to the subject’s lung adenocarcinoma molecular classification obtained in step (3-3), and step (3a-2)
  • the proliferation index obtained and the immune index obtained in step (3b-2) determine the relative risk of impact on survival to determine the corresponding coefficient, and calculate the survival risk score (Risk of Death, RD) of the subject;
  • step (3c-2) According to the survival risk score (also known as the survival risk index) calculated in step (3c-1), determine the survival risk of the subject: low risk (survival risk score is 0-35), Medium risk (survival risk score of 36-70) and high risk (survival risk score of 71-100).
  • step (3c-1) 70 lung adenocarcinoma molecular typing and survival risk-related genes (see also Table 2) are used to calculate the survival risk score of the subject,
  • RD (-0.18*LAD1)+(0.09*LAD2)+(0.04*LAD3)+(0.17*LAD4)+(-0.17*LAD5)+(-0.05*immune index)+(0.12*proliferation index);
  • LAD1 represents the pearson correlation coefficient between the tumor and LAD1 type tumor
  • LAD2 represents the pearson correlation coefficient between the tumor and LAD2 type tumor
  • LAD3 represents the pearson correlation coefficient between the tumor and LAD3 type tumor
  • LAD4 represents The pearson correlation coefficient between the tumor and LAD4 type tumor
  • LAD5 represents the pearson correlation coefficient between the tumor and LAD5 type tumor
  • Immunity Index is the immune index calculated from the 30 immune-related genes in Table 2
  • Proliferation Index is the table The proliferation index calculated for 23 proliferation-related genes in 2.
  • step (3c-1) 24 lung adenocarcinoma molecular typing and survival risk-related genes (see also Table 3) are used to calculate the survival risk score of the subject,
  • RD (-0.12*LAD1)+(0.29*LAD2)+(0.13*LAD3)+(0.18*LAD4)+(-0.09*LAD5)+(-0.55*immune index)+(0.07*proliferation index);
  • LAD1 represents the pearson correlation coefficient between the tumor and LAD1 type tumor
  • LAD2 represents the pearson correlation coefficient between the tumor and LAD2 type tumor
  • LAD3 represents the pearson correlation coefficient between the tumor and LAD3 type tumor
  • LAD4 represents the The pearson correlation coefficient between the tumor and LAD4 type tumor
  • LAD5 represents the pearson correlation coefficient between the tumor and LAD5 type tumor
  • Immune Index is the immune index calculated from the 9 immune-related genes in Table 3
  • Proliferation Index is Table 3 The proliferation index calculated from the 9 proliferation-related genes.
  • step (3c-1) 21 lung adenocarcinoma molecular typing and survival risk-related genes (see also Table 4) are used to calculate the survival risk score of the subject,
  • RD (-0.10*LAD1)+(0.36*LAD2)+(0.14*LAD3)+(0.21*LAD4)+(-0.10*LAD5)+(-0.57*immune index)+(0.07*proliferation index);
  • LAD1 represents the pearson correlation coefficient between the tumor and LAD1 type tumor
  • LAD2 represents the pearson correlation coefficient between the tumor and LAD2 type tumor
  • LAD3 represents the pearson correlation coefficient between the tumor and LAD3 type tumor
  • LAD4 represents the The pearson correlation coefficient between the tumor and LAD4 type tumor
  • LAD5 represents the pearson correlation coefficient between the tumor and LAD5 type tumor
  • Immunity Index is the immune index calculated from the 7 immune-related genes in Table 4
  • Proliferation Index is Table 4 The proliferation index calculated from the 8 proliferation-related genes.
  • the present invention also provides the application of the gene group of the present invention in molecular typing of lung adenocarcinoma and/or assessment of the survival risk of patients with lung adenocarcinoma.
  • the present invention also provides the application of the gene group of the present invention and the reagent for detecting the expression level of genes in the gene group of the present invention in the preparation of products for molecular typing of lung adenocarcinoma and/or assessment of the survival risk of patients with lung adenocarcinoma .
  • the product is a detection/diagnostic kit.
  • the product is an in vitro diagnostic product.
  • the reagents are as described above.
  • the product is as described above.
  • lung adenocarcinoma can be divided into different molecular subtypes, and the molecular subtypes of lung adenocarcinoma can include LAD1, LAD2, LAD3, LAD4, LAD5 and mixed types.
  • the survival risk of patients with lung adenocarcinoma can be assessed, and the survival risk may include low-risk, medium-risk, and high-risk.
  • the present invention also relates to a set of immune-related genes, which includes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37, IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52, DOCK2, CD84, FGL2, FOLR2, NCKAP1L, TRAC, MNDA, MRC1, PLEK, LCP1, SPIB, CD53, CD3E, SLCO2B1, MS4A6A, CYBB, CD4, SH2D1A, TFEC, LYZ, ITGAM, TLR8, CSF1R, CXCL13, GPNMB, CCR5, HK3, CMKLR1, IL2RG, TYROBP, HCK, ITGB2, LAPTM5, SIGLEC1, AOAH, C3AR1, MSR1, IL2RA, CCL5, ADAMDEC1, LILRB
  • the present invention also relates to a set of proliferation-related genes, which includes: PLK1, PRC1, CCNB1, DLGAP5, KPNA2, CCNA2, RRM2, HMMR, KIF20A, FOXM1, MKI67, KIF14, TK1, HJURP, TPX2, EXO1, KIF11, NEK2, KIF23, CDCA3, CDK1, SPAG5, KIF4A, GTSE1, CDKN3, CDC25C, PRR11, CCNB2, MAD2L1, PKMYT1, CENPE, ASPM, CENPF, BUB1, NDC80, NUSAP1, CEP55, NCAPG, BIRC5, ZWINT, TTK, ESPL1, DEPDC1, MELK, CDC20, CDC6, AURKA, NEIL3, CDT1, KIF2C, KIFC1, NCAPH, KIF18B, AURKB, UBE2C, TYMS, TOP2A, PBK, CDC45, CD
  • the present invention also relates to detecting the expression level of immune-related genes or proliferation-related genes as described above, and calculating the immune index or proliferation index; wherein the immune index can be used to assess the immune status of patients with lung adenocarcinoma and guide lung adenocarcinoma cells Immunotherapy, the proliferation index can assess the tumor growth and invasion of patients with lung adenocarcinoma, and can be used to assess the survival risk of patients with lung adenocarcinoma. Therefore, the present invention also relates to the application of the immune-related gene or the proliferation-related gene in the survival risk assessment of lung adenocarcinoma.
  • a gene group for molecular typing and/or survival risk assessment of lung adenocarcinoma wherein the gene group includes 180 genes related to molecular typing and survival risk assessment and 6 housekeeping genes, Among them, the 180 molecular typing and survival risk assessment related genes include: (1) Proliferation related genes: PLK1, PRC1, CCNB1, DLGAP5, KPNA2, CCNA2, RRM2, HMMR, KIF20A, FOXM1, MKI67, KIF14, TK1, HJURP, TPX2, EXO1, KIF11, NEK2, KIF23, CDCA3, CDK1, SPAG5, KIF4A, GTSE1, CDKN3, CDC25C, PRR11, CCNB2, MAD2L1, PKMYT1, CENPE, ASPM, CENPF, BUB1, NDC80, NUSAP1, CEP55, NCAPG, BIRC5, ZWINT, TTK, ESPL1, DEPDC1, MELK, CDC20, CDC6, AURKA
  • a gene group for molecular typing and/or survival risk assessment of lung adenocarcinoma characterized in that the gene group includes 70 molecular typing and survival risk assessment related genes and 6 housekeeping genes, Among them, the 70 molecular typing and survival risk assessment related genes include: (1) Proliferation-related genes: PLK1, PRC1, CCNB1, DLGAP5, KPNA2, CCNA2, RRM2, FOXM1, MKI67, KIF14, HJURP, TPX2, NEK2, CDK1, CDKN3, ASPM, CEP55, BIRC5, MELK, CDC20, TYMS, AURKA and TOP2A; (2) Immune related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37 , IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52, DOCK2, CD84, FOLR2, NCKAP1
  • a gene group for molecular typing and/or survival risk assessment of lung adenocarcinoma characterized in that the gene group includes 24 molecular typing and survival risk assessment related genes and 1 housekeeping gene, Among them, the 24 molecular typing and survival risk assessment related genes include: (1) Proliferation related genes: PLK1, PRC1, CCNB1, MKI67, TPX2, MELK, CDC20, TYMS and TOP2A; (2) Immune related genes: P2RY13 , CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R and CD4; (3) Intercellular substance related genes: SPOCK1, COL1A1, POSTN, ADAM12, COL6A2 and COL5A1; the housekeeping genes include ACTB.
  • a diagnostic product for molecular typing and/or survival risk assessment of lung adenocarcinoma which comprises a relevant reagent for detecting the expression level of genes in any one of the gene groups of items 1-3.
  • reagent is a reagent for detecting the amount of RNA transcribed by the gene, especially mRNA.
  • reagent is a reagent for detecting the amount of cDNA complementary to the mRNA.
  • the reagent is a reagent for detecting the amount of the polypeptide encoded by the gene, preferably, the reagent is an antibody, an antibody fragment or a pro- And sex protein.
  • a set of primers used to assess molecular typing and/or survival risk of lung adenocarcinoma wherein the sequence of the primers is shown in SEQ ID NO. 1-SEQ ID NO. 152.
  • a set of primers and probes for evaluating molecular typing and/or survival risk of lung adenocarcinoma wherein the sequences of the primers and probes are shown in SEQ ID NO.153-SEQ ID NO.227.
  • lung adenocarcinoma includes LAD1, LAD2, LAD3, LAD4, LAD5, and mixed types , wherein the mixed type includes the lung adenocarcinoma that does not belong to the LAD1, LAD2, LAD3, LAD4, and LAD5 types.
  • the present invention relates to a gene group for molecular typing and/or survival risk assessment of lung adenocarcinoma, a reagent for detecting the expression level of genes in the gene group, and molecular typing and/or survival risk for lung adenocarcinoma Evaluation methods and products.
  • a system for molecular classification of lung adenocarcinomas is established, which can classify lung adenocarcinomas into different subtypes and belong to different subtypes of lung adenocarcinoma patients Provide more targeted and individualized treatment.
  • the survival risk of lung adenocarcinoma patients can be well predicted and the proliferation and immune status of the tumor can be effectively evaluated, which has important guiding significance for clinical treatment. Combining subtype, proliferation index, immune index and risk score can make a judgment on the prognosis of patients with lung adenocarcinoma.
  • Molecular typing and risk assessment of lung adenocarcinoma patients can screen out the advantageous populations with different treatment options and provide potential treatment pathways.
  • patients with low survival risk consider not to do radiotherapy and chemotherapy to reduce the occurrence of adverse reactions and the economic burden of treatment; for patients with high survival risk, chemotherapy, radiotherapy or biological treatment should be supplemented in time to receive the maximum clinical Benefit.
  • molecular diagnosis based on expression profiles can help identify groups that can benefit from a treatment plan, improve treatment efficiency, and avoid ineffective treatments.
  • the present invention Compared with the current molecular classification method of lung adenocarcinoma, the present invention has the advantage of not only subtyping lung adenocarcinoma, but also assessing the immune index, proliferation index and survival risk of tumor patients, and comprehensively evaluating lung adenocarcinoma patients The prognosis and the possible benefit to treatment.
  • Another advantage of the present invention is that it provides multiple selectable genes or gene combinations as a supplementary embodiment.
  • the present invention When the present invention is applied to cancer patients, if due to the patient’s pathological condition or other reasons (such as a certain or a certain When the abnormal expression of some genes results in invalid or failure of the detection of the expression level of one or some genes, multiple alternatives can be used to supplement, so that the detection results based on the present invention are more stable and reliable.
  • Example 1 Assessment of subtypes of lung adenocarcinoma and screening of survival risk-related gene groups
  • the EPIG gene expression profile analysis program (see Zhou, Chou et al, 2006. Environment Health Perspect 114(4), 553-559; Chou, Zhou et al, 2007. BMC Bioinformatics 8,427) was used to analyze 504 cases in the TCGA database. Gene expression levels of lung adenocarcinoma with complete clinical information, screen out proliferation-related genes, immune-related genes, and intercellular substance-related genes that are closely related to the survival risk of lung adenocarcinoma, and calculate and optimize the classification and survival of each group of genes Genes with high risk contribution rate.
  • Results A total of 180 genes and 6 housekeeping genes related to lung adenocarcinoma subtypes and survival risk were screened, that is, 186 gene test combinations. See Table 1 for the gene list.
  • Lung adenocarcinoma can be classified into LAD1, LAD2, LAD3, LAD4, LAD5 or mixed types:
  • LAD1 subtype The main characteristics of LAD1 subtype are low expression of proliferation genes, high expression of immune genes, low expression of intercellular substance-related genes, and the highest 5-year survival rate;
  • LAD2 subtypes are mainly characterized by high expression of proliferation genes, low expression of immune genes, moderate expression of intercellular substance-related genes, and low 5-year survival rate;
  • LAD3 subtypes are mainly characterized by low expression of proliferation genes, moderate expression of immune genes, high expression of intercellular substance-related genes, and a moderate 5-year survival rate;
  • LAD4 subtypes are mainly characterized by moderate expression of proliferation genes, low expression of immune genes, high expression of intercellular substance-related genes, and low 5-year survival rate;
  • LAD5 subtype The main characteristics of LAD5 subtype are high expression of proliferation genes, medium expression of immune genes, low expression of intercellular substance-related genes, and medium 5-year survival rate;
  • Example 2 Gene test combination for molecular typing and survival risk assessment of lung adenocarcinoma
  • Example 1 From the 186 genes screened in Example 1, a test combination of 76 genes and 24 genes is preferably used for molecular typing and survival risk assessment of lung adenocarcinoma.
  • lung adenocarcinoma molecular typing and survival risk-related gene groups proliferation-related genes: PLK1, PRC1, CCNB1, DLGAP5, KPNA2, CCNA2, RRM2, FOXM1, MKI67, KIF14, HJURP, TPX2, NEK2, CDK1, CDKN3, ASPM, CEP55, BIRC5, MELK, CDC20, TYMS, AURKA and TOP2A; immune related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37, IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52, DOCK2, CD84, FOLR2, NCKAP1L, TRAC, MNDA, MRC1, PLEK, SPIB, CD53, CD4 and LYZ; intercellular substance related genes: PLK1, PRC1, CCNB1, DLGAP5, KP
  • the molecular typing method of lung adenocarcinoma as described above is adopted (see steps (3-1) to (3-3) in the "Method and Application of the Invention" section)
  • the lung adenocarcinoma tumor is divided into LAD1 type, LAD2 type, LAD3 type, LAD4 type, LAD5 type or mixed type.
  • Figure 1 shows 70 lung adenocarcinoma molecular typing and survival risk-related gene expression heat maps in each subtype.
  • the Kaplan-Meier survival curve can be drawn to obtain 5-year survival Rate, indicating the survival risk of each subtype.
  • the survival risks of the above subtypes are different, which means that each subtype of lung adenocarcinoma has a different survival risk.
  • LAD1 subtype The main characteristics of LAD1 subtype are low expression of proliferation genes, high expression of immune genes, low expression of intercellular substance-related genes, and the highest 5-year survival rate.
  • LAD2 subtypes The main characteristics of LAD2 subtypes are high expression of proliferation genes, low expression of immune genes, moderate expression of intercellular substance-related genes, and low 5-year survival rate.
  • LAD3 subtypes are mainly characterized by low expression of proliferation genes, moderate expression of immune genes, high expression of intercellular substance-related genes, and a moderate 5-year survival rate.
  • LAD4 subtypes are mainly characterized by moderate expression of proliferation genes, low expression of immune genes, high expression of intercellular substance-related genes, and low 5-year survival rate.
  • LAD5 subtypes The main characteristics of LAD5 subtypes are high expression of proliferation genes, moderate expression of immune genes, low expression of intercellular substance-related genes, and a moderate 5-year survival rate.
  • the proliferation index calculation method as described above (see steps (3a-1) to (3a-3) in the "Methods and Applications of the Invention"" section) was adopted, and 23 Cell proliferation-related genes PLK1, PRC1, CCNB1, DLGAP5, KPNA2, CCNA2, RRM2, FOXM1, MKI67, KIF14, HJURP, TPX2, NEK2, CDK1, CDKN3, ASPM, CEP55, BIRC5, MELK, CDC20, TYMS, AURKA and TOP2A Calculate the proliferation index based on the expression level of, which can divide lung adenocarcinoma into fast proliferation and slow proliferation two groups, and observe the difference in survival between the two groups.
  • the 5-year survival rate of the lung adenocarcinoma case group with fast proliferation is lower and the prognosis is poor; the 5-year survival rate of the lung adenocarcinoma case group with slow proliferation is higher and the prognosis is better (Figure 3).
  • the immune index calculation method as described above is adopted (see steps (3b-1) to (3b-3) in the "Method and Application of the Invention" section), according to 30 Immune-related genes P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R, SPN, SASH3, CSF2RB, CD37, IKZF1, CD48, IL10RA, EVI2B, IGSF6, CD52, DOCK2, CD84, FORR2, NCKAP1L, TRAC Calculate the immune index based on the expression levels of MNDA, MRC1, PLEK, SPIB, CD53, CD4, and LYZ.
  • each subtype can be further divided into two groups, the strong immune group and the weak immune group, and observe the difference between the two groups. Difference in survival.
  • the results show that the immune index can indicate the prognosis of lung adenocarcinoma.
  • the case group with a strong immune index has a higher 5-year survival rate and a relatively good prognosis (Figure 4).
  • the Cox model is used to calculate the survival risk of tumors, and whether the disease progression or death occurs and the time of occurrence are used as the observation endpoints.
  • the corresponding coefficients are determined according to the relative risk of the tumor subtype, proliferation index and immune index on survival, and the survival risk score is calculated. ,
  • the calculation method is as follows:
  • RD (-0.18*LAD1)+(0.09*LAD2)+(0.04*LAD3)+(0.17*LAD4)+(-0.17*LAD5)+(-0.05*immune index)+(0.12*proliferation index);
  • LAD1 represents the pearson correlation coefficient between the tumor and LAD1 type tumor
  • LAD2 represents the pearson correlation coefficient between the tumor and LAD2 type tumor
  • LAD3 represents the pearson correlation coefficient between the tumor and LAD3 type tumor
  • LAD4 Represents the pearson correlation coefficient between the tumor and LAD4 type tumor
  • LAD5 represents the pearson correlation coefficient between the tumor and LAD5 type tumor
  • Immunune Index is the immune index calculated from the 30 immune-related genes mentioned above;
  • the “proliferation index” is the proliferation index calculated for the 23 cell proliferation-related genes described above.
  • the risk of tumor survival can be divided into three groups, low risk (0-35), medium risk (36-70) and high risk (71-100).
  • the results show that the survival risk index can indicate the survival risk of patients with lung adenocarcinoma: the 5-year survival rate of the low-risk group is higher, the 5-year survival rate of the medium-risk group is medium, and the 5-year survival rate of the high-risk group is lower (Figure 5) .
  • the molecular typing method, proliferation index, immune index and survival risk score calculation methods of the 25-gene test combination of lung adenocarcinoma are similar to those of the 76-gene test combination.
  • the 25-gene test combination includes: 24 lung adenocarcinoma molecular typing and survival risk-related gene groups (proliferation-related genes: PLK1, PRC1, CCNB1, MKI67, TPX2, MELK, CDC20, TYMS and TOP2A; Immune-related genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7, CCR4, IL7R and CD4; intercellular substance-related genes: SPOCK1, COL1A1, POSTN, ADAM12, COL6A2, and COL5A1), which are used to determine the molecular fraction of lung adenocarcinoma Type and assess the survival risk of patients with lung adenocarcinoma; and a reference gene (ACTB) as an internal standard
  • lung adenocarcinoma tumors were classified into LAD1 Type, LAD2 type, LAD3 type, LAD4 type, LAD5 type or mixed type. The result is similar to the 76 gene test combination.
  • the proliferation index was calculated based on the expression levels of 9 cell proliferation-related genes PLK1, PRC1, CCNB1, MKI67, TPX2, MELK, CDC20, TYMS, and TOP2A.
  • the lung adenocarcinoma can be divided into two groups, fast proliferation and slow proliferation, and observe the two Survival difference between groups. The result is similar to the 76 gene test combination.
  • each subtype can be further divided into two groups, the strong immune group and the immune Weak group, and observe the difference in survival between the two groups. The result is similar to the 76 gene test combination.
  • the Cox model is used to calculate the survival risk of tumors, and whether the disease progression or death occurs and the time of occurrence are used as the observation endpoints.
  • the corresponding coefficients are determined according to the relative risk of the tumor subtype, proliferation index and immune index on survival, and the survival risk score is calculated. ,
  • the calculation method is as follows:
  • RD (-0.12*LAD1)+(0.29*LAD2)+(0.13*LAD3)+(0.18*LAD4)+(-0.09*LAD5)+ (-0.55*immunity index)+(0.07*proliferation index);
  • LAD1 is the immune index calculated from the above 9 immune-related genes
  • proliferation index is the above Proliferation index calculated for 9 cell proliferation-related genes.
  • the risk of tumor survival can be divided into three groups, low risk (0-35), medium risk (36-70) and high risk (71-100). The result is similar to the 76 gene test combination.
  • the molecular typing method of lung adenocarcinoma, proliferation index, immune index and survival risk score of the 24-gene test combination are similar to those of the 76-gene test combination.
  • the 24 gene test combination includes: 21 lung adenocarcinoma molecular typing and survival risk-related gene groups (proliferation-related genes: PLK1, PRC1, CCNB1, MKI67, TPX2, MELK, CDC20 and TOP2A; immune-related Genes: P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7 and IL7R; intercellular substance-related genes: SPOCK1, COL1A1, POSTN, ADAM12, COL6A2 and COL5A1), which are used to determine the molecular classification of lung adenocarcinoma and evaluate lung adenocarcinoma The patient's survival risk; and 3 internal reference genes (including GAPDH, GUSB and TFRC) as internal standards, which are used to standardize
  • the 21 lung adenocarcinoma molecular typing and survival risk-related gene expression levels (standardized by the expression levels of GAPDH, GUSB, and TFRC) as shown in Table 4 were used to molecularly type 504 lung adenocarcinoma cases to classify lung adenocarcinoma Tumors are classified into LAD1, LAD2, LAD3, LAD4, LAD5 or mixed types. The result is similar to the 76 gene test combination.
  • the proliferation index is calculated by the expression levels of 8 cell proliferation-related genes PLK1, PRC1, CCNB1, MKI67, TPX2, MELK, CDC20, and TOP2A.
  • the lung adenocarcinoma can be divided into two groups with fast proliferation and slow proliferation, and observe the two groups. Survival difference between. The result is similar to the 76 gene test combination.
  • the immune index is calculated based on the expression levels of 7 immune-related genes P2RY13, CCR2, PTPRC, IRF8, CLEC10A, TLR7 and IL7R. According to the immune index, each subtype can be further divided into two groups, the strong immune group and the weak immune group, and Observe the difference in survival between the two groups. The result is similar to the 76 gene test combination.
  • the Cox model is used to calculate the survival risk of tumors, and whether the disease progression or death occurs and the time of occurrence are used as the observation endpoints.
  • the corresponding coefficients are determined according to the relative risk of the tumor subtype, proliferation index and immune index on survival, and the survival risk score is calculated. ,
  • the calculation method is as follows:
  • RD (-0.10*LAD1)+(0.36*LAD2)+(0.14*LAD3)+(0.21*LAD4)+(-0.10*LAD5)+(-0.57*immune index)+(0.07*proliferation index);
  • LAD1 is the immune index calculated from the 7 immune-related genes mentioned above;
  • proliferation index is as described above Calculated proliferation index of 8 cell proliferation-related genes.
  • the risk of tumor survival can be divided into three groups, low risk (0-35), medium risk (36-70) and high risk (71-100). The result is similar to the 76 gene test combination.
  • Example 3 Next-generation sequencing detection kit for determining the molecular typing of lung adenocarcinoma and evaluating the survival risk of patients with lung adenocarcinoma
  • a second-generation sequencing detection kit which contains primers for specifically amplifying the cDNA of the 76 genes or 24 genes.
  • the primer sequences are shown respectively In Table 5 and Table 6.
  • the methods for determining the molecular typing of lung adenocarcinoma and assessing the survival risk of patients with lung adenocarcinoma using next-generation sequencing detection kits are as follows.
  • Step 1 Take the tumor or paraffin-embedded tissue of the test object, and use the method in the detection kit to obtain the area of the test object that contains high tumor cells as the original material.
  • Step 2 Extract total RNA from the tissue. You can use RNAstorm CD201RNA or Qiagen RNease FFPEkit RNA extraction kit to extract.
  • Step 3 Make the obtained RNA into a library for sequencing.
  • the obtained tissue RNA is made into a library that can be used for next-generation sequencing using targeted RNA-seq technology.
  • the library preparation method includes the following steps:
  • the specific steps are as follows: (i) Hybridization: add TOP (see Table 5 or Table 6 for specific composition) 4.5 ⁇ 1, mix well and add 21 ⁇ 1 OB1, heated to 70°C and slowly gradient to 30°C; (ii) Extension and connection: adsorb the product in (i) with a magnetic stand and discard the supernatant, wash twice with AM1 and UB1 in the kit and discard it Add 36 ⁇ l ELM4, and incubate in a PCR machine or a metal bath at 37°C for 45 minutes; (iii) Connect the sequencing tag (Index) to the product obtained from (ii), and then PCR: adsorb the product obtained from (ii) with a magnetic stand Then discard the supernatant, add 18 ⁇ 1 of HP3 diluted 40 times, absorb 16 ⁇ 1 with a magnetic stand, add 17.3 ⁇ 1 TDP1,
  • Step 4 Perform next-generation sequencing with NextSeq/MiSeq/MiniSeq/iSeq on the obtained DNA library.
  • Step 5 Statistical analysis of results. Perform statistical analysis on the sequencing results obtained. Then, the method described in Example 2 was used to molecularly type the lung adenocarcinoma of the subject, and the immune index, proliferation index and survival risk score were calculated, and the survival risk was predicted.
  • Example 4 Quantitative PCR detection kit for determining the molecular typing of lung adenocarcinoma and assessing the survival risk of patients with lung adenocarcinoma
  • a quantitative PCR detection kit which contains primers for PCR amplification of the 25 genes, and TaqMan probes for quantification.
  • the sequence of the primers and probes is shown ⁇ 7 ⁇ In Table 7.
  • the kit can be used for single-plex or multiplex RT-PCR detection.
  • the method for molecular typing and survival risk assessment of lung adenocarcinoma by multiple RT-PCR detection using the kit is as follows.
  • Step 1 Take the tumor or paraffin-embedded tissue of the test object, and use the method in the detection kit to obtain the area of the test object that contains high tumor cells as the original material.
  • Step 2 Extract total RNA from the tissue. You can use RNAstorm CD201RNA or Qiagen RNease FFPEkit RNA extraction kit to extract.
  • Step 3 One-step multiplex fluorescence quantitative RT-PCR detection.
  • the one-step real-time multiple fluorescent quantitative RT-PCR detection method is Taqman real-time multiple fluorescent quantitative RT-PCR, and the 24 lung adenocarcinoma molecular typing and survival risk-related genes in Table 7 are divided into 12 reaction systems.
  • Each reaction system contains two primers and probes for molecular typing and survival risk assessment related genes and one housekeeping gene. The three probes are respectively labeled with different fluorescence.
  • Each reaction system is prepared as follows:
  • RNA sample 2 ⁇ l total 100-400ng
  • amplification reaction includes denaturation at 95°C for 10 seconds, annealing, extension and fluorescence detection at 60°C45 -60 seconds for 45 cycles, among which three of FAM/HEX/VIC/ROX/Cy5 can be selected for the 60°C fluorescence detection channel; after the amplification reaction is completed, the Ct value of each gene is recorded, which represents the value of each gene The expression level.
  • Step 4 Statistical analysis of results. Perform statistical analysis on the sequencing results obtained. Then, the method described in Example 2 was used to molecularly type the lung adenocarcinoma of the subject, and the immune index, proliferation index and survival risk score were calculated, and the survival risk was predicted.
  • Example 5 Quantitative PCR detection kit for determining the molecular typing of lung adenocarcinoma and assessing the survival risk of patients with lung adenocarcinoma
  • a quantitative PCR detection kit which contains primers for PCR amplification of the 24 genes, and TaqMan probes, primers and probes for quantifying the amplified products.
  • the sequence of the needles is shown in Table 8.
  • the kit can be used for single-plex or multiplex RT-PCR detection.
  • the method for molecular typing and survival risk assessment of lung adenocarcinoma through single-plex RT-PCR detection using the kit is as follows.
  • Step 1 Take the tumor or paraffin-embedded tissue of the test object, and use the method in the detection kit to obtain the area of the test object that contains high tumor cells as the original material.
  • Step 2 Extract total RNA from the tissue. You can use RNAstorm CD201RNA or Qiagen RNease FFPEkit RNA extraction kit to extract.
  • Step 3 RT-PCR detection.
  • the RT-PCR detection method is Taqman RT-PCR, and the genes shown in Table 8 are respectively subjected to RT-PCR detection. Proceed as follows:
  • Step 4 Statistical analysis of results. Perform statistical analysis on the sequencing results obtained. Then, the method described in Example 2 was used to molecularly type the lung adenocarcinoma of the subject, and the immune index, proliferation index and survival risk score were calculated, and the survival risk was predicted.
  • LAD1 low fast Strong 9 LAD1 in slow weak 10 LAD2 high fast weak 11 LAD3 low fast Strong 12 LAD3 in fast weak 13 LAD4 in fast weak 14 LAD4 in slow weak 15 LAD5 in fast Strong 16 LAD5 in fast weak 17 LAD5 in fast Strong 18 Mixed low fast Strong 19 Mixed in fast Strong 20 Mixed low fast Strong twenty one Mixed low fast Strong

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Medicinal Chemistry (AREA)
  • Hematology (AREA)
  • Urology & Nephrology (AREA)
  • Plant Pathology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Cell Biology (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了一组可以对肺腺癌分子分型及生存风险进行评估的基因群;公开了检测所述基因群的基因表达水平的试剂在制备产品中的应用,所述产品用于确定肺腺癌分子分型及评估肺腺癌患者的生存风险;所述产品包括二代测序(NGS)检测试剂盒、荧光定量PCR检测试剂盒、基因芯片和蛋白质微阵列。本发明还公开了利用所述检测试剂盒进行肺腺癌分子分型及生存风险评估的方法。

Description

肺腺癌分子分型及生存风险基因群及诊断产品和应用
本申请要求2019年8月27日提交的,题为“原发性肺腺癌分子分型及生存风险基因群及诊断产品和应用”的第201910797167.8号中国专利申请的优先权,该申请的内容整体援引加入本文。
技术领域
本发明属于生物技术领域,具体涉及用于确定肺腺癌亚型分型及评估受试者生存风险的基因群及其体外诊断产品和应用。
背景技术
目前,肺癌治疗最常用的指征是病理分型、临床分期。然而,即使是相同病理组织类型、同一临床分期的患者,采用同样的治疗手段,其预后也各不相同。个体在对于药物的反应、药物毒副作用及癌症的结局等方面也存在很大差异,提示不同的肺癌患者个体对治疗的敏感性和发生毒副反应的情况不同。肺癌的发生发展侵袭和转移是由多因素多阶段综合作用演变的过程,因此,如何选择稳定的生物标志物预测病人的预后和毒副反应的发生,进行治疗效果和死亡危险度的评价,并据此进行个体化治疗的研究已成为近年来的热门课题。
目前,世界上较为推广的新方法是,利用组织芯片和免疫组织化学技术对肺癌可能影响预后的相关基因进行检测,结合病人临床病理特征和预后资料,用统计学方法筛选构建肺癌个体化预后预测模型,并进行验证。肺癌病人手术后,可借此预测肺癌的5年或更远期的生存情况。属于复发风险低的可以考虑不再做放疗化疗,减少不良反应的发生和治疗的经济负担;复发风险高的病人则建议及时辅做化疗、放疗或者生物治疗,以期收到最大临床获益。对于无法手术的晚期患者,基于表达谱的分子诊断则可帮助识别一种治疗方案可获益群体,提高治疗效率,避免无效治疗。有研究结果显示,结合了基因组学的预后模型,比单一使用临床参数可以对肺癌患者更好地进行风险分层和预后评估。
肺癌主要分为小细胞肺癌(small cell lung cancer,SCLC)和非小细胞肺癌(non-small cell lung cancer,NSCLC)两大类。后者包括腺癌、鳞状细胞癌、大细胞癌和其他类型,占全部肺癌的80%以上。肺腺癌是而其中主要类型,其发病分子机制复杂,靶向治疗也相对丰富。Faruki和Mayhew等(Faruki H,et al.,Journal of thoracic oncology:official publication of the International Association for the Study of Lung Cancer.2017,12(6):943-53.)通过对数据库中肺腺癌和肺鳞癌的表达谱研究发现,肺腺癌和肺鳞癌的表达谱特征具有 明显差异,其将肺腺癌分为TRU、PP和PI三种亚型,且发现肺腺癌的亚型可以作为肿瘤免疫细胞表达以及PD-L1表达高低的标志物。Chinnaiyan等(Shukla S,et al.,Journal of the National Cancer Institute.2017,109(1))建立一个4基因组合将肺腺癌患者分为高风险和低风险组,其中高风险组的预后显著低于低风险组。
发明内容
在一方面,本发明提供一组用于确定肺腺癌分子分型和/或评估肺腺癌患者的生存风险的基因群,其包括分子分型及生存风险评估相关基因。在一实施方案中,所述基因群还包括参考基因。所述肺腺癌分子分型包括LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型。
在一方面,本发明还提供用于检测本发明的基因群中的基因的表达水平的试剂。在一优选实施方案中,所述试剂为检测本发明基因转录的RNA、特别是mRNA的量的试剂;或者其为检测与mRNA互补的cDNA的量的试剂。在一具体实施方案中,所述试剂为引物、探针或其组合。
在另一方面,本发明还提供对肺腺癌进行分子分型和/或生存风险评估的产品,其包含本发明的试剂。本发明还提供本发明的基因群或试剂在制备产品中的应用。所述产品用于确定肺腺癌分子分型和/或评估肺腺癌患者的生存风险。在一实施方案中,所述产品为二代测序试剂盒、实时荧光定量PCR检测试剂盒、基因芯片、蛋白质微阵列、ELISA诊断试剂盒或免疫组化(IHC)试剂盒。在优选的实施方案中,所述产品为二代测序试剂盒或实时荧光定量PCR检测试剂盒。
在一方面,本发明还提供用于确定受试者的肺腺癌分子分型和/或生存风险的方法,所述方法包括:(1)提供受试者的样本;(2)测定所述样本中本发明的基因群中基因的表达水平;(3)确定所述受试者的肺腺癌分子分型和/或生存的风险。
附图说明
图1示出肺腺癌分子分型及生存风险相关基因(增殖相关基因、免疫相关基因和细胞间质相关基因)在LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型(Mixed)中的表达热图。
图2示出Kaplan-Meier生存曲线,表示肺腺癌每种亚型生存风险有不同。其中,LAD1亚型5年生存率较好,LAD2亚型及LAD4亚型5年生存率相对较差,LAD3亚型和LAD5亚型预后中等。
图3示出Kaplan-Meier生存曲线,表示增殖指数可以指示肺腺癌预后。根据增殖指数可将肺腺癌病例分为增殖快和增殖慢两组,其中增殖快组的5年生存率较低。
图4示出Kaplan-Meier生存曲线,表示免疫指数可以指示肺腺癌预后。根据免疫指数可将肺腺癌病例分为免疫指数强和免疫指数弱两组,其中免疫指数强组的5年生存率 较高。
图5示出Kaplan-Meier生存曲线,表示根据亚型、增殖指数、免疫指数所计算出的肺腺癌生存风险指数可以指示生存风险。低风险(生存风险指数为0-35)组的5年生存率较高、中风险(生存风险指数为36-70)组的5年生存率中等,高风险(生存风险指数为71-100)组的5年生存率较低。
图6表示对21例肺腺癌样本进行分子分型及风险评估的统计数据饼状图。
具体实施方式
一般定义和术语
以下将对本发明进一步详细说明,应理解,所述用语旨在描述目的,而非限制本发明。
除非另有说明,本文使用的所述技术和科学术语具有与本发明所属领域技术人员通常所理解的相同的含义。若存在矛盾,则以本申请提供的定义为准。文中未注明具体条件的实验方法,通常例如可以按照常规条件Sambrook et al.,Molecular Cloning:A Laboratory Manual,4th ed,Cold Spring Harbor,N.Y.,2012中所述的条件,或按照制造商所建议的条件。
当以范围、优选范围、或者优选的数值上限以及优选的数值下限的形式表述某个量、浓度或其他值或参数的时候,应当理解相当于具体揭示了通过将任意一对范围上限或优选数值与任意范围下限或优选数值结合起来的任何范围,而不考虑该范围是否具体揭示。除非另有说明,本文所列出的数值范围旨在包括范围的端点和该范围内的所有整数和分数(小数)。
术语“约”、“大约”当与数值变量并用时,通常指该变量的数值和该变量的所有数值在实验误差内(例如对于平均值95%的置信区间内)或在指定数值的±10%内,或更宽范围内。
术语“任选”或“任选存在”是指随后描述的事件或情况可能发生或可能不发生,该描述包括发生所述事件或情况和不发生所述事件或情况。
表述“包含”或与其同义的类似表述“包括”、“含有”和“具有”等是开放性的,不排除额外的未列举的元素、步骤或成分。表述“由…组成”排除未指明的任何元素、步骤或成分。表述“基本上由…组成”指范围限制在指定的元素、步骤或成分,加上任选存在的不会实质上影响所要求保护的主题的基本和新的特征的元素、步骤或成分。应当理解,表述“包含”涵盖表述“基本上由…组成”和“由…组成”。
表述“至少一个(种)”或者“一个(种)或多个(种)”可以表示1、2、3、4、5、6、7、8、9个(种)或更多个(种)。
本文所述的基因表达水平的检测可以例如通过检测目标核酸(例如RNA转录物)来实现,也可以例如通过检测目标多肽的量(例如编码的蛋白),例如用蛋白组学方法检测 蛋白表达水平来实现。目标多肽的量,例如目标基因编码的多肽、蛋白或蛋白片段的量,可以针对样本中总蛋白的量或参考基因所编码的多肽的量来标准化。目标核酸的量,例如目标基因的DNA、其RNA转录物或与RNA转录物互补的cDNA的量,可以针对样本中总DNA、总RNA或总cDNA的量或者针对一组参考基因的DNA、RNA转录物或与RNA转录物互补的cDNA的量来标准化。
在本文中,术语“多肽”是指由氨基酸以肽键连接组成的化合物,包括多肽的全长或氨基酸片段。在本文中,“多肽”与“蛋白”可以互换使用。
术语“核苷酸”包括脱氧核糖核苷酸和核糖核苷酸。术语“核酸”是指由两个或以上核苷酸组成的聚合物,涵盖脱氧核糖核酸(DNA)、核糖核酸(RNA)以及核酸类似物。
术语“RNA转录物”是指总RNA,即编码或者非编码RNA,包括直接来自于组织或外周血样本中,也包括间接来自于细胞裂解后的组织或血液样本中的RNA。总RNA包含tRNA、mRNA和rRNA,其中,mRNA包括目标基因转录的mRNA,也包括来自于其他非目标基因的mRNA。术语“mRNA”可包括前体mRNA和成熟mRNA,既可为mRNA全长也可为其片段。在本文中,可用于检测的RNA优选为mRNA,更优选为成熟mRNA。术语“cDNA”是指具有与RNA互补碱基序列的DNA。本领域技术人员可应用本领域已知方法由基因的DNA获得其RNA转录物和/或与其RNA转录物互补的cDNA,例如,通过化学合成方法或分子克隆方法。
在本文中,目标核酸(例如RNA转录物)可以例如通过杂交、扩增或者测序的方法来检测和量化。比如,将RNA转录物与探针或者引物杂交形成复合物,通过检测复合物的量获得目标核酸的量。术语“杂交”是指在适当条件下,两个核酸片段通过稳定且特异的氢键结合,形成双螺旋复合物的过程。
术语“扩增引物”或“引物”,是指包含5~100个核苷酸的核酸片段,优选地,包含能起始酶促反应(如,酶促扩增反应)的15~30个核苷酸。
术语“(杂交)探针”是指包括至少5个核苷酸的核酸序列(可以为DNA或RNA),比如,包含5~100个核苷酸,其能在指定条件下与目标核酸(例如目标基因的RNA转录物或者RNA转录物的扩增产物、或与RNA转录物互补的cDNA)杂交形成复合物。杂交探针上还可以包括用于检测的标志物。术语“TaqMan探针”是一种基于TaqMan技术的探针,其5’末端携带荧光基团,例如FAM、TET、HEX、NED、VIC或Cy5等,3’末端携带荧光淬灭基团(例如TAMRA和BHQ基团)或非荧光淬灭基团(TaqMan MGB探针),具有能够与目标核酸杂交的核苷酸序列,当应用于实时荧光定量PCR(RT-PCR)时可报告与其形成复合物的核酸的量。
术语“参考基因”或“内参基因”在本文中指能够作为参照物用于校正和标准化目标基因的表达水平的基因,可以考虑的参考基因纳入标准有:(1)在组织中稳定表达,其表达水平不受病理状况或药物治疗影响或者影响较小;(2)表达水平不宜过高,以避免在表达数据(如通过二代测序获得)获取的数据中占比过高,影响其他基因的数据检测和解 读的准确性。因此,可用于检测本发明的参考基因表达水平的试剂也在本发明的保护范围之内。可以用于本发明的参考基因包括但不限于“看家基因”。在本文中,“参考基因”、“内参基因”和“看家基因”可以互换使用。
术语“看家基因”指这样一类基因,其产物是维持细胞基本生命活动所必需的,在个体生长各个阶段的大多数或几乎全部组织中持续表达,并且表达水平受环境因素影响较小。
在本文中,术语“肺腺癌”是指肺癌的一种类型,属于非小细胞肺癌,起源于支气管粘膜上皮,少数起源于大支气管的粘液腺。发病率比鳞癌和未分化癌低,发病年龄较小,女性相对多见。多数腺癌起源于较小的支气管,为周围型肺癌。肺腺癌较容易发生于年轻女性、有抽烟史、亚洲族群。在本文中,肺腺癌包括但不限于原发性肺腺癌和转移性肺腺癌。
在本文中,术语“肺腺癌分子分型”是指基于肺腺癌肿瘤组织的基因表达谱建立的肺腺癌分类方法。
在本文中,术语“预后”是指对肺腺癌的病程和发展结果的预测,包括但不限于对肺腺癌生存风险的预测。生存风险较低的肺癌的预后较好,反之则预后较差。
“生存风险评估”在本文中是指从随机开始的指定期间内,评估肺腺癌患者疾病进展或因为肺腺癌及其相关原因死亡的可能性。在本文中,“疾病进展”包括但不限于肿瘤细胞增多、再次出现和转移。在本文中,术语“复发风险”和“生存风险”可以互换使用。在本文中,通过计算生存风险评分(又叫做生存风险指数)来进行生存风险评估。
本发明的基因群
在一总的方面,本发明提供一组基因群,其包括肺腺癌分子分型及生存风险评估相关基因。
本发明的肺腺癌分子分型及生存风险评估相关基因可以包括:(1)69个增殖相关基因,(2)73个免疫相关基因,以及(3)38个细胞间质相关基因。
(1)69个增殖相关基因包括:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、HMMR、KIF20A、FOXM1、MKI67、KIF14、TK1、HJURP、TPX2、EXO1、KIF11、NEK2、KIF23、CDCA3、CDK1、SPAG5、KIF4A、GTSE1、CDKN3、CDC25C、PRR11、CCNB2、MAD2L1、PKMYT1、CENPE、ASPM、CENPF、BUB1、NDC80、NUSAP1、CEP55、NCAPG、BIRC5、ZWINT、TTK、ESPL1、DEPDC1、MELK、CDC20、CDC6、AURKA、NEIL3、CDT1、KIF2C、KIFC1、NCAPH、KIF18B、AURKB、UBE2C、TYMS、TOP2A、PBK、CDC45、CDCA8、CENPA、MYBL2、SKA1、MCM10、TRIP13、TROAP、POLQ、GINS1和RAD54L。
(2)73个免疫相关基因包括:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、 CD52、DOCK2、CD84、FGL2、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、LCP1、SPIB、CD53、CD3E、SLCO2B1、MS4A6A、CYBB、CD4、SH2D1A、TFEC、LYZ、ITGAM、TLR8、CSF1R、CXCL13、GPNMB、CCR5、HK3、CMKLR1、IL2RG、TYROBP、HCK、ITGB2、LAPTM5、SIGLEC1、AOAH、C3AR1、MSR1、IL2RA、CCL5、ADAMDEC1、LILRB4、CXCL11、FPR3、SELL、CXCL10、UBD、C1QB、PDCD1LG2、C1QA、SLAMF8、VSIG4、CD163、LAIR1、SLAMF7和MS4A4A。
(3)38个细胞间质相关基因包括:LOXL2、SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1、SULF1、ADAMTS2、PRRX1、COL15A1、SPARC、THY1、FAP、DIO2、FN1、COL6A3、FBN1、SYNDIG1、AEBP1、LRRC15、CILP、ISLR、GAS1、COL10A1、ASPN、MMP2和EPYC。
在一具体方面,本发明提供了一组基因群,其包括肺腺癌分子分型及生存风险评估相关基因,即如上所述(1)69个增殖相关基因中的一个或多个,(2)73个免疫相关基因中的一个或多个,以及(3)38个细胞间质相关基因中的一个或多个。
在一实施方案中,所述基因群包括180个肺腺癌分子分型及生存风险评估相关基因(参见表1),其包括如上所述(1)69个增殖相关基因;(2)73个免疫相关基因,以及(3)38个细胞间质相关基因。
在另一实施方案中,所述基因群包括70个肺腺癌分子分型及生存风险评估相关基因(参见表2),其包括(1)23个增殖相关基因:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、FOXM1、MKI67、KIF14、HJURP、TPX2、NEK2、CDK1、CDKN3、ASPM、CEP55、BIRC5、MELK、CDC20、TYMS、AURKA和TOP2A;(2)30个免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、SPIB、CD53、CD4和LYZ;以及(3)17个细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1和SULF1。
在又一实施方案中,所述基因群包括24个肺腺癌分子分型及生存风险评估相关基因(参见表3),其包括(1)9个增殖相关基因:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20、TYMS和TOP2A;(2)9个免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R和CD4;以及(3)6个细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2和COL5A1。
在另一实施方案中,所述基因群包括21个肺腺癌分子分型及生存风险评估相关基因(参见表4),其包括(1)8个增殖相关基因:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20和TOP2A;(2)7个免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、 CLEC10A、TLR7和IL7R;以及(3)6个细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2和COL5A1。
在一优选实施方案中,所述基因群还可以包括参考基因。优选地,参考基因为看家基因。可以用于本发明的看家基因包括但不限于以下中的一个或多个:GAPDH、GUSB、MRPL19、PSMC4、SF3A1、TFRC、ACTB和RPLP0。在一实施方案中,本发明的基因群还可以包括以下中的至少一个参考基因(例如1、2、3、4、5、6、7或8个)、优选至少3个、最优选6个:GAPDH、GUSB、MRPL19、PSMC4、SF3A1、TFRC、ACTB和RPLP0。在一具体实施方案中,所述参考基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC。在另一具体实施方案中,所述参考基因包括GAPDH、GUSB和TFRC。在又一具体实施方案中,所述参考基因包括ACTB。
在一优选实施方案中,本发明的基因群包括如上所述180个分子分型及生存风险评估相关基因,以及参考基因。在一具体实施方案中,所述参考基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC,所述基因群如表1所示。
在又一优选实施方案中,本发明的基因群包括如上所述的70个分子分型及生存风险评估相关基因,以及参考基因。在一具体实施方案中,所述参考基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC,所述基因群如表2所示。
在另一优选实施方案中,本发明的基因群包括如上所述的24个分子分型及生存风险评估相关基因,以及参考基因。在一具体实施方案中,所述参考基因包括ACTB,所述基因群如表3所示。
在又一优选实施方案中,本发明的基因群包括如上所述的21个分子分型及生存风险评估相关基因,以及参考基因。在一实施方案中,所述参考基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC中的三个。在一具体实施方案中,所述参考基因包括GAPDH、GUSB和TFRC,所述基因群如表4所示。
表1
序号 功能 基因名
1 增殖相关基因 PLK1
2 增殖相关基因 PRC1
3 增殖相关基因 CCNB1
4 增殖相关基因 DLGAP5
5 增殖相关基因 KPNA2
6 增殖相关基因 CCNA2
7 增殖相关基因 RRM2
8 增殖相关基因 HMMR
9 增殖相关基因 KIF20A
10 增殖相关基因 FOXM1
11 增殖相关基因 MKI67
12 增殖相关基因 KIF14
13 增殖相关基因 TK1
14 增殖相关基因 HJURP
15 增殖相关基因 TPX2
16 增殖相关基因 EXO1
17 增殖相关基因 KIF11
18 增殖相关基因 NEK2
19 增殖相关基因 KIF23
20 增殖相关基因 CDCA3
21 增殖相关基因 CDK1
22 增殖相关基因 SPAG5
23 增殖相关基因 KIF4A
24 增殖相关基因 GTSE1
25 增殖相关基因 CDKN3
26 增殖相关基因 CDC25C
27 增殖相关基因 PRR11
28 增殖相关基因 CCNB2
29 增殖相关基因 MAD2L1
30 增殖相关基因 PKMYT1
31 增殖相关基因 CENPE
32 增殖相关基因 ASPM
33 增殖相关基因 CENPF
34 增殖相关基因 BUB1
35 增殖相关基因 NDC80
36 增殖相关基因 NUSAP1
37 增殖相关基因 CEP55
38 增殖相关基因 NCAPG
39 增殖相关基因 BIRC5
40 增殖相关基因 ZWINT
41 增殖相关基因 TTK
42 增殖相关基因 ESPL1
43 增殖相关基因 DEPDC1
44 增殖相关基因 MELK
45 增殖相关基因 CDC20
46 增殖相关基因 CDC6
47 增殖相关基因 AURKA
48 增殖相关基因 NEIL3
49 增殖相关基因 CDT1
50 增殖相关基因 KIF2C
51 增殖相关基因 KIFC1
52 增殖相关基因 NCAPH
53 增殖相关基因 KIF18B
54 增殖相关基因 AURKB
55 增殖相关基因 UBE2C
56 增殖相关基因 TOP2A
57 增殖相关基因 TYMS
58 增殖相关基因 PBK
59 增殖相关基因 CDC45
60 增殖相关基因 CDCA8
61 增殖相关基因 CENPA
62 增殖相关基因 MYBL2
63 增殖相关基因 SKA1
64 增殖相关基因 MCM10
65 增殖相关基因 TRIP13
66 增殖相关基因 TROAP
67 增殖相关基因 POLQ
68 增殖相关基因 GINS1
69 增殖相关基因 RAD54L
70 免疫相关基因 P2RY13
71 免疫相关基因 CCR2
72 免疫相关基因 PTPRC
73 免疫相关基因 IRF8
74 免疫相关基因 CLEC10A
75 免疫相关基因 TLR7
76 免疫相关基因 CCR4
77 免疫相关基因 IL7R
78 免疫相关基因 SPN
79 免疫相关基因 SASH3
80 免疫相关基因 CSF2RB
81 免疫相关基因 CD37
82 免疫相关基因 IKZF1
83 免疫相关基因 CD48
84 免疫相关基因 IL10RA
85 免疫相关基因 EVI2B
86 免疫相关基因 IGSF6
87 免疫相关基因 CD52
88 免疫相关基因 DOCK2
89 免疫相关基因 CD84
90 免疫相关基因 FGL2
91 免疫相关基因 FOLR2
92 免疫相关基因 NCKAP1L
93 免疫相关基因 TRAC
94 免疫相关基因 MNDA
95 免疫相关基因 MRC1
96 免疫相关基因 PLEK
97 免疫相关基因 LCP1
98 免疫相关基因 SPIB
99 免疫相关基因 CD53
100 免疫相关基因 CD3E
101 免疫相关基因 SLCO2B1
102 免疫相关基因 MS4A6A
103 免疫相关基因 CYBB
104 免疫相关基因 CD4
105 免疫相关基因 SH2D1A
106 免疫相关基因 TFEC
107 免疫相关基因 LYZ
108 免疫相关基因 ITGAM
109 免疫相关基因 TLR8
110 免疫相关基因 CSF1R
111 免疫相关基因 CXCL13
112 免疫相关基因 GPNMB
113 免疫相关基因 CCR5
114 免疫相关基因 HK3
115 免疫相关基因 CMKLR1
116 免疫相关基因 IL2RG
117 免疫相关基因 TYROBP
118 免疫相关基因 HCK
119 免疫相关基因 ITGB2
120 免疫相关基因 LAPTM5
121 免疫相关基因 SIGLEC1
122 免疫相关基因 AOAH
123 免疫相关基因 C3AR1
124 免疫相关基因 MSR1
125 免疫相关基因 IL2RA
126 免疫相关基因 CCL5
127 免疫相关基因 ADAMDEC1
128 免疫相关基因 LILRB4
129 免疫相关基因 CXCL11
130 免疫相关基因 FPR3
131 免疫相关基因 SELL
132 免疫相关基因 CXCL10
133 免疫相关基因 UBD
134 免疫相关基因 C1QB
135 免疫相关基因 PDCD1LG2
136 免疫相关基因 C1QA
137 免疫相关基因 SLAMF8
138 免疫相关基因 VSIG4
139 免疫相关基因 CD163
140 免疫相关基因 LAIR1
141 免疫相关基因 SLAMF7
142 免疫相关基因 MS4A4A
143 细胞间质相关基因 LOXL2
144 细胞间质相关基因 SPOCK1
145 细胞间质相关基因 COL1A1
146 细胞间质相关基因 POSTN
147 细胞间质相关基因 ADAM12
148 细胞间质相关基因 COL6A2
149 细胞间质相关基因 COL5A1
150 细胞间质相关基因 COL11A1
151 细胞间质相关基因 COL5A2
152 细胞间质相关基因 COL1A2
153 细胞间质相关基因 MXRA5
154 细胞间质相关基因 THBS2
155 细胞间质相关基因 INHBA
156 细胞间质相关基因 VCAN
157 细胞间质相关基因 ADAMTS12
158 细胞间质相关基因 GREM1
159 细胞间质相关基因 COL3A1
160 细胞间质相关基因 SULF1
161 细胞间质相关基因 ADAMTS2
162 细胞间质相关基因 PRRX1
163 细胞间质相关基因 COL15A1
164 细胞间质相关基因 SPARC
165 细胞间质相关基因 THY1
166 细胞间质相关基因 FAP
167 细胞间质相关基因 DIO2
168 细胞间质相关基因 FN1
169 细胞间质相关基因 COL6A3
170 细胞间质相关基因 FBN1
171 细胞间质相关基因 SYNDIG1
172 细胞间质相关基因 AEBP1
173 细胞间质相关基因 LRRC15
174 细胞间质相关基因 CILP
175 细胞间质相关基因 ISLR
176 细胞间质相关基因 GAS1
177 细胞间质相关基因 COL10A1
178 细胞间质相关基因 ASPN
179 细胞间质相关基因 MMP2
180 细胞间质相关基因 EPYC
181 看家基因 GAPDH
182 看家基因 GUSB
183 看家基因 MRPL19
184 看家基因 PSMC4
185 看家基因 SF3A1
186 看家基因 TFRC
表2
序号 功能 基因名
1 增殖相关基因 PLK1
2 增殖相关基因 PRC1
3 增殖相关基因 CCNB1
4 增殖相关基因 DLGAP5
5 增殖相关基因 KPNA2
6 增殖相关基因 CCNA2
7 增殖相关基因 RRM2
8 增殖相关基因 FOXM1
9 增殖相关基因 MKI67
10 增殖相关基因 KIF14
11 增殖相关基因 HJURP
12 增殖相关基因 TPX2
13 增殖相关基因 NEK2
14 增殖相关基因 CDK1
15 增殖相关基因 CDKN3
16 增殖相关基因 ASPM
17 增殖相关基因 CEP55
18 增殖相关基因 BIRC5
19 增殖相关基因 MELK
20 增殖相关基因 CDC20
21 增殖相关基因 TYMS
22 增殖相关基因 AURKA
23 增殖相关基因 TOP2A
24 免疫相关基因 P2RY13
25 免疫相关基因 CCR2
26 免疫相关基因 PTPRC
27 免疫相关基因 IRF8
28 免疫相关基因 CLEC10A
29 免疫相关基因 TLR7
30 免疫相关基因 CCR4
31 免疫相关基因 IL7R
32 免疫相关基因 SPN
33 免疫相关基因 SASH3
34 免疫相关基因 CSF2RB
35 免疫相关基因 CD37
36 免疫相关基因 IKZF1
37 免疫相关基因 CD48
38 免疫相关基因 IL10RA
39 免疫相关基因 EVI2B
40 免疫相关基因 IGSF6
41 免疫相关基因 CD52
42 免疫相关基因 DOCK2
43 免疫相关基因 CD84
44 免疫相关基因 FOLR2
45 免疫相关基因 NCKAP1L
46 免疫相关基因 TRAC
47 免疫相关基因 MNDA
48 免疫相关基因 MRC1
49 免疫相关基因 PLEK
50 免疫相关基因 SPIB
51 免疫相关基因 CD53
52 免疫相关基因 CD4
53 免疫相关基因 LYZ
54 细胞间质相关基因 SPOCK1
55 细胞间质相关基因 COL1A1
56 细胞间质相关基因 POSTN
57 细胞间质相关基因 ADAM12
58 细胞间质相关基因 COL6A2
59 细胞间质相关基因 COL5A1
60 细胞间质相关基因 COL11A1
61 细胞间质相关基因 COL5A2
62 细胞间质相关基因 COL1A2
63 细胞间质相关基因 MXRA5
64 细胞间质相关基因 THBS2
65 细胞间质相关基因 INHBA
66 细胞间质相关基因 VCAN
67 细胞间质相关基因 ADAMTS12
68 细胞间质相关基因 GREM1
69 细胞间质相关基因 COL3A1
70 细胞间质相关基因 SULF1
71 看家基因 GAPDH
72 看家基因 GUSB
73 看家基因 MRPL19
74 看家基因 PSMC4
75 看家基因 SF3A1
76 看家基因 TFRC
表3
序号 功能 基因名
1 增殖相关基因 PLK1
2 增殖相关基因 PRC1
3 增殖相关基因 CCNB1
4 增殖相关基因 MKI67
5 增殖相关基因 TPX2
6 增殖相关基因 MELK
7 增殖相关基因 CDC20
8 增殖相关基因 TYMS
9 增殖相关基因 TOP2A
10 免疫相关基因 P2RY13
11 免疫相关基因 CCR2
12 免疫相关基因 PTPRC
13 免疫相关基因 IRF8
14 免疫相关基因 CLEC10A
15 免疫相关基因 TLR7
16 免疫相关基因 CCR4
17 免疫相关基因 IL7R
18 免疫相关基因 CD4
19 细胞间质相关基因 SPOCK1
20 细胞间质相关基因 COL1A1
21 细胞间质相关基因 POSTN
22 细胞间质相关基因 ADAM12
23 细胞间质相关基因 COL6A2
24 细胞间质相关基因 COL5A1
25 看家基因 ACTB
表4
序号 功能 基因名
1 增殖相关基因 PLK1
2 增殖相关基因 PRC1
3 增殖相关基因 CCNB1
4 增殖相关基因 MKI67
5 增殖相关基因 TPX2
6 增殖相关基因 MELK
7 增殖相关基因 CDC20
8 增殖相关基因 TOP2A
9 免疫相关基因 P2RY13
10 免疫相关基因 CCR2
11 免疫相关基因 PTPRC
12 免疫相关基因 IRF8
13 免疫相关基因 CLEC10A
14 免疫相关基因 TLR7
15 免疫相关基因 IL7R
16 细胞间质相关基因 SPOCK1
17 细胞间质相关基因 COL1A1
18 细胞间质相关基因 POSTN
19 细胞间质相关基因 ADAM12
20 细胞间质相关基因 COL6A2
21 细胞间质相关基因 COL5A1
22 看家基因 GAPDH
23 看家基因 GUSB
24 看家基因 TFRC
在一具体的实施方案中,本发明的基因群可用于确定肺腺癌分子分型(亚型分型)和/或评估肺腺癌患者的生存风险。
肺腺癌分子分型可以包括LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型。生存风险可以包括低风险、中风险和高风险。
本领域技术人员应当理解,本发明的基因群不限于以上所列的组合。鉴于本发明公开的内容,本领域技术人员应当能够将本发明的分子分型及生存风险评估相关基因和参考基因进行组合,从而获得包含不同基因的组合的基因群,这些基因群也在本发明的保护范围内。
本发明的试剂与诊断产品
在又一方面,本发明涉及用于检测本发明基因群中基因的表达水平的试剂及其在制备检测/诊断产品中的应用。所述基因群如上所述。
所述试剂或所述检测/诊断产品可以用于确定肺腺癌分子分型和/或评估肺腺癌患者的生存风险。本领域技术人员应当理解,试剂或产品中的选择可以各自对应于本发明的基因群中的基因。作为示例,当列举出多个选择,例如SEQ ID NO.153-202的引物或SEQ ID NO.203-227的探针时,并不表示本发明的试剂或产品必须包含全部这些引物或探针,而是表示所述试剂或产品会包含其中所涵盖基因所对应的那些引物或探针。
在优选的方案中,所述试剂用于检测目标核酸(例如本发明的基因群中的基因的DNA、RNA转录物或与RNA转录物互补的cDNA)的量,优选地,为用于检测本发明的基因群中的基因的RNA转录物,特别是mRNA的量,或者检测与mRNA互补的cDNA的量。在一实施方案中,所述试剂为检测目标基因(即本发明的基因群中的基因)的RNA转录物、特别是mRNA的量的试剂。在又一实施方案中,所述试剂为检测与所述mRNA互补的cDNA的量的试剂。
在一优选方案中,所述试剂为探针或引物或其组合,其能够与目标核酸(例如本发明的基因群的基因、其RNA转录物或与RNA转录物互补的cDNA)的部分序列杂交形 成复合物。优选地,探针和引物对目标核酸具有高度特异性。探针和引物可以是人工合成的。
在一实施方案中,所述试剂为引物。在一实施方案中,所述引物具有如SEQ ID NO.1-152所示的序列(又参见表5)。在另一实施方案中,所述引物具有SEQ ID NO.1-6、17、18、23、24、37-40、45-58、61、62、107-118、141-144、151和152所示的序列(又参见表6)。在又一实施方案中,所述引物具有如SEQ ID NO.153-202所示的序列(又参见表7)。在另一实施方案中,所述引物具有SEQ ID NO.228-275所示的序列(又参见表8)。
在一优选实施方案中,所述引物用于二代测序,优选地用于靶向测序。在一具体实施方案中,所述引物用于靶向测序且具有如SEQ ID NO.1-152所示的序列(表5)。在一具体实施方案中,所述引物用于靶向测序且具有如SEQ ID NO.1-6、17、18、23、24、37-40、45-58、61、62、107-118、141-144、151和152所示的序列(表6)。
在另一优选实施方案中,所述引物用于定量PCR,优选实时荧光定量PCR(RT-PCR),例如基于SYBR Green染料的SYBR Green RT-PCR和基于TaqMan技术的TaqMan RT-PCR。TaqMan RT-PCR例如多重RT-PCR和单重RT-PCR。在一实施方案中,所述引物用于SYBR Green RT-PCR,并且具有如SEQ ID NO.153-202所示的序列(又参见表7)或如SEQ ID NO.228-275所示的序列(又参见表8)。在另一实施方案中,所述引物用于TaqMan RT-PCR,并且具有如SEQ ID NO.153-202所示的序列(表7)或如SEQ ID NO.228-275所示的序列(表8)。在一具体实施方案中,所述引物用于多重RT-PCR且具有如SEQ ID NO.153-202所示的序列(表7)。在另一具体实施方案中,所述引物用于单重或多重RT-PCR且具有SEQ ID NO.228-275所示的序列(表8)。
在一实施方案中,所述引物用于制备检测/诊断产品,所述产品为基于靶向测序的二代测序试剂盒或实时荧光定量PCR试剂盒。
在另一实施方案中,所述试剂为探针,包括但不限于用于RT-PCR、原位杂交(ISH)、DNA印记或RNA印记、基因芯片技术等检测的探针。
在一方案中,所述探针为能够用于原位杂交的探针。用于原位杂交的探针例如可以为用于双色银染原位杂交(DISH)、DNA荧光原位杂交(DNA-FISH)、RNA荧光原位杂交(RNA-FISH)、显色原位杂交(CISH)等的探针,所述探针可带有标记物,所述标记物可为荧光基团(例如Alexa Fluor染料、FITC、Texas Red、Cy3、Cy5等)、生物素、地高辛等。在另一方案中,所述探针能够用于基因芯片检测,所述探针还可带有标记物,所述标记物可为荧光基团。在一具体实施方案中,所述探针可用于制备检测/诊断产品,所述产品为基因芯片。
在一优选实施方案中,所述探针用于RT-PCR。在一实施方案中,所述探针用于TaqMan RT-PCR。在一实施方案中,所述探针为TaqMan探针。在一实施方案中,所述探针具有如SEQ ID NO.203-227所示的序列(又参见表7)。在一具体实施方案中,所述探针为具有如SEQ ID NO.203-227所示序列的TaqMan探针。在又一实施方案中,所述 探针具有如SEQ ID NO.276-299所示的序列(又参见表8)。在一具体实施方案中,所述探为具有如SEQ ID NO.276-299所示序列的TaqMan探针。
在一实施方案中,所述探针可用于制备检测/诊断产品,所述产品为实时荧光定量PCR检测试剂盒。
在又一实施方案中,所述试剂为引物和探针的组合。优选地,所述探针为TaqMan探针。在一实施方案中,所述引物和探针的组合用于RT-PCR,例如单重或多重RT-PCR。在一实施方案中,所述引物具有如SEQ ID NO.153-202所示的序列。在又一实施方案中,所述引物具有如SEQ ID NO.228-275所示的序列。在一实施方案中,所述探针具有如SEQ ID NO.203-227所示的序列。在又一实施方案中,所述探针具有如SEQ ID NO.276-299所示的序列。在一具体实施方案中,所述引物具有如SEQ ID NO.153-202所示的序列,所述探针为具有如SEQ ID NO.203-227所示序列的TaqMan探针(又参见表7)。在另一具体方案中,所述引物具有如SEQ ID NO.228-275所示的序列,所述探针为具有如SEQ ID NO.276-299所示序列的TaqMan探针(又参见表8)。
在一实施方案中,所述探针和引物可用于制备诊断产品,所述诊断产品为实时荧光定量PCR检测试剂盒,例如多重或单重实时荧光定量PCR检测试剂盒。
在可选的实施方案中,所述试剂用于检测目标基因(本发明的基因群中的基因)编码的多肽的量。优选地,所述试剂为抗体、抗体片段或者亲和性蛋白,其能够与目标基因编码的多肽特异性结合。更优选地,所述试剂为能够与目标基因编码的多肽特异性结合的抗体或抗体片段。所述抗体、抗体片段或者亲和性蛋白还可带有用于检测的标记物,例如酶(例如过氧化物辣根酶)、放射性同位素、荧光标记物(例如Alexa Fluor染料、FITC、Texas Red、Cy3、Cy5等)、化学发光物质(例如鲁米诺)、生物素、量子点标记(Qdot)等。因此,在一优选的方案中,所述试剂为能够与目标基因编码的多肽特异性结合的抗体或抗体片段,并且可选地带有用于检测的标记物,所述标记物选自酶、放射性同位素、荧光标记物、化学发光物质、生物素、量子点标记。在一实施方案中,所述试剂用于制备检测/诊断产品,所述产品为蛋白芯片(例如蛋白质微阵列)、ELISA诊断试剂盒或免疫组化(IHC)试剂盒。
因此,在另一方面,本发明提供一种产品,其可用于确定肺腺癌分子分型和/或评估肺腺癌患者的生存风险。所述产品包含本发明的试剂。所述产品可以为基于靶向测序的二代测序试剂盒、实时荧光定量PCR试剂盒、基因芯片、蛋白芯片ELISA诊断试剂盒或免疫组化(IHC)试剂盒或其组合。
在一实施方案中,所述产品为基于二代测序(NGS)的诊断产品。在一具体实施方案中,所述产品包含检测本发明的基因群的基因的表达水平的试剂。在一实施方案中,所述基因群包括186个基因,即如上所述的180个分子分型及生存风险评估相关基因以及6个看家基因(又参见表1)。在一实施方案中,所述基因群包括76个基因,即如上所述的70个分子分型及生存风险评估相关基因以及6个看家基因(又参见表2)。在另一实施 方案中,所述的本发明的基因群包括25个基因,即如上所述的24个分子分型及生存风险评估相关基因以及1个看家基因(又参见表3)。在另一实施方案中,所述的本发明的基因群包括24个基因,即如上所述的21个分子分型及生存风险评估相关基因以及3个看家基因,所述3个看家基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TRFC中的三个。在又一实施方案中,所述的本发明的基因群包括24个基因,即如上所述的21个分子分型及生存风险评估相关基因以及3个看家基因(又参见表4)。在一具体实施方案中,所述基于二代测序(NGS)的诊断产品包含具有如SEQ ID NO.1-152所示序列的引物(又参见表5)。在又一具体实施方案中,所述基于二代测序(NGS)的诊断产品包含具有如SEQ ID NO.1-6、17、18、23、24、37-40、45-58、61、62、107-118、141-144、151和152所示序列的引物(又参见表6)。
在又一实施方案中,所述诊断产品为基于荧光定量PCR的诊断产品,优选实时荧光定量PCR(RT-PCR),例如SYBR Green RT-PCR和TaqMan RT-PCR。TaqMan RT-PCR可以例如是多重RT-PCR和单重RT-PCR。在一实施方案中,所述诊断产品包含检测本发明的基因群的基因的表达水平的试剂。在一实施方案中,所述基因群包括186个基因,即如上所述的180个分子分型及生存风险评估相关基因以及6个看家基因(又参见表1)。在一实施方案中,所述基因群包括76个基因,即如上所述的70个分子分型及生存风险评估相关基因以及6个看家基因(又参见表2)。在另一实施方案中,所述基因群包括25个基因,即如上所述的24个分子分型及生存风险评估相关基因以及1个看家基因(又参见表3)。在另一实施方案中,所述基因群包括24个基因,即如上所述的21个分子分型及生存风险评估相关基因以及3个看家基因(又参见表4)。在一具体实施方案中,所述基于荧光定量PCR的诊断产品包含具有如SEQ ID NO.153-202所示序列的引物。在另一具体实施方案中,所述基于荧光定量PCR的诊断产品包含具有如SEQ ID NO.203-227所示序列的TaqMan探针。在一优选实施方案中,所述基于荧光定量PCR的诊断产品包含具有如SEQ ID NO.153-202所示序列的引物,以及具有如SEQ ID NO.203-227所示序列的TaqMan探针(又参见表7)。在一具体实施方案中,所述基于荧光定量PCR的诊断产品包含具有如SEQ ID NO.228-275所示序列的引物。在另一具体实施方案中,所述基于荧光定量PCR的诊断产品包含具有如SEQ ID NO.276-299所示序列的TaqMan探针。在一优选实施方案中,所述基于荧光定量PCR的诊断产品包含具有如SEQ ID NO.228-275所示序列的引物,以及具有如SEQ ID NO.276-299所示序列的TaqMan探针(又参见表8)。
在一实施方案中,所述产品为体外诊断产品。在一具体的实施方案中,所述产品为诊断试剂盒。
在一实施方案中,所述产品用于确定肺腺癌亚型分型和/或评估肺腺癌患者的生存风险。
在一优选的实施方案中,所述产品还包含总RNA抽提试剂、逆转录试剂、二代测 序试剂和/或定量PCR试剂。
所述总RNA抽提试剂可以为本领域常规的总RNA抽提试剂。其实例包括但不限于RNA storm CD201、Qiagen 73504、Invitrogen和ABI AM1975。
所述逆转录试剂可以为本领域常规的逆转录试剂,并且优选地包含dNTP溶液和/或RNA逆转录酶。逆转录试剂的实例包括但不限于NEB M0368L、Thermo K1622、ABI 4366596。
所述二代测序试剂可以为本领域常规使用的试剂,只要能够满足对所得序列进行二代测序的要求即可。二代测序试剂可以为市售产品,其实例包括但不限于Illumina公司
Figure PCTCN2020111702-appb-000001
Reagent Kit v3(150 cycle)(MS-102-3001)、
Figure PCTCN2020111702-appb-000002
Targeted RNA Index Kit A-96 Indices(384 Samples)(RT-402-1001)。二代测序为本领域常规的二代测序,例如为靶向RNA-seq技术。因此,二代测序试剂还可以包含可供构建靶向RNA-seq的文库Illumina定制的试剂,例如
Figure PCTCN2020111702-appb-000003
Targeted RNA Custom Panel Kit(96 Samples)(RT-102-1001)。
所述定量PCR试剂为本领域常规使用的试剂,只要能够满足对所得序列进行定量PCR的要求即可。所述定量PCR试剂可以为市售的。所述定量PCR技术为本领域常规的定量PCR技术,优选为实时荧光定量PCR技术,例如SYBR Green RT-PCR和Taqman RT-PCR技术。所述PCR试剂较佳地还包含可供构建定量PCR的文库的试剂。优选地,所述定量PCR试剂还可以包含实时荧光定量PCR试剂,例如用于SYBR Green RT-PCR的试剂(例如SYBR Green预混物,例如SYBR Green PCR Master Mix)和用于Taqman RT-PCR的试剂(例如Taqman RT-PCR Master Mix)。本领域技术人员能够根据所用的定量PCR技术选择合适的定量PCR试剂。用于定量PCR检测的检测平台可以为ABI7500实时荧光定量PCR仪或罗氏
Figure PCTCN2020111702-appb-000004
480II实时荧光定量PCR仪或其他所有可进行实时荧光定量检测的PCR仪。
在一具体实施方案中,所述产品为基于靶向RNA-seq的二代测序试剂盒,其包含具有如表5或表6所示序列的引物,任选地,还包含选自以下的一个或多个:总RNA抽提试剂、逆转录试剂和二代测序试剂。优选地,所述二代测序试剂为可供构建靶向RNA-seq的文库Illumina定制的试剂。
在又一具体实施方案中,所述产品为SYBR Green RT-PCR的试剂盒,其包含具有如表7或表8所示序列的引物,任选地,还包含选自以下的一个或多个:总RNA抽提试剂、逆转录试剂和用于SYBR Green RT-PCR的试剂。
在另一具体实施方案中,所述产品为TaqMan RT-PCR检测试剂盒,其包含具有如表7所示序列的引物和探针或如表8所示序列的引物和探针,任选地,还包含选自以下的一个或多个:总RNA抽提试剂、逆转录试剂和用于TaqMan RT-PCR的试剂。
本发明的诊断产品(优选试剂盒的形式)还优选地包含从受试者提取检测样本的器械;例如从受试者体内提取组织或血液的器械,优选任何能用于取血的釆血针、注射器等。所述受试者为哺乳动物,优选为人,特别是患有肺腺癌的患者。
本发明的方法和应用
在又一方面,本发明还涉及一种用于确定受试者的肺腺癌分子分型和/或生存风险的方法,所述方法包括
(1)提供受试者的样本,
(2)测定所述样本中本发明的基因群中基因的表达水平,
(3)确定所述受试者的肺腺癌分子分型和/或生存的风险。
本发明的方法可以用于诊断或非诊断目的。
用于本发明的方法的受试者为哺乳动物,优选为人,特别是患有肺腺癌的患者。
在步骤(1)中使用的样本没有特别的限制,只要能从其中获得基因群中的基因的表达水平即可,例如可以从所述样本提取受试者的总RNA、总蛋白等,优选为总RNA。所述样本优选地为组织、血液、血浆、体液或其组合的样本,优选为组织样本,特别是石蜡组织样本。在优选的实施方案中,样本为肿瘤组织样本或包含肿瘤细胞的组织样本。在优选的实施方案中,样本为肿瘤细胞含量高的组织。
步骤(2)中可以采用本领域已知的测定基因表达水平的方法来进行。本领域技术人员可根据需要选择步骤(1)中的样本种类和样本量,并选择本领域的常规技术实现步骤(2)所述测定。优选地,根据参考基因的表达水平对目标基因(例如本发明的分子分型及生存风险评估相关基因)的表达水平进行标准化。对基因的表达水平进行标准化的方法是本领域技术人员所熟知的。
在一实施方案中,步骤(2)可通过检测目标基因(本发明的基因群中的基因)编码的多肽的量来实现。所述检测可通过如上所述的试剂与本领域已知的技术来实现,其中所述技术包括但不限于酶联免疫吸附分析法(ELISA)、化学发光免疫分析技术(例如免疫化学发光分析、化学发光酶免疫分析、电化学发光免疫分析)、流式细胞术、免疫组化法(IHC)。
在一优选实施方案中,步骤(2)可通过检测目标核酸的量实现。所述检测可通过如上所述的试剂与本领域已知的技术来实现,包括但不限于分子杂交技术、定量PCR技术或核酸测序技术等。分子杂交技术包括但不限于ISH技术(例如DISH、DNA-FISH、RNA-FISH、CISH技术等)、DNA印记或RNA印记技术、基因芯片技术(例如微阵列芯片或微流控芯片技术)等,优选原位杂交技术。定量PCR技术包括但不限于半定量PCR和RT-PCR技术,优选RT-PCR技术,例如SYBR Green RT-PCR技术、TaqMan RT-PCR技术。核酸测序技术包括但不限于Sanger测序、二代测序(NGS)、三代测序、单细胞测序技术等,优选二代测序,更优选靶向RNA-seq技术。更优选地,所述检测使用本发明的试剂来实现。
在一优选实施方案中,在步骤(2)中,采用二代测序技术测定本发明的基因群中基因的表达水平。在一实施方案中,所述基因群的基因如表1、表2、表3或表4所示。在一实施方案中,所述基因群包括如上所述的70个分子分型及生存风险评估相关基因以及6个看家基因,并且还可以参见表2。在又一实施方案中,所述基因群包括如上所述 的21个分子分型及生存风险评估相关基因以及3个看家基因,并且还可以参见表4。
在一具体的实施方案中,步骤(2)可以包括:
(2a-1)提取样本中的总RNA;
(2a-2)将任选地进行纯化的总RNA转化为cDNA,然后将其制备成可用于二代测序的文库;
(2a-3)对步骤(2a-2)获得的文库进行测序,任选地根据看家基因的表达水平将分子分型及生存风险评估相关基因的表达水平标准化。
步骤(2a-1)的提取可以通过本领域常规方法进行,优选地利用可商购的RNA提取试剂盒提取受试者的新鲜冷冻组织或石蜡包埋组织的总RNA。在更优选的实施方案中,可以使用RNA storm CD201或Qiagen 73504进行提取。
在一优选的实施方案中,步骤(2a-2)可以包括以下步骤:
(ⅰ)将提取的总RNA反转录生成所关注基因的cDNA;
(ⅱ)将所得cDNA制备成可供测序的文库。
在一优选实施方案中,步骤(2a-2)中使用如表5或表6所示的引物对cDNA进行扩增以制备成可供测序的文库。
步骤(2a-3)可以通过RNA测序完成。所述的测序的方法可以为本领域常规的用于确定基因表达水平的RNA-seq测序方法。优选地利用Illumina NextSeq/MiSeq/MiniSeq/iSeq系列测序仪进行二代测序。利用试剂盒中的引物对本发明的基因群中的基因进行扩增,根据步骤(2a-2)所制备的文库的不同,可以对所得基因序列进行二代测序。优选地,二代测序为靶向RNA-seq技术,用Illumina NextSeq/MiSeq/MiniSeq/iSeq测序仪进行双端测序或单端测序。这样的过程可以由仪器本身自动完成。
在步骤(2)中,还可采用荧光定量PCR方法测定本发明的基因群中基因的表达水平。在一实施方案中,所述基因群的基因如表1、表2、表3或表4所示。在一实施方案中,所述基因群包括如上所述的24个分子分型及生存风险评估相关基因以及1个看家基因,并且还可以参见表3。在又一实施方案中,所述基因群包括如上所述的21个分子分型及生存风险评估相关基因以及3个看家基因,并且还可以参见表4。
在一具体实施方案中,步骤(2)可以包括:
(2b-1)提取样本中的总RNA;
(2b-2)将(2-1)所述总RNA反转录为cDNA;
(2b-3)将所获得cDNA进行实时荧光定量PCR(RT-PCR)检测,任选地根据看家基因的表达水平将分子分型及生存风险评估相关基因的表达水平标准化。
步骤(2b-1)的提取可以通过本领域常规方法进行,优选地利用可商购的RNA提取试剂盒提取受试者的新鲜冷冻组织或石蜡包埋组织的总RNA。在更优选的实施方案中,可以使用RNA storm CD201或Qiagen 73504进行提取。步骤(2b-2)的反转录可使用可商购的逆转录试剂盒进行。在一优选实施方案中,步骤(2b-3)所述RT-PCR方法为TaqMan  RT-PCR。优选地,可使用引物和探针对如表3或表4所示的基因分别进行RT-PCR检测,所述探针为TaqMan探针。优选地,所述引物和探针的序列如表7或表8所示。在一实施方案中,使用如表7所示的引物和探针进行单重或多重RT-PCR检测。在另一实施方案中,使用如表8所示的引物和探针进行单重或多重RT-PCR检测。
在可选的实施方案中,步骤(2b-3)所述RT-PCR方法为SYBR Green RT-PCR,可使用引物和可商购的SYBR Green预混物对表3或表4所示基因分别或同时进行检测。优选地,所述引物的序列如SEQ ID NO.153-202所示(又参见表7)或如SEQ ID NO.228-275所示(又参见表8)。
上述RT-PCR检测可使用ABI 7500实时荧光定量PCR仪(Applied Biosystems)或罗氏的
Figure PCTCN2020111702-appb-000005
480II)进行。反应结束后,记录每个基因的Ct值,代表了各个基因的表达水平。
在本发明的一实施方案中,步骤(3)可以通过将步骤(2)中获得的所述样本中本发明的基因群中基因的表达水平进行统计分析完成。可以任选地根据Hu等开创的单一样品预测法SSP(Single Sample Predictor)(参见Hu Z,et al.,BMC genomics.2006,7:96.)和Parker等优化的方法(参见Parker JS,et al,Journal of clinical oncology:official journal of the American Society of Clinical Oncology.2009,27(8):1160-7.)来进行肺腺癌分子分型和生存风险预测。对步骤(2)获得的基因表达数据进行分析获得单一样品的亚型分型,并可以计算生存风险。
在一实施方案中,步骤(3)包括对肺腺癌进行分子分型,其包括根据步骤(2)中获得的受试者的样本中各基因的表达水平,判断受试者的肺腺癌分子分型。
本发明人通过EPIG基因表达谱分析程序(参见Zhou T,et al.,2006.Environ Health Perspect 114(4),553-559;Chou JW,et al.,2007.BMC Bioinformatics 8,427)分析TCGA数据库中例具有完整临床信息的肺腺癌基因表达量,获得本发明基因的表达谱。进一步地,根据基因的表达谱,采用层次聚类的方法,比较各检测基因间的相似性,将基因进行分组;比较肺腺癌样本间表达谱的相似性,将肺腺癌进行分组,将肺腺癌分为LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型;将肺腺癌分子亚型的基因表达谱作为标准测试数据,用于对样本进行分子分型和生存风险评估。
肺腺癌分子亚型可以包括LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型:
LAD1亚型主要特征为增殖基因低表达,免疫基因高表达,细胞间质相关基因低表达,5年生存率最高;
LAD2亚型主要特征为增殖基因高表达,免疫基因低表达,细胞间质相关基因中等表达,5年生存率低;
LAD3亚型主要特征为增殖基因低表达,免疫基因中等表达,细胞间质相关基因高表达,5年生存率中等;
LAD4亚型主要特征为增殖基因中等表达,免疫基因低表达,细胞间质相关基因高表达,5年生存率低;
LAD5亚型主要特征为增殖基因高表达,免疫基因中等表达,细胞间质相关基因低表达,5年生存率中等;
混合型为不属于LAD1型、LAD2型、LAD3型、LAD4型和LAD5型的肺腺癌。
在一实施方案中,步骤(3)可以包括:
(3-1)根据本发明的基因群在具有统计学显著性数量的肺腺癌样本(训练集)中的表达数据,建立LAD1型、LAD2型、LAD3型、LAD4型、LAD5型中本发明基因群的表达谱作为标准测试数据;
(3-2)根据步骤(2)中获得的所述样本中本发明基因群中基因的表达水平,采用Pearson相关分析法,计算所述样本中本发明基因群的表达谱与标准测试数据的LAD1型、LAD2型、LAD3型、LAD4型或LAD5型中基因表达谱之间的相关系数;
(3-3)当所述样本基因表达谱与X亚型(X选自LAD1型、LAD2型、LAD3型、LAD4型和LAD5型)中基因表达谱的相关系数最高且可信限大于等于0.8时,可将所述样本判断为X亚型;当可信度低于0.8时,则将所述样本判断为混合型(Mixed)。
在又一实施方案中,步骤(3)还包括根据免疫相关基因或增殖相关基因的表达水平,判断受试者的生存风险。具体地,可以包括:
(3a)计算受试者的增殖指数并判断增殖快慢,和/或
(3b)计算受试者的免疫指数并判断免疫功能的强弱。
在一实施方案中,步骤(3a)包括以下步骤:
(3a-1)根据本发明的基因群中的增殖相关基因在具有统计学显著性数量的肺腺癌样本(训练集)中的表达数据,计算训练集中增殖相关基因表达水平的加权平均值,结合生存数据,采用本领域已知的统计学软件(例如x-tile软件、SPSS或其他能够用于计算临界值的分析软件,优选x-tile软件)进行生存分析,取得能最大限度区分生存曲线差异的加权平均值作为临界值;
(3a-2)根据步骤(2)中获得的增殖相关基因表达水平,计算受试者的样本中增殖相关基因表达水平的加权平均值,即增殖指数,基于步骤(3a-1)所述临界值,判断增殖指数为快(步骤(2)中获得的增殖相关基因的表达水平>临界值)或慢(步骤(2)中获得的增殖相关基因的表达水平≤临界值);
(3a-3)根据步骤(3a-2)中获得的增殖指数进行生存风险评估:受试者的增殖指数快,则其生存风险高,预后较差;受试者的增殖指数慢,则其生存风险低,预后较好。
增殖指数可以通过以下公式计算:
Figure PCTCN2020111702-appb-000006
其中n为用于计算增殖指数的增殖相关基因的个数,其为1-69的整数。在一实施方 案中,n=69,增殖相关基因包括:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、HMMR、KIF20A、FOXM1、MKI67、KIF14、TK1、HJURP、TPX2、EXO1、KIF11、NEK2、KIF23、CDCA3、CDK1、SPAG5、KIF4A、GTSE1、CDKN3、CDC25C、PRR11、CCNB2、MAD2L1、PKMYT1、CENPE、ASPM、CENPF、BUB1、NDC80、NUSAP1、CEP55、NCAPG、BIRC5、ZWINT、TTK、ESPL1、DEPDC1、MELK、CDC20、CDC6、AURKA、NEIL3、CDT1、KIF2C、KIFC1、NCAPH、KIF18B、AURKB、UBE2C、TYMS、TOP2A、PBK、CDC45、CDCA8、CENPA、MYBL2、SKA1、MCM10、TRIP13、TROAP、POLQ、GINS1和RAD54L MS4A4A(还可参见表1中的相关信息)。在一实施方案中,n=23,增殖相关基因包括:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、FOXM1、MKI67、KIF14、HJURP、TPX2、NEK2、CDK1、CDKN3、ASPM、CEP55、BIRC5、MELK、CDC20、TYMS、AURKA和TOP2A(又可参见表2)。在另一实施方案中,n=9,增殖相关基因包括:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20、TYMS和TOP2A(又可参见表3)。在另一实施方案中,n=8,增殖相关基因包括:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20和TOP2A(又可参见表4)。
在一实施方案中,步骤(3b)包括以下步骤:
(3b-1)根据本发明的基因群中的免疫相关基因在具有统计学显著性数量的肺腺癌样本(训练集)中的表达数据,计算训练集中免疫相关基因表达水平的加权平均值,结合生存数据,采用本领域已知的统计学软件(例如x-tile软件、SPSS或其他能够用于计算临界值的分析软件,优选x-tile软件)进行生存分析,取得能最大限度区分生存曲线差异的加权平均值作为临界值;
(3b-2)根据步骤(2)中获得的免疫相关基因表达水平,计算受试者的样本中免疫相关基因表达水平的加权平均值,即受试者的免疫指数,基于步骤(3a-1)所述临界值,判断免疫指数为强(步骤(2)中获得的免疫相关基因表达水平>临界值)或弱(步骤(2)中获得的免疫相关基因表达水平≤临界值);
(3b-3)根据步骤(3b-2)中获得的免疫指数进行生存风险评估:受试者的免疫指数强,则受试者免疫功能强,生存风险低,预后较差;受试者的免疫指数弱,则受试者免疫功能弱,生存风险高,预后较好。
免疫指数可以通过以下公式计算:
Figure PCTCN2020111702-appb-000007
其中n为用于计算免疫指数的免疫相关基因的个数,其为1-73的整数。在一实施方案中,n=73,免疫相关基因包括:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FGL2、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、LCP1、SPIB、CD53、CD3E、SLCO2B1、MS4A6A、CYBB、CD4、SH2D1A、TFEC、 LYZ、ITGAM、TLR8、CSF1R、CXCL13、GPNMB、CCR5、HK3、CMKLR1、IL2RG、TYROBP、HCK、ITGB2、LAPTM5、SIGLEC1、AOAH、C3AR1、MSR1、IL2RA、CCL5、ADAMDEC1、LILRB4、CXCL11、FPR3、SELL、CXCL10、UBD、C1QB、PDCD1LG2、C1QA、SLAMF8、VSIG4、CD163、LAIR1、SLAMF7和MS4A4A(还可参见表1中的相关信息)。在另一实施方案中,n=30,免疫相关基因包括:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、SPIB、CD53、CD4和LYZ(又可参见表2)。在又一实施方案中,n=9,免疫相关基因包括:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R和CD4(又可参见表3)。在又一实施方案中,n=7,免疫相关基因包括:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7和IL7R(又可参见表4)。
在获得本发明的基因群中的基因的表达水平的数据之后,本领域技术人员能够应用本领域已知技术获得各组基因表达水平的加权平均值,并结合生存数据获得能最大限度区分生存曲线差异的加权平均值作为临界值。
优选地,步骤(3)还包括步骤(3c)计算肺腺癌患者的生存风险,其包括以下步骤:
(3c-1)采用Cox模型,以疾病进展或死亡是否发生及发生时间作为观察终点,根据步骤(3-3)中获得的受试者的肺腺癌分子分型、步骤(3a-2)获得的增殖指数和步骤(3b-2)获得的免疫指数对于生存发生影响的相对危险度确定相应系数,计算受试者的生存风险评分(Risk of Death,RD);
(3c-2)根据步骤(3c-1)中所计算得出的生存风险评分(又称为生存风险指数),判断受试者的生存风险:低风险(生存风险评分为0-35)、中风险(生存风险评分为36-70)和高风险(生存风险评分为71-100)。
在一具体实施方案中,步骤(3c-1)中使用70个肺腺癌分子分型及生存风险相关基因(又可参见表2)计算受试者的生存风险评分,
RD=(-0.18*LAD1)+(0.09*LAD2)+(0.04*LAD3)+(0.17*LAD4)+(-0.17*LAD5)+(-0.05*免疫指数)+(0.12*增殖指数);其中,“LAD1”代表该肿瘤与LAD1型肿瘤的pearson相关系数;“LAD2”代表该肿瘤与LAD2型肿瘤的pearson相关系数;“LAD3”代表该肿瘤与LAD3型肿瘤的pearson相关系数;“LAD4”代表该肿瘤与LAD4型肿瘤的pearson相关系数;“LAD5”代表该肿瘤与LAD5型肿瘤的pearson相关系数;“免疫指数”为表2中30个免疫相关基因计算的免疫指数;“增殖指数”为表2中23个增殖相关基因计算的增殖指数。
在另一具体实施方案中,步骤(3c-1)中使用24个肺腺癌分子分型及生存风险相关基因(又可参见表3)计算受试者的生存风险评分,
RD=(-0.12*LAD1)+(0.29*LAD2)+(0.13*LAD3)+(0.18*LAD4)+(-0.09*LAD5)+(-0.55*免疫指数)+(0.07*增殖指数);其中,
“LAD1”代表该肿瘤与LAD1型肿瘤的pearson相关系数;“LAD2”代表该肿瘤与LAD2 型肿瘤的pearson相关系数;“LAD3”代表该肿瘤与LAD3型肿瘤的pearson相关系数;“LAD4”代表该肿瘤与LAD4型肿瘤的pearson相关系数;“LAD5”代表该肿瘤与LAD5型肿瘤的pearson相关系数;“免疫指数”为表3中9个免疫相关基因计算的免疫指数;“增殖指数”为表3中9个增殖相关基因计算的增殖指数。
在又一具体实施方案中,步骤(3c-1)中使用21个肺腺癌分子分型及生存风险相关基因(又可参见表4)计算受试者的生存风险评分,
RD=(-0.10*LAD1)+(0.36*LAD2)+(0.14*LAD3)+(0.21*LAD4)+(-0.10*LAD5)+(-0.57*免疫指数)+(0.07*增殖指数);其中,
“LAD1”代表该肿瘤与LAD1型肿瘤的pearson相关系数;“LAD2”代表该肿瘤与LAD2型肿瘤的pearson相关系数;“LAD3”代表该肿瘤与LAD3型肿瘤的pearson相关系数;“LAD4”代表该肿瘤与LAD4型肿瘤的pearson相关系数;“LAD5”代表该肿瘤与LAD5型肿瘤的pearson相关系数;“免疫指数”为表4中7个免疫相关基因计算的免疫指数;“增殖指数”为表4中8个增殖相关基因计算的增殖指数。
相应地,本发明还提供了本发明的基因群在对肺腺癌进行分子分型和/或评估肺腺癌患者生存风险中的应用。本发明还提供了本发明的基因群、检测本发明的基因群中的基因的表达水平的试剂在制备对肺腺癌进行分子分型和/或评估肺腺癌患者生存风险的产品中的应用。在优选的实施方案中,所述产品为检测/诊断试剂盒。在一实施方案中,所述产品为体外诊断产品。所述试剂如上文所述。所述产品如上文所述。根据本发明的方法或应用,可以将肺腺癌分为不同的分子亚型,所述肺腺癌的分子亚型可以包括LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型。根据本发明的方法或应用,可以评估肺腺癌患者的生存风险,所述生存风险可以包括低风险、中风险和高风险。
在另一方面,本发明还涉及一组免疫相关基因,其包括:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FGL2、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、LCP1、SPIB、CD53、CD3E、SLCO2B1、MS4A6A、CYBB、CD4、SH2D1A、TFEC、LYZ、ITGAM、TLR8、CSF1R、CXCL13、GPNMB、CCR5、HK3、CMKLR1、IL2RG、TYROBP、HCK、ITGB2、LAPTM5、SIGLEC1、AOAH、C3AR1、MSR1、IL2RA、CCL5、ADAMDEC1、LILRB4、CXCL11、FPR3、SELL、CXCL10、UBD、C1QB、PDCD1LG2、C1QA、SLAMF8、VSIG4、CD163、LAIR1、SLAMF7和MS4A4A(还可参见表1中的相关信息)。
在又一方面,本发明还涉及一组增殖相关基因,其包括:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、HMMR、KIF20A、FOXM1、MKI67、KIF14、TK1、HJURP、TPX2、EXO1、KIF11、NEK2、KIF23、CDCA3、CDK1、SPAG5、KIF4A、GTSE1、CDKN3、CDC25C、PRR11、CCNB2、MAD2L1、PKMYT1、CENPE、ASPM、CENPF、BUB1、NDC80、NUSAP1、CEP55、NCAPG、BIRC5、ZWINT、TTK、ESPL1、 DEPDC1、MELK、CDC20、CDC6、AURKA、NEIL3、CDT1、KIF2C、KIFC1、NCAPH、KIF18B、AURKB、UBE2C、TYMS、TOP2A、PBK、CDC45、CDCA8、CENPA、MYBL2、SKA1、MCM10、TRIP13、TROAP、POLQ、GINS1和RAD54L MS4A4A(还可参见表1中的相关信息)。
本发明还涉及通过检测如上所述免疫相关基因或增殖相关基因的表达水平,并计算免疫指数或增殖指数;其中,免疫指数可以用于评估肺腺癌患者的免疫状况并指导肺腺癌的细胞免疫治疗,增殖指数可以评估肺腺癌患者的肿瘤生长及侵袭情况,并用于评估肺腺癌患者的生存风险。因此,本发明还涉及所述免疫相关基因或所述增殖相关基因在进行肺腺癌生存风险评估中的应用。
本发明的示例性实施方案有:
1.一种用于肺腺癌分子分型和/或评估其生存风险的基因群,其特征在于,所述基因群包括180个分子分型及生存风险评估相关基因以及6个看家基因,其中,所述180个分子分型及生存风险评估相关基因包括:(1)增殖相关基因:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、HMMR、KIF20A、FOXM1、MKI67、KIF14、TK1、HJURP、TPX2、EXO1、KIF11、NEK2、KIF23、CDCA3、CDK1、SPAG5、KIF4A、GTSE1、CDKN3、CDC25C、PRR11、CCNB2、MAD2L1、PKMYT1、CENPE、ASPM、CENPF、BUB1、NDC80、NUSAP1、CEP55、NCAPG、BIRC5、ZWINT、TTK、ESPL1、DEPDC1、MELK、CDC20、CDC6、AURKA、NEIL3、CDT1、KIF2C、KIFC1、NCAPH、KIF18B、AURKB、UBE2C、TOP2A、TYMS、PBK、CDC45、CDCA8、CENPA、MYBL2、SKA1、MCM10、TRIP13、TROAP、POLQ、GINS1和RAD54L;(2)免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FGL2、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、LCP1、SPIB、CD53、CD3E、SLCO2B1、MS4A6A、CYBB、CD4、SH2D1A、TFEC、LYZ、ITGAM、TLR8、CSF1R、CXCL13、GPNMB、CCR5、HK3、CMKLR1、IL2RG、TYROBP、HCK、ITGB2、LAPTM5、SIGLEC1、AOAH、C3AR1、MSR1、IL2RA、CCL5、ADAMDEC1、LILRB4、CXCL11、FPR3、SELL、CXCL10、UBD、C1QB、PDCD1LG2、C1QA、SLAMF8、VSIG4、CD163、LAIR1、SLAMF7和MS4A4A;(3)细胞间质相关基因:LOXL2、SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1、SULF1、ADAMTS2、PRRX1、COL15A1、SPARC、THY1、FAP、DIO2、FN1、COL6A3、FBN1、SYNDIG1、AEBP1、LRRC15、CILP、ISLR、GAS1、COL10A1、ASPN、MMP2和EPYC;所述看家基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC。
2.一种用于肺腺癌分子分型和/或评估其生存风险的基因群,其特征在于,所述基因群包括70个分子分型及生存风险评估相关基因以及6个看家基因,其中,所述70个 分子分型及生存风险评估相关基因包括:(1)增殖相关基因:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、FOXM1、MKI67、KIF14、HJURP、TPX2、NEK2、CDK1、CDKN3、ASPM、CEP55、BIRC5、MELK、CDC20、TYMS、AURKA和TOP2A;(2)免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、SPIB、CD53、CD4和LYZ;(3)细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1和SULF1;所述看家基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1、TFRC。
3.一种用于肺腺癌分子分型和/或评估其生存风险的基因群,其特征在于,所述基因群包括24个分子分型及生存风险评估相关基因以及1个看家基因,其中,所述24个分子分型及生存风险评估相关基因包括:(1)增殖相关基因:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20、TYMS和TOP2A;(2)免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R和CD4;(3)细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2和COL5A1;所述看家基因包括ACTB。
4.第1-3项中任一项所述的基因群在进行肺腺癌分子分型和/或评估生存风险中的应用。
5.第1-3项中任一项的基因群在制备对肺腺癌进行分子分型和/或评估生存风险的诊断产品中的应用。
6.第1-3项中任一项的基因群中的基因的表达水平的试剂在制备对肺腺癌进行分子分型和/或评估生存风险的诊断产品中的应用。
7.一种对肺腺癌进行分子分型和/或评估生存风险的诊断产品,其包含检测第1-3项中任一项的基因群中基因的表达水平的相关试剂。
8.第6或第7项的应用或诊断产品,其特征在于,所述诊断产品为体外诊断产品的形式,优选诊断试剂盒的形式。
9.第6-8项中任一项的应用或诊断产品,其特征在于,所述试剂为检测所述基因转录的RNA,特别是mRNA的量的试剂。
10.第6-9项中任一项的应用或诊断产品,其特征在于,所述试剂为检测与所述mRNA互补的cDNA的量的试剂。
11.第6-10项中任一项的应用或诊断产品,其特征在于,所述诊断产品还包括总RNA抽提试剂、逆转录试剂、二代测序试剂和/或定量PCR试剂。
12.第6-11项中任一项的应用或诊断产品,其特征在于,所述试剂为检测所述基因编码的多肽的量的试剂,优选地,所述试剂为抗体、抗体片段或者亲和性蛋白。
13.第6-11项中任一项的应用或诊断产品,其特征在于,所述试剂为探针或引物, 优选引物。
14.第13项的应用或诊断产品,其特征在于,所述引物如SEQ ID NO.1-SEQ ID NO.152所示。
15.一组用于评估肺腺癌分子分型和/或生存风险的引物,其中所述引物的序列如SEQ ID NO.1-SEQ ID NO.152所示。
16.第13项的应用或诊断产品,其特征在于,所述引物和探针如SEQ ID NO.153-SEQ ID NO.227所示。
17.一组用于评估肺腺癌分子分型和/或生存风险的引物和探针,其中所述引物和探针的序列如SEQ ID NO.153-SEQ ID NO.227所示。
18.第14-17项中任一项的引物及探针组在用于制备对肺腺癌进行分子分型和/或评估生存风险的产品中的应用。
19.第1-18项中任一项的基因群、应用、诊断产品或引物组,其特征在于,所述肺腺癌包括LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型,其中混合型包括所述不属于LAD1型、LAD2型、LAD3型、LAD4型、LAD5型的肺腺癌。
有益效果
本发明涉及用于进行肺腺癌分子分型和/或生存风险评估的基因群,用于检测所述基因群中基因的表达水平的试剂,以及进行肺腺癌分子分型和/或生存风险评估的方法和产品。
根据本发明的基因群中基因在肺腺癌样本中的表达水平,建立肺腺癌分子分型的体系,可以将肺腺癌分为不同亚型,并为属于不同亚型的肺腺癌患者提供更有针对性的个体化治疗。另一方面,根据本发明的方法和应用,可以很好地预测肺腺癌患者的生存风险并有效评估肿瘤的增殖和免疫状况,对临床治疗有重要指导意义。结合亚型、增殖指数、免疫指数和风险评分可以对于肺腺癌患者的预后做出判断。对肺腺癌患者进行肺腺癌分子分型和风险评估,可以筛选出不同治疗方案的优势人群,并提供潜在的治疗途径。对于生存风险低的患者,可以考虑不再做放疗化疗,减少不良反应的发生和治疗的经济负担;对于生存风险高的患者,则要及时辅做化疗、放疗或者生物治疗,以期收到最大临床获益。对于无法手术的晚期患者,基于表达谱的分子诊断则可帮助识别一种治疗方案可获益群体,提高治疗效率,避免无效治疗。
与当前肺腺癌分子分型的方法相比,本发明的优势在于不仅对肺腺癌进行亚型分型,还评估了肿瘤患者的免疫指数、增殖指数以及生存风险,综合评价肺腺癌患者的预后以及对治疗可能的受益。本发明的另一优势在于,提供了多个可以选择的基因或基因组合作为补充的实施方案,当将本发明应用于癌症患者时,如果由于患者的病理状况或其他原因(例如某个或某些基因的表达异常)导致某个或某些基因的表达水平检测无效或失灵时,可以采用多个替代方案进行补充,使得基于本发明的检测结果更加稳定、可靠。
实施例
下面通过实施例的方式进一步说明本发明,但并不因此将本发明限制在所述的实施例范围之中。下列实施例中未注明具体条件的实验方法,按照常规方法和条件,或按照商品说明书选择。本文的实施例中所用的试剂和仪器均是可商购的。
实施例1:评估肺腺癌亚型分型及生存风险相关基因群的筛选
方法:通过EPIG基因表达谱分析程序(参见Zhou,Chou et al,2006.Environ Health Perspect 114(4),553-559;Chou,Zhou et al,2007.BMC Bioinformatics 8,427)分析TCGA数据库中504例具有完整临床信息的肺腺癌基因表达量,筛选出与肺腺癌生存风险密切相关的增殖相关基因、免疫相关基因、细胞间质相关基因,并在每组基因中计算并优选对分型及生存风险贡献率大的基因。
结果:共筛选获得了与肺腺癌亚型分型及生存风险相关的180个基因及6个看家基因,即186个基因测试组合。基因列表见表1。
将所筛选的186个基因在1346例肺腺癌的Affymatrics基因芯片表达谱的数据中进行有效性和稳定性验证。可以将肺腺癌分为LAD1型、LAD2型、LAD3型、LAD4型、LAD5型或混合型:
LAD1亚型主要特征为增殖基因低表达,免疫基因高表达,细胞间质相关基因低表达,5年生存率最高;
LAD2亚型主要特征为增殖基因高表达,免疫基因低表达,细胞间质相关基因中等表达,5年生存率低;
LAD3亚型主要特征为增殖基因低表达,免疫基因中等表达,细胞间质相关基因高表达,5年生存率中等;
LAD4亚型主要特征为增殖基因中等表达,免疫基因低表达,细胞间质相关基因高表达,5年生存率低;
LAD5亚型主要特征为增殖基因高表达,免疫基因中等表达,细胞间质相关基因低表达,5年生存率中等;
混合型为不属于LAD1型、LAD2型、LAD3型、LAD4型和LAD5型的肺腺癌。
实施例2:用于肺腺癌分子分型及生存风险评估的基因测试组合
从实施例1筛选的186个基因中优选出76基因和24基因的测试组合,用于进行肺腺癌分子分型和生存风险评估。
76基因测试组合:
实验方法:采用76基因测试组合(参见表2),其中70个肺腺癌分子分型及生存风险相关基因群(增殖相关基因:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、FOXM1、MKI67、KIF14、HJURP、TPX2、NEK2、CDK1、CDKN3、ASPM、CEP55、BIRC5、MELK、CDC20、TYMS、AURKA和TOP2A;免疫相关基因:P2RY13、 CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、、EVI2B、IGSF6、CD52、DOCK2、CD84、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、SPIB、CD53、CD4和LYZ;细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1和SULF1)用于确定肺腺癌分子分型及评估肺腺癌患者的生存风险,6个内参基因(包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC)作为内标将分子分型及生存风险相关基因的表达水平进行标准化。计算生存风险指数时采用表2中70个肺腺癌分子分型及生存风险相关基因。
实验结果:
1、肺腺癌分子分型
根据实施例1中获得的标准测试数据,采用如前所述的肺腺癌分子分型方法(参见“本发明的方法和应用”部分中的步骤(3-1)至(3-3)),利用表2所示70个肺腺癌分子分型及生存风险相关基因的表达水平(经GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC的表达水平标准化的)对504例肺腺癌病例进行分子分型,将肺腺癌肿瘤分为LAD1型、LAD2型、LAD3型、LAD4型、LAD5型或混合型。图1示出各亚型中70个肺腺癌分子分型及生存风险相关基因的表达热图。
通过计算不同亚型生存的数量和时间,以肺腺癌病例开始治疗后5年内观察到疾病进展或因为肺腺癌及其相关原因死亡为观察事件,绘制Kaplan-Meier生存曲线可以获得5年生存率,指示各亚型的生存风险。如图2所示,上述各亚型的生存风险不同,表示肺腺癌每种亚型生存风险有不同。
LAD1亚型主要特征为增殖基因低表达,免疫基因高表达,细胞间质相关基因低表达,5年生存率最高。
LAD2亚型主要特征为增殖基因高表达,免疫基因低表达,细胞间质相关基因中等表达,5年生存率低。
LAD3亚型主要特征为增殖基因低表达,免疫基因中等表达,细胞间质相关基因高表达,5年生存率中等。
LAD4亚型主要特征为增殖基因中等表达,免疫基因低表达,细胞间质相关基因高表达,5年生存率低。
LAD5亚型主要特征为增殖基因高表达,免疫基因中等表达,细胞间质相关基因低表达,5年生存率中等。
2、增殖指数
根据实施例1中获得的标准测试数据,采用如前所述的增殖指数计算方法(参见“本发明的方法和应用”部分中的步骤(3a-1)至(3a-3)),通过23个细胞增殖相关基因PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、FOXM1、MKI67、KIF14、HJURP、 TPX2、NEK2、CDK1、CDKN3、ASPM、CEP55、BIRC5、MELK、CDC20、TYMS、AURKA和TOP2A的表达水平计算出增殖指数,可将肺腺癌分为增殖快和增殖慢两组,并观察两组之间的生存差异。结果显示,增殖指数可以指示肺腺癌的预后。增殖快的肺腺癌病例组的5年生存率较低,预后较差;增殖慢的肺腺癌病例组的5年生存率高,预后较好(图3)。
Figure PCTCN2020111702-appb-000008
3、免疫指数
根据实施例1中获得的标准测试数据,采用如前所述的免疫指数计算方法(参见“本发明的方法和应用”部分中的步骤(3b-1)至(3b-3)),根据30个免疫相关基因P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、、EVI2B、IGSF6、CD52、DOCK2、CD84、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、SPIB、CD53、CD4和LYZ的表达水平计算免疫指数,根据免疫指数可将每个亚型进一步分为两组,免疫强组和免疫弱组,并观察两组之间的生存差异。结果显示,免疫指数可以指示肺腺癌的预后,免疫指数强的病例组5年生存率较高,预后相对好(图4)。
Figure PCTCN2020111702-appb-000009
4、生存风险评估
肿瘤生存风险的计算采用Cox模型,以疾病进展或死亡是否发生及发生时间作为观察终点,根据肿瘤的亚型、增殖指数和免疫指数对于生存发生影响的相对危险度确定相应系数,计算生存风险评分,计算方法如下:
生存风险评分(Risk of Death,RD)的计算:0-100
0-35,低风险;36-70,中风险;71-100,高风险;
RD=(-0.18*LAD1)+(0.09*LAD2)+(0.04*LAD3)+(0.17*LAD4)+(-0.17*LAD5)+(-0.05*免疫指数)+(0.12*增殖指数);
其中,“LAD1”代表该肿瘤与LAD1型肿瘤的pearson相关系数;“LAD2”代表该肿瘤与LAD2型肿瘤的pearson相关系数;“LAD3”代表该肿瘤与LAD3型肿瘤的pearson相关系数;“LAD4”代表该肿瘤与LAD4型肿瘤的pearson相关系数;“LAD5”代表该肿瘤与LAD5型肿瘤的pearson相关系数;“免疫指数”为如上所述30个免疫相关基因计算的免疫指数;
“增殖指数”为如上所述23个细胞增殖相关基因计算的增殖指数。
根据所计算得出的生存风险评分,可将肿瘤生存的风险分为三组,低风险(0-35)、中风险(36-70)和高风险(71-100)。结果显示,生存风险指数可以指示肺腺癌患者生存风险:低风险组的5年生存率较高、中风险组的5年生存率中等,高风险组的5年生存率 较低(图5)。
25基因测试组合:
25基因测试组合的肺腺癌分子分型方法、增殖指数、免疫指数和生存风险评分的计算方法与76基因测试组合类似。所述25基因测试组合(参见表3)包括:24个肺腺癌分子分型及生存风险相关基因群(增殖相关基因:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20、TYMS和TOP2A;免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R和CD4;细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2和COL5A1),其用于确定肺腺癌分子分型及评估肺腺癌患者的生存风险;以及1个参考基因(ACTB)作为内标,其用于将分子分型及生存风险相关基因的表达水平进行标准化。计算生存风险指数时采用表3中24个肺腺癌分子分型及生存风险相关基因。
实验结果:
1、肺腺癌分子分型
利用表3所示24个肺腺癌分子分型及生存风险相关基因的表达水平(经ACTB的表达水平标准化的)对504例肺腺癌病例进行分子分型,将肺腺癌肿瘤分为LAD1型、LAD2型、LAD3型、LAD4型、LAD5型或混合型。结果与76基因测试组合相似。
2、增殖指数
通过9个细胞增殖相关基因PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20、TYMS和TOP2A的表达水平计算出增殖指数,可将肺腺癌分为增殖快和增殖慢两组,并观察两组之间的生存差异。结果与76基因测试组合相似。
Figure PCTCN2020111702-appb-000010
3、免疫指数
根据9个免疫相关基因P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R和CD4的表达水平计算免疫指数,根据免疫指数可将每个亚型进一步分为两组,免疫强组和免疫弱组,并观察两组之间的生存差异。结果与76基因测试组合相似。
Figure PCTCN2020111702-appb-000011
4、生存风险评估
肿瘤生存风险的计算采用Cox模型,以疾病进展或死亡是否发生及发生时间作为观察终点,根据肿瘤的亚型、增殖指数和免疫指数对于生存发生影响的相对危险度确定相应系数,计算生存风险评分,计算方法如下:
RD=(-0.12*LAD1)+(0.29*LAD2)+(0.13*LAD3)+(0.18*LAD4)+(-0.09*LAD5)+ (-0.55*免疫指数)+(0.07*增殖指数);
其中“LAD1”、“LAD2”、“LAD3”、“LAD4”、“LAD5”如前定义;“免疫指数”为如上所述9个免疫相关基因计算的免疫指数;“增殖指数”为如上所述9个细胞增殖相关基因计算的增殖指数。
根据所计算得出的生存风险评分,可将肿瘤生存的风险分为三组,低风险(0-35)、中风险(36-70)和高风险(71-100)。结果与76基因测试组合相似。
24基因测试组合:
24基因测试组合的肺腺癌分子分型方法、增殖指数、免疫指数和生存风险评分的计算方法与76基因测试组合类似。所述24基因测试组合(参见表4)包括:21个肺腺癌分子分型及生存风险相关基因群(增殖相关基因:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20和TOP2A;免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7和IL7R;细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2和COL5A1),其用于确定肺腺癌分子分型及评估肺腺癌患者的生存风险;以及3个内参基因(包括GAPDH、GUSB和TFRC)作为内标,其用于将分子分型及生存风险相关基因的表达水平进行标准化。计算生存风险指数时采用表4中21个肺腺癌分子分型及生存风险相关基因。
实验结果:
1、肺腺癌分子分型
利用表4所示21个肺腺癌分子分型及生存风险相关基因的表达水平(经GAPDH、GUSB和TFRC的表达水平标准化的)对504例肺腺癌病例进行分子分型,将肺腺癌肿瘤分为LAD1型、LAD2型、LAD3型、LAD4型、LAD5型或混合型。结果与76基因测试组合相似。
2、增殖指数
通过8个细胞增殖相关基因PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20和TOP2A的表达水平计算出增殖指数,可将肺腺癌分为增殖快和增殖慢两组,并观察两组之间的生存差异。结果与76基因测试组合相似。
Figure PCTCN2020111702-appb-000012
3、免疫指数
根据7个免疫相关基因P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7和IL7R的表达水平计算免疫指数,根据免疫指数可将每个亚型进一步分为两组,免疫强组和免疫弱组,并观察两组之间的生存差异。结果与76基因测试组合相似。
Figure PCTCN2020111702-appb-000013
4、生存风险评估
肿瘤生存风险的计算采用Cox模型,以疾病进展或死亡是否发生及发生时间作为观察终点,根据肿瘤的亚型、增殖指数和免疫指数对于生存发生影响的相对危险度确定相应系数,计算生存风险评分,计算方法如下:
RD=(-0.10*LAD1)+(0.36*LAD2)+(0.14*LAD3)+(0.21*LAD4)+(-0.10*LAD5)+(-0.57*免疫指数)+(0.07*增殖指数);
其中“LAD1”、“LAD2”、“LAD3”、“LAD4”、“LAD5”如前定义;“免疫指数”为如上所述7个免疫相关基因计算的免疫指数;“增殖指数”为如上所述8个细胞增殖相关基因计算的增殖指数。
根据所计算得出的生存风险评分,可将肿瘤生存的风险分为三组,低风险(0-35)、中风险(36-70)和高风险(71-100)。结果与76基因测试组合相似。
实施例3:用于确定肺腺癌分子分型及评估肺腺癌患者的生存风险的二代测序检测试剂盒
根据实施例2中76基因测试组合和24基因测试组合,设计了二代测序检测试剂盒,其包含用于将所述76基因或24基因的cDNA进行特异性扩增的引物,引物序列分别示于表5和表6。使用二代测序检测试剂盒确定肺腺癌分子分型和评估肺腺癌患者的生存风险的方法如下所述。
步骤1:取检测对象肿瘤或石蜡包埋组织,利用检测试剂盒中的方法获取检测对象含肿瘤细胞高的区域为原始材料。
步骤2:提取组织中总RNA。可以使用RNA storm CD201RNA或者Qiagen RNease FFPE kit RNA抽提试剂盒来提取。
步骤3:将所得RNA制成可供测序的文库。将所得组织的RNA制成可供靶向RNA-seq技术二代测序的文库,文库的制备方法包括以下步骤:
(3-1):使用
Figure PCTCN2020111702-appb-000014
逆转录酶(New England Biolabs,#M0368L)将步骤(2)中提取的RNA反转录成cDNA。
(3-2):使用Illumina的
Figure PCTCN2020111702-appb-000015
Targeted RNA建库试剂盒(#15034457)将所得cDNA处理制成可供测序的文库,具体步骤如下:(ⅰ)杂交:加入TOP(具体组成参见表5或表6)4.5μ1,混匀后加入21μ1 OB1,升温至70℃后缓慢梯度降温至30℃;(ⅱ)延伸和连接:将(ⅰ)中产物用磁力架吸附后弃上清,用试剂盒中AM1和UB1洗涤两次后弃上清,加入36μl ELM4,在PCR仪或金属浴中37℃孵育45分钟;(ⅲ)对(ⅱ)所得产物进行测序标签(Index)的连接,然后PCR:将(ⅱ)所得产物用磁力架吸附后弃上清,加入稀释40 倍的HP3 18μ1,用磁力架吸附后吸取16μ1,加入17.3μ1 TDP1、0.3μ1 PMM2、6.4μ1 Index,混匀后进行PCR扩增32个循环;(ⅳ)釆用Gnome DNA(QuestGenomics,南京)纯化试剂盒纯化DNA,得到文库。
步骤4:对所得DNA文库进行用NextSeq/MiSeq/MiniSeq/iSeq进行二代测序。用Illumina NextSeq/MiSeq/MiniSeq/iSeq测序仪进行双端测序或单端测序。此过程均由仪器本身自动完成(Illumina公司)。
步骤5:结果统计分析。将所得测序结果进行统计分析。然后采用实施例2所述方法对受试者的肺腺癌进行分子分型,计算免疫指数、增殖指数和生存风险评分,并预测生存风险。
表5
Figure PCTCN2020111702-appb-000016
Figure PCTCN2020111702-appb-000017
Figure PCTCN2020111702-appb-000018
Figure PCTCN2020111702-appb-000019
Figure PCTCN2020111702-appb-000020
表6
Figure PCTCN2020111702-appb-000021
Figure PCTCN2020111702-appb-000022
实施例4:用于确定肺腺癌分子分型及评估肺腺癌患者的生存风险的定量PCR检测试剂盒
根据实施例2中25基因测试组合,设计了定量PCR检测试剂盒,其包含用于将所述25基因进行PCR扩增的引物,以及用于定量的TaqMan探针,引物和探针的序列示于表7。所述试剂盒可以用于单重或多重RT-PCR检测。应用所述试剂盒通过多重RT-PCR检测来进行肺腺癌分子分型和生存风险评估的方法如下所述。
步骤1:取检测对象肿瘤或石蜡包埋组织,利用检测试剂盒中的方法获取检测对象含肿瘤细胞高的区域为原始材料。
步骤2:提取组织中总RNA。可以使用RNA storm CD201RNA或者Qiagen RNease FFPE kit RNA抽提试剂盒来提取。
步骤3:一步法多重荧光定量RT-PCR检测。所述一步法实时多重荧光定量RT-PCR检测的方法为Taqman实时多重荧光定量RT-PCR,将表7中24个肺腺癌分子分型及生存风险相关基因分成12个反应体系。每个反应体系包含2个分子分型及生存风险评估相关基因和1个看家基因的引物和探针,3个探针分别标记不同荧光。每个反应体系配制如下:
RNA样本2μl(总量100-400ng),如上所述的3对正向、反向两条特异性引物(10μM)各0.4μl,3个Taqman荧光探针(10μM)各0.2μl,反应混合液6μl,酶混合液4μl,DEPC水4μl;其中,逆转录反应在50℃15-20分钟,预变性95℃5分钟;扩增反应包括变性95℃10秒,退火、延伸及荧光检测60℃45-60秒,进行45个循环,其中60℃荧光检测通道可以选择FAM/HEX/VIC/ROX/Cy5中的三个;扩增反应结束后,记录每个基因的Ct值,代表了各个基因的表达水平。
步骤4:结果统计分析。将所得测序结果进行统计分析。然后采用实施例2所述方法对受试者的肺腺癌进行分子分型,计算免疫指数、增殖指数和生存风险评分,并预测生存风险。
表7
Figure PCTCN2020111702-appb-000023
Figure PCTCN2020111702-appb-000024
实施例5:用于确定肺腺癌分子分型及评估肺腺癌患者的生存风险的定量PCR检测试剂盒
根据实施例2中24基因测试组合,设计了定量PCR检测试剂盒,其包含用于对所 述24基因进行PCR扩增的引物,以及用于对扩增产物定量的TaqMan探针,引物和探针的序列示于表8。所述试剂盒可以用于单重或多重RT-PCR检测。应用所述试剂盒通过单重RT-PCR检测来进行肺腺癌分子分型和生存风险评估的方法如下所述。
实验方法:取肺乳腺癌肿瘤组织,提取肿瘤细胞中的RNA,采用TaqMan RT-PCR技术,使用表8所示引物和探针,分别检测基因的表达水平。步骤如下:
步骤1:取检测对象肿瘤或石蜡包埋组织,利用检测试剂盒中的方法获取检测对象含肿瘤细胞高的区域为原始材料。
步骤2:提取组织中总RNA。可以使用RNA storm CD201RNA或者Qiagen RNease FFPE kit RNA抽提试剂盒来提取。
步骤3:RT-PCR检测。所述RT-PCR检测的方法为Taqman RT-PCR,将表8中所示基因分别进行RT-PCR检测。步骤如下:
(3-1):提取检测对象的总RNA;
(3-2):对(3-1)所得RNA进行反转录,具体步骤为:取总量为2μg左右的样本RNA(例如取200ng/μl左右的样本RNA 11μl),和11μl参考RNA一起反转录(Thermo K1622反转录试剂盒)获得样本cDNA和参考cDNA;向样本cDNA加入80μl无RNA酶水将其5倍稀释,向参考cDNA加入180μl无RNA酶水将其10倍稀释;
(3-3):对(3-2)所得对应每个基因的cDNA样本进行TaqMan RT-PCR检测对21个肺腺癌分子分型及生存风险相关基因和3个参考基因(参见表8)分别进行检测。步骤如下:(ⅰ)制备每孔反应体系:(3-2)所得的cDNA样本2μl(总量100-400ng),如表8所示的正向、反向特异性引物及TaqMan荧光探针(10μM)共1.4μl,反应预混合液10μl,DEPC水6.6μl;(ⅱ)95℃灭活逆转录酶2分钟;(ⅲ)扩增与检测:95℃变性25秒,60℃退火、延伸及荧光检测60秒,进行45个循环,暂缓期60℃60秒;扩增反应结束后,记录每个基因的Ct值,代表了各个基因的表达水平。
步骤4:结果统计分析。将所得测序结果进行统计分析。然后采用实施例2所述方法对受试者的肺腺癌进行分子分型,计算免疫指数、增殖指数和生存风险评分,并预测生存风险。
表8
Figure PCTCN2020111702-appb-000025
Figure PCTCN2020111702-appb-000026
Figure PCTCN2020111702-appb-000027
实验结果:采用本发明的方法对21例肺腺癌样本进行肺腺癌分子分型及生存风险评估,结果见表9和图6。结果显示:可以将21例肺腺癌样本分为LAD1、LAD2、LAD3、LAD4、LAD5或混合型(Mixed)。肺腺癌样本的生存风险评估可以为低、中或高风险,增殖指数可以为增殖快和增殖慢,免疫指数可以为免疫指数强或免疫指数弱。根据肺腺癌样本的亚型、生存风险结果、增殖指数和免疫指数,结合组织病理学,可以建立细化、个体化的肺腺癌临床指标,提供更有针对性的个体化治疗。同时,可以筛选出适合不同治疗方案的优势人群,并提供潜在的治疗途径。
表9
样本 分子亚型 风险评估 增殖指数 免疫指数
1 LAD1
2 LAD1
3 LAD1
4 LAD1
5 LAD1
6 LAD1
7 LAD1
8 LAD1
9 LAD1
10 LAD2
11 LAD3
12 LAD3
13 LAD4
14 LAD4
15 LAD5
16 LAD5
17 LAD5
18 Mixed
19 Mixed
20 Mixed
21 Mixed
本文所述的实施例和实施方案仅用于说明目的,并且可以对上述实施方案进行改变而不脱离本发明的广泛发明构思。因此,应当理解,本发明不限于所公开的具体实施方案,而是旨在涵盖由所附权利要求书限定的本发明的精神和范围内的修改。

Claims (29)

  1. 一组用于确定肺腺癌分子分型和/或评估肺腺癌患者的生存风险的基因群,其包括分子分型及生存风险评估相关基因,其中,所述分子分型及生存风险评估相关基因包括:
    (1)以下增殖相关基因中的一个或多个:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、HMMR、KIF20A、FOXM1、MKI67、KIF14、TK1、HJURP、TPX2、EXO1、KIF11、NEK2、KIF23、CDCA3、CDK1、SPAG5、KIF4A、GTSE1、CDKN3、CDC25C、PRR11、CCNB2、MAD2L1、PKMYT1、CENPE、ASPM、CENPF、BUB1、NDC80、NUSAP1、CEP55、NCAPG、BIRC5、ZWINT、TTK、ESPL1、DEPDC1、MELK、CDC20、CDC6、AURKA、NEIL3、CDT1、KIF2C、KIFC1、NCAPH、KIF18B、AURKB、UBE2C、TOP2A、TYMS、PBK、CDC45、CDCA8、CENPA、MYBL2、SKA1、MCM10、TRIP13、TROAP、POLQ、GINS1和RAD54L;
    (2)以下免疫相关基因中的一个或多个:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FGL2、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、LCP1、SPIB、CD53、CD3E、SLCO2B1、MS4A6A、CYBB、CD4、SH2D1A、TFEC、LYZ、ITGAM、TLR8、CSF1R、CXCL13、GPNMB、CCR5、HK3、CMKLR1、IL2RG、TYROBP、HCK、ITGB2、LAPTM5、SIGLEC1、AOAH、C3AR1、MSR1、IL2RA、CCL5、ADAMDEC1、LILRB4、CXCL11、FPR3、SELL、CXCL10、UBD、C1QB、PDCD1LG2、C1QA、SLAMF8、VSIG4、CD163、LAIR1、SLAMF7和MS4A4A;
    (3)以下细胞间质相关基因中的一个或多个:LOXL2、SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1、SULF1、ADAMTS2、PRRX1、COL15A1、SPARC、THY1、FAP、DIO2、FN1、COL6A3、FBN1、SYNDIG1、AEBP1、LRRC15、CILP、ISLR、GAS1、COL10A1、ASPN、MMP2和EPYC。
  2. 权利要求1所述的基因群,其包括21个分子分型及生存风险评估相关基因,所述分子分型及生存风险评估相关基因包括:
    (1)增殖相关基因:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20和TOP2A;
    (2)免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7和IL7R;
    (3)细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2和COL5A1。
  3. 权利要求1所述的基因群,其包括24个分子分型及生存风险评估相关基因,所 述分子分型及生存风险评估相关基因包括:
    (1)增殖相关基因:PLK1、PRC1、CCNB1、MKI67、TPX2、MELK、CDC20、TYMS和TOP2A;
    (2)免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R和CD4;
    (3)细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2和COL5A1。
  4. 权利要求1所述的基因群,其包括70个分子分型及生存风险评估相关基因,其中,所述70个分子分型及生存风险评估相关基因包括:
    (1)增殖相关基因:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、FOXM1、MKI67、KIF14、HJURP、TPX2、NEK2、CDK1、CDKN3、ASPM、CEP55、BIRC5、MELK、CDC20、TYMS、AURKA和TOP2A;
    (2)免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、SPIB、CD53、CD4和LYZ;
    (3)细胞间质相关基因:SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1和SULF1。
  5. 权利要求1所述的基因群,其包括180个分子分型及生存风险评估相关基因,所述分子分型及生存风险评估相关基因包括:
    (1)增殖相关基因:PLK1、PRC1、CCNB1、DLGAP5、KPNA2、CCNA2、RRM2、HMMR、KIF20A、FOXM1、MKI67、KIF14、TK1、HJURP、TPX2、EXO1、KIF11、NEK2、KIF23、CDCA3、CDK1、SPAG5、KIF4A、GTSE1、CDKN3、CDC25C、PRR11、CCNB2、MAD2L1、PKMYT1、CENPE、ASPM、CENPF、BUB1、NDC80、NUSAP1、CEP55、NCAPG、BIRC5、ZWINT、TTK、ESPL1、DEPDC1、MELK、CDC20、CDC6、AURKA、NEIL3、CDT1、KIF2C、KIFC1、NCAPH、KIF18B、AURKB、UBE2C、TOP2A、TYMS、PBK、CDC45、CDCA8、CENPA、MYBL2、SKA1、MCM10、TRIP13、TROAP、POLQ、GINS1和RAD54L;
    (2)免疫相关基因:P2RY13、CCR2、PTPRC、IRF8、CLEC10A、TLR7、CCR4、IL7R、SPN、SASH3、CSF2RB、CD37、IKZF1、CD48、IL10RA、EVI2B、IGSF6、CD52、DOCK2、CD84、FGL2、FOLR2、NCKAP1L、TRAC、MNDA、MRC1、PLEK、LCP1、SPIB、CD53、CD3E、SLCO2B1、MS4A6A、CYBB、CD4、SH2D1A、TFEC、LYZ、ITGAM、TLR8、CSF1R、CXCL13、GPNMB、CCR5、HK3、CMKLR1、IL2RG、TYROBP、 HCK、ITGB2、LAPTM5、SIGLEC1、AOAH、C3AR1、MSR1、IL2RA、CCL5、ADAMDEC1、LILRB4、CXCL11、FPR3、SELL、CXCL10、UBD、C1QB、PDCD1LG2、C1QA、SLAMF8、VSIG4、CD163、LAIR1、SLAMF7和MS4A4A;
    (3)细胞间质相关基因:LOXL2、SPOCK1、COL1A1、POSTN、ADAM12、COL6A2、COL5A1、COL11A1、COL5A2、COL1A2、MXRA5、THBS2、INHBA、VCAN、ADAMTS12、GREM1、COL3A1、SULF1、ADAMTS2、PRRX1、COL15A1、SPARC、THY1、FAP、DIO2、FN1、COL6A3、FBN1、SYNDIG1、AEBP1、LRRC15、CILP、ISLR、GAS1、COL10A1、ASPN、MMP2和EPYC。
  6. 权利要求1-5中任一项所述的基因群,其还包括参考基因;
    优选地,所述参考基因包括以下中的1个、更优选3个、最优选6个:GAPDH、GUSB、MRPL19、PSMC4、SF3A1、TFRC、ACTB和RPLP0。
  7. 权利要求2所述的基因群,其还包括参考基因;优选地,所述参考基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1和TFRC中的三个;更优选地,所述参考基因包括GAPDH、GUSB和TFRC。
  8. 权利要求3所述的基因群,其还包括参考基因;优选地,所述参考基因包括ACTB。
  9. 权利要求4或5所述的基因群,其还包括参考基因;优选地,所述参考基因包括GAPDH、GUSB、MRPL19、PSMC4、SF3A1、TFRC。
  10. 用于检测权利要求1-9中任一项所述的基因群中的基因的表达水平的试剂。
  11. 权利要求10所述的试剂,其为检测所述基因转录的RNA、特别是mRNA的量的试剂;或者其为检测与mRNA互补的cDNA的量的试剂。
  12. 权利要求10或11所述的试剂,其为引物、探针或其组合。
  13. 权利要求12所述的试剂,其为引物;
    优选地,所述引物具有如SEQ ID NO.1-152所示的序列,或者具有如SEQ ID NO.1-6、17、18、23、24、37-40、45-58、61、62、107-118、141-144、151和152所示的序列,或者具有如SEQ ID NO.153-202所示的序列,或者具有如SEQ ID NO.228-275所示的序列。
  14. 权利要求12所述的试剂,其为探针;
    优选地,所述探针为TaqMan探针;
    更优选地,所述探针具有如SEQ ID NO.203-227所示的序列,或具有如SEQ ID NO.276-299所示的序列;
    最优选地,所述探针为具有如SEQ ID NO.203-227所示序列的TaqMan探针,或具有如SEQ ID NO.276-299所示序列的TaqMan探针。
  15. 权利要求12所述的试剂,其为引物和探针的组合,
    优选地,所述引物具有如SEQ ID NO.153-202所示的序列,所述探针为具有如SEQ ID NO.203-227所示序列的TaqMan探针;或者
    所述引物具有如SEQ ID NO.228-275所示的序列,所述探针为具有如SEQ ID NO.276-299所示序列的TaqMan探针。
  16. 权利要求10所述的试剂,其为检测所述基因编码的多肽的量的试剂,优选地,所述试剂为抗体、抗体片段或者亲和性蛋白。
  17. 一种对肺腺癌进行分子分型和/或生存风险评估的产品,其包含权利要求10-16中任一项所述的试剂。
  18. 权利要求1-9中任一项所述的基因群、权利要求10-16中任一项所述的试剂或权利要求17所述的产品在确定肺腺癌分子分型和/或评估肺腺癌患者的生存风险中的应用。
  19. 权利要求1-9中任一项所述的基因群或权利要求10-16中任一项所述的试剂在制备产品中的应用,所述产品用于确定肺腺癌分子分型和/或评估肺腺癌患者的生存风险。
  20. 权利要求17所述的产品或权利要求19所述的应用,其中所述产品为体外诊断产品的形式,优选诊断试剂盒的形式。
  21. 权利要求17所述的产品或权利要求19所述的应用,其中所述产品为二代测序试剂盒、实时荧光定量PCR检测试剂盒、基因芯片、蛋白质微阵列、ELISA诊断试剂盒或免疫组化(IHC)试剂盒。
  22. 权利要求21所述的产品或应用,其中所述产品为二代测序试剂盒,其包含具有如SEQ ID NO.1-152所示序列的引物或者如SEQ ID NO.1-6、17、18、23、24、37-40、 45-58、61、62、107-118、141-144、151和152所示序列的引物,并且任选地包含选自以下的一个或多个:总RNA抽提试剂、逆转录试剂和二代测序试剂。
  23. 权利要求21所述的产品或应用,其中所述产品为实时荧光定量PCR检测试剂盒,其包含具有如SEQ ID NO.153-202所示序列的引物或具有如SEQ ID NO.228-275所示序列的引物。
  24. 权利要求23所述的产品或应用,其中所述实时荧光定量PCR检测试剂盒还包含TaqMan探针,并且任选地包含选自以下的一个或多个:总RNA抽提试剂、逆转录试剂和用于TaqMan RT-PCR的试剂。
  25. 权利要求24所述的产品或应用,其中所述实时荧光定量PCR检测试剂盒包含具有如SEQ ID NO.153-202所示序列的引物和具有如SEQ ID NO.203-227所示序列的TaqMan探针。
  26. 权利要求24所述的产品或应用,其中所述实时荧光定量PCR检测试剂盒包含具有如SEQ ID NO.228-275所示序列的引物和具有如SEQ ID NO.276-299所示序列的TaqMan探针。
  27. 权利要求23所述的产品或应用,其中所述实时荧光定量PCR检测试剂盒还包含选自以下的一个或多个:总RNA抽提试剂、逆转录试剂和用于SYBR Green RT-PCR的试剂。
  28. 一种用于确定受试者的肺腺癌分子分型和/或生存风险的方法,所述方法包括
    (1)提供受试者的样本,
    (2)测定所述样本中权利要求1-9中任一项所述的基因群中基因的表达水平,
    (3)确定所述受试者的肺腺癌分子分型和/或生存的风险。
  29. 权利要求1-9中任一项所述的基因群、权利要求17和20-27中任一项所述的产品、权要求18-27中任一项所述的应用或权利要求28所述的方法,其特征在于,
    所述肺腺癌分子分型包括LAD1型、LAD2型、LAD3型、LAD4型、LAD5型和混合型。
PCT/CN2020/111702 2019-08-27 2020-08-27 肺腺癌分子分型及生存风险基因群及诊断产品和应用 WO2021037134A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080062083.0A CN114341367B (zh) 2019-08-27 2020-08-27 肺腺癌分子分型及生存风险基因群及诊断产品和应用
US17/753,254 US20220364183A1 (en) 2019-08-27 2020-08-27 Gene Panels for Molecular Subtype and Survival Risk Assessment of Lung Adenocarcinoma and Diagnostic Products and Applications Thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910797167.8A CN112442535A (zh) 2019-08-27 2019-08-27 原发性肺腺癌分子分型及生存风险基因群及诊断产品和应用
CN201910797167.8 2019-08-27

Publications (1)

Publication Number Publication Date
WO2021037134A1 true WO2021037134A1 (zh) 2021-03-04

Family

ID=74685577

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/111702 WO2021037134A1 (zh) 2019-08-27 2020-08-27 肺腺癌分子分型及生存风险基因群及诊断产品和应用

Country Status (3)

Country Link
US (1) US20220364183A1 (zh)
CN (2) CN112442535A (zh)
WO (1) WO2021037134A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022207671A1 (en) * 2021-03-29 2022-10-06 Fenomark Diagnostics Ab Proteogenomic analysis of non-small cell lung cancer
CN116312814A (zh) * 2021-12-02 2023-06-23 复旦大学 一种肺腺癌分子分型模型的构建方法、设备、装置以及试剂盒

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023002943A1 (ja) * 2021-07-20 2023-01-26 国立大学法人東北大学 がん患者の予後を予測するためのバイオマーカー、がん患者の予後を予測するための方法、がん患者における、がん治療薬の効果を予測するための方法、及び、がん患者の予後を予測するためのキット
CN113981079A (zh) * 2021-09-22 2022-01-28 杭州金域医学检验所有限公司 Csf2rb及编码蛋白在女性非吸烟肺癌保护中的应用
CN114032308B (zh) * 2021-11-19 2022-11-29 上海生物芯片有限公司 Fam83a、kpna2、krt6a和ldha联合作为肺腺癌生物标志物的用途
CN114958922A (zh) * 2022-06-08 2022-08-30 山东大学 一种敲减ccna2基因的方法与敲减ccna2基因在制备治疗肺腺癌药物中的应用
CN115851927A (zh) * 2022-08-09 2023-03-28 上海善准医疗科技有限公司 肺鳞癌分子分型及生存风险基因群及诊断产品和应用
CN116259360B (zh) * 2023-03-16 2024-02-09 中国人民解放军空军军医大学 肺腺癌中高增殖肿瘤亚群的鉴别及特征基因集与应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101111768A (zh) * 2004-11-30 2008-01-23 维里德克斯有限责任公司 肺癌预后
CN107849569A (zh) * 2015-11-05 2018-03-27 深圳华大生命科学研究院 肺腺癌生物标记物及其应用
WO2019018764A1 (en) * 2017-07-21 2019-01-24 Genecentric Therapeutics, Inc. METHODS OF DETERMINING RESPONSE TO PARP INHIBITORS
CN109790583A (zh) * 2016-05-17 2019-05-21 基因中心治疗公司 对肺腺癌亚型分型的方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106282347A (zh) * 2016-08-17 2017-01-04 中南大学 HoxC11作为生物标志物在制备肺腺癌的预诊断试剂中的应用
CN108034719B (zh) * 2017-09-29 2021-07-23 中南大学 Gins4基因或gins4蛋白作为生物标志物在制备肺腺癌的预诊断试剂中的应用
CN108363907B (zh) * 2018-05-09 2022-01-18 中国科学院昆明动物研究所 一种基于多基因表达特征谱的肺腺癌个性化预后评估方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101111768A (zh) * 2004-11-30 2008-01-23 维里德克斯有限责任公司 肺癌预后
CN107849569A (zh) * 2015-11-05 2018-03-27 深圳华大生命科学研究院 肺腺癌生物标记物及其应用
CN109790583A (zh) * 2016-05-17 2019-05-21 基因中心治疗公司 对肺腺癌亚型分型的方法
WO2019018764A1 (en) * 2017-07-21 2019-01-24 Genecentric Therapeutics, Inc. METHODS OF DETERMINING RESPONSE TO PARP INHIBITORS

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
HARTWIG TORSTEN; MONTINARO ANTONELLA; VON KARSTEDT SILVIA; SEVKO ALEXANDRA; SURINOVA SILVIA; CHAKRAVARTHY ANKUR; TARABORRELLI LUCI: "The TRAIL-Induced Cancer Secretome Promotes a Tumor-Supportive Immune", MOLECULAR CELL, vol. 65, no. 4, 16 February 2017 (2017-02-16), pages 730 - 742, XP029924363, ISSN: 1097-2765, DOI: 10.1016/j.molcel.2017.01.021 *
LI LIFENG, PENG MENGLE, XUE WENHUA, FAN ZHIRUI, WANG TIAN, LIAN JINGYAO, ZHAI YUNKAI, LIAN WENPING, QIN DONGCHUN, ZHAO JIE: "Integrated analysis of dysregulated long non-coding RNAs/microRNAs/mRNAs in metastasis of lung adenocarcinoma", JOURNAL OF TRANSLATIONAL MEDICINE, vol. 16, no. 1, 372, 27 December 2018 (2018-12-27), pages 1 - 14, XP055786903, DOI: 10.1186/s12967-018-1732-z *
PING ZHAN; XIAO-KUN SHEN; QIAN QIAN; JI-PING ZHU; YU ZHANG; HAI-YAN XIE; CHUEN-HUA XU; KE-KE HAO; WEI HU; NING XIA; GUO-JUN LU; LI: "Down-Regulation of Lysyl Oxidase-like 2 (LOXL2) Is Associated with Disease Progression in Lung Adenocarcinomas", MEDICAL ONCOLOGY, vol. 29, no. 2, 26 April 2011 (2011-04-26), pages 648 - 655, XP035049731, ISSN: 1559-131X, DOI: 10.1007/s12032-011-9959-z *
SINGH MOHINI; VENUGOPAL CHITRA; TOKAR TOMAS; BROWN KEVIN R; MCFARLANE NICOLE; BAKHSHINYAN DAVID; VIJAYAKUMAR THUSYANTH; MANORANJAN: "RNAi Screen Identifies Essential Regulators of Human Brain Metastasis Initiating Cells", ACTA NEUROPATHOLOGICA, vol. 134, no. 6, 1 August 2017 (2017-08-01), pages 923 - 940, XP036351198, ISSN: 0001-6322, DOI: 10.1007/s00401-017-1757-z *
XU YAN , SHEN XIAO-XING , LIU BIAO , RAO QIN , SHI SHAN-SHAN , WANG XUAN , HE YAN , HUANG PEI-LIN: "The Abnormal Expression and the Clinicopathologic Significance of Polo-like Kinase 1(PLK1) in Adenocarcinoma of the Lung", JOURNAL OF SOUTHEAST UNIVERSITY (MEDICAL SCIENCE EDITION), vol. 36, no. 5, 31 October 2017 (2017-10-31), pages 804 - 810, XP055786898, ISSN: 1671-6264, DOI: 10.3969/j.issn.1671-6264.2017.05.025 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022207671A1 (en) * 2021-03-29 2022-10-06 Fenomark Diagnostics Ab Proteogenomic analysis of non-small cell lung cancer
CN116312814A (zh) * 2021-12-02 2023-06-23 复旦大学 一种肺腺癌分子分型模型的构建方法、设备、装置以及试剂盒

Also Published As

Publication number Publication date
CN114341367A (zh) 2022-04-12
US20220364183A1 (en) 2022-11-17
CN112442535A (zh) 2021-03-05
CN114341367B (zh) 2024-04-05

Similar Documents

Publication Publication Date Title
WO2021037134A1 (zh) 肺腺癌分子分型及生存风险基因群及诊断产品和应用
EP2504451B1 (en) Methods to predict clinical outcome of cancer
EP3458612B1 (en) Methods for subtyping of lung adenocarcinoma
US20190249260A1 (en) Method for Using Gene Expression to Determine Prognosis of Prostate Cancer
US20220090206A1 (en) Colorectal cancer recurrence gene expression signature
US10428386B2 (en) Gene for predicting the prognosis for early-stage breast cancer, and a method for predicting the prognosis for early-stage breast cancer by using the same
EP2715348B1 (en) Molecular diagnostic test for cancer
US20110159498A1 (en) Methods, agents and kits for the detection of cancer
WO2018001295A1 (zh) 分子标志物、内参基因及其应用、检测试剂盒以及检测模型的构建方法
CA3024744A1 (en) Methods for subtyping of lung squamous cell carcinoma
WO2022135552A1 (zh) 结直肠癌分子分型及生存风险基因群及诊断产品和应用
EP3122905B1 (en) Circulating micrornas as biomarkers for endometriosis
AU2020201779A1 (en) Method for using gene expression to determine prognosis of prostate cancer
US9708666B2 (en) Prognostic molecular signature of sarcomas, and uses thereof
US10934590B2 (en) Biomarkers for breast cancer and methods of use thereof
WO2021175236A1 (zh) 干扰素信号通路相关基因群及诊断产品和应用
CN115851927A (zh) 肺鳞癌分子分型及生存风险基因群及诊断产品和应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20859099

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20859099

Country of ref document: EP

Kind code of ref document: A1