CN113186287B - Biomarker for non-small cell lung cancer typing and application thereof - Google Patents

Biomarker for non-small cell lung cancer typing and application thereof Download PDF

Info

Publication number
CN113186287B
CN113186287B CN202110505178.1A CN202110505178A CN113186287B CN 113186287 B CN113186287 B CN 113186287B CN 202110505178 A CN202110505178 A CN 202110505178A CN 113186287 B CN113186287 B CN 113186287B
Authority
CN
China
Prior art keywords
lung cancer
small cell
adenocarcinoma
cell lung
biomarker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110505178.1A
Other languages
Chinese (zh)
Other versions
CN113186287A (en
Inventor
刘康
刘鑫
郝诗莹
许雷
张华�
马丹丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Kanghua Juntai Biotechnology Co ltd
Original Assignee
Shenzhen Kanghua Juntai Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Kanghua Juntai Biotechnology Co ltd filed Critical Shenzhen Kanghua Juntai Biotechnology Co ltd
Priority to CN202110505178.1A priority Critical patent/CN113186287B/en
Publication of CN113186287A publication Critical patent/CN113186287A/en
Application granted granted Critical
Publication of CN113186287B publication Critical patent/CN113186287B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Abstract

The invention relates to a biomarker for non-small cell lung cancer typing and application thereof, belonging to the technical field of medical detection. The biomarker comprises at least 5 genes such as TP53, STK11, PTEN, NFE2L and KRAS. By using the biomarkers, squamous carcinoma and adenocarcinoma in small cell lung cancer are classified, when the minimum number of the markers is 5, the AUC of a classification diagnosis ROC curve is 0.700, when the markers are further increased to 10, the AUC of the classification diagnosis ROC curve is 0.734, when all the biomarkers are used, the AUC can reach 0.786, and the diagnosis capability is excellent.

Description

Biomarker for non-small cell lung cancer typing and application thereof
Technical Field
The invention relates to the technical field of medical detection, in particular to a biomarker for non-small cell lung cancer typing and application thereof.
Background
Lung cancer is a heterogeneous disease, and the main basis for the selection of the existing treatment modes of lung cancer is pathological typing and staged diagnosis. Pathotyping is generally a histological determination of its subtype: important typing is for example: small cell vs non-small cell, adenocarcinoma vs squamous cell carcinoma, and the like. Differentiation between the various morphological subtypes of lung cancer is essential in guiding patient management, as are different pathological subtypes, whose corresponding therapeutic strategies, for example: the small cell undifferentiated lung cancer has high malignancy, is easy to transfer in early stage, is sensitive to radiotherapy and chemotherapy, and is the main treatment means of non-operative treatment of systemic chemotherapy and local radiotherapy. The non-small cell lung cancer mainly comprises squamous carcinoma and adenocarcinoma, the non-small cell lung cancer of stage I and II is mainly selected for operation, and can be cured by combining with postoperative adjuvant radiotherapy and chemotherapy. For non-small cell lung cancer, the growth rate of tumor cells of adenocarcinoma is high, most of the adenocarcinoma cells have metastasis in the early stage, and the metastasis is mainly hematogenous metastasis, so the adenocarcinoma cells are more sensitive to chemotherapeutic drugs, and the reflex treatment effect is poor, so the modes of surgery, chemotherapy, immunization, targeted treatment and the like are often selected. The squamous carcinoma is relatively slow, mostly locally invades in the early stage, mainly takes a lymph node metastasis way as the main way, and has later distant metastasis, so that the radiotherapy sensitivity to the squamous carcinoma is higher, and the radiotherapy method generally adopts the modes of operation, radiation, immunotherapy and the like.
However, most of the conventional pathological judgment of lung cancer only depends on the pathological diagnosis result of histology, but the pathological subtype typing error result is caused by the personal experience of doctors and various reasons, the error rate is high, and other related quality control means are not available.
In addition to conventional chemoradiotherapy approaches, lung cancer treatment is currently advancing into the era of precision medicine: conventional imaging diagnosis and pathological diagnosis have not satisfied the demand of precise medical treatment. At present, the importance of accurate diagnosis of lung cancer, in addition to traditional diagnostic methods, in the molecular level, can detect specific therapeutic target markers, i.e., molecular markers such as tumor-specific gene mutations and characteristic proteins, RNAs, metabolites, etc., is increasingly prominent.
The detection of gene variation has become one of the necessary procedures for conventional treatment, among which the most notable ones, such as the detection of Epidermal Growth Factor Receptor (EGFR), lung cancer patients with dysfunction of this protein (usually caused by gene mutation) show a significant response to a drug that specifically targets the EGFR protein, and the detection of related gene variation of this target has entered the relevant guidelines for lung cancer treatment.
Moreover, the detection of the drug target related variant gene is a necessary item for lung cancer patients, especially for patients with advanced stage. While the limited tissue samples and the need for ever increasing assessment of therapeutic targeting markers greatly increases the current diagnostic needs, studies of histological diagnostic reproducibility have shown intra-and inter-pathologist variability in decision: the wrong pathological judgment result, poorly differentiated tumors, contradictory immunohistochemical results and the like present challenges to the precise medical accuracy of the current lung cancer. Therefore, there is a need for a reliable means for determining the pathological subtype of lung cancer.
Disclosure of Invention
In view of the above, it is necessary to provide a biomarker for typing non-small cell lung cancer, which can be obtained by obtaining a biomarker for typing adenocarcinoma and squamous cell carcinoma in non-small cell lung cancer through different expression profiles of variant genes in adenocarcinoma vs squamous cell carcinoma of non-small cell lung cancer, and to provide a method for discriminating two pathological subtypes based on molecular level.
A biomarker for non-small cell lung cancer typing comprising: NFE2L2, TP53, CDKN2A, PTEN, MUC16, PIK3CA, RYR2, ATP10A, SLCO1B1, RASA1, ZFHX4, KMT2D, KRAS, EGFR, ANK1, BRAF, PCLO, PTPRD, ASTN1, ADGRG4, NOTCH4, FAT3, PCDH15, ROBO2, KEAP1, TENM2, TSHZ3, TBSEP 1, CACNA1E, XIRP2, ASXL3, ZNF804A, NALCN, FBN2, SPTA1, MUC17, RBM10, SETD2, MXRA5, ST6GAL2, RP1L1, ASPM, FLG, HECW1, COL12A1, COL14A1, AFF2, SMARCA4, SDK1, EPHB6, UBA6, SF1, MGAM, PCDH11X, COL5A1, ALK, ATM, VCAN, ZNF536, EPHA6, PDZRN3, PTPRC, ITGA4, DRD5, MYH11, DACH1, CTNNB1, TNR, NPAP1, TLR4, F8, ABCB5, RYR1, OR4C15, MYOM2, DMD, FOLH1, FRMPD4, ADAMTS17, SHANK1, FAM171A1, CCKBR, TRPS1, HMCN1, NTRK3, ATRX, AHNAK, SALL1, PRUNE2, CSDD 1, PDGFRA, ADAMTS12, GRM1, SYNE2, TROR 8H2, TEP1, CCDC178, STK11, NID1, DCSTAMP, STAG2, MET, BCL11B, ZNF226, NTRK2, NEDD4, BTK, TMTC3, RBM15, ITK 2, ITK, CMR 1.
The present inventors have conducted extensive studies on the gene pattern of non-small cell lung cancer, and have obtained genes related to squamous cell carcinoma that show higher mutation frequency than adenocarcinoma (NFE 2L2, TP53, CDKN2A, PTEN, MUC16, PIK3CA, RYR2, ATP10A, SLCO1B1, RASA1, ZFHX4, KMT 2D), genes related to squamous cell carcinoma that show higher mutation frequency than adenocarcinoma (KRAS, EGFR, ANK1, BRAF, PCLO, PTPRD, ASTN1, ADGRG4, NOTCH4, FAT3, PCDH15, ROBO2, KEAP1, TENM2, TSHZ3, SETBP1, CACNA1E, XIRP2, ASXL3, ZNF804A, NALCN, FBN2, SPTA1, MUC17, RBM10, SETD2, MXRA5, ST6GAL2, RP1L1, ASPM, FLG, HECW1, COL12A1, COL14A1, AFF2, SMARCA4, SDK1, EPHB6, UBA6, SF1, MGAM, PCDH11X, COL5A1, ALK, ATM, VCAN, ZNF536, EPHA6, PDZRN3, PTPRC, ITGA4, DRD5, MYH11, DACH1, CTNNB1, TNR, NPAP1, TLR4, F8, ABCB5, RYR1, OR4C15, MYOM2, DMD, FOLH1, FRMPD4, ADAMTS17, SHANK1, FAM171A1, CCKBR, TRPS1, HMCN1, NTRK3, ATRX, nak, SALL1, PRUNE2, CSMD1, PDGFRA, ADAMTS12, GRM1, nide 2, OR8H2, TEP1, CCDC 178), and mutant genes (STK 11, DCSTAMP 1, STAG2, MET, BCL11B, nidf 226, NTRK2, nerk 4, BTK, ntk, eptd 3, sytc 15, tmk 2, tmk 1, KLK 15), and the above two pathological markers can be used to differentiate cancer cells and determine the cancer type by using a model.
In one embodiment, the biomarker comprises: TP53, STK11, PTEN, NFE2L and KRAS genes. To be provided with
For molecular diagnosis, on the premise of ensuring diagnosis sensitivity and specificity, the number of diagnosis markers is reduced as much as possible, the complexity of operation can be effectively reduced, the repeatability and reliability of detection results are ensured, and the economic burden of patients can be greatly reduced. The biomarker provides combined application of 5 biomarkers, the application cost is reduced, and the AUC value of the RCO curve of the model for typing judgment can reach 0.700.
In one embodiment, the biomarker comprises: TP53, STK11, PTEN, NFE2L, KRAS, EGFR, CDKN2A, BRAF, TSHZ3 and PIK3CA genes. And (3) typing by adopting the 10 biomarkers, wherein the AUC value of the RCO curve of the obtained judgment model can reach 0.734.
In one embodiment, the biomarker comprises: TP53, STK11, PTEN, NFE2L, KRAS, EGFR, CDKN2A, BRAF, TSHZ3, PIK3CA, SETD2, MUC16, RYR2, PTPR, PCLO, RP1L1, ASTN1, SPTA1, ASXL3, and XIRP2. And (3) typing by adopting the 20 biomarkers, wherein the AUC value of the RCO curve of the obtained judgment model can reach 0.747.
In one embodiment, the biomarker comprises: NFE2L2, TP53, CDKN2A, PTEN, MUC16, PIK3CA, RYR2, ATP10A, SLCO1B1, RASA1, ZFXX 4, KMT2D, KRAS, EGFR, ANK1, BRAF, PCLO, PTPRD, ASTN1, ADGRG4, NOTCH4, FAT3, PCDH15, ROBO2, KEAP1, TENM2, TSHZ3, TBP1, CACNA1E, XIRP2, ASXL3, ZNF804A, NALCN, FBN2, SPTA1, MUC17, RBM10, SETD2, MXRA5, ST6GAL2, RP1L1, ASPM, FLG, HECW1, COL12A1, COL14A1, COLF 2, SMARCA4, SDK1, EPHB6, UBA6, MGSF 1, MGAM 1, PCAM 11, COL A11, ALK, ATM, VCAN, ZNF536, EPHA6, PDZRN3, PTPRC, ITGA4, DRD5, MYH11, DACH1, CTNNB1, TNR, NPAP1, TLR4, F8, ABCB5, RYR1, OR4C15, MYOM2, DMD, FOLH1, FRMPD4, ADAMTS17, SHANK1, FAM171A1, CCKBR, TRPS1, HMCN1, NTRK3, ATRX, NAK, SALL1, PRUNE2, CSMD1, PDGFRA, ADAMTS12, GRM1, GRE 2, OR8H2, TEP1, CCDC178, STK11, NID1, DCSTAMP, STAG2, MET, BCL11B, ZNF226, NTRK2, NEDD4, TMTC3, RBM15, KLITK 2, and CMKLR1.
And (3) typing is carried out by adopting all the biomarkers, and the AUC value of the RCO curve of the obtained judgment model can reach 0.786.
The invention also discloses application of the biomarker in the diagnosis of adenocarcinoma and squamous cell carcinoma in non-small cell lung cancer patients.
It is understood that, in detecting the above-mentioned biomarkers (i.e., genetic variations), various methods that can be used in the art to detect genetic variations can be employed, for example:
1) Sequencing technology comprises second generation sequencing technology and third generation sequencing technology. The principle of the second-generation sequencing technology is Massively Parallel Sequencing (MPS), and in some embodiments, the second-generation sequencing technology may be: 1. based on DNA polymerase Sequencing by synthesis technology (SBS), representative companies are Illumina (reversible terminator Sequencing), thermo Fisher/Life Technologies (Ion Torrent), genapSys, roche diagnostics (454 pyrosequencing), etc.; 2. based on the DNA ligase Ligation Sequencing technology (SBL), the representative companies are Huada gene/Complete Genomics (composite Probe-Anchor molecule Ligation, cPAL), thermo Fisher/Applied Biosystems (Sequencing by Oligonucleotide Ligation and Detection, SOLID), etc. The third generation sequencing technology is a single molecule sequencing technology, and in some embodiments, the third generation sequencing technology can be: single molecule real-time fluorescence Sequencing technology (SMRT, pacific Biosciences), nanopore Sequencing technology [ Oxford Nanopore Technologies (ONT), genia Technologies and Stratos Genomics (roche diagnostics) ], nanometera Sequencing technology (Nanogate, quantum Biosystems), sequencing by DNA hydrolysis based technology (Sequencing by de-synthesis, pyrophosporolysis, base 4), and any other Sequencing method known in the art.
2) Microarray hybridization techniques: such as SNP microarrays and the like;
3) PCR-based detection technology for mutation sites: for example, KASP typing method, ligase Detection Reaction (LDR) typing method, taqman probe method and the like.
In one embodiment, the biomarker is a biomarker for blood and/or tissue detection.
It will be appreciated that the above assay is equally applicable to other biological sample types. However, the sample is easy to obtain, and the application is wide.
The invention also discloses application of the reagent for detecting the biomarkers in the biological sample in preparing a non-small cell lung cancer typing diagnostic reagent or diagnostic equipment.
It will be appreciated that the reagent is, for example, a kit, and the device is, for example, an integrated detection device, and may be adapted to the needs of a particular application.
The invention also discloses a detection kit for non-small cell lung cancer typing diagnosis, which comprises a reagent for detecting the biomarkers of the claims.
The invention also discloses a non-small cell lung cancer typing diagnosis system, which comprises:
an analysis device: the method is used for obtaining the genetic variation condition of the biomarkers in a biological sample of a subject to be evaluated, and inputting the genetic variation condition into an evaluation model for typing evaluation;
an output device: for outputting the above evaluation result.
In one embodiment, the evaluation model is established by: the evaluation model is established by the following method: obtaining a plurality of adenocarcinoma and squamous carcinoma biological samples, sequencing to obtain the gene mutation condition of the biomarker, establishing a typing model by using a random forest model, and obtaining the non-small cell lung cancer typing diagnosis model by the model with mportance = TRUE, ntree =100 and mtry = 2.
Compared with the prior art, the invention has the following beneficial effects:
the biomarker for typing the non-small cell lung cancer can be used for typing squamous cancer and adenocarcinoma in small cell lung cancer by using the combination of the biomarkers, the AUC of a typing diagnosis ROC curve is 0.700 when the minimum 5 markers are used, the AUC of the typing diagnosis ROC curve is 0.734 when the markers are further increased to 10, and the AUC of all the biomarkers can reach 0.786, so that the biomarker has excellent diagnosis capability.
For the patient with small lung cell carcinoma, the gene detection report result of the biomarker can be used for carrying out a mutual verification process on pathological results, so that the pathological diagnosis result is ensured to be correct, and the method plays an important role in the next accurate treatment.
Drawings
FIG. 1 is a ROC-AUC plot for the effect of differentiating in the validation set using all markers to build a model.
FIG. 2 is a ROC-AUC graph of the effect of differentiation in the validation set using a combination of 20 markers to establish a model.
FIG. 3 is a ROC-AUC graph of the effect of differentiation in the validation set using a 10 marker combination model.
FIG. 4 is a ROC-AUC graph of the effect of differentiation in the validation set using a combination of 5 markers to establish a model.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The reagents used in the following examples, unless otherwise specified, are all commercially available; the methods used in the following examples, unless otherwise specified, were carried out in the conventional manner.
Description of the drawings:
TCGA: all named The Cancer Genome Atlas, included data for 30+ tumors. Is The Cancer Genome map (The Cancer Genome Atlas, TCGA) project initiated by The National Cancer Institute (NCI) and The National human Genome Institute (NHGRI). Is a comprehensive and multidimensional map aiming at various cancer genomes. The fields involved include not only genomic sequencing but also transcriptome, methylation etc. epigenomic sequencing and ultimately integrative analysis and their correlation with clinical and image data.
In the invention, the adenocarcinoma patients refer to non-small cell adenocarcinoma patients which are jointly identified by 2 or more pathological experts and obtained from pathological detection results; the squamous cell carcinoma patients refer to non-small cell squamous cell carcinoma patients which are jointly identified by 2 or more pathological experts according to pathological detection results.
Example 1
Based on a public database, the method for primarily screening the variant gene markers for the non-small cell lung cancer pathological subtype typing specifically comprises the following steps:
1. and screening candidate mutation sites.
Tumor tissue whole genome sequencing data for non-small cell tumor patients were obtained from the TCGA database (https:// portal.gdc.cancer. Gov /): in the study, 561 patients with non-small cell lung cancer (286 adenocarcinoma cases and 275 squamous cell carcinoma cases) were downloaded with whole genome sequencing data, mutation sites were calculated by four different software (mutect, varscan, muse and somaticsniper), and at least two of the four mutation software were simultaneously captured to the mutation sites in the sample as candidate mutation sites.
2. And (4) screening potential markers.
Performing a differential analysis based on the adenocarcinoma cohort dataset and the squamous carcinoma cohort dataset: fisher. Test analysis was used, and the variant gene with p less than or equal to 0.05 was selected as a potential marker, see table below.
TABLE 1 potential markers
Figure BDA0003058085790000051
/>
Figure BDA0003058085790000061
Example 2
In this embodiment, the analysis and verification of the clinical sample for the obtained potential markers includes the following steps:
1. tissue sample acquisition:
374 relevant FFPE slide samples of pathology identified by relevant experts as non-small cell lung cancer (191 adenocarcinomas, 183 squamous carcinomas) were collected from river-south university.
2. Sample sequencing analysis:
FFPE tissue samples were subjected to whole genome sequencing analysis by a third party (clear biotechnology).
3. And (5) establishing a model.
3.1 model preliminary establishment.
Using all the potential markers obtained in the above example 1, the independent validation sets, that is, the tissue samples of the non-small cell lung cancer patients with 191 cases of adenocarcinoma and 183 cases of squamous carcinoma, were tested and judged, and modeling analysis (R package random forest) was performed using a random forest model, according to 6:4, repeating for 20 times, and trying by repeatedly searching model establishing conditions, wherein, the opportunity = TRUE and ntree =100 and mtry =2 are set, so that the ROC curve AUC of the model is up to 0.786, as shown in fig. 1.
3.2 the markers were screened.
By utilizing a random forest model to carry out modeling analysis, the tissue samples of 191 cases of adenocarcinoma and 183 cases of squamous carcinoma patients with non-small cell lung cancer are detected and judged, and the random forest model is utilized to carry out modeling analysis according to the ratio of 6:4, repeating for 20 times, performing pairwise correlation analysis on all MARKERs by pearson correlation, and obtaining 20 optimal combinations of MARKERs by an exhaustive random combination method, wherein the AUC of the obtained model is as high as 0.747, as shown in FIG. 2.
3.2 preference is given to markers.
By utilizing a random forest model to carry out modeling analysis, the tissue samples of 191 cases of adenocarcinoma and 183 cases of squamous carcinoma patients with non-small cell lung cancer are detected and judged, and the random forest model is utilized to carry out modeling analysis according to the ratio of 6:4, repeating for 20 times, simultaneously carrying out pairwise correlation analysis on all MARKERs through pearson correlation, obtaining 10 MARKER optimal combinations through an exhaustive random combination method, and obtaining an ROC curve AUC of the model which can be as high as 0.734, as shown in figure 3.
3.3 further preference is given to markers.
The random forest model is used for modeling analysis, tissue samples of 191 cases of adenocarcinoma and 183 cases of squamous carcinoma of non-small cell lung cancer patients are detected and judged, the random forest model is used for modeling analysis, and the mass ratio of the random forest model is as follows: 4, repeating for 20 times, performing pairwise correlation analysis on all MARKERs through pearson correlation, and obtaining 5 MARKER optimal combinations through an exhaustive random combination method, wherein the AUC of the obtained ROC curve of the model can reach 0.700, as shown in FIG. 4.
Example 3
13 samples of non-small cell lung cancer clinically judged in example 2, which were collected from river university and were different from the sample set of example 2, were analyzed using the 20 MARKER combination models established in example 2 above, and the results of the analysis were compared with the results of clinical expert judgment, and are shown in the following table.
TABLE 2 clinical verification results
Cases of disease Results of model typing Expert's judgment result
Number 1 Adenocarcinoma Adenocarcinoma
Number 2 Squamous cell carcinoma Squamous carcinoma
No. 3 Adenocarcinoma Squamous carcinoma
Number 4 Squamous carcinoma Adenocarcinoma
Number 5 Adenocarcinoma Adenocarcinoma
Number 6 Adenocarcinoma Adenocarcinoma
No. 7 Squamous carcinoma Squamous carcinoma
Number 8 Adenocarcinoma Adenocarcinoma
Number
9 Squamous carcinoma Adenocarcinoma
Number
10 Adenocarcinoma Adenocarcinoma
Number 11 Adenocarcinoma Adenocarcinoma
Number 12 Adenocarcinoma Adenocarcinoma
Number 13 Squamous carcinoma Squamous carcinoma
From the results, the biomarker of the invention is adopted to accurately judge the classification of squamous cell carcinoma or adenocarcinoma in small cell lung cancer by using the model, and the consistency with expert judgment is more than 76.9%.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (2)

1. A biomarker for the typing of squamous and adenocarcinoma in non-small cell lung cancer consisting of TP53, STK11, PTEN, NFE2L, KRAS, EGFR, CDKN2A, BRAF, TSHZ3, PIK3CA, SETD2, MUC16, RYR2, PTPR, PCLO, RP1L1, ASTN1, SPTA1, ASXL3 and XIRP2 genes.
2. Use of a reagent for detecting the expression level of the biomarker according to claim 1 in a biological sample for preparing a diagnostic reagent or a diagnostic device for the typing of squamous cell carcinoma and adenocarcinoma in non-small cell lung cancer.
CN202110505178.1A 2021-05-10 2021-05-10 Biomarker for non-small cell lung cancer typing and application thereof Active CN113186287B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110505178.1A CN113186287B (en) 2021-05-10 2021-05-10 Biomarker for non-small cell lung cancer typing and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110505178.1A CN113186287B (en) 2021-05-10 2021-05-10 Biomarker for non-small cell lung cancer typing and application thereof

Publications (2)

Publication Number Publication Date
CN113186287A CN113186287A (en) 2021-07-30
CN113186287B true CN113186287B (en) 2023-03-24

Family

ID=76988688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110505178.1A Active CN113186287B (en) 2021-05-10 2021-05-10 Biomarker for non-small cell lung cancer typing and application thereof

Country Status (1)

Country Link
CN (1) CN113186287B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2890207A1 (en) 2012-11-05 2014-05-08 Foundation Medicine, Inc. Novel ntrk1 fusion molecules and uses thereof
US10980804B2 (en) 2013-01-18 2021-04-20 Foundation Medicine, Inc. Methods of treating cholangiocarcinoma
US20240093304A1 (en) * 2020-12-30 2024-03-21 Foundation Medicine, Inc. Alk fusion genes and uses thereof
CN114295706A (en) * 2021-09-28 2022-04-08 岛津企业管理(中国)有限公司 Statistic-based pathological typing method for non-targeted non-small cell lung cancer
CN114134228A (en) * 2021-10-09 2022-03-04 复旦大学附属中山医院 Kit, system and storage medium for evaluating PI3K/Art/mTOR pathway related gene mutation and application thereof
CN113881777B (en) * 2021-11-12 2023-12-15 首都医科大学附属北京胸科医院 Kit applied to environmental pollution and cancerogenic risk assessment
CN114214409B (en) * 2021-12-23 2024-03-12 深圳康华君泰生物科技有限公司 Biomarker for esophageal carcinoma typing and application thereof
CN115595370A (en) * 2022-11-11 2023-01-13 常州国药医学检验实验室有限公司(Cn) Gene transcript marker combination for non-small cell lung cancer typing diagnosis and typing diagnosis device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015137406A1 (en) * 2014-03-12 2015-09-17 学校法人順天堂 Method for differentiating between lung squamous cell carcinoma and lung adenocarcinoma

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2806274A1 (en) * 2013-05-24 2014-11-26 AIT Austrian Institute of Technology GmbH Lung cancer diagnostic method and means
WO2015184461A1 (en) * 2014-05-30 2015-12-03 Faruki Hawazin Methods for typing of lung cancer
WO2017075784A1 (en) * 2015-11-05 2017-05-11 深圳华大基因研究院 Biomarker for detection of lung adenocarcinoma and use thereof
KR101853118B1 (en) * 2016-09-02 2018-04-30 주식회사 바이오인프라생명과학 Complex biomarker group for detecting lung cancer in a subject, lung cancer diagnostic kit using the same, method for detecting lung cancer using information on complex biomarker and computing system executing the method
CN108548929A (en) * 2018-04-11 2018-09-18 谢丽 Detect application of the articles for use of biomarker expression level in indicating cancerous state
CN112375826B (en) * 2020-12-03 2021-08-27 远见生物科技(上海)有限公司 Circular RNA composition marker for identifying non-small cell lung cancer subtype and application thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015137406A1 (en) * 2014-03-12 2015-09-17 学校法人順天堂 Method for differentiating between lung squamous cell carcinoma and lung adenocarcinoma

Also Published As

Publication number Publication date
CN113186287A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN113186287B (en) Biomarker for non-small cell lung cancer typing and application thereof
JP6817259B2 (en) Use of size and number abnormalities in plasma DNA for the detection of cancer
US20150038376A1 (en) Thyroid cancer biomarker
CN105087568B (en) One group of gene and its application for tumor cells parting
US10731224B2 (en) Enhancement of cancer screening using cell-free viral nucleic acids
CN115699205A (en) Generating cancer detection analysis sets from performance metrics
CN106282161A (en) Special capture and repeat replication low frequency DNA base variation method and application
Nair et al. Genomic profiling of bronchoalveolar lavage fluid in lung cancer
Kessler et al. Improving cancer detection and treatment with liquid biopsies and ptDNA
CN111028888A (en) Detection method of genome-wide copy number variation and application thereof
Hobbs et al. Biostatistics and bioinformatics in clinical trials
CN110964821A (en) Detection panel for predicting liver cancer metastasis mode and risk and application thereof
CN115472294B (en) Model for predicting transformation speed of small cell transformation lung adenocarcinoma patient and construction method thereof
US20240102100A1 (en) Ribosomal rnas 2'o-methylation as a novel source of biomarkers relevant for diagnosis, prognosis and therapy of cancers
CN114214409A (en) Biomarker for esophageal cancer typing and application thereof
CN115820860A (en) Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof
CN107699620A (en) Methylated genes composition and the purposes for preparing diagnosis indication Luminal Type B Bone of Breast Cancer transfering reagent boxes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant