CN113186287A - Biomarker for non-small cell lung cancer typing and application thereof - Google Patents
Biomarker for non-small cell lung cancer typing and application thereof Download PDFInfo
- Publication number
- CN113186287A CN113186287A CN202110505178.1A CN202110505178A CN113186287A CN 113186287 A CN113186287 A CN 113186287A CN 202110505178 A CN202110505178 A CN 202110505178A CN 113186287 A CN113186287 A CN 113186287A
- Authority
- CN
- China
- Prior art keywords
- lung cancer
- small cell
- biomarker
- cell lung
- typing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/10—Signal processing, e.g. from mass spectrometry [MS] or from PCR
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Genetics & Genomics (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Organic Chemistry (AREA)
- Analytical Chemistry (AREA)
- Pathology (AREA)
- Theoretical Computer Science (AREA)
- Public Health (AREA)
- Immunology (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Zoology (AREA)
- Data Mining & Analysis (AREA)
- Wood Science & Technology (AREA)
- Signal Processing (AREA)
- Oncology (AREA)
- Artificial Intelligence (AREA)
- Bioethics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Hospice & Palliative Care (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Microbiology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Biochemistry (AREA)
Abstract
The invention relates to a biomarker for non-small cell lung cancer typing and application thereof, belonging to the technical field of medical detection. The biomarker comprises at least 5 genes such as TP53, STK11, PTEN, NFE2L and KRAS. By using the biomarkers, squamous carcinoma and adenocarcinoma in small cell lung cancer are classified, when the minimum number of the markers is 5, the AUC of a classification diagnosis ROC curve is 0.700, when the markers are further increased to 10, the AUC of the classification diagnosis ROC curve is 0.734, when all the biomarkers are used, the AUC can reach 0.786, and the diagnosis capability is excellent.
Description
Technical Field
The invention relates to the technical field of medical detection, in particular to a biomarker for non-small cell lung cancer typing and application thereof.
Background
Lung cancer is a heterogeneous disease, and the main basis for the selection of the existing treatment modes of lung cancer is pathological typing and staged diagnosis. Pathotyping is generally a histological determination of its subtype: important typing is for example: small cell vs non-small cell, adenocarcinoma vs squamous cell carcinoma, and the like. Differentiation between the various morphological subtypes of lung cancer is essential in guiding patient management, as are different pathological subtypes, whose corresponding therapeutic strategies, for example: the small cell undifferentiated lung cancer has high malignancy, is easy to transfer in early stage, is sensitive to radiotherapy and chemotherapy, and is the main treatment means of non-operative treatment of systemic chemotherapy and local radiotherapy. The non-small cell lung cancer mainly comprises squamous carcinoma and adenocarcinoma, the non-small cell lung cancer of stage I and II is mainly selected for operation, and can be cured by combining with postoperative adjuvant radiotherapy and chemotherapy. For non-small cell lung cancer, the growth rate of tumor cells of adenocarcinoma is high, most of the adenocarcinoma cells have metastasis in the early stage, and the metastasis is mainly hematogenous metastasis, so the adenocarcinoma cells are more sensitive to chemotherapeutic drugs, and the reflex treatment effect is poor, so the modes of surgery, chemotherapy, immunization, targeted treatment and the like are often selected. The squamous carcinoma is relatively slow, mostly locally invades in the early stage, mainly takes a lymph node metastasis way as the main way, and has later distant metastasis, so that the radiotherapy sensitivity to the squamous carcinoma is higher, and the radiotherapy method generally adopts the modes of operation, radiation, immunotherapy and the like.
However, most of the conventional pathological judgment of lung cancer only depends on the pathological diagnosis result of histology, but the pathological subtype is wrongly typed due to personal experience of doctors, various reasons and the like, so that the error rate is high, and other related quality control means are not available.
In addition to conventional chemoradiotherapy approaches, lung cancer treatment is currently advancing into the era of precision medicine: conventional imaging diagnosis and pathological diagnosis have not satisfied the demand of precise medical treatment. At present, the importance of accurate diagnosis of lung cancer, in addition to traditional diagnostic methods, in the molecular level, can detect specific therapeutic target markers, i.e., molecular markers such as tumor-specific gene mutations and characteristic proteins, RNAs, metabolites, etc., is increasingly prominent.
The detection of gene variation has become one of the necessary procedures for conventional treatment, among which the most notable ones, such as the detection of Epidermal Growth Factor Receptor (EGFR), lung cancer patients with dysfunction of this protein (usually caused by gene mutation) show a significant response to a drug that specifically targets the EGFR protein, and the detection of related gene variation of this target has entered the relevant guidelines for lung cancer treatment.
Moreover, the detection of the drug target related variant gene is a necessary item for lung cancer patients, especially for patients with advanced stage. While the limited tissue samples and the need for ever increasing assessment of therapeutic targeting markers greatly increases the current diagnostic needs, studies of histological diagnostic reproducibility have shown intra-and inter-pathologist variability in decision: the wrong pathological judgment result, poorly differentiated tumors, contradictory immunohistochemical results and the like present challenges to the precise medical accuracy of the current lung cancer. Therefore, there is a need for a reliable means for determining the pathological subtype of lung cancer.
Disclosure of Invention
In view of the above, it is necessary to provide a biomarker for typing non-small cell lung cancer, which can be obtained by obtaining a biomarker for typing adenocarcinoma and squamous cell carcinoma in non-small cell lung cancer through different expression profiles of variant genes in adenocarcinoma vs squamous cell carcinoma of non-small cell lung cancer, and to provide a method for discriminating two pathological subtypes based on molecular level.
A biomarker for non-small cell lung cancer typing comprising: 15, at least one of genes including NFE2L2, TP53, CDKN2A, PTEN, MUC16, PIK3CA, RYR2, ATP10A, SLCO1B1, RASA1, ZFH 4, KMT2D, KRAS, EGFR, ANK1, BRAF, PCLO, PTPRD, ASTN1, ADGRG4, NOTCH4, FAT3, PCDH15, ROBO 15, KEAP 15, TENM 15, TSHZ 15, SETBP 15, CACNA1 15, XIRP 15, ASXL 15, ZNF804 15, NALCN, FBN 15, SPTA 15, MUC 15, RBM15, SENN 15, MX3672, ST6GAL 15, GALRP 1L 15, HEPM, FLG 15, SANTC 15, SANTCABG 15, SANTC 15, CAMTB 15, CAMTBF 15, CABG 15, CADG 15, CABG 15, CANTC.
The present inventors have conducted extensive studies on the gene pattern of non-small cell lung cancer, and obtained genes related to squamous cell carcinoma with higher mutation frequency than adenocarcinoma (NFE2L, TP, CDKN2, PTEN, MUC, PIK3CA, RYR, ATP10, SLCO1B, RASA, HXZF, KMT 2), genes related to squamous cell carcinoma with higher mutation frequency in adenocarcinoma (KRAS, EGFR, ANK, BRAF, PCLO, PTPRD, ASTN, ADGRG, NOTCH, FAT, PCDH, ROBO, KEAP, TENM, TENHZ, SECW, CACNA1, TBXRP, ASXL, ZNF804, NALCN, FBN, SPTA, TSC, RBM, SETD, MXRA, ST6GAL, NN1L, DPPM, FLG, HECW, COL12A, COL14A, PRF, ARCA, SMARCA, SACK, SARB, GRM1, SYNE2, OR8H2, TEP1, CCDC178), and mutant genes only found in adenocarcinoma (STK11, NID1, DCSTAMP, STAG2, MET, BCL11B, ZNF226, NTRK2, NEDD4, BTK, TMTC3, RBM15, KLK2, ITK, CMKLR1), and establishing a distinguishing judgment model for the genes.
In one embodiment, the biomarker comprises: TP53, STK11, PTEN, NFE2L and KRAS genes. To be provided with
For molecular diagnosis, on the premise of ensuring diagnosis sensitivity and specificity, the number of diagnosis markers is reduced as much as possible, the complexity of operation can be effectively reduced, the repeatability and reliability of detection results are ensured, and the economic burden of patients can be greatly reduced. The biomarker provides combined application of 5 biomarkers, the application cost is reduced, and the AUC value of the RCO curve of the model for typing judgment can reach 0.700.
In one embodiment, the biomarker comprises: TP53, STK11, PTEN, NFE2L, KRAS, EGFR, CDKN2A, BRAF, TSHZ3 and PIK3CA genes. And (3) typing is carried out by adopting the 10 biomarkers, and the AUC value of the RCO curve of the obtained judgment model can reach 0.734.
In one embodiment, the biomarker comprises: TP53, STK11, PTEN, NFE2L, KRAS, EGFR, CDKN2A, BRAF, TSHZ3, PIK3CA, SETD2, MUC16, RYR2, PTPR, PCLO, RP1L1, ASTN1, SPTA1, ASXL3, and XIRP 2. And (3) typing is carried out by adopting the 20 biomarkers, and the AUC value of the RCO curve of the obtained judgment model can reach 0.747.
In one embodiment, the biomarker comprises: NFE2L, TP, CDKN2, PTEN, MUC, PIK3CA, RYR, ATP10, SLCO1B, RASA, ZFLX, KMT2, KRAS, EGFR, ANK, BRAF, PCLO, PTPRD, ASTN, ADGRG, NOTCH, FAT, PCDH, ROBO, KEAP, TENM, TSHZ, SETBP, CACNA1, XIRP, ASXL, ZNF804, NALCN, FBN, SPTA, MUC, RBM, SETD, MXRA, ST6GAL, RP1L, ASPM, FLG, HECW, COL12A, COL14A, AFF, SMARCA, SDK, EPHB, UBA, MGSF, MAM, PCDH11, NN 5A, ALK, VCN, ATM, EPAN, PDF 536, EPHA, PTPRRN, ZRN, DRGA, DRHB, SARB, SANTC, SANTCP, SANTC, SANTCP, PRNG, SANTC, SANTCK, SANTCP, SANTC, TACK, SANTC, TACK, SANTC, SANTCP, SANTC, SANTCK, SANTCP, SANTC, SANTER, SANTC, SANTCK, SANTC, SANTER, SANTC, SANTER, SANTCK, SANTER, SANTC, SANTER, SANTC, SANTER, SANTCK, SANTER, SANTC, SANTER, SANT.
And (3) typing is carried out by adopting all the biomarkers, and the AUC value of the RCO curve of the obtained judgment model can reach 0.786.
The invention also discloses application of the biomarker in the diagnosis of adenocarcinoma and squamous cell carcinoma in non-small cell lung cancer patients.
It will be appreciated that in detecting the above-described biomarkers (i.e., genetic variations), various methods that can be used in the art to detect genetic variations can be employed, for example:
1) sequencing technology comprises second generation sequencing technology and third generation sequencing technology. The principle of the second-generation sequencing technology is Massively Parallel Sequencing (MPS), and in some embodiments, the second-generation sequencing technology may be: 1. based on DNA polymerase Sequencing by synthesis technology (SBS), representative companies are Illumina (reversible terminator Sequencing), Thermo Fisher/Life Technologies (Ion Torrent), GenapSys, Roche diagnostics (454 pyrosequencing), etc.; 2. based on the DNA ligase Ligation Sequencing technology (SBL), the representative companies are Huada gene/Complete Genomics (composite Probe-Anchor molecule Ligation, cPAL), Thermo Fisher/Applied Biosystems (Sequencing by Oligonucleotide Ligation and Detection, SOLID), etc. The third generation sequencing technology is a single molecule sequencing technology, and in some embodiments, the third generation sequencing technology can be: single molecule real-time fluorescence Sequencing technology (SMRT, Pacific Biosciences), Nanopore Sequencing technology [ Oxford Nanopore Technologies (ONT), Genia Technologies and Stratos Genomics (roche diagnostics) ], nanometera Sequencing technology (Nanogate, Quantum Biosystems), Sequencing by hydrolysis of DNA-based techniques (Sequencing by de-synthesis, pyrophosporolysis, Base4), and any other Sequencing method known in the art.
2) Microarray hybridization techniques: such as SNP microarrays and the like;
3) PCR-based detection technology for mutation sites: for example, KASP typing method, Ligase Detection Reaction (LDR) typing method, Taqman probe method and the like.
In one embodiment, the biomarker is a biomarker for blood and/or tissue detection.
It will be appreciated that the above assay is equally applicable to other biological sample types. However, the sample is easy to obtain, and the application is wide.
The invention also discloses application of the reagent for detecting the biomarkers in the biological sample in preparing a non-small cell lung cancer typing diagnostic reagent or diagnostic equipment.
It will be appreciated that the reagent is, for example, a kit, and the device is, for example, an integrated detection device, and may be adapted to the needs of a particular application.
The invention also discloses a detection kit for non-small cell lung cancer typing diagnosis, which comprises a reagent for detecting the biomarkers of the claims.
The invention also discloses a non-small cell lung cancer typing diagnosis system, which comprises:
an analysis device: the method is used for obtaining the genetic variation condition of the biomarkers in a biological sample of a subject to be evaluated, and inputting the genetic variation condition into an evaluation model for typing evaluation;
an output device: for outputting the above evaluation result.
In one embodiment, the evaluation model is established by: the evaluation model is established by the following method: obtaining a plurality of adenocarcinoma and squamous carcinoma biological samples, sequencing to obtain the gene mutation condition of the biomarker, establishing a typing model by using a random forest model, wherein the model has an mportance of TRUE and an ntree of 100 and an mtry of 2, and obtaining the non-small cell lung cancer typing diagnosis model.
Compared with the prior art, the invention has the following beneficial effects:
the biomarker for typing the non-small cell lung cancer can be used for typing squamous cancer and adenocarcinoma in small cell lung cancer by using the combination of the biomarkers, the AUC of a typing diagnosis ROC curve is 0.700 when the minimum 5 markers are used, the AUC of the typing diagnosis ROC curve is 0.734 when the markers are further increased to 10, and the AUC of all the biomarkers can reach 0.786, so that the biomarker has excellent diagnosis capability.
For the patient with small lung cell carcinoma, the gene detection report result of the biomarker can be used for carrying out a mutual verification process on pathological results, so that the pathological diagnosis result is ensured to be correct, and the method plays an important role in the next accurate treatment.
Drawings
FIG. 1 is a ROC-AUC graph of the effect of differentiating in the validation set using all markers to build a model.
FIG. 2 is a ROC-AUC graph of the effect of differentiation in the validation set using a combination of 20 markers to establish a model.
FIG. 3 is a ROC-AUC graph of the effect of differentiation in the validation set using a 10 marker combination model.
FIG. 4 is a ROC-AUC graph of the effect of differentiation in the validation set using a combination of 5 markers to establish a model.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The reagents used in the following examples, unless otherwise specified, are all commercially available; the methods used in the following examples, unless otherwise specified, were carried out in the conventional manner.
Description of the drawings:
TCGA: all named The Cancer Genome Atlas, included data for 30+ tumors. Is The Cancer Genome map (The Cancer Genome Atlas, TCGA) project initiated by The National Cancer Institute (NCI) and The National human Genome Institute (NHGRI). Is a comprehensive and multidimensional map aiming at various cancer genomes. The fields involved include not only genomic sequencing but also transcriptome, methylation etc. epigenomic sequencing and ultimately integrative analysis and their correlation with clinical and image data.
In the invention, the adenocarcinoma patients refer to non-small cell adenocarcinoma patients which are jointly identified by 2 or more pathological experts and obtained from pathological detection results; the squamous cell carcinoma patients refer to non-small cell squamous cell carcinoma patients which are jointly identified by 2 or more pathological experts according to pathological detection results.
Example 1
Based on a public database, the method for primarily screening the variant gene markers for the pathological subtype typing of the non-small cell lung cancer specifically comprises the following steps:
1. screening candidate mutation sites.
Tumor tissue whole genome sequencing data for non-small cell tumor patients were obtained from the TCGA database (https:// portal.gdc.cancer. gov /): in the study, 561 patients with non-small cell lung cancer (286 adenocarcinoma cases and 275 squamous cell carcinoma cases) were downloaded with whole genome sequencing data, mutation sites were calculated by four different software (mutect, varscan, muse and somaticsniper), and at least two of the four mutation software were simultaneously captured to the mutation sites in the sample as candidate mutation sites.
2. And (4) screening potential markers.
Performing a differential analysis based on the adenocarcinoma cohort dataset and the squamous carcinoma cohort dataset: fisher. test analysis was used, and the variant gene with p less than or equal to 0.05 was selected as a potential marker, see table below.
TABLE 1 potential markers
Example 2
In this embodiment, the analysis and verification of the clinical sample for the obtained potential markers includes the following steps:
1. obtaining a tissue sample:
374 relevant FFPE slide samples of pathology identified by relevant experts as non-small cell lung cancer (191 adenocarcinomas, 183 squamous carcinomas) were collected from river-south university.
2. Sample sequencing analysis:
FFPE tissue samples were subjected to whole genome sequencing analysis by a third party (clear biotechnology).
3. And (5) establishing a model.
3.1 model preliminary establishment.
Using all the potential markers obtained in the above example 1, the independent validation sets, that is, the tissue samples of the non-small cell lung cancer patients with 191 cases of adenocarcinoma and 183 cases of squamous carcinoma, were tested and judged, and modeling analysis (R package random forest) was performed using a random forest model, according to 6: 4, repeating for 20 times, and trying by repeatedly searching model establishing conditions, wherein, the import is TRUE, the ntree is 100, and the mtry is 2, so as to obtain an ROC curve AUC of the model as high as 0.786, as shown in fig. 1.
3.2 the markers were screened.
By utilizing a random forest model to carry out modeling analysis, the tissue samples of 191 cases of adenocarcinoma and 183 cases of squamous carcinoma patients with non-small cell lung cancer are detected and judged, and the random forest model is utilized to carry out modeling analysis according to the ratio of 6: 4, repeating for 20 times, performing pairwise correlation analysis on all MARKERs by pearson correlation, and obtaining 20 optimal combinations of MARKERs by an exhaustive random combination method, wherein the AUC of the obtained model is as high as 0.747, as shown in FIG. 2.
3.2 preference is given to markers.
By utilizing a random forest model to carry out modeling analysis, the tissue samples of 191 cases of adenocarcinoma and 183 cases of squamous carcinoma patients with non-small cell lung cancer are detected and judged, and the random forest model is utilized to carry out modeling analysis according to the ratio of 6: 4, repeating for 20 times, simultaneously carrying out pairwise correlation analysis on all MARKERs through pearson correlation, obtaining 10 MARKER optimal combinations through an exhaustive random combination method, and obtaining an ROC curve AUC of the model which can be as high as 0.734, as shown in figure 3.
3.3 further preference is given to markers.
By utilizing a random forest model to carry out modeling analysis, the tissue samples of 191 cases of adenocarcinoma and 183 cases of squamous carcinoma patients with non-small cell lung cancer are detected and judged, and the random forest model is utilized to carry out modeling analysis according to the ratio of 6: 4, repeating for 20 times, performing pairwise correlation analysis on all MARKERs through pearson correlation, and obtaining 5 MARKER optimal combinations through an exhaustive random combination method, wherein the AUC of the obtained ROC curve of the model can reach 0.700, as shown in FIG. 4.
Example 3
13 samples of non-small cell lung cancer clinically judged in example 2, which were collected from river university and were different from the sample set of example 2, were analyzed using the 20 MARKER combination models established in example 2 above, and the results of the analysis were compared with the results of clinical expert judgment, and are shown in the following table.
TABLE 2 clinical verification results
Cases of disease | Results of model typing | Expert's judgment |
Number 1 | Adenocarcinoma | Adenocarcinoma |
Number 2 | Squamous carcinoma | Squamous carcinoma |
No. 3 | Adenocarcinoma | Squamous carcinoma |
Number 4 | Squamous carcinoma | Adenocarcinoma |
Number 5 | Adenocarcinoma | Adenocarcinoma |
Number 6 | Adenocarcinoma | Adenocarcinoma |
No. 7 | Squamous carcinoma | Squamous carcinoma |
Number 8 | | Adenocarcinoma |
Number | ||
9 | Squamous | Adenocarcinoma |
Number | ||
10 | Adenocarcinoma | Adenocarcinoma |
Number 11 | Adenocarcinoma | Adenocarcinoma |
Number 12 | Adenocarcinoma | Adenocarcinoma |
Number 13 | Squamous carcinoma | Squamous carcinoma |
From the results, the biomarker of the invention is adopted to accurately judge the classification of squamous cell carcinoma or adenocarcinoma in small cell lung cancer by using the model, and the consistency with expert judgment is more than 76.9%.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A biomarker for typing non-small cell lung cancer, comprising: 15, at least one of genes including NFE2L2, TP53, CDKN2A, PTEN, MUC16, PIK3CA, RYR2, ATP10A, SLCO1B1, RASA1, ZFH 4, KMT2D, KRAS, EGFR, ANK1, BRAF, PCLO, PTPRD, ASTN1, ADGRG4, NOTCH4, FAT3, PCDH15, ROBO 15, KEAP 15, TENM 15, TSHZ 15, SETBP 15, CACNA1 15, XIRP 15, ASXL 15, ZNF804 15, NALCN, FBN 15, SPTA 15, MUC 15, RBM15, SENN 15, MX3672, ST6GAL 15, GALRP 1L 15, HEPM, FLG 15, SANTC 15, SANTCABG 15, SANTC 15, CAMTB 15, CAMTBF 15, CABG 15, CADG 15, CABG 15, CANTC.
2. The biomarker for non-small cell lung cancer typing according to claim 1, comprising: TP53, STK11, PTEN, NFE2L and KRAS genes.
3. The biomarker for non-small cell lung cancer typing according to claim 1, comprising: TP53, STK11, PTEN, NFE2L, KRAS, EGFR, CDKN2A, BRAF, TSHZ3 and PIK3CA genes.
4. The biomarker for non-small cell lung cancer typing according to claim 1, comprising: TP53, STK11, PTEN, NFE2L, KRAS, EGFR, CDKN2A, BRAF, TSHZ3, PIK3CA, SETD2, MUC16, RYR2, PTPR, PCLO, RP1L1, ASTN1, SPTA1, ASXL3, and XIRP 2.
5. Use of a biomarker according to any of claims 1 to 4 in the differential diagnosis of adenocarcinoma and squamous cell carcinoma in a patient with non-small cell lung cancer.
6. Use according to claim 5, wherein the biomarker is used as a biomarker for blood and/or tissue detection.
7. Use of a reagent for detecting a biomarker according to any of claims 1 to 4 in a biological sample for the preparation of a non-small cell lung cancer typing diagnostic reagent or diagnostic device.
8. A test kit for the differential diagnosis of non-small cell lung cancer comprising reagents for detecting the biomarkers of any one of claims 1-4.
9. A system for the differential diagnosis of non-small cell lung cancer, comprising:
an analysis device: the method is used for obtaining the genetic variation condition of the biomarker of any one of claims 1 to 4 in a biological sample of a subject to be evaluated, and inputting the genetic variation condition into an evaluation model for typing evaluation;
an output device: for outputting the above evaluation result.
10. The system for non-small cell lung cancer differential diagnosis according to claim 9, wherein the assessment model is established by: obtaining a plurality of adenocarcinoma and squamous carcinoma biological samples, sequencing to obtain the gene mutation condition of the biomarker, establishing a typing model by using a random forest model, wherein the model has an mportance of TRUE and an ntree of 100 and an mtry of 2, and obtaining the non-small cell lung cancer typing diagnosis model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110505178.1A CN113186287B (en) | 2021-05-10 | 2021-05-10 | Biomarker for non-small cell lung cancer typing and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110505178.1A CN113186287B (en) | 2021-05-10 | 2021-05-10 | Biomarker for non-small cell lung cancer typing and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113186287A true CN113186287A (en) | 2021-07-30 |
CN113186287B CN113186287B (en) | 2023-03-24 |
Family
ID=76988688
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110505178.1A Active CN113186287B (en) | 2021-05-10 | 2021-05-10 | Biomarker for non-small cell lung cancer typing and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113186287B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113881777A (en) * | 2021-11-12 | 2022-01-04 | 首都医科大学附属北京胸科医院 | Kit applied to environmental pollution carcinogenic risk assessment |
CN114134228A (en) * | 2021-10-09 | 2022-03-04 | 复旦大学附属中山医院 | Kit, system and storage medium for evaluating PI3K/Art/mTOR pathway related gene mutation and application thereof |
CN114214409A (en) * | 2021-12-23 | 2022-03-22 | 深圳康华君泰生物科技有限公司 | Biomarker for esophageal cancer typing and application thereof |
CN114295706A (en) * | 2021-09-28 | 2022-04-08 | 岛津企业管理(中国)有限公司 | Statistic-based pathological typing method for non-targeted non-small cell lung cancer |
WO2022147163A1 (en) * | 2020-12-30 | 2022-07-07 | Foundation Medicine, Inc. | Alk fusion genes and uses thereof |
CN115595370A (en) * | 2022-11-11 | 2023-01-13 | 常州国药医学检验实验室有限公司(Cn) | Gene transcript marker combination for non-small cell lung cancer typing diagnosis and typing diagnosis device |
US11578372B2 (en) | 2012-11-05 | 2023-02-14 | Foundation Medicine, Inc. | NTRK1 fusion molecules and uses thereof |
US11771698B2 (en) | 2013-01-18 | 2023-10-03 | Foundation Medicine, Inc. | Methods of treating cholangiocarcinoma |
CN117535402A (en) * | 2023-12-28 | 2024-02-09 | 湖南家辉生物技术有限公司 | Application of FRMPD4 gene mutant as detection target, detection reagent with FRMPD4 gene mutant and detection kit |
CN117535402B (en) * | 2023-12-28 | 2024-05-31 | 湖南家辉生物技术有限公司 | Application of FRMPD gene mutant as detection target, detection reagent with FRMPD gene mutant and detection kit |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015137406A1 (en) * | 2014-03-12 | 2015-09-17 | 学校法人順天堂 | Method for differentiating between lung squamous cell carcinoma and lung adenocarcinoma |
US20160109453A1 (en) * | 2013-05-24 | 2016-04-21 | Ait Austrian Institute Of Technology Gmbh | Lung Cancer Diagnostic Method and Means |
US20170114416A1 (en) * | 2014-05-30 | 2017-04-27 | The University Of North Carolina At Chapel Hill | Methods for typing of lung cancer |
WO2017075784A1 (en) * | 2015-11-05 | 2017-05-11 | 深圳华大基因研究院 | Biomarker for detection of lung adenocarcinoma and use thereof |
CN107796942A (en) * | 2016-09-02 | 2018-03-13 | 生命基础公司 | For the compound bio mark group of pulmonary cancer diagnosis, pulmonary cancer diagnosis kit, method and computing system using its information |
CN108548929A (en) * | 2018-04-11 | 2018-09-18 | 谢丽 | Detect application of the articles for use of biomarker expression level in indicating cancerous state |
CN112375826A (en) * | 2020-12-03 | 2021-02-19 | 远见生物科技(上海)有限公司 | Circular RNA composition marker for identifying non-small cell lung cancer subtype and application thereof |
-
2021
- 2021-05-10 CN CN202110505178.1A patent/CN113186287B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160109453A1 (en) * | 2013-05-24 | 2016-04-21 | Ait Austrian Institute Of Technology Gmbh | Lung Cancer Diagnostic Method and Means |
WO2015137406A1 (en) * | 2014-03-12 | 2015-09-17 | 学校法人順天堂 | Method for differentiating between lung squamous cell carcinoma and lung adenocarcinoma |
US20170114416A1 (en) * | 2014-05-30 | 2017-04-27 | The University Of North Carolina At Chapel Hill | Methods for typing of lung cancer |
WO2017075784A1 (en) * | 2015-11-05 | 2017-05-11 | 深圳华大基因研究院 | Biomarker for detection of lung adenocarcinoma and use thereof |
CN107796942A (en) * | 2016-09-02 | 2018-03-13 | 生命基础公司 | For the compound bio mark group of pulmonary cancer diagnosis, pulmonary cancer diagnosis kit, method and computing system using its information |
CN108548929A (en) * | 2018-04-11 | 2018-09-18 | 谢丽 | Detect application of the articles for use of biomarker expression level in indicating cancerous state |
CN112375826A (en) * | 2020-12-03 | 2021-02-19 | 远见生物科技(上海)有限公司 | Circular RNA composition marker for identifying non-small cell lung cancer subtype and application thereof |
Non-Patent Citations (2)
Title |
---|
BINGJI CAO等: "Use of four genes in exosomes as biomarkers for the identification of lung adenocarcinoma and lung squamous cell carcinoma", 《ONCOL LETT》 * |
YANYANWU等: "Identification of subtype specific biomarkers of clear cell renal cell carcinoma using random forest and greedy algorithm", 《BIOSYSTEMS》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11578372B2 (en) | 2012-11-05 | 2023-02-14 | Foundation Medicine, Inc. | NTRK1 fusion molecules and uses thereof |
US11771698B2 (en) | 2013-01-18 | 2023-10-03 | Foundation Medicine, Inc. | Methods of treating cholangiocarcinoma |
WO2022147163A1 (en) * | 2020-12-30 | 2022-07-07 | Foundation Medicine, Inc. | Alk fusion genes and uses thereof |
CN114295706A (en) * | 2021-09-28 | 2022-04-08 | 岛津企业管理(中国)有限公司 | Statistic-based pathological typing method for non-targeted non-small cell lung cancer |
CN114134228A (en) * | 2021-10-09 | 2022-03-04 | 复旦大学附属中山医院 | Kit, system and storage medium for evaluating PI3K/Art/mTOR pathway related gene mutation and application thereof |
CN114134228B (en) * | 2021-10-09 | 2024-05-03 | 复旦大学附属中山医院 | Kit, system and storage medium for evaluating PI3K/Akt/mTOR pathway related gene mutation and application thereof |
CN113881777A (en) * | 2021-11-12 | 2022-01-04 | 首都医科大学附属北京胸科医院 | Kit applied to environmental pollution carcinogenic risk assessment |
CN113881777B (en) * | 2021-11-12 | 2023-12-15 | 首都医科大学附属北京胸科医院 | Kit applied to environmental pollution and cancerogenic risk assessment |
CN114214409A (en) * | 2021-12-23 | 2022-03-22 | 深圳康华君泰生物科技有限公司 | Biomarker for esophageal cancer typing and application thereof |
CN114214409B (en) * | 2021-12-23 | 2024-03-12 | 深圳康华君泰生物科技有限公司 | Biomarker for esophageal carcinoma typing and application thereof |
CN115595370A (en) * | 2022-11-11 | 2023-01-13 | 常州国药医学检验实验室有限公司(Cn) | Gene transcript marker combination for non-small cell lung cancer typing diagnosis and typing diagnosis device |
CN117535402A (en) * | 2023-12-28 | 2024-02-09 | 湖南家辉生物技术有限公司 | Application of FRMPD4 gene mutant as detection target, detection reagent with FRMPD4 gene mutant and detection kit |
CN117535402B (en) * | 2023-12-28 | 2024-05-31 | 湖南家辉生物技术有限公司 | Application of FRMPD gene mutant as detection target, detection reagent with FRMPD gene mutant and detection kit |
Also Published As
Publication number | Publication date |
---|---|
CN113186287B (en) | 2023-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113186287B (en) | Biomarker for non-small cell lung cancer typing and application thereof | |
US20210025011A1 (en) | Methylation markers and targeted methylation probe panel | |
US20150038376A1 (en) | Thyroid cancer biomarker | |
TWI814753B (en) | Models for targeted sequencing | |
CN111640508B (en) | Method and application of pan-tumor targeted drug sensitivity state assessment model constructed based on high-throughput sequencing data and clinical phenotypes | |
CN111863137B (en) | Complex disease state evaluation method based on high-throughput sequencing data and clinical phenotype construction and application | |
CN111863126B (en) | Method for constructing colorectal tumor state evaluation model and application | |
CN109680049A (en) | A kind of method and its application based on the dissociative DNA in blood high-flux sequence analysis affiliated individual physiological state of cfDNA | |
EP3658684B1 (en) | Enhancement of cancer screening using cell-free viral nucleic acids | |
CN108021788B (en) | Method and device for extracting biomarkers based on deep sequencing data of cell free DNA | |
CN111816315B (en) | Pancreatic duct cancer state assessment model construction method and application | |
Jung et al. | Advances in the assessment of minimal residual disease in mantle cell lymphoma | |
CN106282161A (en) | Special capture and repeat replication low frequency DNA base variation method and application | |
WO2021150990A1 (en) | Small rna disease classifiers | |
Bielo et al. | Variant allele frequency: a decision-making tool in precision oncology? | |
CN115472294B (en) | Model for predicting transformation speed of small cell transformation lung adenocarcinoma patient and construction method thereof | |
RU2795410C2 (en) | Biomarker panel and methods for detecting microsatellite instability in various types of cancer | |
RU2795410C9 (en) | Biomarker panel and methods for detecting microsatellite instability in various types of cancer | |
US20230416844A1 (en) | Methods To Analyze Methylomes In Tumor And Plasma Cell-Free DNA | |
Zhu et al. | Genome-wide Discovery of MicroRNA Biomarkers for Cancer Precision Medicine | |
Mellert et al. | Liquid biopsy and droplet digital PCR offer improvements for lung cancer testing. | |
CN114214409A (en) | Biomarker for esophageal cancer typing and application thereof | |
CN117831618A (en) | Method for analyzing motif based on three-generation sequencing data and biological sample grouping system | |
Al Mana et al. | Performance of a rapid digital PCR test for the detection of non-small cell lung cancer (NSCLC) variants tested at the University of Minnesota-ARDL |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |