CN118006776A - Early lung adenocarcinoma marker, screening method thereof, diagnosis model and diagnosis device - Google Patents

Early lung adenocarcinoma marker, screening method thereof, diagnosis model and diagnosis device Download PDF

Info

Publication number
CN118006776A
CN118006776A CN202410114554.8A CN202410114554A CN118006776A CN 118006776 A CN118006776 A CN 118006776A CN 202410114554 A CN202410114554 A CN 202410114554A CN 118006776 A CN118006776 A CN 118006776A
Authority
CN
China
Prior art keywords
lung adenocarcinoma
early lung
marker
early
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410114554.8A
Other languages
Chinese (zh)
Inventor
张靖
陈军歌
樊瑜波
孙婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202410114554.8A priority Critical patent/CN118006776A/en
Publication of CN118006776A publication Critical patent/CN118006776A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Epidemiology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Hospice & Palliative Care (AREA)
  • Primary Health Care (AREA)
  • Oncology (AREA)
  • Microbiology (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention provides an early lung adenocarcinoma marker, a screening method thereof, a diagnosis model and a diagnosis device, and belongs to the technical field of bioinformatics and biological detection, wherein the early lung adenocarcinoma marker comprises at least one of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2). The early lung adenocarcinoma marker provided by the invention has higher specificity for lung adenocarcinoma diagnosis, has lower diagnosis cost by adopting the early lung adenocarcinoma marker, reduces the radiation exposure risk by adopting the LDCT method, and can improve the effectiveness and reliability of early lung adenocarcinoma diagnosis by applying the early lung adenocarcinoma marker on the basis of minimally invasive.

Description

Early lung adenocarcinoma marker, screening method thereof, diagnosis model and diagnosis device
Technical Field
The invention relates to the technical fields of bioinformatics and biological detection, in particular to an early lung adenocarcinoma marker, a screening method thereof, a diagnosis model and a diagnosis device.
Background
Lung cancer is one of the most common cancers and one of the leading causes of cancer-related death worldwide. Among them, lung adenocarcinoma (LUAD) is the most common histological type of lung cancer. Great progress has been made in the treatment of lung adenocarcinoma at present, including surgical therapy, radiation therapy, chemotherapy, targeted therapy, immunotherapy, etc., but the 5-year overall survival rate (OS) of lung adenocarcinoma is about 18%. Early stage lung adenocarcinoma is not obvious in clinical manifestation, and early stage screening is an effective method for effectively improving survival rate of lung adenocarcinoma.
With the increasing range of applications of high resolution Computed Tomography (CT) and Low Dose Computed Tomography (LDCT) screening, the detection rate of early stage lung cancer is increasing. The high false positive rate, radiation exposure and high cost of LDCT limit its application. Although several blood-based detection methods have been developed to aid in the early detection of lung cancer, there is still a lack of noninvasive and reliable early lung cancer detection methods and biomarkers. Therefore, it is necessary to establish an effective, reliable, minimally invasive early LUAD diagnostic method.
Disclosure of Invention
The invention solves the problem of providing an early lung adenocarcinoma marker, a screening method, a diagnosis model and a diagnosis device thereof, and the effectiveness and the reliability of early lung adenocarcinoma detection are improved through the early lung adenocarcinoma marker.
In a first aspect, the invention provides an early lung adenocarcinoma marker comprising at least one of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2).
Preferably, the early lung adenocarcinoma markers include FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2).
Preferably, the early lung adenocarcinoma marker is PE (18:0/18:1).
The early lung adenocarcinoma marker provided by the invention has higher specificity for lung adenocarcinoma diagnosis, has lower diagnosis cost by adopting the early lung adenocarcinoma marker, reduces the radiation exposure risk by adopting the LDCT method, and can improve the effectiveness and reliability of early lung adenocarcinoma diagnosis by applying the early lung adenocarcinoma marker on the basis of minimally invasive.
In a second aspect, the present invention provides an early lung adenocarcinoma diagnostic model based on the early lung adenocarcinoma markers as described above, comprising:
wherein LSRscore represents a model score, and the higher the model score is, the higher the probability of suffering from lung adenocarcinoma is, and FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2) respectively represent the concentration of the corresponding marker in the sample to be tested.
The early lung adenocarcinoma marker obtained by screening further obtains an early lung adenocarcinoma diagnosis model, and on the basis of fully utilizing the detection results of a plurality of early lung adenocarcinoma markers, the accuracy of early lung adenocarcinoma diagnosis is obviously improved, so that the effectiveness and reliability of early lung adenocarcinoma diagnosis are improved.
In a third aspect, the present invention provides a screening method for an early lung adenocarcinoma marker, for screening the early lung adenocarcinoma marker as described above, comprising the steps of:
s1, respectively collecting plasma samples of early lung adenocarcinoma patients and healthy samples;
Step S2, performing non-target lipidomic measurement on the plasma sample of the early lung adenocarcinoma patient and the healthy sample to obtain a first measurement result and a second measurement result respectively;
s3, performing multivariate statistical analysis on the first measurement result and the second measurement result, and screening to obtain potential markers with obvious differential expression in the first measurement result and the second measurement result;
and S4, adopting a big data analysis model to further screen the potential marker to obtain the early lung adenocarcinoma marker.
Preferably, the step S4 includes:
Further analyzing the potential marker by adopting a XGBoost analysis model, a random forest analysis model and a Lasso regression analysis model to obtain a first analysis result, a second analysis result and a third analysis result;
And obtaining the early lung adenocarcinoma marker according to the first analysis result, the second analysis result and the third analysis result.
Preferably, in the step S4, the big data analysis model is adopted to further screen the potential markers, so as to obtain a marker screening result, and a plasma-targeted lipid quantitative verification method and a tissue-targeted lipid quantitative verification method are adopted to effectively verify the marker screening result, so as to obtain the early lung adenocarcinoma markers.
Preferably, in step S2, non-target lipidomic measurements are performed on the plasma sample using UHPLC-QE-MS.
The invention provides a screening method of early lung adenocarcinoma markers, which is characterized in that plasma samples of early lung adenocarcinoma patients and healthy samples are respectively collected, a first measurement result and a second measurement result are respectively obtained by non-target lipidomic screening, and the early lung adenocarcinoma markers are further obtained by multi-element statistical analysis and a big data analysis model. The screening accuracy is high, the time consumption is short, the reliability of the screening result of the early lung adenocarcinoma markers is obviously improved, and a foundation is laid for developing early lung adenocarcinoma marker diagnosis strategies.
In a fourth aspect, the present invention provides an early lung adenocarcinoma diagnostic apparatus comprising:
a detection unit for detecting the concentration of an early lung adenocarcinoma marker in a sample to be detected, wherein the early lung adenocarcinoma marker comprises at least one of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2);
The processing unit is used for judging the probability of the lung adenocarcinoma of the sample to be tested according to the concentration of the early lung adenocarcinoma marker;
and the result output unit is used for outputting the judging result of the processing unit.
Preferably, the processing unit is configured to obtain a model score according to an early lung adenocarcinoma diagnosis model, and determine a probability of the sample to be tested to suffer from lung adenocarcinoma according to the model score.
Compared with the prior art, the early lung adenocarcinoma diagnosis device provided by the invention has the beneficial effects that the early lung adenocarcinoma diagnosis device is the same as the early lung adenocarcinoma marker, and is not repeated here.
Drawings
FIG. 1 is a graph showing the OPLS-DA score in the positive ion mode of example 1 of the invention;
FIG. 2 is a graph showing the OPLS-DA score in the negative ion mode in example 1 of the invention;
FIG. 3 is a graph showing the result of substitution test of OPLS-DA model in positive ion mode in example 1 of the invention;
FIG. 4 shows the result of substitution test of OPLS-DA model in the negative ion mode in example 1 of the invention;
FIG. 5 is a volcanic chart of differential lipid metabolites of lung adenocarcinoma and normal control populations according to example 1 of the present invention;
FIG. 6 is a statistical chart of the differential lipid profiles of lung adenocarcinoma patients and healthy control populations in example 1 of the present invention;
FIG. 7 is a ROC graph of early stage lung adenocarcinoma markers and early stage lung adenocarcinoma diagnostic models for lung adenocarcinoma patients and healthy control populations in example 1 of the present invention;
FIG. 8 is a graph of the ROC of early stage lung adenocarcinoma markers and early stage lung adenocarcinoma diagnostic models based on targeted lipidomics for a lung adenocarcinoma patient and a healthy control population in example 2 of the present invention;
FIG. 9 is a graph of PE (18:0/18:1) levels ROC based on targeted lipidomic in lung adenocarcinoma patients and healthy control populations in example 3 of the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of embodiments of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.
It should be noted that, without conflict, features in the embodiments of the present invention may be combined with each other. The terms "comprising," "including," "containing," and "having" are intended to be non-limiting, as other steps and other ingredients not affecting the result may be added. The above terms encompass the terms "consisting of … …" and "consisting essentially of … …". Materials, equipment, reagents are commercially available unless otherwise specified.
Moreover, while the invention has been described with reference to specific embodiments, it should be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that the different dependent claims and the features described herein may be combined in ways not otherwise described in the original claims. It is also to be understood that features described in connection with separate embodiments may be used in other described embodiments.
In a first aspect, embodiments of the present invention provide an early lung adenocarcinoma marker comprising at least one of FA (20:0), PE (18:0/18:1), LPC (18:1), and PC (18:2/18.2).
FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2) each represent a lipid metabolite, FA represents a fatty acid, PE represents phosphatidylethanolamine, LPC represents lysophosphatidylcholine, PC represents phosphatidylcholine, 18:0 represents 18C atoms in the fatty acid, the number of double bonds is 0, and the others are all of similar meanings; FA (20:0) and PE (18:0/18:1) were up-regulated in plasma samples from patients with early stage lung adenocarcinoma, while LPC (18:1) and PC (18:2/18.2) were down-regulated in plasma samples from patients with early stage lung adenocarcinoma.
The early lung adenocarcinoma marker is obtained through screening, has higher specificity for lung adenocarcinoma diagnosis, has lower cost by adopting the early lung adenocarcinoma marker, reduces the radiation exposure risk by adopting an LDCT method, and improves the effectiveness and reliability of the early lung adenocarcinoma diagnosis by applying the early lung adenocarcinoma marker on the basis of minimally invasive.
In one embodiment, the early lung adenocarcinoma markers include FA (20:0), PE (18:0/18:1), LPC (18:1), and PC (18:2/18.2).
That is, the early lung adenocarcinoma markers include FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2) at the same time, and the reliability of the early lung adenocarcinoma markers applied to early lung adenocarcinoma diagnosis can be further improved by comprehensively evaluating the cases of several lipid markers.
In one embodiment, the early lung adenocarcinoma marker is PE (18:0/18:1).
That is, the early lung cancer markers are PE (18:0/18:1) alone, the PE (18:0/18:1) alone has strong prediction capability for diagnosing early lung adenocarcinoma, and by adopting PE (18:0/18:1) alone as the early lung adenocarcinoma markers, the diagnosis cost can be further reduced and the efficiency can be improved while the reliability of the diagnosis result is ensured.
In a second aspect, embodiments of the present invention provide an early lung adenocarcinoma diagnostic model based on the early lung adenocarcinoma markers as described above, comprising:
wherein LSRscore represents a model score, and the higher the model score is, the higher the probability of suffering from lung adenocarcinoma is, and FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2) respectively represent the concentration of the corresponding marker in the sample to be tested.
That is, in the early lung adenocarcinoma diagnosis model provided by the embodiment of the invention, the lipid marker concentration index with significant difference in the early lung adenocarcinoma patient sample can be fully utilized by dividing the sum of two types of FA (20:0) and PE (18:0/18:1) which are up-regulated in the early lung adenocarcinoma patient sample by the sum of two types of LPC (18:1) and PC (18:2/18.2) which are down-regulated in the early lung adenocarcinoma patient sample, so that the prediction reliability is improved.
The early lung adenocarcinoma marker obtained by screening further obtains an early lung adenocarcinoma diagnosis model, and on the basis of fully utilizing the detection results of a plurality of early lung adenocarcinoma markers, the accuracy of early lung adenocarcinoma diagnosis is remarkably improved, so that the effectiveness and reliability of early lung adenocarcinoma diagnosis are improved.
In a third aspect, embodiments of the present invention provide a screening method for an early lung adenocarcinoma marker, for screening the early lung adenocarcinoma marker as described above, comprising the steps of:
s1, respectively collecting plasma samples of early lung adenocarcinoma patients and healthy samples;
Step S2, performing non-target lipidomic measurement on the plasma sample of the early lung adenocarcinoma patient and the healthy sample to obtain a first measurement result and a second measurement result respectively;
s3, performing multivariate statistical analysis on the first measurement result and the second measurement result, and screening to obtain potential markers with obvious differential expression in the first measurement result and the second measurement result;
and S4, adopting a big data analysis model to further screen the potential marker to obtain the early lung adenocarcinoma marker.
The metabolites can reliably reflect the state of the biological system. Technically, metabonomics can collect, detect and analyze various types of small molecule metabolites. As a subset of metabolomics, lipidomic is an effective method for studying cellular lipid metabolism and determining disease lipid biomarkers. Depending on the detection strategy, lipidomics may employ targeted or non-targeted means. Data Dependent Acquisition (DDA) for non-targeted lipidomics can cover metabolites widely, with wide application in finding candidate biomarkers. Targeted lipidomics based on the Multiple Reaction Monitoring (MRM) mode have good sensitivity, accuracy and reliability, are widely used to verify candidate biomarkers in independent sample populations, combine non-targeted and targeted lipidomic assays, and greatly enhance the reliable screening ability for disease-related lipids. Plasma lipidomics currently show predictive capability in the diagnosis of various diseases such as cancer, type II diabetes, cardiovascular disease and systemic lupus erythematosus.
Tumor lesions are a complex ecosystem consisting of malignant cells, various immune cells and stromal cells. With the breakthroughs of cell isolation and sequencing technology, single cell RNA sequencing (scRNA-seq) is capable of unbiased whole genome analysis of many cells at the single cell level and decoding of the tumor ecosystem. Previous studies analyzed scRNA-seq data from early lung adenocarcinoma patients and healthy control samples, found that there was a significant enrichment of abnormal lipid metabolism in early lung adenocarcinoma tumor tissue, suggesting that studying lipid metabolism characteristics may provide better insight into diagnosis and treatment strategies for lung adenocarcinoma. Therefore, plasma lipidomic research is carried out to identify a novel noninvasive early lung adenocarcinoma lipid biomarker with high sensitivity and specificity, which has important significance for diagnosing early lung adenocarcinoma.
The embodiment of the invention provides a screening method of an early lung adenocarcinoma marker, which is characterized in that a first measurement result and a second measurement result are respectively obtained by respectively collecting plasma samples of early lung adenocarcinoma patients and healthy samples and utilizing non-target lipidomic screening, and the early lung adenocarcinoma marker is further obtained by performing multi-element statistical analysis and a big data analysis model. The screening accuracy is high, the time consumption is short, the reliability of the screening result of the early lung adenocarcinoma markers is obviously improved, and a foundation is laid for developing early lung adenocarcinoma marker diagnosis strategies.
In step S1, peripheral blood of an early lung adenocarcinoma patient and a healthy sample is collected to obtain a plasma sample, so that the indexes of different samples can be subjected to comparative analysis conveniently.
In step S2, non-target lipidomic measurements are performed on the plasma sample using UHPLC-QE-MS.
The UHPLC-QE-MS non-target measurement method has high accuracy and time-consuming period, and can perform high-throughput qualitative and quantitative analysis on the lipid in the sample, thereby obviously shortening the screening time of early lung adenocarcinoma markers, enhancing the reliability of the result and providing a more scientific method for the research of lipid biomarkers.
In step S3, the first measurement result and the second measurement result are subjected to multivariate statistical analysis by adopting OPLS-DA statistics.
Metabonomics has the property of high dimension (more types of metabolites are detected) and small samples (less amount of samples detected), and therefore employs the statistical method of OPLS-DA. Orthogonal partial least squares discriminant analysis (OPLS-DA) is a regression modeling method from multiple dependent variables to multiple independent variables. The orthogonal variables irrelevant to the classification variables in the metabolites can be filtered through OPLS-DA statistical analysis, and the non-orthogonal variables are respectively analyzed, so that more reliable relevant degree information of the component differences of the metabolites and the experimental group is obtained.
The step S4 includes:
Further analyzing the potential marker by adopting a XGBoost analysis model, a random forest analysis model and a Lasso regression analysis model to obtain a first analysis result, a second analysis result and a third analysis result;
And obtaining the early lung adenocarcinoma marker according to the first analysis result, the second analysis result and the third analysis result.
Specifically, the potential markers obtained by screening are taken as characteristic inputs, the samples are randomly divided into a training set (accounting for 70% of all samples) and a test set (accounting for 30% of all samples), and 2000 random samplings are carried out. For each iteration of random sampling, using XGBoost and random forest analysis models, lipid molecular markers of early lung adenocarcinoma can be classified from healthy samples, respectively; the Lasso regression analysis model uses all samples to infer lipid molecular markers from potential markers. And analyzing non-targeted lipid data of potential markers in positive and negative ion modes respectively, and determining the lipid molecular markers which are deduced consistently by the three methods as early lung adenocarcinoma markers through a XGBoost analysis model, a random forest analysis model and a Lasso regression analysis model.
According to the screening method for the early lung adenocarcinoma markers provided by the embodiment of the invention, only the lipid markers which are available for the internal standard are reserved finally, so that the early lung adenocarcinoma markers comprising at least one of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2) are obtained.
In some embodiments, in the step S4, the potential marker is further screened by using the big data analysis model, so as to obtain a marker screening result, and the marker screening result is effectively verified by using a plasma-targeted lipid quantitative verification method and a tissue-targeted lipid quantitative verification method, so as to obtain the early lung adenocarcinoma marker.
That is, in addition to the screening process, a plasma-targeted lipid quantitative verification method and a tissue-targeted lipid quantitative verification method are used to further ensure the effectiveness of screening for early lung adenocarcinoma markers.
In a fourth aspect, embodiments of the present invention provide an early lung adenocarcinoma diagnostic apparatus comprising:
a detection unit for detecting the concentration of an early lung adenocarcinoma marker in a sample to be detected, wherein the early lung adenocarcinoma marker comprises at least one of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2);
The processing unit is used for judging the probability of the lung adenocarcinoma of the sample to be tested according to the concentration of the early lung adenocarcinoma marker;
and the result output unit is used for outputting the judging result of the processing unit.
Compared with the prior art, the early lung adenocarcinoma diagnosis device provided by the invention has the beneficial effects that the early lung adenocarcinoma diagnosis device is the same as the early lung adenocarcinoma marker, and is not repeated here.
In some embodiments, the processing unit is configured to obtain a model score according to an early lung adenocarcinoma diagnosis model, and determine a probability of the sample to be tested to suffer from lung adenocarcinoma according to the model score.
That is, a detection unit for detecting the concentration of an early lung adenocarcinoma marker in a sample to be tested, wherein the early lung adenocarcinoma marker includes all of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2);
The processing unit is used for acquiring a model score according to the early lung adenocarcinoma diagnosis model LSRscore, judging the probability of the lung adenocarcinoma of the sample to be tested according to the model score, wherein the higher the model score is, the higher the probability of the lung adenocarcinoma is;
And the result output unit is used for outputting the judging result of the processing unit.
Four lung adenocarcinoma markers, namely FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2), in a sample to be detected are detected through the detection unit, and then a more reliable result can be generated through the processing unit by utilizing an early lung adenocarcinoma diagnosis model.
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. The experimental methods, which do not address specific conditions in the following examples, are generally in accordance with the conditions recommended by the manufacturer.
Example 1 screening of early stage lung adenocarcinoma markers
1.1, Collecting plasma samples:
Plasma samples from 60 patients with early stage lung adenocarcinoma from the Beijing university people Hospital were retrospectively collected according to the following criteria: the pathological proven LUAD has no history of other cancers, is over 18 years old, has no chronic blood system diseases (such as hemolytic diseases), and has not been subjected to anticancer treatments such as chemotherapy, radiotherapy, targeted therapy, immunotherapy, etc. In addition, plasma samples from 30 healthy subjects were collected;
1.2, plasma sample UHPLC-QE-MS non-target lipidomic measurements:
plasma metabolite extraction: remove 100. Mu.L of sample and add 480. Mu.L of extract (methyl tert-butyl ether: methanol=5:1 with internal standard); vortex mixing for 30 seconds, and then ultrasonic treatment is carried out in an ice-water bath for 10 minutes. The sample was allowed to stand at-40℃for 1 hour. Centrifuge at 3000rpm for 15min at 4℃and collect 350. Mu.L of supernatant in EP tube and dry in a vacuum concentrator at 37 ℃. Add 100 μl of solution (DCM: meoh=1:1) for reconstitution, vortex for 30s, sonicate for 10min in ice water bath. Then centrifuged at 13000rpm for 15 minutes at 4℃and 80. Mu.L of the supernatant was transferred to a sample bottle for LC/MS analysis. Mixing 10 mu L of supernatant of all the samples to form QC samples, and detecting the QC samples on a machine;
And (3) detecting: the target compound was chromatographed using an Agilent 1290 (Agilent Technologies) ultra performance liquid chromatograph, on a Phenomen Kinetex C (2.1 x 100mm,1.7 μm) liquid chromatography column. The phase A of liquid chromatography is 40% water and 60% acetonitrile solution, which contains 10mmol/L ammonium formate; phase B was 10% acetonitrile, 90% isopropyl alcohol solution, and 50mL of 10mmol/L ammonium formate aqueous solution was added per 1000 mL. Gradient elution is adopted: 0 to 1.0min,40 percent of B;1.0 to 12.0min,40 to 100 percent of B;12.0 to 13.5min,100 percent of B;13.5 to 13.7min,100 to 40 percent of B;13.7 to 18.0min,40 percent of B. Mobile phase flow rate: 0.3mL/min, column temperature: 55 ℃, sample tray temperature: 4 ℃, sample injection volume: 2 mu L of positive ions; negative ions 4. Mu.L. Thermo Q Exactive Orbitrap the mass spectrometer was capable of primary and secondary mass spectrometry data acquisition under control of control software (Xcalibur, version 4.0.27, thermo). The detailed parameters are as follows: SHEATH GAS flow rate 30Arb,Aux gas flow rate:10Arb,Capillary temperature:320 ℃ (positive) or 300℃(negative),Full ms resolution:70000,MS/MS resolution:17500,Collision energy:15/30/45in NCE mode,Spray Voltage:5kV(positive) or-4.5 kV (negative);
And (3) data processing: the original data file was converted to a mzXML format file using the ProteoWizard "msconvert" program. Then, XCMS was used for the work of retention time correction, peak identification, peak extraction, peak integration, peak alignment, etc., minfrac was set to 0.5, and cutoff was set to 0.6. Lipid identification was achieved by spectral matching using a lipid blast database;
1.3, data preprocessing:
the lipid with the substance name obtained qualitatively through the secondary mass spectrum matching is reserved, the lipid with the qualitative scoring value of not less than 0.6 of the secondary mass spectrum matching is screened from the positive ion mode or the negative ion mode respectively, and the lipid level of a QC sample is standardized;
1.4, screening for LUAD lipid markers:
The model effectiveness was evaluated by R 2 Y (model's interpretability of the classification variable Y) and Q 2 (model's predictability) by performing an OPLS-DA modeling analysis on two sets of samples (LUAD and HC), where the parameters of the OPLS-DA analysis model are shown in table 1, the analysis results are shown in fig. 1-2, fig. 1 shows an analysis graph of positive ion mode, grey dots show HC (healthy samples), black dots show LUAD (early lung adenocarcinoma patient samples), fig. 2 shows an analysis graph of negative ion mode, HC shows healthy samples, and LUAD shows early lung adenocarcinoma patient samples. The abscissa of fig. 1 may reflect the gap between the two groups; the ordinate direction may reflect the gap within the group. The results show that LUAD and healthy samples are distinguished significantly. Variables with a variable projection importance (Variable Importance in the Projection, VIP) of greater than 1.0 in the OPLS-DA model are considered to have significant differences between classes. Finally, through replacement test, the arrangement sequence of the classification variable Y is randomly changed for a plurality of times to obtain different random Q2 values, and the validity of the model is further tested, and the result is shown in figures 3-4; under the positive ion or negative ion mode, the Q2 intercept values obtained through 200 times of substitution tests are respectively-0.57 and-0.491, and the R2 intercept values are respectively 0.69 and 0.534, so that the OPLS-DA model is proved to be credible without over-fitting phenomenon. Wherein HC represents Health Control, i.e., a Health sample;
TABLE 1OPLS-DA analysis model parameters
Mode R2Y Q2
Positive ion mode 0.96 0.88
Negative ion mode 0.91 0.82
Screening of potential markers: lipids with VIP greater than 1 were retained for subsequent analysis. The non-targeted lipid data were subjected to MANN WHITNEY U test and FDR multiplex comparison corrections and FoldChange values of lipid levels of lung adenocarcinoma samples and healthy controls were calculated to determine the lipid level differences between the two groups. The volcanic profile of the differential lipids, including Fatty Acids (FA), glycerides (GL), glycerophospholipids (GP), sphingolipids (SP), sterols (ST), was determined as VIP >1, FDRpadjust <0.05, and Foldchange >1.5 or <0.67 as shown in fig. 5. Specific statistical distributions are shown in fig. 6, including 50 up-regulated lipids and 48 down-regulated lipids in positive ion mode, and 39 up-regulated lipids and 12 down-regulated lipids in negative ion mode;
Screening of early lung adenocarcinoma markers: and taking the potential markers obtained by screening as characteristic input. Samples were randomly split into training set (70% of all samples) and test set (30% of all samples) and randomly sampled 2000 times. For each iteration of random sampling, lipid molecular markers of early lung adenocarcinoma were classified from healthy samples using a random forest analysis model and XGBoost analysis model, respectively. Lasso regression typing all samples were used to infer lipid molecular markers from potential markers. Respectively analyzing non-targeted lipid data of differential lipids in positive and negative ion modes, and determining a lipid molecular marker which is deduced consistently by three methods of a random forest analysis model, a XGBoost analysis model and a Lasso regression analysis model as a final early lung adenocarcinoma marker;
According to the screening method of the early lung adenocarcinoma markers provided by the embodiment of the invention, only the lipid markers which are available for the internal standard are reserved finally, and four early lung adenocarcinoma markers of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2) are obtained as shown in figure 7;
Construction of an early lung adenocarcinoma diagnosis model: ROC analysis was performed on all samples in non-targeted lipid data, confirming the reliability of the four characteristic lipid predictions for early lung adenocarcinoma, respectively, and the results are shown in fig. 7. Considering that the blood lipid ratio shows good predictive ability for various diseases, a scoring model LSRscore based on early lung adenocarcinoma markers was constructed, and each sample was scored by dividing the sum of the concentrations of the two upregulated early lung adenocarcinoma markers FA (20:0) and PE (18:0/18:1) by the sum of the concentrations of the two downregulated early lung adenocarcinoma markers LPC (18:1) and PC (18:2/18:2). As shown in fig. 7, the predictive power of the LSRscore model in early lung adenocarcinoma was evaluated, and AUC values reached 0.972, indicating good LSRscore effect.
Example 2 plasma-targeted lipid detection assay based on early lung adenocarcinoma markers verifies
2.1, Collection of plasma samples:
During the validation process, plasma samples of 30 LUAD patients and 30 healthy subjects collected at the beijing university people hospital were included.
Lung adenocarcinoma sample age range: 62.23 + -9.48 [ mean+ -sd ], including 15 females and 15 males, of which 26 patients were stage I LUAD,3 were stage II, and 1 were stage III. Age range of healthy subjects: 51.57.+ -. 4.52, including 10 females and 20 males.
2.2, Full quantitative detection of the targeted lipidome:
Plasma sample metabolite extraction: 10. Mu.L of plasma sample were mixed with 190. Mu.L of water and 480. Mu.L of extract (MTBE: meOH=5:1) were added. After vortexing for 60 seconds, sonicating in an ice-water bath for 10min. The sample was then centrifuged at 3000rpm for 15 minutes at 4℃and 250. Mu.L of supernatant was taken. 250 μl MTBE was added again, then vortexed, sonicated and centrifuged, and 250 μl supernatant was removed and the procedure repeated 1 time. The supernatants of the above 3 times were combined, dried in vacuo at 37℃and subsequently reconstituted by adding 100. Mu.L of solution (DCM: meOH: H2O=60:30:4.5) to the dried metabolite, vortexing for 30s, and sonicating for 10min in an ice-water bath. Then centrifuged at 12000rpm for 15 minutes at 4℃and 30. Mu.L of the supernatant was taken for LC/MS analysis in a sample bottle. Mixing 10 mu L of supernatant of all the samples to form QC samples, and detecting the QC samples on a machine;
And (3) detecting: the target compound was chromatographed through a liquid chromatography column using SCIEX ExionLC ultra high performance liquid chromatograph. The phase A of liquid chromatography is 40% water and 60% acetonitrile solution, which contains 10mmol/L ammonium acetate; phase B is 10% acetonitrile, 90% isopropyl alcohol solution containing 10mmol/L ammonium acetate. Mobile phase flow rate: 0.3mL/min, column temperature: 40 ℃, sample tray temperature: sample injection volume at 6 ℃): 2. Mu.L. The liquid chromatography mobile phase conditions are shown in table 2:
TABLE 2 liquid chromatography mobile phase conditions
Time (min) Flow rate (mu L/min) A% B%
0 300 80 20
1 300 80 20
4 300 40 60
15 300 2 98
16 300 2 98
16.01 300 80 20
18 300 80 20
At data acquisition, mass spectrometry was performed in Multiple Reaction Monitoring (MRM) mode. The ion source parameters are as follows :IonSpray Voltage:+5500/-4500V,Curtain Gas:40psi,Temperature:350℃,Ion Source Gas 1:50psi,Ion Source Gas 2:50psi,DP:±80V;
And (3) data processing: skyline 20.1 software was used for quantification of the target compounds. Calculating the absolute content of each lipid relative to a homolipid Internal Standard (IS) according to the relation between the peak area and the actual concentration of the IS, and then averaging the results obtained by a plurality of IS of the lipid to obtain the absolute content;
2.3, plasma targeting data validation of early lung adenocarcinoma markers:
the expression levels of 4 lipid molecular markers obtained in plasma non-targeted lipidomics were visualized and statistically analyzed using the Mann-Whitney U test for the expression levels of biomarkers in peripheral blood of both sets of samples and the LSRscore score was calculated for each sample in the validation cohort. As shown in fig. 8, the AUC of the plasma independent validation set was 0.92, and the AUC data for both the lipid molecular markers PE (18:0/18:1) and PC (18:2/18:2) was no less than 0.8.
Example 3 tissue-targeting lipid detection assay based on early lung adenocarcinoma markers verifies
3.1, Sample collection:
in the tissue validation cohort, 25 LUAD patients were collected from the beijing university people hospital and in-situ tumor tissue and adjacent normal lung tissue were obtained during surgery. Pathological diagnosis of tumor specimens was performed by two pathologists. Molecular pathology was performed at the molecular pathology detection center of the Beijing university people's hospitals. Lung adenocarcinoma sample age range: 62.72.+ -. 10.21[ mean.+ -. Sd ], including 11 females and 14 males, wherein 21 patients were stage I LUAD,3 were stage II, and 1 were stage III.
3.2, Full quantitative detection of the targeted lipidome:
tissue sample metabolite extraction: 10mg of tissue sample was weighed onto dry ice into an EP tube, 400. Mu.L of water was added, vortexed, mixed well for 60s, milled at 45Hz for 4 minutes, sonicated in an ice-water bath for 5 minutes, and the procedure was repeated 3 times. 10. Mu.L of the homogenate was mixed with 190. Mu.L of water and 480. Mu.L of the extract containing the internal standard was added. After vortexing for 60 seconds, the samples were sonicated in an ice water bath for 10min. The sample was then centrifuged at 3000rpm for 15 minutes at 4℃and 250. Mu.L of supernatant was taken. The remaining samples were added to 250 μ LMTBE, then vortexed, sonicated and centrifuged, and 250 μl of supernatant was removed and the procedure repeated 1 time. The supernatants from above 3 times were combined and dried under vacuum at 37 ℃. To the dried metabolite was added 200 μl of solution (DCM: meOH: h2o=60:30:4.5) for reconstitution, the sample was vortexed for 30s and sonicated in an ice water bath for 10min. Then centrifuged at 12000rpm for 15 minutes at 4℃and 40. Mu.L of the supernatant was taken for LC/MS analysis in a sample bottle. Mixing 10 mu L of supernatant of all the samples to form QC samples, and detecting the QC samples on a machine;
And (3) detecting: the target compound was chromatographed through a liquid chromatography column using SCIEX ExionLC ultra high performance liquid chromatograph. The phase A of liquid chromatography is 40% water and 60% acetonitrile solution, which contains 10mmol/L ammonium acetate; phase B is 10% acetonitrile, 90% isopropyl alcohol solution containing 10mmol/L ammonium acetate. Mobile phase flow rate: 0.3mL/min, column temperature: 40 ℃, sample tray temperature: sample injection volume at 6 ℃): 2. Mu.L. The liquid chromatography mobile phase conditions are shown in table 2. At data acquisition, mass spectrometry was performed in Multiple Reaction Monitoring (MRM) mode. The ion source parameters are as follows :IonSpray Voltage:+5500/-4500V,Curtain Gas:40psi,Temperature:350℃,Ion Source Gas 1:50psi,Ion Source Gas 2:50psi,DP:±80V;
And (3) data processing: skyline 20.1 software was used for quantification of the target compounds. Calculating the absolute content of each lipid relative to a homolipid Internal Standard (IS) according to the relation between the peak area and the actual concentration of the IS, and then averaging the results obtained by a plurality of IS of the lipid to obtain the absolute content;
3.3, verifying early lung adenocarcinoma markers by tissue targeting data:
LC-MS based targeted lipidomic assays measured the concentrations of PE (18:0/18:1) and PC (18:2/18:2) in tumor tissue and adjacent lung tissue of 25 LUAD patients. PE (18:0/18:1) and PC (18:2/18:2) were detected in these surgically excised tissues as shown in FIG. 9. PE (18:0/18:1) was significantly upregulated in LUAD tumor tissue compared to paracanced lung tissue. ROC analysis showed that PE (18:0/18:1) performed well with AUC values up to 0.845. These results confirm consistent changes in the LUAD organization for PE (18:0/18:1). Multiple independent evidence from plasma and in situ tissues suggests that PE alone (18:0/18:1) has a strong predictive power for diagnosing early LUAD (auc=0.964 in plasma non-targeted lipidomic; auc=0.80 in plasma targeted lipidomic; auc=0.845 in LUAD tissue in situ targeted lipidomic).
In conclusion, the early lung adenocarcinoma marker and the early lung adenocarcinoma diagnosis model LSRscore provided by the invention can distinguish clinical early lung adenocarcinoma from a healthy sample with ultrahigh sensitivity and specificity.
Although the invention is disclosed above, the scope of the invention is not limited thereto. Various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications will fall within the scope of the invention.

Claims (10)

1. An early lung adenocarcinoma marker comprising at least one of FA (20:0), PE (18:0/18:1), LPC (18:1), and PC (18:2/18.2).
2. The marker of early lung adenocarcinoma according to claim 1, comprising FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2).
3. The marker of early lung adenocarcinoma according to claim 1, characterized in that it is PE (18:0/18:1).
4. An early lung adenocarcinoma diagnostic model based on the early lung adenocarcinoma marker of any one of claims 1-3, characterized in that the early lung adenocarcinoma diagnostic model comprises:
wherein LSRscore represents a model score, and the higher the model score is, the higher the probability of suffering from lung adenocarcinoma is, and FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2) respectively represent the concentration of the corresponding marker in the sample to be tested.
5. A method of screening for an early lung adenocarcinoma marker according to any one of claims 1-3, comprising the steps of:
s1, respectively collecting plasma samples of early lung adenocarcinoma patients and healthy samples;
Step S2, performing non-target lipidomic measurement on the plasma sample of the early lung adenocarcinoma patient and the healthy sample to obtain a first measurement result and a second measurement result respectively;
s3, performing multivariate statistical analysis on the first measurement result and the second measurement result, and screening to obtain potential markers with obvious differential expression in the first measurement result and the second measurement result;
and S4, adopting a big data analysis model to further screen the potential marker to obtain the early lung adenocarcinoma marker.
6. The method of screening for early lung adenocarcinoma markers according to claim 5, wherein the step S4 comprises:
Further analyzing the potential marker by adopting a XGBoost analysis model, a random forest analysis model and a Lasso regression analysis model to obtain a first analysis result, a second analysis result and a third analysis result;
And obtaining the early lung adenocarcinoma marker according to the first analysis result, the second analysis result and the third analysis result.
7. The method according to claim 5, wherein in step S4, the potential markers are further screened by using the big data analysis model to obtain marker screening results, and the marker screening results are effectively verified by using a plasma-targeted lipid quantitative verification method and a tissue-targeted lipid quantitative verification method to obtain the early lung adenocarcinoma markers.
8. The method according to claim 5, wherein in step S2, non-target lipidomic measurement is performed on the plasma sample using UHPLC-QE-MS.
9. An early lung adenocarcinoma diagnostic device comprising:
a detection unit for detecting the concentration of an early lung adenocarcinoma marker in a sample to be detected, wherein the early lung adenocarcinoma marker comprises at least one of FA (20:0), PE (18:0/18:1), LPC (18:1) and PC (18:2/18.2);
The processing unit is used for judging the probability of the lung adenocarcinoma of the sample to be tested according to the concentration of the early lung adenocarcinoma marker;
and the result output unit is used for outputting the judging result of the processing unit.
10. The apparatus according to claim 9, wherein the processing unit is configured to obtain a model score according to an early lung adenocarcinoma diagnosis model, and determine the probability of the sample to be tested to suffer from lung adenocarcinoma according to the model score.
CN202410114554.8A 2024-01-26 2024-01-26 Early lung adenocarcinoma marker, screening method thereof, diagnosis model and diagnosis device Pending CN118006776A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410114554.8A CN118006776A (en) 2024-01-26 2024-01-26 Early lung adenocarcinoma marker, screening method thereof, diagnosis model and diagnosis device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410114554.8A CN118006776A (en) 2024-01-26 2024-01-26 Early lung adenocarcinoma marker, screening method thereof, diagnosis model and diagnosis device

Publications (1)

Publication Number Publication Date
CN118006776A true CN118006776A (en) 2024-05-10

Family

ID=90951697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410114554.8A Pending CN118006776A (en) 2024-01-26 2024-01-26 Early lung adenocarcinoma marker, screening method thereof, diagnosis model and diagnosis device

Country Status (1)

Country Link
CN (1) CN118006776A (en)

Similar Documents

Publication Publication Date Title
Tao et al. Metabolomics identifies serum and exosomes metabolite markers of pancreatic cancer
CN110646554B (en) Pancreatic cancer diagnosis marker based on metabonomics and screening method and application thereof
Struck-Lewicka et al. Urine metabolic fingerprinting using LC–MS and GC–MS reveals metabolite changes in prostate cancer: A pilot study
Ros-Mazurczyk et al. Serum lipid profile discriminates patients with early lung cancer from healthy controls
Wu et al. Metabolomic investigation of gastric cancer tissue using gas chromatography/mass spectrometry
JP5502972B2 (en) Biomarker useful for diagnosing prostate cancer and method thereof
Camera et al. Use of lipidomics to investigate sebum dysfunction in juvenile acne [S]
JP5038311B2 (en) Method for diagnosing colorectal cancer and ovarian cancer by measuring vitamin E-related metabolites
Liu et al. LC-MS-based plasma metabolomics and lipidomics analyses for differential diagnosis of bladder cancer and renal cell carcinoma
CN106716127B (en) Methods for detecting ovarian cancer
CN111562338B (en) Application of transparent renal cell carcinoma metabolic marker in renal cell carcinoma early screening and diagnosis product
AU2017202740B2 (en) Lipid markers for early diagnosis of breast cancer
US20200363419A1 (en) A method of diagnosing cancer based on lipidomic analysis of a body fluid
CN113960215A (en) Marker for lung adenocarcinoma diagnosis and application thereof
Liang et al. Serum metabolomics uncovering specific metabolite signatures of intra-and extrahepatic cholangiocarcinoma
Hassan et al. Metabolomics driven analysis of obesity-linked colorectal cancer patients via GC-MS and chemometrics: A pilot study
Yumba-Mpanga et al. Metabolomic heterogeneity of urogenital tract cancers analyzed by complementary chromatographic techniques coupled with mass spectrometry
Kozar et al. Identification of novel diagnostic biomarkers in breast cancer using targeted metabolomic profiling
US20200064349A1 (en) Prostate cancer diagnostic biomarker composition including kynurenine pathway&#39;s metabolites
Krishnan et al. Circulating metabolite biomarkers: A game changer in the human prostate cancer diagnosis
Djukovic et al. Colorectal cancer detection using targeted LC-MS metabolic profiling
Ossoliński et al. Targeted and untargeted urinary metabolic profiling of bladder cancer
Issaq et al. Biomarker discovery: study design and execution
Zhang et al. Altered phosphatidylcholines expression in sputum for diagnosis of non-small cell lung cancer
US20140162903A1 (en) Metabolite Biomarkers For Forecasting The Outcome of Preoperative Chemotherapy For Breast Cancer Treatment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination