CN115820860A - Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof - Google Patents

Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof Download PDF

Info

Publication number
CN115820860A
CN115820860A CN202211471119.8A CN202211471119A CN115820860A CN 115820860 A CN115820860 A CN 115820860A CN 202211471119 A CN202211471119 A CN 202211471119A CN 115820860 A CN115820860 A CN 115820860A
Authority
CN
China
Prior art keywords
dna methylation
small cell
lung cancer
cell lung
marker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211471119.8A
Other languages
Chinese (zh)
Inventor
李国亮
朱志贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Agricultural University
Original Assignee
Huazhong Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Agricultural University filed Critical Huazhong Agricultural University
Priority to CN202211471119.8A priority Critical patent/CN115820860A/en
Publication of CN115820860A publication Critical patent/CN115820860A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method for screening a non-small cell lung cancer marker based on enhancer methylation difference, and a marker and application thereof. Aberrant DNA methylation in the enhancer region of the human genome leads to aberrant gene expression regulation in non-small cell lung cancer. Based on the principle, the non-small cell lung cancer marker screening method is further combined with a paired end tag sequencing analysis chromatin interaction technology (ChIA-PET) to screen out a genome enhancer region DNA methylation marker causing the abnormal gene expression regulation of the non-small cell lung cancer. The screening method has the advantages of accuracy, rapidness and high flux. The DNA methylation markers are cg00787780, cg16434331, cg21862081 and cg24327132. The marker has good diagnosis efficiency on the non-small cell lung cancer, is high in accuracy, sensitivity and specificity, and can better help to improve the detection rate of the non-small cell lung cancer.

Description

Method for screening non-small cell lung cancer markers based on methylation difference of enhancer, markers and application thereof
Technical Field
The invention relates to the field of biomedicine, in particular to a method for screening a non-small cell lung cancer marker based on enhancer methylation difference, and a marker and application thereof.
Background
Lung cancer is one of the most common cancers worldwide and is also the leading cause of cancer death. Non-small cell lung cancer (NSCLC) is the predominant histological type, accounting for approximately 85% of newly diagnosed lung cancers. Since most patients are diagnosed with advanced metastases once tested, the prognosis for lung cancer patients is often poor. The 5-year survival rate of patients diagnosed with metastatic lung cancer is less than 10%. By contrast, depending on surgery, early diagnosis of lung cancer patients has a much better prognosis with a 5-year survival rate of up to 80%. Low dose helical Computed Tomography (CT) screening for lung cancer has been shown to improve the early diagnosis rate of lung cancer, thereby helping to reduce mortality.
While low dose CT-based screening is very sensitive, the low specificity makes it report many false positives, which may lead to further follow-up or invasive procedures. Therefore, there is a need for more accurate novel biomarkers for clinical lung cancer diagnosis.
Aberrant DNA methylation, a common event in various cancers, suggests that DNA methylation can be a biomarker for cancer diagnosis. Changes in DNA methylation are very stable compared to other biomarkers such as protein and gene mutations, and occur at an early stage of cancer.
The mainstream lung cancer DNA methylation markers are mainly concentrated in the promoters of cancer suppressor genes or oncogenes or abnormal DNA methylation regions inside genes. Recent studies have shown that aberrant DNA methylation in the distal regulatory regions of genes, particularly in the enhancer region, can also contribute to the development and progression of cancer.
At present, no reported screening method for non-small cell lung cancer markers based on enhancer methylation difference exists.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for screening a non-small cell lung cancer marker based on enhancer methylation difference, and the marker and application thereof.
In order to achieve the purpose, the invention designs a method for screening a non-small cell lung cancer marker based on enhancer methylation difference, which comprises the following steps:
s1, respectively obtaining whole genome DNA methylation data of a non-small cell lung cancer sample and whole genome DNA methylation data of a normal lung tissue sample by using a WGBS sequencing technology;
s2, carrying out DNA methylation difference analysis on the whole genome DNA methylation data of the non-small cell lung cancer sample and the whole genome DNA methylation data of the normal lung tissue sample by using BatMeth2 software to obtain genome position information of a differential DNA Methylation Region (DMR);
s3, respectively obtaining genome position information of a lung tissue genome enhancer region or genome position information of a non-small cell lung cancer tissue related cell line (such as A549) genome enhancer region through an EnhancerrAtlas 2.0 database, combining the genome position information of the differential DNA Methylation Region (DMR) obtained in the step S2, and reserving the DMR which has intersection with the genome position of the enhancer region to obtain the differential DNA methylation region (eDMR) of the enhancer region;
s4, acquiring RNAPOL2ChIA-PET data of lung tissues or RNAPOL2ChIA-PET data of non-small cell lung cancer related cell lines (such as A549) by using a chromatin remote interaction sequencing-on-end (ChIA-PET) technology of a paired end tag, acquiring RNAPOL2 whole genome three-dimensional chromatin interaction information of lung tissues or RNAPOL2 whole genome three-dimensional chromatin interaction information of non-small cell lung cancer related cell lines (such as A549) by using ChIA-PET Tool V3 software, and acquiring enhancer region difference DNA methylation regions (eDMR) with one end positioned in the enhancer region difference DNA methylation regions (eDMR) and the other end positioned in the gene promoters;
s5, screening the information of the complete genome enhancer region difference DNA methylation region (eDMR) -gene promoter interaction pair obtained in the step S4 to obtain eDMR with chromatin interaction with the gene promoter;
s6, respectively acquiring DNA methylation data of a non-small cell lung cancer sample and DNA methylation data of a normal lung tissue sample through Illumina Infinium physiology methylation 450K BeadChiP (450K methylation chip detection technology);
s7, obtaining a differential DNA methylation site (DMC 1) of the non-small cell lung cancer sample and a differential DNA methylation site (DMC 2) of the normal lung tissue sample by using a statistical method based on the DNA methylation data of the non-small cell lung cancer sample and the DNA methylation data of the normal lung tissue sample collected in the S6; based on the eDMR obtained by screening in the step S5, screening out a differential DNA methylation site of which the genome position is positioned in the eDMR as a candidate marker;
s8, by using a Lasso method, taking the methylation degree of the candidate marker differential DNA methylation sites obtained in the S7 as an independent variable, taking the sample types in the DNA methylation data of the non-small cell lung cancer sample collected in the S6 and the DNA methylation data of the normal lung tissue sample as dependent variables, establishing a linear regression model on the DNA methylation data of the non-small cell lung cancer sample collected in the S6 and the DNA methylation data of the normal lung tissue sample, and keeping the differential DNA methylation sites with the coefficient not equal to 0 in the model as final markers;
s9, respectively acquiring DNA methylation data of the non-small cell lung cancer sample and a DNA methylation data set (a data set not in S6) of the normal lung tissue sample by using a 450K methylation chip detection technology, or respectively downloading the DNA methylation data of the non-small cell lung cancer sample and the DNA methylation data set (a data set not in S6) of the normal lung tissue sample from a publicly published database (such as the NGDC (national genome science data center); constructing a logistic regression model using the acquired data set,
and evaluating the effectiveness of the marker obtained in the step S8 on the diagnosis of the non-small cell lung cancer by using a logistic regression model.
Further, in step S2, the screening criteria for genomic position information of the differential DNA Methylation Region (DMR) are as follows:
differential degree of methylation of differential DNA Methylation Regions (DMR): 0.25 or more and 1 or-0.25 or less, and a multiple-test FDR of 0.05 or more and 0 or more.
Still further, in the step S8, the sample type is set as:
the normal lung tissue sample codes for 0, and the non-small cell lung cancer sample codes for 1.
Still further, in the step S8, the final markers are four, namely cg00787780, cg16434331, cg21862081 and cg24327132.
Still further, in step S9, the equation of the logistic regression model is:
Y=26.671–10.644X 1 –4.376X 2 –15.556X 3 –11.790X 4
wherein, X 1 Is cg00787780, X 2 Is cg16434331, X 3 Is cg21862081, X 4 Cg24327132;
y is the probability that the sample suffers from non-small cell lung cancer, and Y is more than or equal to 0 and less than or equal to 1.
Still further, in step S9, the logistic regression model classification performance criteria are as follows:
when the classification accuracy is more than 0.5 and less than or equal to 1 and the AUC is more than 0.5 and less than or equal to 1, the marker is effective (the closer the classification accuracy and the AUC are to 1, the better the classification performance of the marker is);
when the classification accuracy is less than or equal to 0.5 or the AUC is less than or equal to 0.5, the marker is invalid.
The invention also provides the DNA methylation markers obtained by screening by the method, wherein the markers are cg00787780, cg16434331, cg21862081 and cg24327132.
The invention also provides application of the kit for detecting the DNA methylation marker in the sample in preparation of products for diagnosing and predicting non-small cell lung cancer.
The important words of the invention are explained:
WGBS refers to Whole Genome Bisulfite Sequencing, which can accurately detect the methylation level of all single cytosine bases (C bases) in the Whole Genome range, and is the gold standard for DNA methylation research.
The illumina Infinium physiology 450K bead chip technology can detect about 45 ten thousand DNA methylation sites in the whole human genome, and is a relatively common economical DNA methylation detection technology.
DNA methylation refers to the methylation state of the 5 th carbon atom on cytosine in CpG dinucleotide, and is an important epigenetic mechanism as a relatively stable modification state which can be inherited to new filial generation DNA along with the DNA replication process under the action of DNA methyltransferase. Methylation of the promoter region of a gene can lead to transcriptional silencing of the oncogene, and thus is closely related to tumor development. Aberrant methylation includes hypermethylation of cancer suppressor genes and DNA repair genes, hypomethylation of repeat DNA, loss of imprinting of certain genes, which is associated with the development of a variety of tumors.
ChIA-PET assigns a technology for sequencing and analyzing Chromatin Interaction (chromosome Interaction Analysis with Paired-End-Tag sequencing) on a terminal Tag, and is a technology for researching remote Chromatin Interaction in a genome range by integrating co-immunoprecipitation (ChIP), chromatin proximity ligation (chromosome proximity ligation), double-terminal Tags (Paired-End Tags) and a high-throughput sequencing technology.
5. A promoter is a DNA sequence recognized, bound and initiated by RNA polymerase and contains conserved sequences required for RNA polymerase specific binding and transcription initiation, most of which are located upstream of the transcription initiation point of a gene. The promoter itself is not transcribed.
6. Enhancers are small regions of DNA that bind to proteins, and, when bound to proteins, the transcription of the gene is enhanced. Enhancers may be located upstream or downstream of a gene. And do not necessarily have linear access to the gene of interest on the genome because of the winding structure of chromatin, giving the sequences opportunities to come into spatial contact at positions that are far apart.
RNAPOL2 means that RNA polymerase II (also known as RNAP II or PolII) is a complex of multiple proteins. It is one of three RNA polymerases found in the nucleus of eukaryotic cells. It catalyzes the transcription of DNA to synthesize mRNA, the precursors of most snrnas and micrornas. RNA polymerase II is a 550kDa 12 subunit complex, the most studied type of RNA polymerase. It requires multiple transcription factors to bind to the upstream gene promoter and initiate transcription.
8. Specificity refers to the rate at which patient samples without a particular clinical disease are detected as negative.
9. Sensitivity refers to the rate at which patient samples with a definite clinical disease are detected as positive.
AUC (Area Under dark) is defined as the Area enclosed by the coordinate axes Under ROC, and the value of the Area is between 0 and 1. The closer the AUC is to 1, the higher the authenticity of the detection method.
ROC (receiver operating characteristic curve) refers to a receiver operating characteristic curve, and is a comprehensive index reflecting continuous variables of sensitivity and specificity.
Fdr refers to Bonferroni correction: if N independent hypotheses are tested simultaneously on the same data set, then the level of statistical significance for each hypothesis should be 1/N of the level of statistical significance when only one hypothesis is tested.
The invention has the beneficial effects that:
1. the invention relates to a screening method of a non-small cell lung cancer marker, which is a method for screening a DNA methylation marker based on abnormal DNA methylation of an enhancer region with three-dimensional interaction of chromatin with a gene promoter region. The method has the advantages of accuracy, rapidness and high flux. By the method, the screening range of the DNA methylation marker for diagnosing the non-small cell lung cancer in the field can be expanded. And the mechanism that abnormal DNA methylation of the enhancer region and the promoter region or the internal region of the gene causes cancer is different, and the method is expected to be complementary to the conventional DNA methylation marker screening method, so that the detection rate of the non-small cell lung cancer is better improved.
The DNA methylation marker has extremely high sensitivity and specificity, and compared with other biomarkers such as protein and gene mutation, the change of the DNA methylation is very stable, and the DNA methylation marker appears at the early stage of cancer, thereby being beneficial to the early screening of the non-small cell lung cancer. In addition, the method for screening the non-small cell lung cancer marker based on enhancer abnormal DNA methylation can obtain more cancer diagnosis markers with high reliability, is beneficial to promoting the clinical application of the DNA methylation marker, finally realizes accurate, efficient, economic and noninvasive early cancer screening, and improves the life quality of people.
Drawings
FIG. 1 is a flow chart of a method for screening DNA methylation markers of non-small cell lung cancer based on enhancer methylation difference.
FIG. 2 is a heatmap of the degree of methylation of eDMR in 17 samples. Each row represents one eDMR and each column represents one sample. NC represents normal lung tissue samples, TC represents non-small cell lung cancer samples, and heatmap color represents zscore values for the degree of methylation.
FIG. 3 is a graph showing the distribution of methylation levels of markers at various stages in normal lung tissue and non-small cell lung cancer samples. The significance test method is a Wilcoxon sum rank test.
FIG. 4 is a ROC plot of a training set of normal lung tissue and non-small cell lung cancer samples. AUC represents the area under the ROC curve. At a classification threshold of 0.743, the specificity was 0.972 and the sensitivity was 0.988.
Detailed Description
The present invention is described in further detail below with reference to specific examples so as to be understood by those skilled in the art.
Example 1
The screening method of the non-small cell lung cancer marker based on the methylation difference of the enhancer comprises the following steps:
s1, obtaining the whole genome DNA methylation data of 8 non-small cell lung cancer samples and the whole genome DNA methylation data of 9 normal lung tissue samples respectively by using a WGBS sequencing technology, wherein the specific information is shown in Table 1 below.
TABLE 1 statistics of WGBS methylation information
Figure BDA0003958540040000071
S2, carrying out DNA methylation difference analysis on the whole genome DNA methylation data of the non-small cell lung cancer sample and the whole genome DNA methylation data of the normal lung tissue sample by using BatMeth2 software to obtain genome position information of a difference DNA Methylation Region (DMR); wherein the screening criteria for genomic location information of the differential DNA Methylation Region (DMR) are as follows:
differential degree of methylation of differential DNA Methylation Regions (DMR): 0.25 or more and 1 or-0.25 or less, and the multiplex assay FDR is less than 0.05 and more than 0.
And S3, acquiring the genome position information of the genome enhancer region of the non-small cell lung cancer tissue-related cell line A549 by using an EnhancerAtlas 2.0 database, and reserving the DMR which has intersection with the genome position of the enhancer region by combining the genome position information of the differential DNA Methylation Region (DMR) acquired in the step S2, so as to obtain the differential DNA methylation region (eDMR) of the enhancer region. Here, a total of 2595 enhancer regions were obtained which differed from the DNA methylation region (eDMR), and their methylation information in each sample is shown in FIG. 2, where the colors represent normalized values for the degree of methylation of eDMR.
S4, acquiring RNAPOL2ChIA-PET data of the non-small cell lung cancer related cell line A549 by using a paired end tag chromatin remote interactive sequencing (ChIA-PET) technology, and obtaining whole genome RNAPOL2 three-dimensional chromatin interactive information of A549 by using ChIA-PET Tool V3 software.
And S5, acquiring eDMR region-gene promoter interaction pair information with one end positioned in a gene promoter and the other end positioned in the eDMR region based on the 2595 eDMR regions, thereby acquiring the eDMR with three-dimensional chromatin interaction with the gene promoter. (ii) a
S6, acquiring 450K methylation chip data sets (number of non-small cell lung cancer samples: 807, number of normal samples: 71) of normal lung tissues and non-small cell lung cancer samples from the TCGA database.
S7, obtaining differential DNA methylation site (DMC) information in a 450K methylation chip data set by using a statistical method, combining the eDMR which is obtained in the S5 and has three-dimensional chromatin interaction with a gene promoter, reserving DMC in the eDMR regions at the genome position as candidate DNA methylation markers, and obtaining 14 candidate markers, wherein specific results are shown in a table 2, and 14 CpG sites in the table 2 are the candidate DNA methylation markers and are located in a genome enhancer region.
TABLE 2 enhancer DNA methylation candidate markers
Figure BDA0003958540040000091
* The location of the genome is referenced to the human reference genome version hg 38.
S8, establishing a Lasso regression model for 878 non-small cell lung cancers (n = 807) and normal lung tissue samples (n = 71) by utilizing 14 CpG sites in the table 2 to perform feature screening, deleting the CpG sites with the coefficient equal to 0 in the model, and finally obtaining 4 CpG sites, namely the final non-small cell lung cancer enhancer DNA methylation marker sites, as shown in the table 3.
TABLE 3 selected 4 non-small cell Lung cancer enhancer DNA methylation marker sites
Figure BDA0003958540040000092
To validate the effectiveness of the DNA methylation markers, we examined the DNA methylation degree of these markers in normal lung tissue samples and non-small cell lung cancer samples at various pathological stages, and all 4 CpG sites showed abnormal DNA methylation and also showed significant variability at Stage I. The results are shown in fig. 3, which is a statistical test using Wilcoxon sum rank with sample number n =878.
And S9, evaluating the efficacy of the DNA methylation marker of the non-small cell lung cancer enhancer. Using the methylation information of these 4 CpG sites, a logistic regression model was built for 878 samples in TCGA, and the equation of the model was:
Y=26.671–10.644X 1 –4.376X 2 –15.556X 3 –11.790X 4 wherein, X 1 Is cg00787780, X 2 Is cg16434331, X 3 Is cg21862081, X 4 Cg24327132;
y is the probability that the sample suffers from non-small cell lung cancer, and Y is more than or equal to 0 and less than or equal to 1.
And (3) taking P =0.743 as a screening threshold of Y, judging that the sample has the non-small cell lung cancer if the Y is more than or equal to P, and judging that the sample is a normal sample if not. The sensitivity in the training set (878 samples in TCGA) was 0.972 and the specificity was 0.988. The ROC curve of the markers in the training set is shown in FIG. 4, wherein the AUC value is 0.996, and therefore, the DNA methylation markers of the 4 enhancer regions have good classification performance. On the other hand, we performed performance evaluation on the classification models established by these 4 methylation markers in the other two test set data, and also compared the classification performance of the other 3 reported promoter regions for DNA methylation markers. The result shows that the DNA methylation marker of the enhancer region has more stable and better classification performance. Specific results are shown in tables 4 and 5.
TABLE 4 validation of enhancer region DNA methylation marker Performance on two independent datasets
Figure BDA0003958540040000101
Note: a Data Set; tumor cancer sample number; normal lung tissue sample number; TP is true positive; TN is true negative; FP false positive; FN false negative
TABLE 5 comparison of Classification Performance with previously reported Lung cancer DNA methylation markers
Figure BDA0003958540040000102
Figure BDA0003958540040000111
Note: AUC is the area under the ROC curve; accuracy is classification Accuracy. The corresponding documents of the lung cancer DNA methylation markers reported previously are:
1.Li M,Zhang C,Zhou L,Li S,Cao YJ,Wang L,Xiang R,Shi Y&Piao Y(2020)Identification and validation of novel DNA methylation markers for early diagnosis of lung adenocarcinoma.Mol Oncol 14,2744-2758.
2.Diaz-Lagares A,Mendez-Gonzalez J,Hervas D,Saigi M,Pajares MJ,Garcia D,Crujerias AB,Pio R,Montuenga LM&Zulueta J(2016)A Novel Epigenetic Signature for Early Diagnosis in Lung CancerEpigenetic Signature for Lung Cancer Diagnosis.Clin Cancer Res 22,3361-3371.
3.Dong S,Li W,Wang L,Hu J,Song Y,Zhang B,Ren X,Ji S,Li J&Xu P(2019)Histone-related genes are hypermethylated in lung cancer and hypermethylated HIST1H4F could serve as a pan-cancer biomarker.Cancer Res 79,6101-6112.
example 2 kit for detecting the above-mentioned DNA methylation marker in a sample
The kit detects the methylation degrees of the four markers in the biopsy sample or blood to be tested, substitutes the methylation degrees into the model equation, and judges whether the tested patient has the non-small cell lung cancer according to the threshold value. On the other hand, the markers can be used for constructing a new and more accurate model equation in a larger-scale clinical queue of the non-small cell lung cancer, and the model equation is applied to the early diagnosis and screening of the non-small cell lung cancer.
Other parts not described in detail are prior art. Although the present invention has been described in detail with reference to the above embodiments, it is only a part of the embodiments of the present invention, not all of the embodiments, and other embodiments can be obtained without inventive step according to the embodiments, and the embodiments are within the scope of the present invention.

Claims (8)

1. A method for screening a non-small cell lung cancer marker based on methylation difference of an enhancer is characterized by comprising the following steps: the method comprises the following steps:
s1, respectively obtaining whole genome DNA methylation data of a non-small cell lung cancer sample and whole genome DNA methylation data of a normal lung tissue sample by using a WGBS sequencing technology;
s2, carrying out DNA methylation difference analysis on the whole genome DNA methylation data of the non-small cell lung cancer sample and the whole genome DNA methylation data of the normal lung tissue sample by using BatMeth2 software to obtain genome position information of a difference DNA methylation region;
s3, respectively obtaining genome position information of a lung tissue genome enhancer region or genome position information of a non-small cell lung cancer tissue related cell line genome enhancer region through an EnhancerrAtlas 2.0 database, and reserving DMR with intersection with the genome position of an enhancer region by combining the genome position information of the DMR of the differential DNA methylation region obtained in the step S2 to obtain an enhancer region differential DNA methylation region eDMR;
s4, acquiring RNAPOL2ChIA-PET data of lung tissues or RNAPOL2ChIA-PET data of non-small cell lung cancer related cell lines by using a paired end tag sequencing technology, acquiring RNAPOL2 whole genome three-dimensional chromatin interaction information of the lung tissues or RNAPOL2 whole genome three-dimensional chromatin interaction information of the non-small cell lung cancer related cell lines by using ChIA-PET Tool V3 software, and acquiring the difference DNA methylation region of the whole genome enhancer region with one end positioned in the enhancer region and the other end positioned in the gene promoter by combining the difference DNA methylation region of the enhancer region obtained in the step S3;
s5, screening by using the information of the complete genome enhancer region differential DNA methylation region-gene promoter interaction pair obtained in the step S4 to obtain eDMR with chromatin interaction with the gene promoter;
s6, respectively acquiring DNA methylation data of a non-small cell lung cancer sample and DNA methylation data of a normal lung tissue sample through Illumina Infinium HumanMethylation 450K BeadChiP;
s7, obtaining the differential DNA methylation sites of the non-small cell lung cancer sample and the differential DNA methylation sites of the normal lung tissue sample by using a statistical method based on the DNA methylation data of the non-small cell lung cancer sample and the DNA methylation data of the normal lung tissue sample collected in the S6; screening out a differential DNA methylation site with a genome position located in the eDMR as a candidate marker based on the eDMR obtained by screening in the step S5;
s8, by using a Lasso method, taking the methylation degree of the candidate marker differential DNA methylation sites obtained in the S7 as an independent variable, taking the sample types in the DNA methylation data of the non-small cell lung cancer sample collected in the S6 and the DNA methylation data of the normal lung tissue sample as dependent variables, establishing a linear regression model on the DNA methylation data of the non-small cell lung cancer sample collected in the S6 and the DNA methylation data of the normal lung tissue sample, and keeping the differential DNA methylation sites with the coefficient not equal to 0 in the model as final markers;
s9, respectively acquiring DNA methylation data of the non-small cell lung cancer sample and a DNA methylation data set of the normal lung tissue sample by using a 450K methylation chip detection technology, or respectively downloading the DNA methylation data of the non-small cell lung cancer sample and the DNA methylation data set of the normal lung tissue sample from a publicly published database; and (3) constructing a logistic regression model by using the acquired data set, and evaluating the effectiveness of the marker on the non-small cell lung cancer diagnosis by using the logistic regression model in the step (S8).
2. The method for screening non-small cell lung cancer markers based on the methylation difference of an enhancer as claimed in claim 1, wherein: in step S2, the screening criteria for genomic position information of the differential DNA Methylation Region (DMR) are as follows:
differential degree of methylation of differential DNA Methylation Regions (DMR): 0.25 or more and 1 or-0.25 or less, and the multiplex assay FDR is less than 0.05 and more than 0.
3. The method for screening non-small cell lung cancer markers based on enhancer methylation difference according to claim 1, wherein: in step S8, the sample type is set as:
the normal lung tissue sample codes for 0, and the non-small cell lung cancer sample codes for 1.
4. The method for screening non-small cell lung cancer markers based on enhancer methylation difference according to claim 1, wherein: in the step S8, the final markers are four, namely cg00787780, cg16434331, cg21862081 and cg24327132.
5. The method for screening non-small cell lung cancer markers based on enhancer methylation difference according to claim 1, wherein: in step S9, the equation of the logistic regression model is:
Y=26.671–10.644X 1 –4.376X 2 –15.556X 3 –11.790X 4
wherein, X 1 Is cg00787780, X 2 Is cg16434331, X 3 Is cg21862081, X 4 Cg24327132; y is the probability that the sample suffers from non-small cell lung cancer, and Y is more than or equal to 0 and less than or equal to 1.
6. The method for screening non-small cell lung cancer markers based on enhancer methylation difference according to claim 1 or 4, wherein: in step S9, the performance criteria are as follows:
when the classification accuracy is more than 0.5 and less than or equal to 1 and the AUC is more than 0.5 and less than or equal to 1, the marker is effective;
when the classification accuracy is less than or equal to 0.5 or the AUC is less than or equal to 0.5, the marker is invalid.
7. The DNA methylation marker screened by the method of claim 1, wherein the marker is cg00787780, cg16434331, cg21862081, cg24327132.
8. Use of a kit for detecting a DNA methylation marker according to claim 7 in a sample for the preparation of a product for diagnosing and prognosing non-small cell lung cancer.
CN202211471119.8A 2022-11-23 2022-11-23 Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof Pending CN115820860A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211471119.8A CN115820860A (en) 2022-11-23 2022-11-23 Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211471119.8A CN115820860A (en) 2022-11-23 2022-11-23 Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof

Publications (1)

Publication Number Publication Date
CN115820860A true CN115820860A (en) 2023-03-21

Family

ID=85530460

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211471119.8A Pending CN115820860A (en) 2022-11-23 2022-11-23 Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof

Country Status (1)

Country Link
CN (1) CN115820860A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758989A (en) * 2023-06-09 2023-09-15 哈尔滨星云生物信息技术开发有限公司 Breast cancer marker screening method and related device
CN116758989B (en) * 2023-06-09 2024-04-30 哈尔滨星云生物信息技术开发有限公司 Breast cancer marker screening method and related device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758989A (en) * 2023-06-09 2023-09-15 哈尔滨星云生物信息技术开发有限公司 Breast cancer marker screening method and related device
CN116758989B (en) * 2023-06-09 2024-04-30 哈尔滨星云生物信息技术开发有限公司 Breast cancer marker screening method and related device

Similar Documents

Publication Publication Date Title
JP6480591B2 (en) Use of size and number abnormalities in plasma DNA for cancer detection
JP2020103298A (en) Systems and methods to detect rare mutations and copy number variation
WO2019068082A1 (en) Dna methylation biomarkers for cancer diagnosing
CN111254194B (en) Cancer-related biomarkers based on sequencing and data analysis of cfDNA and application thereof in classification of cfDNA samples
EP3658684B1 (en) Enhancement of cancer screening using cell-free viral nucleic acids
CN112941180A (en) Group of lung cancer DNA methylation molecular markers and application thereof in preparation of lung cancer early diagnosis kit
CN109830264B (en) Method for classifying tumor patients based on methylation sites
JP2024001068A (en) Dna methylation markers for noninvasive detection of cancer and uses thereof
US20210087637A1 (en) Methods and systems for screening for conditions
CN114974417A (en) Methylation sequencing method and device
EP3950960A1 (en) Dna methylation marker for predicting recurrence of liver cancer, and use thereof
US20140206565A1 (en) Esophageal Cancer Markers
CN110408706A (en) It is a kind of assess recurrent nasopharyngeal carcinoma biomarker and its application
CN115976209A (en) Training method of lung cancer prediction model, prediction device and application
Kwon et al. Advances in methylation analysis of liquid biopsy in early cancer detection of colorectal and lung cancer
EP3810807A1 (en) Methods and compositions for the analysis of cancer biomarkers
CN115820860A (en) Method for screening non-small cell lung cancer marker based on methylation difference of enhancer, marker and application thereof
CN108460247B (en) Method and system for determining colorectal tumor cells based on KRAS and NDRG4 genes
EP4282984A1 (en) Method for construction of multi-feature prediction model for cancer diagnosis
Ye et al. Molecular counting enables accurate and precise quantification of methylated ctDNA for tumor-naive cancer therapy response monitoring
CN116162705B (en) Gastric cancer diagnosis product and diagnosis model
Wu et al. Identification of Six Genes as Diagnostic Markers for Colorectal Cancer Detection by Integrating Multiple Expression Profiles
CN114141303A (en) Construction method of lung cancer screening model and lung cancer screening kit
CN111363819A (en) Method for jointly detecting and diagnosing breast cancer by utilizing ddPCR technology
CN115074436A (en) Application of lung cancer early diagnosis marker in preparation of lung cancer early diagnosis reagent

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination