CN116875701A - Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof - Google Patents

Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof Download PDF

Info

Publication number
CN116875701A
CN116875701A CN202311085574.9A CN202311085574A CN116875701A CN 116875701 A CN116875701 A CN 116875701A CN 202311085574 A CN202311085574 A CN 202311085574A CN 116875701 A CN116875701 A CN 116875701A
Authority
CN
China
Prior art keywords
methylation
benign
umhl3
chr1
pdr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311085574.9A
Other languages
Chinese (zh)
Inventor
刘凌晓
颜志平
刘轶颖
王飞航
苏志熙
巩成相
何其晔
陈颐
徐汪洋
赵丹阳
孙慧怡
张子寒
杨敏捷
周馨
杨颂生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Wuyuan Health Technology Co ltd
Zhongshan Hospital Fudan University
Original Assignee
Shanghai Wuyuan Health Technology Co ltd
Zhongshan Hospital Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Wuyuan Health Technology Co ltd, Zhongshan Hospital Fudan University filed Critical Shanghai Wuyuan Health Technology Co ltd
Priority to CN202311085574.9A priority Critical patent/CN116875701A/en
Publication of CN116875701A publication Critical patent/CN116875701A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Abstract

The invention discloses a leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof. According to the invention, 60 methylation markers are screened out from 108 training set samples, and thyroid nodule benign and malignant prediction is carried out through a constructed random forest regression model, so that the method has higher sensitivity and specificity for detection of non-tiny nodules (the longest diameter of the nodules is larger than 1 cm) and tiny nodules (the longest diameter of the nodules is not larger than 1 cm); compared with the existing technology for molecular diagnosis of benign and malignant thyroid nodules, the methylation marker and the technical scheme provided by the invention effectively solve the problems that the ultrasonic diagnosis of benign and malignant thyroid nodules can not be 100% diagnosed, puncture sampling is risky, the molecular diagnosis accuracy of tissues is low, and the like, and are beneficial to early diagnosis and early treatment of thyroid cancer so as to improve the cure rate.

Description

Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof
Technical Field
The invention relates to a leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof, and belongs to the technical field of biomedical detection.
Background
Thyroid cancer is the most common malignancy of the head and neck. The incidence has increased rapidly in recent years. Papillary thyroid carcinoma (Papillary Thyroid Carcinoma, PTC) is the most common. The clinically conventional diagnostic method is ultrasound imaging. The ACR TI-RADS standard is a thyroid nodule ultrasound classification diagnostic standard proposed by the american society of radiation in 2017. Patients were scored quantitatively based on this criteria and then classified for benign and malignant, respectively TR1 (score 0, highly indicative of benign, malignancy risk < 2%), TR2 (score 2, benign, malignancy risk < 2%), TR3 (score 3, mild suspected malignancy, malignancy risk < 5%), TR4 (score 4-6, moderate suspected malignancy, malignancy risk 5% -20%), TR5 (score 7, highly indicative of malignancy, malignancy risk > 20%). Diagnosing benign lesions in less than or equal to 3 and malignant lesions in less than or equal to 4. That is, patients of ∈4 may also be benign lesions. In addition, ultrasound can cause certain difficulties in diagnosis when it encounters several conditions: multiple small primary malignant lesions; single or multiple small malignant lesions combined with other benign lesions; there are small focal malignant areas within the larger benign lesions. Ultrasound examination is highly suspected of being malignant thyroid nodules, or ultrasound diagnosis is not 100% definitive for patients, and further fine needle puncture cytology (fine needle aspiration, FNA) examination is required to confirm diagnosis. Malignant and benign nodules present some difficulty in diagnosis due to approximate cytological features, and up to 30% of thyroid nodules are difficult to accurately diagnose by cytological features. The existing molecular diagnosis method improves the identification accuracy, but the accuracy still needs to be improved.
Several cancer studies have found that the risk of cancer is significantly associated with overall DNA methylation of leukocytes, including intestinal cancer, bladder cancer, gastric cancer, breast cancer, and head and neck tumors. Other studies have found that methylation specific to leukocyte genes is associated with cancer risk, focusing mainly on breast and intestinal cancers. In general, the research of the leukocyte methylation standard for evaluating the disease risk is relatively few at present, but the leukocyte methylation standard can be used for noninvasively and repeatedly sampling, is beneficial to dynamically evaluating the disease risk, and has good application prospect.
Disclosure of Invention
The purpose of the invention is that: aiming at the defects of the existing ACR TI-RADS diagnosis system for classifying and diagnosing thyroid nodules, the invention provides a leukocyte methylation marker for diagnosing thyroid benign and malignant nodules molecules and application thereof, and the leukocyte methylation marker is used for improving specificity and accuracy of thyroid benign and malignant nodules diagnosis and reducing unnecessary puncture biopsy.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect of the invention, there is provided a leukocyte methylation marker for diagnosing a benign and malignant thyroid nodule, said methylation marker being selected from one or more of the following gene fragments:
chr1:2979941:2980204, chr1:9903073:9903206, chr1:16489135:16489154, chr1:42385756:42385944:964957885, chr4:17343483, chr1:1550030039:15500479, and support frame support frame from support frame from support frame support frame from support frame from, from support frame, from support frame, from, from support mount, support mount, support mount, chr10: chr type 10. Preferably, the methylation markers and corresponding methylation level algorithms are as follows:
chr1:2979941:2980204_UMHL3,chr1:9903073:9903206_MHL,chr1:16489135:16489154_MHL,chr1:42385756:42385944_PDR,chr1:84326655:84326705_AMF,chr1:202776149:202776284_MHL3,chr1:212687460:212687484_UMHL3,chr1:221051967:221051975_MHL,chr1:247590087:247590125_UMHL3,chr2:90449871:90449877_PDR,chr2:162930212:162930284_MHL,chr2:232791101:232791148_AMF,chr3:96495744:96495785_UMHL,chr4:3873289:3873320_PDR,chr4:7474354:7474374_PDR,chr4:17783433:17783483_MHL3,chr5:15500310:15500479_MHL,chr5:42924409:42924459_AMF,chr5:72598984:72598989_UMHL3,chr6:27228227:27228268_PDR,chr7:4922765:4922802_PDR,chr7:27212883:27213062_MHL,chr7:27215299:27215672_UMHL,chr7:30721952:30721977_PDR,chr8:11445789:11445888_UMHL3,chr8:24813967:24813984_PDR,chr8:38325198:38325247_AMF,chr8:125740528:125740563_MHL,chr9:94712520:94712581_UMHL3,chr9:114245252:114245260_AMF,chr9:136357330:136357347_PDR,chr9:140089655:140089861_AMF,chr10:71122474:71122535_MHL3,chr10:77871851:77871936_PDR,chr10:90846868:90847011_UMHL,chr10:114594519:114594642_AMF,chr10:130182493:130182560_UMHL3,chr10:134600258:134600302_MHL,chr10:135178449:135178609_UMHL3,chr11:133852324:133852377_UMHL3,chr11:134341177:134341187_PDR,chr12:125000613:125000883_PDR,chr13:26761355:26761378_UMHL3,chr14:21269942:21270203_PDR,chr14:70346859:70346870_PDR,chr15:27017041:27017079_UMHL3,chr15:44038750:44038801_MHL3,chr15:76633785:76634041_MHL3,chr16:25413501:25413639_UMHL3,chr17:85092:85115_UMHL3,chr18:77251761:77251776_MHL,chr19:8174155:8174267_UMHL3,chr20:6748842:6748856_MHL3,chr20:33146070:33146819_UMHL,chr20:62046303:62046323_AMF,chr21:45160770:45160856_UMHL3,chr22:39784422:39784565_UMHL,chr22:40042775:40042781_MHL,chr22:49081733:49081764_UMHL3,chr22:50473587:50473604_PDR。
in a second aspect, the invention provides the use of a leukocyte methylation marker or a detection reagent thereof as described above in the preparation of an in vitro diagnostic product for the detection of benign and malignant thyroid nodules.
In a third aspect of the present invention, there is provided a kit for diagnosing a benign and malignant thyroid nodule, comprising reagents for detecting the methylation level of a methylation marker as described in the above technical scheme.
Preferably, the algorithm of methylation level is at least one of AMF, MHL, MHL3, UMHL3 and PDR; more preferably, the methylation marker adopts a corresponding methylation level algorithm, and the methylation marker and the corresponding methylation level algorithm are as shown in the technical scheme of the first aspect of the invention.
Preferably, the reagent for detecting methylation level is a reagent selected from the group consisting of reagents used in one or more of the following methods: bisulfite conversion-based PCR (e.g., methylation-specific PCR), DNA sequencing (e.g., bisulfite sequencing, whole genome methylation sequencing, simplified methylation sequencing), methylation-sensitive restriction enzyme analysis, fluorescence quantification, methylation-sensitive high resolution melting curve, chip-based methylation profile analysis, mass spectrometry (e.g., flight mass spectrometry).
Preferably, the agent is selected from one or more of the following: bisulfite and derivatives thereof, PCR buffer, polymerase, dNTP, primer, probe, methylation sensitive or insensitive restriction enzyme, enzyme digestion buffer, fluorescent dye, fluorescence quencher, fluorescent reporter, exonuclease, alkaline phosphatase, internal standard and control.
In a fourth aspect of the invention, there is provided an apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of:
step 1: obtaining a methylation level of a methylation marker according to the first aspect of the invention in a sample;
step 2: treating the methylation level of each methylation marker using at least one algorithm selected from AMF, MHL, MHL, UMHL3 and PDR;
step 3: obtaining a sample malignancy prediction probability by constructing a model using the processed methylation level of step 2;
step 4: and identifying the sample as benign or malignant to the thyroid nodule according to the sample malignancy prediction probability and malignancy prediction threshold.
Preferably, each methylation marker in the step 2 adopts a corresponding methylation level algorithm, and the methylation markers and the corresponding methylation level algorithm are as shown in the technical scheme of the first aspect of the invention.
Preferably, the model constructed in the step 3 is a random forest regression model constructed by using random forest in a scikit-learn (version 0.24.2) library, and the parameters are set as follows: the number of weak learners is set to 80, and the maximum depth is set to 5.
Preferably, the identification method in the step 4 is as follows: if the malignancy prediction probability of the sample obtained through the model is larger than the malignancy prediction threshold, the sample is identified as being malignant to the thyroid nodule, otherwise, the sample is identified as being benign to the thyroid nodule; the malignancy prediction threshold is 0.51.
In a fifth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
step 1: obtaining a methylation level of a methylation marker according to the first aspect of the invention in a sample;
step 2: treating the methylation level of each methylation marker using at least one algorithm selected from AMF, MHL, MHL, UMHL3 and PDR;
step 3: obtaining a sample malignancy prediction probability by constructing a model using the processed methylation level of step 2;
step 4: and identifying the sample as benign or malignant to the thyroid nodule according to the sample malignancy prediction probability and malignancy prediction threshold.
Preferably, the model construction of the step 2 and the identification method of the step 4 are as shown in the technical scheme of the fourth aspect of the present invention.
In a sixth aspect of the invention, there is provided a system or device for diagnosing a benign and malignant thyroid nodule, the system or device comprising:
a collection device for obtaining the methylation level in the methylation marker according to the first aspect of the present invention in a sample;
a data processing device that processes the methylation level of each of the methylation markers using at least one algorithm selected from AMF, MHL, MHL3, UMHL3, and PDR, and obtains a sample malignancy prediction probability using the methylation level of each of the processed markers by constructing a model; the model is a random forest regression model;
and the judging device is used for judging the benign and malignant thyroid nodule of the sample according to the sample malignant prediction probability and the malignant prediction threshold.
Preferably, the collection device comprises a sample processing device and a sequencing device.
Preferably, the collection device further comprises means for inputting the methylation level.
Preferably, the model is a random forest regression model constructed using random forest in the scikit-learn (version 0.24.2) library, the parameters set to: the number of weak learners is set to 80, and the maximum depth is set to 5.
Preferably, the methylation marker adopts a corresponding methylation level algorithm, and the methylation marker and the corresponding methylation level algorithm are shown in the technical scheme of the first aspect of the invention.
Preferably, in the judging device, if the malignancy prediction probability of the sample obtained by the model is greater than the malignancy prediction threshold, the sample is identified as being malignant to the thyroid nodule, otherwise, the sample is identified as being benign to the thyroid nodule; the malignancy prediction threshold is 0.51.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a method for diagnosing benign and malignant thyroid nodules by using a peripheral blood methylation marker combination, which has higher sensitivity and specificity for detecting non-tiny nodules (the longest diameter of the nodules is larger than 1 cm) and tiny nodules (the longest diameter of the nodules is not larger than 1 cm); compared with the existing technology for molecular diagnosis of benign and malignant thyroid nodules, the methylation marker and the technical scheme provided by the invention effectively solve the problems that the ultrasonic diagnosis of benign and malignant thyroid nodules can not be 100% diagnosed, puncture sampling is risky, the molecular diagnosis accuracy of tissues is low, and the like, and are beneficial to early diagnosis and early treatment of thyroid cancer so as to improve the cure rate.
Drawings
FIG. 1. ROC curves of a model constructed from methylation marker combinations in training set and validation set samples for diagnosing malignant nodules;
FIG. 2. ROC curves for a model constructed from a combination of methylation markers for diagnosing malignant nodules in a training set and a validated set of non-minuscule nodules;
FIG. 3 ROC curves for malignant nodules in training set and validation set samples of the micro-nodules using the model constructed with the methylation marker combinations.
Detailed Description
In order to make the invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.
The terms "benign" and "malignant" as used herein refer to the nature of thyroid nodules. In general, benign manifestations are slow growth of nodules, uniform texture, good mobility, smooth surface, cystic changes, no lymphadenomegaly, no calcification, etc. Malignancy manifests as uncontrolled malignant cell growth, spread and tissue infiltration. Ultrasound signs that suggest a thyroid nodule as malignant include: the height of the nodule is larger than the width, the areola is lacked, the micro calcification is carried out, the boundary is irregular, the echo is reduced, the solid nodule is provided, the blood flow in the nodule is rich, etc. In an embodiment of the invention, the malignant thyroid nodule is papillary thyroid carcinoma (Papillary thyroid cancer, PTC).
The molecular diagnosis in the invention comprises early diagnosis of thyroid malignant tumor, late diagnosis of thyroid malignant tumor, and also comprises thyroid malignant tumor screening, risk assessment, prognosis and disease recognition.
Early diagnosis refers to the likelihood of finding cancer prior to metastasis, preferably before morphological changes in tissue or cells can be observed.
In an embodiment of the invention, the sample type for which the invention is directed is peripheral blood leukocytes. Firstly, collecting peripheral blood of a tested subject, separating peripheral blood white blood cells, and extracting genomic DNA in the white blood cells. Methylation sequencing of the extracted DNA sample may be whole genome methylation sequencing (WGBS) or degenerate methylation sequencing (RRBS); degenerate methylation sequencing (RRBS) is employed in particular embodiments of the invention; simplified genome methylation sequencing (Reduced Representation Bisulfite Sequencing, RRBS) is to enrich fragments rich in CCGG sites on genomic DNA by a restriction enzyme digestion method, and then to calculate methylation rate according to the ratio of the number of reads which are not converted into C and not converted into T on a single C site to the number of reads covered by the genome after sequencing the genome after being treated by bisulfite by utilizing the characteristic that unmethylated cytosine (C) can be converted into thymine (T) by bisulfite. Compared with WGBS, the RRBS technology is used as a methylation sequencing scheme with high cost performance, the sequencing amount is greatly reduced, and the method has wide application value in large-scale clinical sample research.
In an embodiment of the invention, the method of the invention for detecting benign and malignant thyroid nodules using methylation markers comprises the steps of:
(1) The methylation level of CpG sites in the sample methylation markers was detected using genome-simplified methylation sequencing (RRBS) technology, and then methylation metrics (also called methylation level calculation methods, including AMF, MHL, MHL3, UMHL3, PDR) specific to each marker were calculated, and z-score normalization was performed, and as the DNA methylation level of the marker, each methylation marker and its corresponding methylation level calculation method are specifically shown in Table 1 in the examples of the present invention.
The calculation formula of AMF is as follows:
m is the total CpG site number in the marker, i is one of the CpG sites, N C,i Sequencing reads number, N, for methylation of the CpG sites T,i Sequencing reads that are unmethylated for the CpG sites. The calculation formulas of MHL and MHL3 are:
where l is the length of the marker, P (MH i ) Sequencing reads ratio, w for completely continuous methylated CpG at i position i Weights for i-position, w for MHL i =i, w for MHL3 i =i 3
The calculation formulas of UMHL and UMHL3 are as follows:
where l is the length of the marker, P (MH i ) Sequencing reads ratio, w, for completely continuous unmethylated CpG at position i i Weights for i-position, w for MHL i =i, w for MHL3 i =i 3
The calculation formula of the PDR is as follows:
that is, the number of sequencing reads that are inconsistent in the standard region is a proportion of the total number of sequencing reads in the standard region.
(2) Prediction was performed using a random forest regression model (randomehtrregsor) constructed from the DNA methylation levels of the methylation marker combinations of the present invention. The random forest regression model was implemented using the random forest regression model in the scikit-learn (version 0.24.2) library, with parameters set as: the number of weak learners is set to 80, the maximum depth is set to 5, and the other weak learners are all default settings. The malignancy prediction probability threshold is 0.51. Each sample was calculated to obtain a probability of malignancy prediction using a constructed random forest regression model based on the DNA methylation level of its marker combination. If the malignancy prediction probability is greater than the threshold value, judging that the vehicle is malignant, otherwise judging that the vehicle is benign.
The test methods used in the following examples are conventional methods unless otherwise specified; the materials, reagents and the like used, unless otherwise specified, are those commercially available.
Examples
1. Sample collection
A total of 205 peripheral blood leukocyte samples from the patients with papillary thyroid cancer and the patients with benign thyroid nodules from the affiliated Zhongshan Hospital of the double denier university were collected, including 91 patients with papillary thyroid cancer and 114 patients with benign thyroid nodules. 108 cases (49 cases of papillary thyroid cancer patients and 59 cases of benign thyroid nodule patients) are used as training sets, the remaining 97 cases (42 cases of papillary thyroid cancer patients and 55 cases of benign thyroid nodule patients) are used as verification sets, and 46 cases (6 cases of papillary thyroid cancer patients and 40 cases of benign thyroid nodule patients) of non-tiny nodules (the longest diameter of the nodules is larger than 1 cm) and 51 cases (36 cases of papillary thyroid cancer patients and 15 cases of benign thyroid nodule patients) of tiny nodules (the longest diameter of the nodules is not larger than 1 cm) are included as verification sets. Each sample obtained methylation level of CpG sites above 10x by using RRBS platform; then, methylation metrics (AMF, MHL, MHL3, UMHL3, PDR) specific for each standard were calculated and z-score normalized as DNA methylation level for that marker. The training set constructs a random forest regression model using the DNA methylation levels of the above marker combinations, the validation set predicts using this model, and draws the subject and works the characteristic curve (receiver operating characteristic curve, ROC), and calculates the area under the curve AUC (Area Under Curve).
2. Screening of methylation markers
108 training set samples (49 patients with papillary thyroid carcinoma and 59 patients with benign thyroid nodules) are used for obtaining methylation levels of CpG sites above 10x by using an RRBS platform; methylation metrics were then calculated for the methylation haplotype regions (Methylation haplotype blocks, MHB), including AMF, MHL, MHL, UMHL3, and PDR, i.e., six methylation metrics were calculated for each MHB. The methylation metrics of patients with papillary thyroid cancer and benign thyroid nodules were compared using a statistical test (rank sum test), and P value for each MHB six methylation metrics was calculated separately. Each MHB selects the one methylation metric with the smallest P value. Then, 70 markers were screened out at P value < 0.001. Then 60 markers are further screened out by a recursive feature elimination method and a cross validation method based on a random forest model. The model is cross-validated with 10-fold search.
The 60 methylation markers screened (the genomic coordinates listed are based on the human reference genome hg 19) and the corresponding methylation level metrics are specified as follows:
chr1:2979941:2980204_UMHL3, chr1:9903073:9903206_MHL,chr1:16489135:16489154_MHL, chr1:42385756:42385944_PDR,chr1:84326655:84326705_AMF,chr1:202776149:202776284_MHL3,chr1:212687460:212687484_UMHL3,chr1:221051967:221051975_MHL,chr1:247590087:247590125_UMHL3,chr2:90449871:90449877_PDR,chr2:162930212:162930284_MHL,chr2:232791101:232791148_AMF,chr3:96495744:96495785_UMHL, chr4:3873289:3873320_PDR,chr4:7474354:7474374_PDR, chr4:17783433:17783483_MHL3 ,chr5:15500310:15500479_MHL,chr5:42924409:42924459_AMF,chr5:72598984:72598989_UMHL3,chr6:27228227:27228268_PDR,chr7:4922765:4922802_PDR , chr7:27212883:27213062_MHL ,chr7:27215299:27215672_UMHL, chr7:30721952:30721977_PDR,chr8:11445789:11445888_UMHL3,chr8:24813967:24813984_PDR,chr8:38325198:38325247_AMF,chr8:125740528:125740563_MHL,chr9:94712520:94712581_UMHL3,chr9:114245252:114245260_AMF,chr9:136357330:136357347_PDR,chr9:140089655:140089861_AMF,chr10:71122474:71122535_MHL3,chr10:77871851:77871936_PDR,chr10:90846868:90847011_UMHL,chr10:114594519:114594642_AMF,chr10:130182493:130182560_UMHL3,chr10:134600258:134600302_MHL,chr10:135178449:135178609_UMHL3,chr11:133852324:133852377_UMHL3,chr11:134341177:134341187_PDR,chr12:125000613:125000883_PDR,chr13:26761355:26761378_UMHL3,chr14:21269942:21270203_PDR,chr14:70346859:70346870_PDR,chr15:27017041:27017079_UMHL3,chr15:44038750:44038801_MHL3,chr15:76633785:76634041_MHL3,chr16:25413501:25413639_UMHL3,chr17:85092:85115_UMHL3,chr18:77251761:77251776_MHL,chr19:8174155:8174267_UMHL3,chr20:6748842:6748856_MHL3,chr20:33146070:33146819_UMHL,chr20:62046303:62046323_AMF,chr21:45160770:45160856_UMHL3,chr22:39784422:39784565_UMHL,chr22:40042775:40042781_MHL,chr22:49081733:49081764_UMHL3,chr22:50473587:50473604_PDR。
the Gene names (Gene symbols) annotated into the ginseng genome HG19 with the above methylation markers are shown in Table 1 below:
TABLE 1 methylation marker annotated genes
3. Construction of random forest prediction model
Random forest regression models (random forest) were constructed in the training set using the DNA methylation levels of the methylation marker combinations screened above (60 methylation marker combinations, below). Random forest regression model was constructed using random forest regressor in the scikit-learn (version 0.24.2) library, code of random forest regressor (n_timer=80, max_depth=5, random_state=1); it can be seen that the number of weak learners in the model is set to 80, the maximum depth is set to 5, and the model super-parameters are obtained by 10-fold cross-validation of training set samples. And obtaining a malignant prediction probability threshold value of 0.51 based on the model prediction result of the training set sample. Each sample was calculated to obtain a probability of malignancy prediction using a constructed random forest regression model based on the DNA methylation level of its marker combination. If the malignancy prediction probability is greater than the threshold value, judging that the vehicle is malignant, otherwise judging that the vehicle is benign.
4. Random forest predictive model diagnosis of benign and malignant nodules in training set and all validation set samples
AUC was tested in all validation set samples using the random forest prediction model constructed as described above, and the results showed that the validation set had an area under the ROC curve of 0.86 and 95% ci of 0.82-0.90 (fig. 1). When the training set specificity is 100% and the sensitivity is 100%, the malignancy prediction threshold is 0.51, namely the malignancy prediction probability is greater than 0.51, and the training set is judged to be malignant, otherwise, the training set is judged to be benign; the sensitivity of the threshold value to diagnosis of the thyroid gland malignant nodule is 83.3%, the specificity is 90.9%, PPV (positive predict value) is 87.5% and NPV (negative predict value) is 87.7%. The test set samples were predicted using the methylation marker combinations as shown in table 2:
table 2 test set samples with methylation marker combination prediction results
/>
/>
/>
5. Random forest predictive model diagnosis of benign and malignant nodules in training set and non-minutia validation set samples
The AUC was tested in a sample of a validation set of non-minuscule nodules using the random forest prediction model constructed as described above, and the results showed that the validation set had an area under the ROC curve of 0.87 and 95% ci of 0.82-0.93 (fig. 2). The sensitivity of the malignant thyroid nodule diagnosis verification of non-tiny nodules by using a malignant prediction threshold of 0.51 reaches 66.7%, the specificity reaches 90%, the PPV reaches 50%, and the NPV reaches 94.7%. The results of the methylation marker combination predictions for the non-minuscule validation set samples are shown in Table 3:
TABLE 3 methylation marker combination prediction results for non-minuscule validation set samples
/>
6. Random forest predictive model diagnosis of benign and malignant nodules in training set and verification set samples of tiny nodules
AUC was tested in a sample of a validation set of micro nodules using the random forest prediction model constructed as described above, and the results showed that the methylation marker combination, the validation set, had an area under the ROC curve of 0.86 and 95% ci of 0.81-0.94 (fig. 3). The sensitivity of the malignant nodule diagnosis of the thyroid collection by the verification of the small nodule with the malignant prediction threshold value of 0.51 reaches 86.1 percent, the specificity reaches 93.3 percent, the PPV reaches 96.9 percent, and the NPV reaches 73.7 percent. The results of the methylation marker combination prediction for the validation set samples of the micro nodules are shown in Table 4:
TABLE 4 methylation marker combination prediction results for validation set samples of micro nodules
/>
/>

Claims (10)

1. A leukocyte methylation marker for diagnosing a benign and malignant thyroid nodule, wherein said methylation marker is selected from one or more of the following gene fragments:
chr1:2979941:2980204,chr1:9903073:9903206,chr1:16489135:16489154,chr1:42385756:42385944,chr1:84326655:84326705,chr1:202776149:202776284,chr1:212687460:212687484,chr1:221051967:221051975,chr1:247590087:247590125,chr2:90449871:90449877,chr2:162930212:162930284,chr2:232791101:232791148,chr3:96495744:96495785,chr4:3873289:3873320,chr4:7474354:7474374,chr4:17783433:17783483,chr5:15500310:15500479,chr5:42924409:42924459,chr5:72598984:72598989,chr6:27228227:27228268,chr7:4922765:4922802,chr7:27212883:27213062,chr7:27215299:27215672,chr7:30721952:30721977,chr8:11445789:11445888,chr8:24813967:24813984,chr8:38325198:38325247,chr8:125740528:125740563,chr9:94712520:94712581,chr9:114245252:114245260,chr9:136357330:136357347,chr9:140089655:140089861,chr10:71122474:71122535,chr10:77871851:77871936,chr10:90846868:90847011,chr10:114594519:114594642,chr10:130182493:130182560,chr10:134600258:134600302,chr10:135178449:135178609,chr11:133852324:133852377,chr11:134341177:134341187,chr12:125000613:125000883,chr13:26761355:26761378,chr14:21269942:21270203,chr14:70346859:70346870,chr15:27017041:27017079,chr15:44038750:44038801,chr15:76633785:76634041,chr16:25413501:25413639,chr17:85092:85115,chr18:77251761:77251776,chr19:8174155:8174267,chr20:6748842:6748856,chr20:33146070:33146819,chr20:62046303:62046323,chr21:45160770:45160856,chr22:39784422:39784565,chr22:40042775:40042781,chr22:49081733:49081764,chr22:50473587:50473604。
2. the leukocyte methylation marker for diagnosing a benign and malignant thyroid nodule according to claim 1, wherein said methylation marker and corresponding methylation level algorithm are as follows:
chr1:2979941:2980204_UMHL3,chr1:9903073:9903206_MHL,
chr1:16489135:16489154_MHL,chr1:42385756:42385944_PDR,
chr1:84326655:84326705_AMF,chr1:202776149:202776284_MHL3,
chr1:212687460:212687484_UMHL3,chr1:221051967:221051975_MHL,
chr1:247590087:247590125_UMHL3,chr2:90449871:90449877_PDR,
chr2:162930212:162930284_MHL,chr2:232791101:232791148_AMF,
chr3:96495744:96495785_UMHL,chr4:3873289:3873320_PDR,
chr4:7474354:7474374_PDR,chr4:17783433:17783483_MHL3,
chr5:15500310:15500479_MHL,chr5:42924409:42924459_AMF,
chr5:72598984:72598989_UMHL3,chr6:27228227:27228268_PDR,
chr7:4922765:4922802_PDR,chr7:27212883:27213062_MHL,
chr7:27215299:27215672_UMHL,chr7:30721952:30721977_PDR,
chr8:11445789:11445888_UMHL3,chr8:24813967:24813984_PDR,
chr8:38325198:38325247_AMF,chr8:125740528:125740563_MHL,
chr9:94712520:94712581_UMHL3,chr9:114245252:114245260_AMF,
chr9:136357330:136357347_PDR,chr9:140089655:140089861_AMF,
chr10:71122474:71122535_MHL3,chr10:77871851:77871936_PDR,
chr10:90846868:90847011_UMHL,chr10:114594519:114594642_AMF,
chr10:130182493:130182560_UMHL3,chr10:134600258:134600302_MHL,
chr10:135178449:135178609_UMHL3,chr11:133852324:133852377_UMHL3,
chr11:134341177:134341187_PDR,chr12:125000613:125000883_PDR,
chr13:26761355:26761378_UMHL3,chr14:21269942:21270203_PDR,
chr14:70346859:70346870_PDR,chr15:27017041:27017079_UMHL3,
chr15:44038750:44038801_MHL3,chr15:76633785:76634041_MHL3,
chr16:25413501:25413639_UMHL3,chr17:85092:85115_UMHL3,
chr18:77251761:77251776_MHL,chr19:8174155:8174267_UMHL3,
chr20:6748842:6748856_MHL3,chr20:33146070:33146819_UMHL,
chr20:62046303:62046323_AMF,chr21:45160770:45160856_UMHL3,
chr22:39784422:39784565_UMHL,chr22:40042775:40042781_MHL,
chr22:49081733:49081764_UMHL3,chr22:50473587:50473604_PDR。
3. use of the leukocyte methylation marker or detection reagent thereof according to claim 1 for the preparation of an in vitro diagnostic product for the detection of benign and malignant thyroid nodules.
4. A kit for diagnosing a benign and malignant thyroid nodule, comprising reagents for detecting the methylation level of the methylation marker of claim 1.
5. A system or device for diagnosing a benign and malignant thyroid nodule, the system or device comprising:
a collection device for obtaining the methylation level of the methylation marker of claim 1 in a sample;
a data processing device that processes the methylation level of each of the methylation markers using at least one algorithm selected from AMF, MHL, MHL3, UMHL3, and PDR, and obtains a sample malignancy prediction probability by constructing a model using the methylation level of each of the methylation markers that is processed; the model is a random forest regression model;
and the judging device is used for judging the benign and malignant thyroid nodule of the sample according to the sample malignant prediction probability and the malignant prediction threshold.
6. A system or device for diagnosing a benign-malignant thyroid nodule as claimed in claim 5 wherein said acquisition means comprises sample processing means and sequencing means.
7. A system or device for diagnosing a benign and malignant thyroid nodule as claimed in claim 6 wherein said means for acquiring further comprises means for inputting said methylation level.
8. A system or apparatus for diagnosing a benign-malignant thyroid nodule as claimed in claim 5 wherein the model is a random forest regression model constructed using random forest regressor in the scikit-learn library, parameters set to: the number of weak learners is set to 80, and the maximum depth is set to 5.
9. A system or apparatus for diagnosing a benign-malignant thyroid nodule as claimed in claim 5 wherein said methylation markers employ corresponding methylation level algorithms, said methylation markers and corresponding methylation level algorithms being those of claim 2.
10. The system or apparatus for diagnosing a benign-malignant thyroid nodule according to claim 5, wherein in said determining means, if the probability of malignancy prediction of the sample by the model is greater than a threshold value of malignancy prediction, the sample is identified as malignant thyroid nodule, and conversely, the sample is identified as benign thyroid nodule; the malignancy prediction threshold is 0.51.
CN202311085574.9A 2023-08-25 2023-08-25 Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof Pending CN116875701A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311085574.9A CN116875701A (en) 2023-08-25 2023-08-25 Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311085574.9A CN116875701A (en) 2023-08-25 2023-08-25 Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof

Publications (1)

Publication Number Publication Date
CN116875701A true CN116875701A (en) 2023-10-13

Family

ID=88262992

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311085574.9A Pending CN116875701A (en) 2023-08-25 2023-08-25 Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof

Country Status (1)

Country Link
CN (1) CN116875701A (en)

Similar Documents

Publication Publication Date Title
CN111910004B (en) Application of cfDNA in noninvasive diagnosis of early breast cancer
AU2018305609B2 (en) Enhancement of cancer screening using cell-free viral nucleic acids
CN111863250B (en) Combined diagnosis model and system for early breast cancer
CN109830264B (en) Method for classifying tumor patients based on methylation sites
KR20190085667A (en) Circulating Tumor DNA Detection Method Using Sample comprising Cell free DNA and Uses thereof
CN113355415B (en) Detection reagent and kit for diagnosis or auxiliary diagnosis of esophageal cancer
WO2018019294A1 (en) Methods for gynecologic neoplasm diagnosis
CN115896281A (en) Methylated biomarker, kit and application
KR102271315B1 (en) Method for prognosis of breast cancer using ribosomal protein from artificial intelligence
CA3167633A1 (en) Systems and methods for calling variants using methylation sequencing data
CN1549864A (en) Evaluating system for predicting cancer return
CN115976209A (en) Training method of lung cancer prediction model, prediction device and application
CN108998530B (en) Lung cancer up-regulated long-chain non-coding RNA marker and application thereof
CN112771178A (en) Prediction and characterization of DLBCL-derived cell subtypes
CN116875701A (en) Leukocyte methylation marker for molecular diagnosis of benign and malignant thyroid nodules and application thereof
WO2022124718A1 (en) Method for prognosis of breast cancer by using mitochondria ribosomal gene set derived by artificial intelligence
CN112852969B (en) Epigenetically modified lncRNA as tumor diagnosis or tumor progression prediction marker
CN113981078B (en) Biomarker for predicting curative effect of EGFR (epidermal growth factor receptor) -resistant targeted therapy of patients with advanced esophageal cancer and curative effect prediction kit
EP4130293A1 (en) Method of mutation detection in a liquid biopsy
CN113151469B (en) Tumor classification marker combination and application thereof
CN115772566B (en) Methylation biomarker for auxiliary detection of lung cancer somatic ERBB2 gene mutation and application thereof
CN116403719A (en) Construction method of breast nodule malignancy differential diagnosis model
CN115287360A (en) Methylation marker for detecting benign and malignant thyroid nodules and application
CN116536422A (en) Thyroid cancer early-stage auxiliary diagnosis marker
Ren et al. Early Detection of Non-Small Cell Lung Cancer with Novel 5-Hydroxymethylcytosine DNA Markers: Discovery, Tissue Validation, and Pilot Testing in Plasma

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination