WO2022032429A1 - 用于肝癌检测和诊断的甲基化标志物 - Google Patents

用于肝癌检测和诊断的甲基化标志物 Download PDF

Info

Publication number
WO2022032429A1
WO2022032429A1 PCT/CN2020/108131 CN2020108131W WO2022032429A1 WO 2022032429 A1 WO2022032429 A1 WO 2022032429A1 CN 2020108131 W CN2020108131 W CN 2020108131W WO 2022032429 A1 WO2022032429 A1 WO 2022032429A1
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
chromosome
positions
distinguishing
differentially methylated
Prior art date
Application number
PCT/CN2020/108131
Other languages
English (en)
French (fr)
Inventor
汪宇盈
蒋睿婧芳
彭佳茜
孙健泷
李志隆
郑建超
朱师达
Original Assignee
华大数极生物科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华大数极生物科技(深圳)有限公司 filed Critical 华大数极生物科技(深圳)有限公司
Priority to CN202080010767.6A priority Critical patent/CN113454219B/zh
Priority to PCT/CN2020/108131 priority patent/WO2022032429A1/zh
Publication of WO2022032429A1 publication Critical patent/WO2022032429A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Definitions

  • the present invention relates to the field of biomedicine, in particular to a methylation marker for liver cancer detection and diagnosis.
  • Liver cancer is one of the cancers with high incidence and mortality in the world.
  • the incidence of liver cancer in my country is particularly serious. More than 50% of the world's liver cancer occurs in China.
  • the main screening methods for liver cancer are serum alpha-fetoprotein (AFP) and ultrasound imaging, but these methods have problems of low sensitivity or insufficient specificity for early-stage liver cancer, and imaging detection is more limited by examining doctors Due to factors such as experience, the performance of detection equipment, etc., a large part of liver cancer is already at an advanced stage when it is discovered, and the treatment and prognosis of advanced liver cancer are poor, and the five-year survival rate of patients is poor. Therefore, it is of great significance to establish an accurate, simple and economical early screening method for liver cancer.
  • AFP serum alpha-fetoprotein
  • cfDNA cell-free DNA
  • ctDNA circulating tumor DNA
  • DNA methylation is an important gene expression regulation mechanism, which can regulate gene expression and silencing, and has a significant impact on the occurrence and development of tumors. Aberrant methylation of cancer-related genes often occurs in the early stages of cancer, so DNA methylation signals are considered to be potential early screening markers for tumors.
  • the flow chart of the existing liver cancer screening program (the "Guidelines for the Diagnosis and Treatment of Primary Liver Cancer (2019 Edition)” announced by the National Health Commission) is shown in Figure 1.
  • the main screening methods for liver cancer are serum alpha-fetoprotein (AFP) examination and ultrasonography. These methods have problems of low sensitivity or insufficient specificity for early-stage liver cancer, and imaging detection is more limited by factors such as the experience of the examining doctor and the performance of the detection equipment.
  • AFP serum alpha-fetoprotein
  • imaging detection is more limited by factors such as the experience of the examining doctor and the performance of the detection equipment.
  • a large part of liver cancer is discovered at an advanced stage. The treatment and prognosis of advanced liver cancer are poor, and the five-year survival rate of patients is poor.
  • liver cancer In order to effectively solve "the high incidence and mortality of liver cancer, the early stage of liver cancer is asymptomatic, and the diagnosed patients are often in the middle and late stages, which greatly reduces the five-year survival rate; the current screening methods for liver cancer are relatively single and limited, and most rely on ultrasound imaging. It is not sensitive enough to the small size of liver cancer tissue; serum alpha-fetoprotein (AFP), the only universal blood marker for liver cancer screening at present, has low sensitivity or insufficient specificity for early liver cancer, and cannot meet the requirements of large-scale liver cancer screening.” For this problem, the present invention provides a methylation marker for liver cancer detection and diagnosis.
  • AFP serum alpha-fetoprotein
  • the present invention claims a set of differentially methylated regions (DMRs).
  • the group of differentially methylated regions claimed in the present invention contains all or part of the following 51 differentially methylated regions (specifically shown in Table 1):
  • (A7) is located at positions 119535537-119535986 of chromosome 1;
  • (A10) is located at positions 197882364-197882519 of chromosome 1;
  • (A15) is located at positions 200326591-200327369 of chromosome 2;
  • (A16) is located at positions 200333453-200333973 of chromosome 2;
  • (A18) is located at positions 170137150-170137931 of chromosome 3;
  • (A24) is located at positions 108488594-108488844 of chromosome 6;
  • (A27) is located at positions 27207996-27208054 of chromosome 7;
  • (A30) is located at positions 17271051-17271340 of chromosome 8;
  • (A33) is located at positions 99985934-99986482 of chromosome 8;
  • (A38) is located at positions 58021614-58021842 of chromosome 12;
  • (A40) is located at positions 102247495-102248194 of chromosome 14;
  • (A44) is located at positions 75368790-75370662 of chromosome 17;
  • (A50) is located at positions 50721097-50722014 of chromosome 20;
  • (A51) is located at positions 38220548-38221506 of chromosome 22.
  • the physical positions of the 51 differentially methylated regions were determined based on the alignment of the human whole genome sequence (version number hg19).
  • the group of differentially methylated regions includes, but is not limited to, selection of the aforementioned subsets (A1)-(A51), small-scale replacement, small-scale addition, and the like.
  • the group of differentially methylated regions can be composed of all or part of the 51 differentially methylated regions shown in (A1)-(A51) above.
  • the differentially methylated region group consists of 51 differentially methylated regions shown in (A1)-(A51) above.
  • the present invention claims the application of the differentially methylated region group described above as a methylation marker in any of the following:
  • the present invention claims the use of a substance for detecting the methylation level of the set of differentially methylated regions described above in any of the following:
  • the present invention claims the use of a combination of substances and media in any of the following:
  • the substance is a substance for detecting the differentially methylated region group described above;
  • a method for constructing and using a cancer risk prediction model is stored on the medium
  • the method for constructing and using the cancer risk prediction model includes the following steps:
  • C2 Constructing a cancer risk prediction model using machine learning, and then using the cancer risk prediction model to diagnose or assist in diagnosing cancer, pre-warning cancer before clinical symptoms, and/or distinguishing or assisting in distinguishing cancer from benign lesions.
  • n1 and n2 may be positive integers above 96 and 54, respectively, the same below.
  • the substance for detecting the differentially methylated region group may include a bisulfite reagent.
  • the present invention claims the use of a medium storing a method for constructing and using a cancer risk prediction model in any of the following:
  • the method for constructing and using the cancer risk prediction model includes the following steps:
  • C2 Constructing a cancer risk prediction model using machine learning, and then using the cancer risk prediction model to diagnose or assist in diagnosing cancer, pre-warning cancer before clinical symptoms, and/or distinguishing or assisting in distinguishing cancer from benign lesions.
  • the present invention claims a kit.
  • the kit claimed in the present invention can be any of the following:
  • Kit I Contains:
  • control nucleic acid comprising sequences from the set of differentially methylated regions described above and having a methylation status associated with non-cancer patients.
  • Kit II contains:
  • control nucleic acid comprising sequences from the set of differentially methylated regions described above and having a methylation status associated with cancer patients.
  • Kit III containing substances for detecting the aforementioned differentially methylated region groups and a medium storing methods for constructing and using a cancer risk prediction model;
  • the method for constructing and using the cancer risk prediction model includes the following steps:
  • C2 Constructing a cancer risk prediction model using machine learning, and then using the cancer risk prediction model to diagnose or assist in diagnosing cancer, pre-warning cancer before clinical symptoms, and/or distinguishing or assisting in distinguishing cancer from benign lesions.
  • the substance for detecting the differentially methylated region group may include a bisulfite reagent.
  • the present invention claims a system.
  • the system claimed by the present invention includes:
  • the reagent for detecting the methylation level of the differentially methylated region group may include a bisulfite reagent.
  • (D2) a device, the device includes a unit X and a unit Y;
  • the unit X is used to establish a cancer risk prediction model, including a data acquisition module and a data analysis and processing module;
  • the data collection module is used to collect (D1) the methylation level data obtained from n1 cancer patient samples and n2 non-cancer patient samples for the aforementioned differentially methylated region group;
  • the data analysis and processing module can use the methylation level data collected by the data acquisition module from n1 cancer patient samples and n2 non-cancer patient samples for the differentially methylated region group described above as a training set. , based on the principle of machine learning method to build a cancer risk prediction model;
  • the unit Y can be based on the cancer risk prediction model and the methylation level data from the test subject sample for the differentially methylated region group described above, so as to diagnose or assist in the diagnosis of cancer, and provide early warning before clinical symptoms. Cancer and/or distinguish or assist in distinguishing cancer from benign lesions.
  • the present invention claims the use of the aforementioned kit or the aforementioned system in any of the following:
  • the present invention claims a method of diagnosing or assisting in the diagnosis of cancer.
  • the method for diagnosing or assisting the diagnosis of cancer as claimed in the present invention may include the following steps: analyzing the methylation status of the group of differentially methylated regions described above in the sample from the subject, so as to realize the diagnosis or assist the diagnosis cancer.
  • a methylation status for the set of differentially methylated regions that differs from a sample from a non-cancer patient is considered or suspected of having cancer.
  • the same methylation status for the group of differentially methylated regions from the cancer patient sample is considered to have cancer or is suspected to have cancer.
  • the method for diagnosing or assisting in diagnosing cancer may include the following steps:
  • C2 Constructing a cancer risk prediction model by using a machine learning method, and then using the cancer risk prediction model to diagnose or assist in diagnosing cancer.
  • the present invention claims a method of warning cancer before clinical symptoms.
  • the method for early warning of cancer before clinical symptoms as claimed in the present invention may include the following steps: analyzing the methylation status of the group of differentially methylated regions described above in the sample from the test subject, so as to realize the detection of clinical symptoms Early warning of cancer.
  • a methylation status for the set of differentially methylated regions that differs from a sample from a non-cancer patient is considered a high-risk cancer patient, and vice versa is considered a low-risk cancer patient.
  • the methylation status of the group of differentially methylated regions is the same as that of the sample from cancer patients, it is regarded as a high-risk cancer patient, and vice versa, it is regarded as a low-risk cancer patient.
  • the method for warning cancer before clinical symptoms may include the following steps:
  • C2 using machine learning method to construct a cancer risk prediction model, and then using the cancer risk prediction model to realize early warning of cancer before clinical symptoms.
  • the present invention claims a method of distinguishing or assisting in distinguishing cancer from benign lesions.
  • the method for distinguishing or assisting in distinguishing cancer and benign lesions as claimed in the present invention may include the following steps: analyzing the methylation status of the group of differentially methylated regions described in claim 1 or 2 from the sample from the test subject , so as to distinguish or assist in distinguishing cancer from benign lesions.
  • a methylation status for the set of differentially methylated regions that differs from a sample from a non-cancer patient is considered cancer, and vice versa is considered a benign lesion.
  • the same methylation status for the group of differentially methylated regions as the samples from cancer patients is regarded as cancer, otherwise, it is regarded as benign lesions.
  • the method for distinguishing or assisting in distinguishing cancer and benign lesions may include the following steps:
  • C2 using a machine learning method to construct a cancer risk prediction model, and then using the cancer risk prediction model to distinguish or assist in distinguishing cancer from benign lesions.
  • the machine learning method may be a random forest method.
  • the model is specifically constructed by python scikit-learn, and the parameters are: the maximum depth is 5, 50 trees, and the feature importance>0.15.
  • the sample is a sample from which DNA can be extracted.
  • the sample includes, but is not limited to, plasma, tissue, saliva, urine, feces, and the like.
  • the method of analyzing the methylation status and the method of obtaining the methylation level data may include, but are not limited to, bisulfite conversion, PCR, methylation-specific PCR (MS -PCR), pyrosequencing (pyrosequencing), Sanger sequencing (Sanger sequencing), high-throughput sequencing (High-throughput sequencing) or third-generation sequencing or single-molecule sequencing (Third-generation sequencing), etc.
  • targeted methylation high-throughput sequencing is performed on plasma cfDNA, so as to obtain the methylation level data of the differentially methylated region group described above. . Specifically, it includes the following steps: extracting cfDNA from plasma samples, constructing a methylation library, performing library hybridization capture, and high-throughput sequencing.
  • the analytical methods for these DMR methylation levels are related to computer software and/or computer hardware, including but not limited to determining the methylation status of DMRs, comparing the methylation status of DMRs, generating methylation standard curves, and determining Ct values, calculation of methylation rates of DMRs, determination of specificity and/or sensitivity of assays or markers, calculation of ROC curves and associated AUCs, sequence analysis, etc.
  • the non-cancer patient can be a healthy control or a patient with benign lesions.
  • the healthy control is a physical examination sample that does not complain of abnormality.
  • the cancer includes, but is not limited to, liver cancer, colorectal cancer, lung cancer, gastric cancer, or pancreatic cancer, and the like.
  • the cancer is liver cancer.
  • the benign lesions are benign liver lesions or liver cirrhosis.
  • the benign liver lesions are specifically liver hemangioma, liver abscess, liver cyst or liver focal nodular hyperplasia.
  • the liver cancer may be primary hepatocellular carcinoma or intrahepatic cholangiocarcinoma. Further such as primary hepatocellular carcinoma or intrahepatic cholangiocarcinoma not receiving any form of anti-tumor therapy.
  • the liver cancer may be a liver cancer with BCLC stage O, A, B, and/or C.
  • the aforementioned methylation level data is methylation rate. That is, the methylation level of the differentially methylated region group described above is the ratio of methylated cytosines to all cytosines in the CpG in the corresponding DMR region.
  • FIG 1 is a flowchart of the liver cancer screening program (the “Standards for the Diagnosis and Treatment of Primary Liver Cancer (2019 Edition)” announced by the National Health Commission.
  • Figure 2 shows 2658 differentially methylated regions (DMRs) found based on 44 pairs of liver cancer tumor tissues and adjacent tissues, of which 357 are hypermethylated (Hyper) DMRs and 2301 are hypomethylated (Hypo) DMRs.
  • DMRs differentially methylated regions
  • Figure 3 is a heat map of sample set I based on the methylation rates of 51 model DMRs.
  • Figure 4 shows the performance of the 51 DMR-based liver cancer methylation model in sample set I.
  • Figure 5 is a heatmap of the methylation rates in sample set II based on the 51 model DMRs.
  • Figure 6 shows the performance of the 51 DMR-based liver cancer methylation model in sample set II.
  • Figure 7 is a comparison of the performance of the liver cancer methylation model based on 51 DMRs and the traditional AFP detection for judging liver cancer.
  • the present invention carried out methylation high-throughput sequencing on 44 pairs of liver cancer tissues and adjacent liver cancer tissues, and through data analysis and calculation, 2658 differentially methylated regions that may be related to the occurrence and development of liver cancer were found ( Differentially methylated regions (DMRs).
  • DMRs Differentially methylated regions
  • the present invention conducts high-throughput targeted methylation of a total of 705 plasma cfDNA from 385 liver cancer patients, 259 healthy people, 36 benign liver disease patients, and 25 liver cirrhosis patients. Sequencing data were collected on the methylation levels of these DMRs in the cfDNA of liver cancer patients and healthy individuals.
  • the present invention constructs a cancer risk prediction model through screening of certain conditions and machine learning, and selects 51 DMRs as markers for liver cancer screening.
  • the following examples facilitate a better understanding of the present invention, but do not limit the present invention.
  • the experimental methods in the following examples are conventional methods unless otherwise specified.
  • the test materials used in the following examples were purchased from conventional biochemical reagent stores unless otherwise specified.
  • the quantitative tests in the following examples are all set to repeat the experiments three times, and the results are averaged.
  • This example describes the discovery of differentially methylated regions (DMRs) involved in the development of hepatocarcinogenesis and as markers for liver cancer screening therein.
  • DMRs differentially methylated regions
  • WGBS whole-genome bisulfite sequencing
  • liver cancer samples were: pathologically diagnosed primary hepatocellular carcinoma or intrahepatic cholangiocarcinoma, no history of previous malignant tumors, and no antitumor therapy of any kind before surgery. Among them, stage 0 and stage A liver cancer samples should account for more than 60% of the total liver cancer samples. Healthy people were physical examination samples with no complaints of abnormality.
  • DNA extraction was performed on 44 pairs of liver and liver adjacent tissue samples using DNeasy Blood & Tissue Ki (Qiagen, #69506).
  • Unmethylated lambda DNA Take 200ng of the extracted DNA, and add 1ng of Unmethylated lambda DNA (PROMEGA, #D1521) for subsequent CU transformation quality control.
  • DNA fragmentation and magnetic bead double-selection are both routine experimental operations. The specific parameters can be adjusted according to the model of the interrupter and according to the instructions on the official website.
  • cfDNA extraction was performed on liver cancer and healthy human plasma samples using the MagPure Circulating DNA Maxi Kit (MAGEN, #12917PJ-100). Take 10ng of cfDNA, and add 0.05ng of Unmethylated lambda DNA to interrupt and screen to about 160bp for subsequent C-U transformation quality control.
  • MGIEasy Whole Genome Methylation Library Preparation Kit (MGI, #1000005251) for library construction, and complete “end repair & dA tail addition”, “adapter ligation”, “ligation product purification”, “Bisulfite treatment and purification” according to the instructions , "PCR amplification”, “PCR product purification” and “PCR product quality inspection” steps, no subsequent operation of the kit is required.
  • Hybridization, capture and elution and PostPCR were performed using the Seq Cap EZ Hybridization and Wash Kit (ROCHE, 5634253001) and the SeqCap Epi CpGiant Enrichment Kit (ROCHE, 7138911001). Since the sequencing instrument of the MGI platform is used, the Block used in the hybridization process and the PostPCR primers used in the PostPCR step must use the corresponding Block and PostPCR primers of the MGI platform (both from the MGIEasy Exome Capture Auxiliary Kit, MGI, #1000007743 ).
  • PE100 sequencing was performed using MGISEQ-2000 (MGI).
  • MMI MGISEQ-2000
  • FCL PE100 high-throughput sequencing reagent kit
  • 1000012552 can be purchased for sequencing operations, or the sequencing can be entrusted to institutions that provide sequencing services.
  • DMRs differentially methylated regions
  • Targeted high-throughput sequencing was performed on the cfDNA of 140 liver cancer patients and 84 healthy people from sample set I in this example. Random forest-based 10-fold cross-validation was used to model hypermethylated DMRs, and 51 DMRs that could be used as markers for liver cancer screening were screened according to feature importance (feature importance > 0.15) (Table 1). Model performance was evaluated using the validation set for each compromise, resulting in an average validation set sensitivity of 0.929, specificity of 0.894, and AUC of 0.96.
  • the specific operation corresponding to the result is as follows: divide the sample into 10 parts, and in each fold in the 10-fold cross-validation, use 90% of the samples as the training set (for modeling) and 10% as the validation set (for the verification model), the test samples are different for each fold.
  • the DMR methylation rate (that is, the methylated cells in the CpG in the DMR region) was calculated using the depth of a single CpG site obtained by targeted high-throughput sequencing of cfDNA and the number of methylated cytosines. ratio of pyrimidines to all cytosines).
  • the model is built by python scikit-learn, the parameters are: maximum depth is 5, 50 trees, feature importance > 0.15.
  • Figure 2 presents a heatmap of 2658 DMRs found based on 44 pairs of liver cancer tissue and paracancerous tissue. Liver cancer tissue is on the left, and adjacent tissue is on the right. Hypermethylated DMRs are located above the heatmap and hypomethylated DMRs are located below the heatmap. Each grid represents the corresponding DMR methylation rate of the sample at this site, and its range is 0-1. The closer the methylation rate is to 0, the darker the color. It can be seen from the figure that the discovered DMR has a distinct degree of distinction between liver cancer tissue and adjacent tissue.
  • Figure 3 presents a heatmap of intraregional methylation rates of 51 DMRs in sample set I of 140 HCC patients and 84 healthy individuals.
  • the horizontal axis is the sample, with healthy people on the far left, followed by liver cancer patients.
  • the vertical axis is the DMR, with the hypermethylated DMR on the top and the hypomethylated DMR on the bottom. It can be seen from the figure that the 51 DMRs screened can clearly distinguish the samples from liver cancer patients and healthy people.
  • Figure 4 shows the performance of the 51 DMR-based liver cancer methylation model for sample set I in this example.
  • the horizontal axis represents the false positive rate (1-specificity) and the vertical axis represents the sensitivity. It can be seen from the figure that the methylation model has a good judgment ability for the sample of this example.
  • the main purpose of this example is to verify the performance of these 51 DMRs for liver cancer screening in sample set II, including plasma free samples from 245 patients with liver cancer, 36 patients with benign liver lesions, 175 healthy people and 25 patients with liver cirrhosis DNA.
  • the inclusion criteria of liver cancer samples were: pathologically diagnosed primary hepatocellular carcinoma or intrahepatic cholangiocarcinoma, no history of previous malignant tumors, and no antitumor therapy of any kind before surgery.
  • stage 0 and stage A liver cancer samples should account for more than 60% of the total liver cancer samples.
  • Benign hepatic lesions include hepatic hemangioma, hepatic abscess, hepatic cyst, and focal nodular hyperplasia of the liver. Healthy people were physical examination samples with no complaints of abnormality.
  • Sample Set II does not contain any samples from Sample Set I in Example I.
  • Table 2 shows the median methylation rates of 51 DMRs in this model for liver cancer, healthy people, liver cirrhosis, and benign liver lesions in sample set II. It can be seen from the table that the selected DMR has obvious discrimination between liver cancer and non-HCC samples.
  • Table 3 shows the performance of the model on sample set II. Among them, the sensitivity in liver cancer samples was 81.3%, the specificity in healthy people was 96.9%, and the specificities in liver cirrhosis and benign liver lesions were 90.5% and 73.5%, respectively. It can be seen from the table that this model has a good ability to distinguish liver cancer, healthy people and other liver diseases.
  • Table 4 shows the performance of the model under different stages of liver cancer (BCLC stage). Sensitivity was 51.8% for stage 0, 82.5% for stage A, 91.7% for stage B, and 100% for stage C. In conventional AFP detection, the sensitivity of stage 0, stage A, stage B and stage C were 29.6%, 27.7%, 55.0% and 57.1%, respectively. It can be seen from the table that the performance of this model is significantly improved compared with the existing AFP.
  • Figure 5 presents a heatmap of methylation rates in a set of independent validation samples among the methylation markers considered in this analysis. It can be seen from the figure that there is a clear distinction between liver cancer and non-HCC.
  • Figure 6 shows the AUC of this liver cancer methylation rate model in sample set II.
  • Figure 7 shows the comparison of the performance of the methylation model and traditional AFP detection for judging liver cancer. It can be seen from the figure that the performance of the methylation model in each stage is significantly improved compared with the traditional AFP detection.
  • a total of 127 liver cancer samples have clinical stage information, and 109 of these 127 samples have AFP information.
  • the present invention provides 51 DMRs that can be used for liver cancer screening (Table 1). By detecting the methylation levels of these DMRs and analyzing the obtained data, the possibility of the subjects suffering from liver cancer can be predicted, so as to achieve the purpose of screening liver cancer in the general population or high-risk groups of liver cancer.
  • the invention provides an accurate, simple and economical early screening method for liver cancer, which can improve the detection rate of liver cancer, especially early liver cancer, in the high-risk population of liver cancer and the general physical examination population, thereby improving the survival rate of liver cancer patients, and saving A large number of medical expenses and reduce medical burden.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Pathology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Biotechnology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Epidemiology (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Immunology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioethics (AREA)
  • Hospice & Palliative Care (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Oncology (AREA)
  • Evolutionary Computation (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

用于肝癌检测和诊断的甲基化标志物。提供了用于诊断或辅助诊断癌症的51个差异化甲基化区域。提供了一种准确、简便、经济的肝癌早期筛查手段,可提高肝癌高危人群及普通体检人群中肝癌,尤其是早期肝癌的检出率,进而提高肝癌病人的生存率,并节约大量的医疗支出和降低医疗负担。

Description

用于肝癌检测和诊断的甲基化标志物 技术领域
本发明涉及生物医学领域,具体涉及用于肝癌检测和诊断的甲基化标志物。
背景技术
肝癌是世界上发病、死亡率很高的癌症之一,我国的肝癌发病情况尤为严重,全球超过50%的肝癌发生在中国。目前肝癌的筛查手段主要是血清甲胎蛋白(AFP)检查和超声显像检测,但这些方法存在对早期肝癌的灵敏性低或特异性不足的问题,且影像学检测更受限于检查医生的经验、检测仪器的性能等因素,目前有很大一部分肝癌发现时已经是晚期,而晚期肝癌的治疗、预后较差,患者的五年生存率较差。因此,建立一种准确、简便、经济的肝癌早期筛查方法具有重大意义。
当人体内的细胞破裂或死亡时,会释放其DNA进入循环系统内,即为游离DNA(cell-freeDNA,cfDNA)。同样的,当肿瘤细胞破裂或死亡时,也会释放出循环肿瘤DNA(circulatingtumorDNA,ctDNA),这些DNA携带有肿瘤细胞的遗传学信息。通过检测混杂在cfDNA里的ctDNA,分析其携带的突变和表观遗传学等信息,即可推断受检者罹患癌症的可能性。
科学家们开发出了许多高分子利用率的高通量测序技术,使得在血浆游离DNA中检测到微量的突变信号成为了可能,也推动着精准肿瘤早筛的发展。科学家们先从基因突变中寻找适用于肿瘤早筛的生物标志物,但是研究表明,单一采用突变信号进行肿瘤早筛效果有限。因此,科学家们开始从表观遗传学层面进行肿瘤早筛探索。DNA甲基化是一种重要的基因表达调控机制,能够调节基因的表达和沉默,在肿瘤的发生发展中具有重大的影响。癌症相关基因的异常甲基化常出现于癌症发生的早期,因此DNA甲基化信号被认为是有潜力的肿瘤早期筛查标志物。
综合国内外专注于肿瘤早筛技术研究的公司来看:目前Grail主要采用的研究技术是cfDNA靶向测序、WGS和WGBS,通过对大量癌症样本和非癌对照样本进行全基因组测序,挖掘肿瘤特异的突变和甲基化分子标志物。此策略能够较全面地对肿瘤基因组图谱进行研究,但高深度的全基因组测序带来的巨额成本并非一般研究单位能够承担。Guardant Health则专注于液体活检技术,采用高灵敏度的检测技术进行肿瘤早筛研究,但液体活检技术用于肿瘤早筛依然存在许多局限,如:早期肿瘤突变信号极弱、部分基因突变在多个不同癌种中存在和克隆性造血会对ctDNA检测带来巨大的干扰等。因此单一采用突变作为分子标志物效果受限较大。泛生子则联合突变和蛋白标志物进行检测,该研究也显示了应用多组学进行检测能够有效提高检测性能。鹍远基因和基准医疗则专注于甲基化的检测,DNA甲基化的变化往往是多个位点同时发生,因此相比单个位点的基因突变具有更高的灵敏性,而且DNA甲基化信号的组织特异性,使得泛癌种肿 瘤早筛成为可能,因此甲基化是肿瘤早筛非常理想的分子标志物。
目前国内外有许多基于DNA甲基化的肿瘤早筛研究,如被誉为“无创DNA产前检测的奠基人”的香港著名分子生物学临床应用专家卢煜明,在2019年时发表了使用低深度WGBS测序检测尿液cfDNA中的甲基化和拷贝数变异(CNA)情况,此方法用于膀胱癌检测灵敏性达到93.5%(特异性95.8%);如Anderson,B.W.等人在2018年发表的关于胃癌甲基化标志物临床验证的研究,该研究首先从DNA甲基化组中找到了胃癌相关的候选DNA甲基化标志物,然后采用甲基化特异性PCR(MSP)方法对大量样本进行测试,最终获得一个包括3个marker的panel(ELMO1,ZNF569,C13orf18),该方法的灵敏性达到86%(特异性95%,CI 71-95%)。越来越多的研究报道证明了DNA甲基化标志物在肿瘤早筛领域的巨大潜力,在大量的研究基础上,开发一种基于甲基化的、“轻便”的检测方法将会加速甲基化肿瘤早筛向临床产业转化的进程。
现有的肝癌筛查方案(卫健委公布《原发性肝癌诊疗规范(2019年版)》)流程图如图1。目前肝癌的筛查手段主要是血清甲胎蛋白(AFP)检查和超声显像检测。这些方法存在对早期肝癌的灵敏性低或特异性不足的问题,且影像学检测更受限于检查医生的经验、检测仪器的性能等因素,目前有很大一部分肝癌发现时已经是晚期,而晚期肝癌的治疗、预后较差,患者的五年生存率较差。
发明公开
为了有效的解决“肝癌发病率和死亡率较高,肝癌早期无症状,确诊病人往往处于中晚期,致使五年生存率大大下降;目前肝癌的筛查手段比较单一且局限,大多依赖于超声影像手段,对体积小的肝癌组织不够敏感;目前肝癌筛查的唯一通用血液标志物血清甲胎蛋白(AFP)对早期肝癌的灵敏性低或特异性不足,不能满足大规模肝癌筛查的要求”这一问题,本发明提供一种用于肝癌检测和诊断的甲基化标志物。
第一方面,本发明要求保护差异化甲基化区域(DMR)组。
本发明所要求保护的差异化甲基化区域组,含有如下51个差异化甲基化区域中的全部或部分(具体如表1所示):
(A1)位于1号染色体的第22140769-22140997位;
(A2)位于1号染色体的第47909518-47911295位;
(A3)位于1号染色体的第119522233-119522972位;
(A4)位于1号染色体的第119525991-119526101位;
(A5)位于1号染色体的第119526727-119527757位;
(A6)位于1号染色体的第119531595-119533069位;
(A7)位于1号染色体的第119535537-119535986位;
(A8)位于1号染色体的第119542942-119543424位;
(A9)位于1号染色体的第119549096-119550717位;
(A10)位于1号染色体的第197882364-197882519位;
(A11)位于2号染色体的第26624440-26625280位;
(A12)位于2号染色体的第63282623-63283168位;
(A13)位于2号染色体的第63283795-63284165位;
(A14)位于2号染色体的第162279905-162280539位;
(A15)位于2号染色体的第200326591-200327369位;
(A16)位于2号染色体的第200333453-200333973位;
(A17)位于3号染色体的第125075832-125076480位;
(A18)位于3号染色体的第170137150-170137931位;
(A19)位于4号染色体的第995761-996936位;
(A20)位于4号染色体的第41875340-41875925位;
(A21)位于5号染色体的第139047739-139048298位;
(A22)位于6号染色体的第1624936-1625224位;
(A23)位于6号染色体的第26271346-26271748位;
(A24)位于6号染色体的第108488594-108488844位;
(A25)位于6号染色体的第108492267-108492437位;
(A26)位于6号染色体的第150285813-150286646位;
(A27)位于7号染色体的第27207996-27208054位;
(A28)位于7号染色体的第96636496-96636870位;
(A29)位于7号染色体的第129418361-129418612位;
(A30)位于8号染色体的第17271051-17271340位;
(A31)位于8号染色体的第67873733-67874151位;
(A32)位于8号染色体的第99961175-99961661位;
(A33)位于8号染色体的第99985934-99986482位;
(A34)位于9号染色体的第100616319-100616730位;
(A35)位于10号染色体的第93646929-93647266位;
(A36)位于10号染色体的第134597818-134599519位;
(A37)位于11号染色体的第69517700-69518306位;
(A38)位于12号染色体的第58021614-58021842位;
(A39)位于12号染色体的第81102127-81102896位;
(A40)位于14号染色体的第102247495-102248194位;
(A41)位于15号染色体的第76630449-76631040位;
(A42)位于17号染色体的第29297770-29298669位;
(A43)位于17号染色体的第43047552-43047830位;
(A44)位于17号染色体的第75368790-75370662位;
(A45)位于18号染色体的第76739367-76740382位;
(A46)位于19号染色体的第12305592-12306084位;
(A47)位于19号染色体的第13210026-13210503位;
(A48)位于19号染色体的第15342716-15343266位;
(A49)位于19号染色体的第15344024-15344364位;
(A50)位于20号染色体的第50721097-50722014位;
(A51)位于22号染色体的第38220548-38221506位。
所述51个差异化甲基化区域的物理位置是基于人类全基因组序列(版本号为hg19)比对确定的。
其中,所述差异化甲基化区域组包括但不限于选取前文(A1)-(A51)子集、小规模替换、小规模增加等。
进一步地,所述差异化甲基化区域组可由前文(A1)-(A51)所示的51个差异化甲基化区域中的全部或部分组成。
在本发明的具体实施方式中,所述差异化甲基化区域组由前文(A1)-(A51)所示的51个差异化甲基化区域组成。
第二方面,本发明要求保护前文所述的差异化甲基化区域组作为甲基化标记物在如下任一中的应用:
(B1)制备用于诊断或辅助诊断癌症的产品;
(B2)诊断或辅助诊断癌症;
(B3)制备用于在临床症状之前预警癌症的产品;
(B4)在临床症状之前预警癌症;
(B5)制备用于区分或辅助区分癌症和良性病变的产品;
(B6)区分或辅助区分癌症和良性病变。
第三方面,本发明要求保护用于检测前文所述的差异化甲基化区域组的甲基化水平的物质在如下任一中的应用:
(B1)制备用于诊断或辅助诊断癌症的产品;
(B2)诊断或辅助诊断癌症;
(B3)制备用于在临床症状之前预警癌症的产品;
(B4)在临床症状之前预警癌症;
(B5)制备用于区分或辅助区分癌症和良性病变的产品;
(B6)区分或辅助区分癌症和良性病变。
第四方面,本发明要求保护物质和介质的组合在如下任一中的应用:
(B1)制备用于诊断或辅助诊断癌症的产品;
(B2)诊断或辅助诊断癌症;
(B3)制备用于在临床症状之前预警癌症的产品;
(B4)在临床症状之前预警癌症;
(B5)制备用于区分或辅助区分癌症和良性病变的产品;
(B6)区分或辅助区分癌症和良性病变;
所述物质为用于检测前文所述的差异化甲基化区域组的物质;
所述介质上储存有癌症风险预测模型构建和使用方法;
所述癌症风险预测模型的构建和使用方法包括如下步骤:
(C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据;
(C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变。
在(C1)中,n1和n2分别可为96和54以上的正整数,下同。
其中,用于检测所述的差异化甲基化区域组的物质可包括重亚硫酸盐试剂。
第五方面,本发明要求保护储存有癌症风险预测模型构建和使用方法的介质在如下任一中的应用:
(B1)制备用于诊断或辅助诊断癌症的产品;
(B2)诊断或辅助诊断癌症;
(B3)制备用于在临床症状之前预警癌症的产品;
(B4)在临床症状之前预警癌症;
(B5)制备用于区分或辅助区分癌症和良性病变的产品;
(B6)区分或辅助区分癌症和良性病变;
所述癌症风险预测模型的构建和使用方法包括如下步骤:
(C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据;
(C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变。
第六方面,本发明要求保护一种试剂盒。
本发明所要求保护的试剂盒可为如下任意一种:
试剂盒I:包含:
(a1)重亚硫酸盐试剂;以及
(a2)对照核酸,所述对照核酸包含来自前文所述差异化甲基化区域组的序列,并且具有与非癌症患者相关的甲基化状态。
试剂盒II,包含:
(b1)重亚硫酸盐试剂;以及
(b2)对照核酸,所述对照核酸包含来自前文所述差异化甲基化区域组的序列,并且具有与癌症患者相关的甲基化状态。
试剂盒III,含有用于检测前文所述的差异化甲基化区域组的物质和储存有癌症风险预测模型构建和使用方法的介质;
所述癌症风险预测模型的构建和使用方法包括如下步骤:
(C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据;
(C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变。
其中,用于检测所述的差异化甲基化区域组的物质可包括重亚硫酸盐试剂。
第七方面,本发明要求保护一种系统。
本发明所要求保护的系统包括:
(D1)用于检测前文所述的差异化甲基化区域组的甲基化水平的试剂和/或仪器;
其中,用于检测所述的差异化甲基化区域组的甲基化水平的试剂可包括重亚硫酸盐试剂。
(D2)装置,所述装置包括单元X和单元Y;
所述单元X用于建立癌症风险预测模型,包括数据采集模块和数据分析处理模块;
所述数据采集模块用于采集(D1)检测得到的来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据;
所述数据分析处理模块能够将所述数据采集模块采集的来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据作为训练集,基于机器学习法原理构建得到癌症风险预测模型;
所述单元Y能够基于所述癌症风险预测模型和来自待测者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据,实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变。
第八方面,本发明要求保护前文所述试剂盒或前文所述系统在如下任一中的应用:
(B1)制备用于诊断或辅助诊断癌症的产品;
(B2)诊断或辅助诊断癌症;
(B3)制备用于在临床症状之前预警癌症的产品;
(B4)在临床症状之前预警癌症;
(B5)制备用于区分或辅助区分癌症和良性病变的产品;
(B6)区分或辅助区分癌症和良性病变;
(B7)制备用于区分或辅助区分肝癌和肝硬化的产品;
(B8)区分或辅助区分肝癌和肝硬化。
第九方面,本发明要求保护一种诊断或辅助诊断癌症的方法。
本发明所要求保护的诊断或辅助诊断癌症的方法,可包括如下步骤:分析来自待测者样本的针对前文所述的差异化甲基化区域组的甲基化状态,从而实现诊断或辅助诊断癌症。
不同于来自非癌症患者样本的针对所述差异化甲基化区域组的甲基化状态则认为患有或疑似患有癌症。
进一步地,与来自癌症患者样本的针对所述差异化甲基化区域组的甲基化状态相同则认为患有癌症或疑似患有癌症。
在本发明的具体实施方式中,所述诊断或辅助诊断癌症的方法,可包括如下步骤:
(C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据;
(C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症。
第十方面,本发明要求保护一种在临床症状之前预警癌症的方法。
本发明所要求保护的在临床症状之前预警癌症的方法,可包括如下步骤:分析来自待测者样本的针对前文所述的差异化甲基化区域组的甲基化状态,从而实现在临床症状之前预警癌症。
不同于来自非癌症患者样本的针对所述差异化甲基化区域组的甲基化状态则视为癌症高风险患者,反之则视为癌症低风险患者。
进一步地,与来自癌症患者样本的针对所述差异化甲基化区域组的甲基化状态相同则视为癌症高风险患者,反之则视为癌症低风险患者。
在本发明的具体实施方式中,所述在临床症状之前预警癌症的方法,可包括如下步骤:
(C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据;
(C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现在临床症状之前预警癌症。
第十一方面,本发明要求保护一种区分或辅助区分癌症和良性病变的方法。
本发明所要求保护的区分或辅助区分癌症和良性病变的方法,可包括如下步骤:分析来自待测者样本的针对权利要求1或2所述的差异化甲基化区域组的甲基化状态,从而实现区分或辅助区分癌症和良性病变。
不同于来自非癌症患者样本的针对所述差异化甲基化区域组的甲基化状态则视为癌症,反之则视为良性病变。
进一步地,与来自癌症患者样本的针对所述差异化甲基化区域组的甲基化状态相同则视为癌症,反之则视为良性病变。
在本发明的具体实施方式中,所述区分或辅助区分癌症和良性病变的方法,可包括如下步骤:
(C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对前文所述的差异化甲基化区域组的甲基化水平数据;
(C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现区分或辅助区分癌症和良性病变。
在前文各方面中,所述机器学习法可为随机森林法。
在本发明的具体实施方式中,所述模型具体通过python scikit-learn进行构建,参数为:最大深度为5,50棵树,特征重要性>0.15。
在前文各方面中,所述样本为能够提取DNA的样本。
进一步地,所述样本包括但不限于血浆、组织、唾液、尿液、粪便等。
在前文各方面中,分析所述甲基化状态的方法和获得所述甲基化水平数据方法均可包括但不限于重硫酸盐转化(bisulfite conversion)、PCR、甲基化特异性PCR(MS-PCR)、焦磷酸测序(pyrosequencing)、桑格测序(Sanger sequencing)、高通量测序(High-throughput sequencing)或三代测序或单分子测序(Third-generation sequencing)等。
在本发明的具体实施方式中,是对血浆游离细胞DNA(plasma cfDNA)进行靶向甲基化高通量测序,从而获得前文所述的差异化甲基化区域组的甲基化水平数据的。具体包括如下步骤:从血浆样本中提取cfDNA,构建甲基化文库,进行文库杂交捕获,高通量测序。
对这些DMR甲基化水平的分析方法,与计算机软件和/或计算机硬件相关,包括但不限于测定DMR的甲基化状态、比较DMR的甲基化状态、产生甲基化标准曲线、测定Ct值、计算DMR的甲基化率、测定分析或标记物的特异性和/或灵敏度、计算ROC曲线和相关AUC、序列分析等。
在前文各方面中,所述非癌症患可为健康对照或良性病变患者。所述健康对照为未诉异常的体检样本。
在前文各方面中,所述癌症包括但不限于肝癌、结直肠癌、肺癌、胃癌或胰腺癌等。
在本发明的具体实施方式中,所述癌症为肝癌。相应的,所述良性病变为肝脏良性病变或肝硬化。所述肝脏良性病变具体为肝血管瘤,肝脓肿,肝囊肿或肝局灶性结节性增生。
在前文各方面中,所述肝癌可为原发肝细胞癌或肝内胆管癌。进一步如未接收任何形式的抗肿瘤治疗的原发肝细胞癌或肝内胆管癌。
在前文各方面中,所述肝癌可为BCLC分期为O期、A期、B期和/或C期的肝癌。
在本发明的具体实施方式中,前文所述甲基化水平数据为甲基化率。即前文所述差异化甲基化区域组的甲基化水平为对应DMR区域内CpG中甲基化胞嘧啶占所有胞嘧啶的比值。
附图说明
图1为肝癌筛查方案(卫健委公布《原发性肝癌诊疗规范(2019年版)》)流程图。
图2为基于44对肝癌肿瘤组织与癌旁组织发现的2658个差异甲基化区域(DMR),其中357个为高甲基化(Hyper)DMR,2301个为低甲基化(Hypo)DMR。
图3为基于51个模型DMR的甲基化率在样本集合I的热图。
图4为基于51个DMR的肝癌甲基化模型在样本集合I中的效能。
图5为基于51个模型DMR的甲基化率在样本集合II的热图。
图6为基于51个DMR的肝癌甲基化模型在样本集合II的效能。
图7为基于51个DMR的肝癌甲基化模型与传统AFP检测对于判断肝癌的性能对比。
实施发明的最佳方式
首先,本发明通过对44对肝癌组织和肝癌癌旁组织进行甲基化高通量测序,通过对数据进行分析计算,找出了2658个可能与肝癌发生发展有关的差异化甲基化区域(Differentially methylated region,DMR)。
然后,本发明通过来自对385个肝癌患者、259个健康人、36个肝良性病变患者、25个肝硬化患者的总计705个血浆游离细胞DNA(plasma cfDNA)进行靶向甲基化高通量测序,收集了这些DMR在肝癌病人和健康人cfDNA内的甲基化水平数据。
最后,结合已经收集的数据,本发明通过一定条件的筛选及机器学习构建癌症风险预测模型,选出了51个DMR作为肝癌筛查的标志物。
以下的实施例便于更好地理解本发明,但并不限定本发明。下述实施例中的实验方法,如无特殊说明,均为常规方法。下述实施例中所用的试验材料,如无特殊说明,均为自常规生化试剂商店购买得到的。以下实施例中的定量试验,均设置三次重复实验,结果取平均值。
实施例1、51个可以用作肝癌筛查的DMR的发现
此实施例描述与肝癌发生发展有关的差异化甲基化区域(Differentially methylated region,DMR)及其中可作为肝癌筛查的标记物的发现。
第一,对44对从冷冻肝癌组织以及从冷冻癌旁组织提取的DNA进行全基因组甲基化测序(whole genome bisulfite sequencing,WGBS),通过对数据分析计算,鉴定出与肝癌发生发展有关的DMR。
第二,对来自样本集合I的肝癌患者及健康人的血浆游离细胞DNA(plasma cfDNA)进行靶向甲基化高通量测序,包括140个肝癌患者及84个健康人的cfDNA。然后利用机器学习构建肝癌风险模型,筛选出这些DMR中可作为肝癌筛查的标记物。肝癌样本的入组标准为:经病理确诊的原发肝细胞癌或肝内胆管癌,无既往恶性肿瘤史,术前未接收任何形式的抗肿瘤治疗。其中0期与A期肝癌样本需占肝癌样本总数的60%以上。健康人为未诉异常的体检样本。
研究受试者和样品:研究由华大伦理委员会及复旦大学附属中山医院伦理委员会批准。冷冻肝癌组织、冷冻肝癌组织、肝癌患者血浆、健康人血浆均来自复旦大学附属中山医院。
一、DNA准备
1、组织样本提取
使用DNeasy Blood&Tissue Ki(Qiagen,#69506)对44对肝癌和肝癌癌旁组织样本进行DNA提取。
2、DNA片段化
取200ng提取好的DNA,同时加入1ng Unmethylated lambda DNA(PROMEGA,#D1521)用于后续C-U转化质控,使用超声打断仪将DNA打断,再使用AMPure XP(AGENCOURT,#A63882)对DNA进行片段选择,使DNA片段大小集中在100-300左右。DNA片段化及磁珠双选均为常规实验操作,具体参数可根据打断仪型号差异,按照官网说明进行调整。
3、血浆样本提取
使用MagPure Circulating DNA Maxi Kit(MAGEN,#12917PJ-100)对肝癌和健康人血浆样本进行cfDNA提取。取10ng cfDNA,并加入0.05ng打断并筛选至160bp左右的Unmethylated lambda DNA用于后续C-U转化质控。
二、文库构建
使用MGIEasy全基因组甲基化文库制备试剂盒(MGI,#1000005251)进行文库构建,按照说明书完成“末端修复&添加dA尾”、“接头连接”、“连接产物纯化”、“Bisulfite处理和纯化”、“PCR扩增”、“PCR产物纯化”和“PCR产物质检”步骤,无需进行试剂盒后续操作。
三、文库杂交捕获
使用Seq Cap EZ Hybridization and Wash Kit(ROCHE,5634253001)和SeqCap Epi CpGiant Enrichment Kit(ROCHE,7138911001)进行杂交、捕获和洗脱和PostPCR。由于使用了MGI平台的测序仪器,杂交过程中所用的Block以及PostPCR步骤中所用的PostPCR引物须使用MGI平台对应的Block和PostPCR引物(均来自MGIEasy外显子组捕获辅助试剂盒,MGI,#1000007743)。
四、测序
使用MGISEQ-2000(MGI)进行PE100测序。可购买对应商业试剂盒(MGISEQ-2000RS高通量测序试剂套装(FCL PE100),1000012552)进行测序操作,或委托提供测序服务的机构进行测序。
六、结果
对来自44个肝癌病人的肝癌组织与癌旁组织提取的DNA进行全基因组甲基化组测序,对测序数据进行分析,计算出了每个CpG位点的甲基化率。通过比较各个CpG位点在肝癌组织与癌旁组织中甲基化率的异同,采用层次贝叶方法辨析出了2658个可能与肝癌发生发展有关的差异化甲基化区域(DMR),其中357个是高甲基化DMR。
对该实施例中来自样本集合I的140个肝癌患者及84个健康人的cfDNA进行靶向高通量测序。采用基于随机森林的10折交叉验证对高甲基化DMR进行建模,根据特征重要性进行筛选(特征重要性>0.15)出51个可作为肝癌筛查的 标志物的DMR(表1)。利用每折中验证集对模型性能进行评估,由此可得验证集的平均灵敏度为0.929,特异性为0.894,AUC为0.96。该结果对应的具体操作如下:将样本分为10份,在10折交叉验证中的每一折中,利用其中90%样本作为训练集(建模用),10%作为验证集(用于验证模型),每一折的测试样本均不同。建模时是采用对cfDNA进行靶向高通量测序所得的单CpG位点的深度和甲基化胞嘧啶个数经计算得到的DMR甲基化率(即DMR区域内CpG中甲基化胞嘧啶占所有胞嘧啶的比值)。模型通过python scikit-learn进行构建,参数为:最大深度为5,50棵树,特征重要性>0.15。
图2展示了基于44对肝癌组织和癌旁组织发现的2658个DMR的热图。肝癌组织于左侧,癌旁组织于右侧。高甲基化DMR位于热图的上方,低甲基化DMR位于热图的下方。每格代表对应的该样本于该位点的DMR甲基化率,其范围为0-1。甲基化率越接近0,颜色越深。由图可见,所发现的DMR对肝癌组织与癌旁组织有明显的区分度。
图3展示了51个DMR在样本集合I的140个肝癌患者和84个健康人中区域内甲基化率的热图。横轴为样本,健康人于最左侧,随后为肝癌患者。纵轴为DMR,高甲基化DMR于上方,下方为低甲基化DMR。由图可见,筛选的51个DMR对肝癌患者与健康人样本有明显的区分度。
图4展示了基于51个DMR的肝癌甲基化模型在该实施例中样本集合I的性能。横轴代表假阳性率(1-特异性),纵轴代表灵敏度。由图可见,该甲基化模型对该实施例样本有良好的判断能力。
表1 51个可作为肝癌筛查的标志物的DMR
DMR编号 染色体 起点位置 终点位置 DMR长度(bp) 基因
1 chr1 22140769 22140997 229 LDLRAD2
2 chr1 47909518 47911295 1778  
3 chr1 119522233 119522972 740 TBX15
4 chr1 119525991 119526101 111 TBX15
5 chr1 119526727 119527757 1031 TBX15
6 chr1 119531595 119533069 1475 TBX15
7 chr1 119535537 119535986 450 TBX15
8 chr1 119542942 119543424 483  
9 chr1 119549096 119550717 1622  
10 chr1 197882364 197882519 156 LHX9
11 chr2 26624440 26625280 841 DRC1
12 chr2 63282623 63283168 546 OTX1
13 chr2 63283795 63284165 371 OTX1
14 chr2 162279905 162280539 635 TBR1
15 chr2 200326591 200327369 779 SATB2
16 chr2 200333453 200333973 521 SATB2
17 chr3 125075832 125076480 649 ZNF148
18 chr3 170137150 170137931 782 CLDN11
19 chr4 995761 996936 1176 IDUA
20 chr4 41875340 41875925 586  
21 chr5 139047739 139048298 560 CXXC5
22 chr6 1624936 1625224 289 GMDS
23 chr6 26271346 26271748 403 HIST1H2BI
24 chr6 108488594 108488844 251 NR2E1
25 chr6 108492267 108492437 171 NR2E1
26 chr6 150285813 150286646 834 ULBP1
27 chr7 27207996 27208054 59 HOXA10-AS
28 chr7 96636496 96636870 375 DLX6
29 chr7 129418361 129418612 252 MIR183
30 chr8 17271051 17271340 290 MTMR7
31 chr8 67873733 67874151 419 TCF24
32 chr8 99961175 99961661 487 OSR2
33 chr8 99985934 99986482 549  
34 chr9 100616319 100616730 412 FOXE1
35 chr10 93646929 93647266 338  
36 chr10 134597818 134599519 1702 NKX6
37 chr11 69517700 69518306 607 FGF19
38 chr12 58021614 58021842 229 B4GALNT1
39 chr12 81102127 81102896 770 MYF6
40 chr14 102247495 102248194 700 PPP2R5C
41 chr15 76630449 76631040 592 ISL2
42 chr17 29297770 29298669 900 DPRXP4
43 chr17 43047552 43047830 279 C1QL1
44 chr17 75368790 75370662 1873 SEPT9
45 chr18 76739367 76740382 1016 SALL3
46 chr19 12305592 12306084 493  
47 chr19 13210026 13210503 478 LYL1
48 chr19 15342716 15343266 551 EPHX3
49 chr19 15344024 15344364 341 EPHX3
50 chr20 50721097 50722014 918 ZFP64
51 chr22 38220548 38221506 959 GALR3
注:表中的物理位置是基于人类全基因组序列(版本号为hg19)比对确定的。“基因”一栏空着的表示该区域无可注释的基因。
实施例2、51个DMR在肝癌筛查中的验证
此实施例的主要目的是在样本集合II里验证这51个DMR筛查肝癌的性能,包括来自245个肝癌患者,36个肝良性病变患者,175个健康人及25个肝硬化患者的血浆游离DNA。肝癌样本的入组标准为:经病理确诊的原发肝细胞癌或肝内胆管癌,无既往恶性肿瘤史,术前未接收任何形式的抗肿瘤治疗。其中0期与A期肝癌样本需占肝癌样本总数的60%以上。肝良性病变包括肝血管瘤,肝脓肿,肝囊肿、肝局灶性结节性增生。健康人为未诉异常的体检样本。样本集合II中不包含任何来自实施例I中样本集合I的样本。
研究受试者和样品:同实施例1。
一、方法
在样本集合II里验证这51个DMR筛查肝癌的性能,直接验证实施例1构建的相应模型。
二、结果
表2展示了样本集合II中肝癌、健康人、肝硬化和肝良性病变在本模型51个DMR的甲基化率中位数。由表可见,所选DMR在肝癌及非肝癌样本有明显的区分度。
表3展示了模型在样本集合II中性能。其中在肝癌样本中灵敏度达到81.3%,在健康人的特异性为96.9%,在肝硬化和肝良性病变的特异性分别达到90.5%和73.5%。由表可见,本模型对肝癌、健康人、其他肝部病变有良好的鉴别能力。
表4展示了模型在肝癌不同分期(BCLC分期)下的性能。0期的灵敏度为51.8%,A期的灵敏度为82.5%,B期的灵敏度为91.7%,C期的灵敏度为100%。在常规AFP检测中,0期、A期、B期和C期的灵敏度分别为29.6%,27.7%,55.0%,57.1%。由表可见,本模型较现有AFP的性能有显著的提高。
图5展示了此分析中所考虑的甲基化标记物中的在独立验证样本集合中的甲基化率热图。由图可见,肝癌与非肝癌有明显的区分度。
图6展示了该肝癌甲基化率模型在样本集合II中的AUC。
图7展示了该甲基化模型与传统AFP检测对于判断肝癌的性能对比。由图可见,该甲基化模型在每个分期下的性能均较传统AFP检测有显著的提升。
表2 样本集合II中不同样本类型在51个DMR的甲基化率中位数
Figure PCTCN2020108131-appb-000001
Figure PCTCN2020108131-appb-000002
Figure PCTCN2020108131-appb-000003
表3 肝癌预测模型在样本集合II中性能
Figure PCTCN2020108131-appb-000004
注:表中空格表示不适用此格,即只有肝癌才有灵敏度,非肝癌才有特异性。
表4 肝癌预测模型在不同分期下的性能
Figure PCTCN2020108131-appb-000005
注:共有127例肝癌样本有临床分期信息,且这127例样本中有109例有AFP信息。
工业应用
本发明提供了51个可以用作肝癌筛查的DMR(表1)。通过检测这些DMR的甲基化水平,并对所得数据进行分析,可以预测受检者罹患肝癌的可能性,进而实现在普通人群或肝癌高危人群中筛查肝癌的目的。本发明提供了一种能准确、简便、经济的肝癌早期筛查手段,可提高肝癌高危人群及普通体检人群中肝癌,尤其是早期肝癌的检出率,进而提高肝癌病人的生存率,并节约大量的医疗支出和降低医疗负担。

Claims (44)

  1. 差异化甲基化区域组,含有如下51个差异化甲基化区域中的全部或部分:
    (A1)位于1号染色体的第22140769-22140997位;
    (A2)位于1号染色体的第47909518-47911295位;
    (A3)位于1号染色体的第119522233-119522972位;
    (A4)位于1号染色体的第119525991-119526101位;
    (A5)位于1号染色体的第119526727-119527757位;
    (A6)位于1号染色体的第119531595-119533069位;
    (A7)位于1号染色体的第119535537-119535986位;
    (A8)位于1号染色体的第119542942-119543424位;
    (A9)位于1号染色体的第119549096-119550717位;
    (A10)位于1号染色体的第197882364-197882519位;
    (A11)位于2号染色体的第26624440-26625280位;
    (A12)位于2号染色体的第63282623-63283168位;
    (A13)位于2号染色体的第63283795-63284165位;
    (A14)位于2号染色体的第162279905-162280539位;
    (A15)位于2号染色体的第200326591-200327369位;
    (A16)位于2号染色体的第200333453-200333973位;
    (A17)位于3号染色体的第125075832-125076480位;
    (A18)位于3号染色体的第170137150-170137931位;
    (A19)位于4号染色体的第995761-996936位;
    (A20)位于4号染色体的第41875340-41875925位;
    (A21)位于5号染色体的第139047739-139048298位;
    (A22)位于6号染色体的第1624936-1625224位;
    (A23)位于6号染色体的第26271346-26271748位;
    (A24)位于6号染色体的第108488594-108488844位;
    (A25)位于6号染色体的第108492267-108492437位;
    (A26)位于6号染色体的第150285813-150286646位;
    (A27)位于7号染色体的第27207996-27208054位;
    (A28)位于7号染色体的第96636496-96636870位;
    (A29)位于7号染色体的第129418361-129418612位;
    (A30)位于8号染色体的第17271051-17271340位;
    (A31)位于8号染色体的第67873733-67874151位;
    (A32)位于8号染色体的第99961175-99961661位;
    (A33)位于8号染色体的第99985934-99986482位;
    (A34)位于9号染色体的第100616319-100616730位;
    (A35)位于10号染色体的第93646929-93647266位;
    (A36)位于10号染色体的第134597818-134599519位;
    (A37)位于11号染色体的第69517700-69518306位;
    (A38)位于12号染色体的第58021614-58021842位;
    (A39)位于12号染色体的第81102127-81102896位;
    (A40)位于14号染色体的第102247495-102248194位;
    (A41)位于15号染色体的第76630449-76631040位;
    (A42)位于17号染色体的第29297770-29298669位;
    (A43)位于17号染色体的第43047552-43047830位;
    (A44)位于17号染色体的第75368790-75370662位;
    (A45)位于18号染色体的第76739367-76740382位;
    (A46)位于19号染色体的第12305592-12306084位;
    (A47)位于19号染色体的第13210026-13210503位;
    (A48)位于19号染色体的第15342716-15343266位;
    (A49)位于19号染色体的第15344024-15344364位;
    (A50)位于20号染色体的第50721097-50722014位;
    (A51)位于22号染色体的第38220548-38221506位;
    所述51个差异化甲基化区域的物理位置是基于人类全基因组序列hg19比对确定的。
  2. 根据权利要求1所述的差异化甲基化区域组,其特征在于:所述差异化甲基化区域组由权利要求1中(A1)-(A51)所示的51个差异化甲基化区域中的全部或部分组成。
  3. 权利要求1或2所述的差异化甲基化区域组作为甲基化标记物在如下任一中的应用:
    (B1)制备用于诊断或辅助诊断癌症的产品;
    (B2)诊断或辅助诊断癌症;
    (B3)制备用于在临床症状之前预警癌症的产品;
    (B4)在临床症状之前预警癌症;
    (B5)制备用于区分或辅助区分癌症和良性病变的产品;
    (B6)区分或辅助区分癌症和良性病变。
  4. 用于检测权利要求1或2所述的差异化甲基化区域组的甲基化水平的物质在如下任一中的应用:
    (B1)制备用于诊断或辅助诊断癌症的产品;
    (B2)诊断或辅助诊断癌症;
    (B3)制备用于在临床症状之前预警癌症的产品;
    (B4)在临床症状之前预警癌症;
    (B5)制备用于区分或辅助区分癌症和良性病变的产品;
    (B6)区分或辅助区分癌症和良性病变。
  5. 物质和介质的组合在如下任一中的应用:
    (B1)制备用于诊断或辅助诊断癌症的产品;
    (B2)诊断或辅助诊断癌症;
    (B3)制备用于在临床症状之前预警癌症的产品;
    (B4)在临床症状之前预警癌症;
    (B5)制备用于区分或辅助区分癌症和良性病变的产品;
    (B6)区分或辅助区分癌症和良性病变;
    所述物质为用于检测权利要求1或2所述的差异化甲基化区域组的物质;
    所述介质上储存有癌症风险预测模型构建和使用方法;
    所述癌症风险预测模型的构建和使用方法,包括如下步骤:
    (C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对权利要求1所述的差异化甲基化区域组的甲基化水平数据;
    (C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变。
  6. 储存有癌症风险预测模型构建和使用方法的介质在如下任一中的应用:
    (B1)制备用于诊断或辅助诊断癌症的产品;
    (B2)诊断或辅助诊断癌症;
    (B3)制备用于在临床症状之前预警癌症的产品;
    (B4)在临床症状之前预警癌症;
    (B5)制备用于区分或辅助区分癌症和良性病变的产品;
    (B6)区分或辅助区分癌症和良性病变;
    所述癌症风险预测模型的构建和使用方法包括如下步骤:
    (C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对权利要求1所述的差异化甲基化区域组的甲基化水平数据;
    (C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变区分。
  7. 根据权利要求5或6所述的应用,其特征在于:所述机器学习法为随机森林法。
  8. 根据权利要求5-7中任一所述的应用,其特征在于:所述样本为能够提取DNA的样本。
  9. 根据权利要求8所述的应用,其特征在于:所述样本为血浆、组织、唾液、尿液或/和粪便。
  10. 根据权利要求5-9中任一所述的应用,其特征在于:获得所述甲基化 水平数据方法为重硫酸盐转化、PCR、甲基化特异性PCR、焦磷酸测序、桑格测序、高通量测序或三代测序或单分子测序。
  11. 根据权利要求5-10中任一所述的应用,其特征在于:所述非癌症患者为健康对照或良性病变患者。
  12. 根据权利要求3-11中任一所述的应用,其特征在于:所述癌症为肝癌、结直肠癌、肺癌、胃癌或胰腺癌。
  13. 根据权利要求12所述的应用,其特征在于:所述癌症为肝癌;所述良性病变为肝脏良性病变或肝硬化。
  14. 根据权利要求12或13所述的应用,其特征在于:所述肝癌为原发肝细胞癌或肝内胆管癌。
  15. 根据权利要求12-14中任一所述的应用,其特征在于:所述肝癌为BCLC分期为O期、A期、B期和/或C期的肝癌。
  16. 一种试剂盒,包含:
    (a1)重亚硫酸盐试剂;以及
    (a2)对照核酸,所述对照核酸包含来自权利要求1或2所述差异化甲基化区域组的序列,并且具有与非癌症患者相关的甲基化状态。
  17. 一种试剂盒,包含:
    (b1)重亚硫酸盐试剂;以及
    (b2)对照核酸,所述对照核酸包含来自权利要求1或2所述差异化甲基化区域组的序列,并且具有与癌症患者相关的甲基化状态。
  18. 一种试剂盒,含有用于检测权利要求1或2所述的差异化甲基化区域组的物质和储存有癌症风险预测模型构建和使用方法的介质;
    所述癌症风险预测模型的构建和使用方法包括如下步骤:
    (C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对权利要求1所述的差异化甲基化区域组的甲基化水平数据;
    (C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变。
  19. 系统,包括:
    (D1)用于检测权利要求1或2所述的差异化甲基化区域组的甲基化水平的试剂和/或仪器;
    (D2)装置,所述装置包括单元X和单元Y;
    所述单元X用于建立癌症风险预测模型,包括数据采集模块和数据分析处理模块;
    所述数据采集模块用于采集(D1)检测得到的来自n1个癌症患者样本和n2个非癌症患者样本的针对权利要求1所述的差异化甲基化区域组的甲基化水平数据;
    所述数据分析处理模块能够将所述数据采集模块采集的来自n1个癌症患者样本和n2个非癌症患者样本的针对权利要求1所述的差异化甲基化区域组的甲基化水平数据作为训练集,基于机器学习法原理构建得到癌症风险预测模型;
    所述单元Y能够基于所述癌症风险预测模型和来自待测者样本的针对权利要求1所述的差异化甲基化区域组的甲基化水平数据,实现诊断或辅助诊断癌症、在临床症状之前预警癌症和/或区分或辅助区分癌症和良性病变。
  20. 根据权利要求18所述试剂盒或权利要求19所述的系统,其特征在于:所述机器学习法为随机森林法。
  21. 根据权利要求18-20中任一所述的试剂盒或系统,其特征在于:所述样本为能够提取DNA的样本。
  22. 根据权利要求21所述的试剂盒或系统,其特征在于:所述样本为血浆、组织、唾液、尿液或/和粪便。
  23. 根据权利要求18-22中任一所述的试剂盒或系统,其特征在于:获得所述甲基化水平数据方法为重硫酸盐转化、PCR、甲基化特异性PCR、焦磷酸测序、桑格测序、高通量测序或三代测序或单分子测序。
  24. 根据权利要求16-23中任一所述的试剂盒或系统,其特征在于:所述非癌症患者为健康对照或良性病变患者。
  25. 根据权利要求16-23中任一所述的试剂盒或系统,其特征在于:所述癌症为肝癌、结直肠癌、肺癌、胃癌或胰腺癌。
  26. 根据权利要求25所述的试剂盒或系统,其特征在于:所述癌症为肝癌;所述良性病变为肝脏良性病变或肝硬化。
  27. 根据权利要求25或26所述的试剂盒或系统,其特征在于:所述肝癌为原发肝细胞癌或肝内胆管癌。
  28. 根据权利要求25-27中任一所述的试剂盒或系统,其特征在于:所述肝癌为BCLC分期为O期、A期、B期和/或C期的肝癌。
  29. 权利要求16-28中任一所述试剂盒或系统在如下任一中的应用:
    (B1)制备用于诊断或辅助诊断癌症的产品;
    (B2)诊断或辅助诊断癌症;
    (B3)制备用于在临床症状之前预警癌症的产品;
    (B4)在临床症状之前预警癌症;
    (B5)制备用于区分或辅助区分癌症和良性病变的产品;
    (B6)区分或辅助区分癌症和良性病变。
  30. 一种诊断或辅助诊断癌症的方法,包括如下步骤:分析来自待测者样本的针对权利要求1或2所述的差异化甲基化区域组的甲基化状态,从而实现诊断或辅助诊断癌症。
  31. 根据权利要求30所述的方法,其特征在于:所述方法包括如下步骤:
    (C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本 的针对权利要求1或2所述的差异化甲基化区域组的甲基化水平数据;
    (C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现诊断或辅助诊断癌症。
  32. 一种在临床症状之前预警癌症的方法,包括如下步骤:分析来自待测者样本的针对权利要求1或2所述的差异化甲基化区域组的甲基化状态,从而实现在临床症状之前预警癌症。
  33. 根据权利要求32所述的方法,其特征在于:所述方法包括如下步骤:
    (C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对权利要求1或2所述的差异化甲基化区域组的甲基化水平数据;
    (C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现在临床症状之前预警癌症。
  34. 一种区分或辅助区分癌症和良性病变的方法,包括如下步骤:分析来自待测者样本的针对权利要求1或2所述的差异化甲基化区域组的甲基化状态,从而实现区分或辅助区分癌症和良性病变。
  35. 根据权利要求34所述的方法,其特征在于:所述方法包括如下步骤:
    (C1)构建训练集,包括来自n1个癌症患者样本和n2个非癌症患者样本的针对权利要求1或2所述的差异化甲基化区域组的甲基化水平数据;
    (C2)采用机器学习法构建癌症风险预测模型,然后利用所述癌症风险预测模型实现区分或辅助区分癌症和良性病变。
  36. 根据权利要求31、33或35所述的方法,其特征在于:所述机器学习法为随机森林法。
  37. 根据权利要求30-35中任一所述的方法,其特征在于:所述样本为能够提取DNA的样本。
  38. 根据权利要求37所述的方法,其特征在于:所述样本为血浆、组织、唾液、尿液、粪便。
  39. 根据权利要求30-38中任一所述的方法,其特征在于:分析所述甲基化状态的方法和获得所述甲基化水平数据方法为重硫酸盐转化、PCR、甲基化特异性PCR、焦磷酸测序、桑格测序、高通量测序或三代测序或单分子测序。
  40. 根据权利要求31、33或35所述的方法,其特征在于:所述非癌症患者为健康对照或良性病变患者。
  41. 根据权利要求30-39中任一所述的方法,其特征在于:所述癌症为肝癌、结直肠癌、肺癌、胃癌或胰腺癌。
  42. 根据权利要求42所述的方法,其特征在于:所述癌症为肝癌;所述良性病变为肝脏良性病变或肝硬化。
  43. 根据权利要求42或43所述的方法,其特征在于:所述肝癌为原发肝细胞癌或肝内胆管癌。
  44. 根据权利要求42-44中任一所述的方法,其特征在于:所述肝癌为BCLC 分期为O期、A期、B期和/或C期的肝癌。
PCT/CN2020/108131 2020-08-10 2020-08-10 用于肝癌检测和诊断的甲基化标志物 WO2022032429A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080010767.6A CN113454219B (zh) 2020-08-10 2020-08-10 用于肝癌检测和诊断的甲基化标志物
PCT/CN2020/108131 WO2022032429A1 (zh) 2020-08-10 2020-08-10 用于肝癌检测和诊断的甲基化标志物

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/108131 WO2022032429A1 (zh) 2020-08-10 2020-08-10 用于肝癌检测和诊断的甲基化标志物

Publications (1)

Publication Number Publication Date
WO2022032429A1 true WO2022032429A1 (zh) 2022-02-17

Family

ID=77808739

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/108131 WO2022032429A1 (zh) 2020-08-10 2020-08-10 用于肝癌检测和诊断的甲基化标志物

Country Status (2)

Country Link
CN (1) CN113454219B (zh)
WO (1) WO2022032429A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114657247A (zh) * 2022-02-28 2022-06-24 北京莱盟君泰国际医疗技术开发有限公司 用于早期肝癌检测的dna甲基化生物标记物或组合及其应用
CN114743593A (zh) * 2022-06-13 2022-07-12 北京橡鑫生物科技有限公司 一种基于尿液进行前列腺癌早期筛查模型的构建方法、筛查模型及试剂盒
CN115274124A (zh) * 2022-07-22 2022-11-01 江苏先声医学诊断有限公司 一种基于数据驱动的肿瘤早筛靶向Panel和分类模型的动态优化方法
WO2024045160A1 (zh) * 2022-09-02 2024-03-07 深圳华大基因股份有限公司 Oplah基因的差异性甲基化区域、试剂盒和用途
WO2024192928A1 (zh) * 2023-03-23 2024-09-26 北京和瑞精湛医学检验实验室有限公司 一种用于肝癌检测的基因组合与相关试剂和应用

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118475705A (zh) * 2021-11-11 2024-08-09 深圳华大基因股份有限公司 一种用于诊断肝癌的核酸检测试剂盒
CN118159669A (zh) * 2021-11-11 2024-06-07 深圳华大基因股份有限公司 一种用于诊断肝癌的核酸及蛋白检测试剂盒
CN113999914A (zh) * 2021-11-30 2022-02-01 杭州翱锐基因科技有限公司 一种新型多靶点肝细胞癌早期检测的组合标志物及其应用
CN115287353B (zh) * 2022-01-24 2023-10-27 南京世和医疗器械有限公司 一种肝癌血浆游离dna来源的甲基化标志物及用途
CN115094139B (zh) * 2022-06-22 2023-04-28 武汉艾米森生命科技有限公司 检测甲基化水平的试剂在制备膀胱癌诊断产品中的应用以及膀胱癌诊断试剂盒
TWI839307B (zh) * 2023-05-06 2024-04-11 華聯生物科技股份有限公司 利用電腦評估肝癌患者治療後病變進展及預後的方法
CN116656830B (zh) * 2023-08-01 2023-10-13 臻和(北京)生物科技有限公司 用于胃癌辅助诊断的甲基化标志物、装置、设备和存储介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102061337A (zh) * 2010-11-24 2011-05-18 深圳华大基因科技有限公司 一种组织特异性差异甲基化区域检测方法和系统
CN107267626A (zh) * 2017-07-11 2017-10-20 北京市理化分析测试中心 一种基于dna甲基化检测肝癌的试剂盒及应用

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102061337A (zh) * 2010-11-24 2011-05-18 深圳华大基因科技有限公司 一种组织特异性差异甲基化区域检测方法和系统
CN107267626A (zh) * 2017-07-11 2017-10-20 北京市理化分析测试中心 一种基于dna甲基化检测肝癌的试剂盒及应用

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
CHENG ZHANG, SHUANG GE, JUN WANG, XIAOTONG JING, HAILING LI, SHUYU MEI, JUAN ZHANG, KE LIANG, HUI XU, XIAOYING ZHANG, CUIJUAN ZHAN: "Epigenomic profiling of DNA methylation for hepatocellular carcinoma diagnosis and prognosis prediction", JOURNAL OF GASTROENTEROLOGY AND HEPATOLOGY, 6TH INTERNATIONAL SYMPOSIUM ON ALCOHOLIC LIVER AND PANCREATIC DISEASES AND CIRRHOSIS, 20‐21 OCTOBER 2011, FUKUOKA, JAPAN, vol. 34, no. 10, 1 October 2019 (2019-10-01), pages 1869 - 1877, XP055759042, ISSN: 0815-9319, DOI: 10.1111/jgh.14694 *
DATABASE NUCLEOTIDE 26 June 2021 (2021-06-26), ANONYMOUS : "Homo sapiens dynein regulatory complex subunit 1 (DRC1), mRNA", XP055900979, retrieved from NCBI Database accession no. NM_145038 *
DATABASE NUCLEOTIDE 27 December 2021 (2021-12-27), ANONYMOUS : "Homo sapiens NK6 homeobox 1 (NKX6-1), mRNA ", XP055900971, retrieved from NCBI Database accession no. NM_006168 *
DATABASE NUCLEOTIDE 3 July 2008 (2008-07-03), ANONYMOUS : "Homo sapiens cDNA FLJ45629 fis, clone CHONS2000797, highly similar to T-box transcription factor TBX15 ", XP055900977, retrieved from NCBI Database accession no. AK127536 *
DATABASE NUCLEOTIDE 8 February 2022 (2022-02-08), ANONYMOUS : "Homo sapiens low density lipoprotein receptor class A domain containing 2 (LDLRAD2), mRNA", XP055900829, retrieved from NCBI Database accession no. NM_001013693 *
HLADY, R.A. ET AL.: "Genome-wide discovery and validation of diagnostic DNA methylation-based biomarkers for hepatocellular cancer detection in circulating cell free DNA.", THERANOSTICS., vol. 9, no. 24, 31 December 2019 (2019-12-31), XP055842150, DOI: 10.7150/thno.35573 *
KONG DE-SONG, ZHANG FENG, QIU PING, ZHENG SHI-ZHONG: "Role and Mechanisms of DNA Methylation in Liver Diseases", WORLD CHINESE JOURNAL OF DIGESTOLOGY, vol. 22, no. 21, 28 July 2014 (2014-07-28), CN , pages 3041 - 3047, XP009534345, ISSN: 1009-3079, DOI: 10.11569/wcjd.v22.i21.3041 *
SHEN JING, WANG SHUANG, ZHANG YU-JING, WU HUI-CHEN, KIBRIYA MUHAMMAD G., JASMINE FARZANA, AHSAN HABIBUL, WU DAVID PH, SIEGEL ABBY : "Exploring genome-wide DNA methylation profiles altered in hepatocellular carcinoma using Infinium HumanMethylation 450 BeadChips", EPIGENETICS, LANDES BIOSCIENCE, US, vol. 8, no. 1, 1 January 2013 (2013-01-01), US , pages 34 - 43, XP055900823, ISSN: 1559-2294, DOI: 10.4161/epi.23062 *
SONG MIN-AE, KWEE SANDI A., TIIRIKAINEN MAARIT, HERNANDEZ BRENDA Y., OKIMOTO GORDON, TSAI NAOKY C., WONG LINDA L., YU HERBERT: "Comparison of genome-scale DNA methylation profiles in hepatocellular carcinoma by viral status", EPIGENETICS, LANDES BIOSCIENCE, US, vol. 11, no. 6, 2 June 2016 (2016-06-02), US , pages 464 - 474, XP055900824, ISSN: 1559-2294, DOI: 10.1080/15592294.2016.1151586 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114657247A (zh) * 2022-02-28 2022-06-24 北京莱盟君泰国际医疗技术开发有限公司 用于早期肝癌检测的dna甲基化生物标记物或组合及其应用
CN114657247B (zh) * 2022-02-28 2022-12-02 北京莱盟君泰国际医疗技术开发有限公司 用于早期肝癌检测的dna甲基化生物标记物或组合及其应用
CN114743593A (zh) * 2022-06-13 2022-07-12 北京橡鑫生物科技有限公司 一种基于尿液进行前列腺癌早期筛查模型的构建方法、筛查模型及试剂盒
CN115274124A (zh) * 2022-07-22 2022-11-01 江苏先声医学诊断有限公司 一种基于数据驱动的肿瘤早筛靶向Panel和分类模型的动态优化方法
CN115274124B (zh) * 2022-07-22 2023-11-14 江苏先声医学诊断有限公司 一种基于数据驱动的肿瘤早筛靶向Panel和分类模型的动态优化方法
WO2024045160A1 (zh) * 2022-09-02 2024-03-07 深圳华大基因股份有限公司 Oplah基因的差异性甲基化区域、试剂盒和用途
WO2024192928A1 (zh) * 2023-03-23 2024-09-26 北京和瑞精湛医学检验实验室有限公司 一种用于肝癌检测的基因组合与相关试剂和应用

Also Published As

Publication number Publication date
CN113454219B (zh) 2024-03-08
CN113454219A (zh) 2021-09-28

Similar Documents

Publication Publication Date Title
WO2022032429A1 (zh) 用于肝癌检测和诊断的甲基化标志物
AU2020260534C1 (en) Using size and number aberrations in plasma DNA for detecting cancer
Wen et al. Genome-scale detection of hypermethylated CpG islands in circulating cell-free DNA of hepatocellular carcinoma patients
Lange et al. Genome-scale discovery of DNA-methylation biomarkers for blood-based detection of colorectal cancer
Bratulic et al. The translational status of cancer liquid biopsies
EP3555309B1 (en) Epigenetic markers and related methods and means for the detection and management of ovarian cancer
JP6161607B2 (ja) サンプルにおける異なる異数性の有無を決定する方法
CN112322736B (zh) 一种用于检测肝癌的试剂组合,试剂盒及其用途
CN112501293B (zh) 一种用于检测肝癌的试剂组合,试剂盒及其用途
WO2022161076A1 (zh) 用于肺结节良恶性检测的甲基化标记物或其组合及应用
CN112280865B (zh) 一种用于检测肝癌的试剂组合,试剂盒及其用途
WO2021233329A1 (zh) 用于检测乳腺癌的甲基化生物标记物或其组合和应用
Yuan et al. Early screening of nasopharyngeal carcinoma
WO2023226939A1 (zh) 用于检测结直肠癌淋巴结转移的甲基化生物标记物及其应用
Sun et al. Detection and monitoring of HBV-related hepatocellular carcinoma from plasma cfDNA fragmentation profiles
CN112951325A (zh) 一种用于癌症检测的探针组合的设计方法及其应用
CN115976216A (zh) 一组用于肺结节良恶性鉴别诊断的甲基化标志物及其筛选方法和应用
Li et al. RASSF1A methylation as a biomarker for detection of colorectal cancer and hepatocellular carcinoma
CN113817822B (zh) 一种基于甲基化检测的肿瘤诊断试剂盒及其应用
Xue et al. Circulating cell-free DNA sequencing for early detection of lung cancer
WO2023082142A1 (zh) 用于检测肝癌的otx1甲基化标志物
WO2023082141A1 (zh) 用于检测肝癌的hist1h3g甲基化标志物
WO2023082140A1 (zh) 一种用于诊断肝癌的核酸检测试剂盒
WO2023082139A1 (zh) 一种用于诊断肝癌的核酸及蛋白检测试剂盒
US20230102121A1 (en) Reagent combination and kit for detecting liver cancers, and use thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20948928

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20948928

Country of ref document: EP

Kind code of ref document: A1