WO2021169875A1 - 一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法 - Google Patents

一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法 Download PDF

Info

Publication number
WO2021169875A1
WO2021169875A1 PCT/CN2021/077065 CN2021077065W WO2021169875A1 WO 2021169875 A1 WO2021169875 A1 WO 2021169875A1 CN 2021077065 W CN2021077065 W CN 2021077065W WO 2021169875 A1 WO2021169875 A1 WO 2021169875A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
cancer
probe
probes
tissue
Prior art date
Application number
PCT/CN2021/077065
Other languages
English (en)
French (fr)
Inventor
韩晓亮
李永君
吴宁宁
郭媛媛
王建铭
Original Assignee
博尔诚(北京)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 博尔诚(北京)科技有限公司 filed Critical 博尔诚(北京)科技有限公司
Priority to CN202180016643.3A priority Critical patent/CN115176034A/zh
Publication of WO2021169875A1 publication Critical patent/WO2021169875A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • This application relates to a cancer gene methylation detection system, in particular to a system based on high-throughput sequencing (NGS) to detect changes in the free DNA methylation level of tumors of three luminal organs, including esophageal cancer, gastric cancer, and colorectal cancer. And the in vitro cancer detection method implemented in this system.
  • NGS high-throughput sequencing
  • NGS High-throughput sequencing
  • This technology can simultaneously sequence dozens to millions of DNA molecules, marking the arrival of the post-genome era.
  • different goals such as de novo sequencing and resequencing can be achieved, and the sequence of the genome, transcriptome, and methylome can also be analyzed through different pre-processing.
  • PCR polymerase chain reaction
  • FISH fluorescence in situ hybridization
  • gene chip technology has low price, high sensitivity, simple and fast operation, and high clinical popularity.
  • FISH fluorescence in situ hybridization
  • the gene chip throughput is higher than the former two, and it can detect a large number of genes at the same time.
  • the limitation is that it can only detect known genes or mutations, with low accuracy and high false positives.
  • the NGS technology has the characteristics of high throughput (detecting a large number of known and unknown genes and mutations at the same time), accurate results (higher accuracy than gene chips), fast detection speed, and low cost of detection for each gene. Now Has been gradually applied to clinical disease detection and monitoring and other fields. With the further reduction of sequencing costs in the future, NGS will inevitably gradually replace other high-throughput technologies such as gene chips.
  • target sequence capture and sequencing has become a more mainstream choice.
  • This technology is based on detection requirements, designing capture probes for the genomic region of interest, enriching the target fragment DNA through the principle of hybridization and complementation, and subsequently performing NGS detection.
  • This strategy can be flexibly customized according to the purpose of research or detection, selecting only a small number of gene regions, increasing the depth of sequencing, and effectively discovering the mutation status of the target region, with high sensitivity and accuracy.
  • Liquid biopsy is a method of in vitro diagnosis. It uses non-invasive blood testing to monitor circulating tumor cells (CTC) or circulating tumor DNA (ctDNA) released into the blood by tumors or metastases. This technology can effectively reduce invasiveness. It can realize the sampling of all parts of the tumor and all metastases, overcome tumor heterogeneity (and the current standard tissue biopsy can only reflect the characteristics of a certain part of the tumor), and realize real-time monitoring with higher sensitivity , It is even possible to predict the location of the lesion based on genomic information, which can effectively prolong the survival time of patients. Based on these advantages, liquid biopsy can be used for early diagnosis of tumors, auxiliary staging, prognosis and recurrence monitoring, medication guidance and other aspects. Currently the most commonly used liquid biopsy of free DNA.
  • CtDNA Cell-free DNA
  • cfDNA is a partially degraded endogenous DNA that exists in circulating blood and is free and extracellular. Studies have shown that during the development of tumor tissues, after tumor cells undergo apoptosis, DNA will be released into plasma, and after degradation, free tumor DNA (ctDNA) will be formed.
  • the molecular genetic characteristics of CtDNA (such as gene mutation, microsatellite instability, and tumor suppressor gene promoter methylation, etc.) are consistent with tumor tissue DNA.
  • the collection of peripheral blood is simpler than other clinical detection methods, easy to promote to the grassroots, and because of its non-invasive characteristics, it is easier to be accepted by asymptomatic people. Therefore, the detection of changes in the level of ctDNA methylation in plasma can become one of the important methods for early screening and diagnosis of multiple cancers.
  • Using target sequence capture technology combined with NGS to monitor cfDNA variation and methylation level changes can realize early tumor screening, susceptibility gene monitoring, companion diagnosis, personalized medication, prognostic monitoring and other applications.
  • many companies at home and abroad have launched cancer detection panels of different scales for different application scenarios. Some panels have obtained FDA or CFDA approval numbers.
  • FoundationOne CDx launched by Foundation Medicine covers 324 genes
  • IMPACT launched by Memorial Sloan Kettering Cancer Research Center (MSK) covers 468 cancer-related genes
  • Burning Rock Medicine launched "human EGFR/ALK/BRAF/KRAS gene mutations” Joint detection kit”, “Human EGFR, KRAS, BRAF, PIK3CA, ALK, ROS1 gene mutation detection kit” launched by Nuohe Zhiyuan, etc.
  • Kunyuan Gene also launched the product "Chang Lesi” to detect the methylation level of colorectal cancer in 2018.
  • this application provides a system for detecting cancer methylation and an in vitro cancer detection method implemented in the system, which can be used for esophageal cancer and gastric cancer.
  • the system and the method implemented in the system can: 1) be used in a non-invasive way for early screening of asymptomatic people and prognostic testing of cancer patients, reducing the harm caused by invasive testing, 2) increasing The depth of sequencing makes the breadth of gene detection superior to existing technologies and products. It has the characteristics of high throughput, faster detection speed, and low detection cost evenly distributed to each gene.
  • this application involves the following:
  • a system for detecting cancer methylation comprising:
  • a sample collection module which is used to collect samples of subjects
  • DNA extraction module which is used to extract and purify DNA in the sample
  • a library building module which is used to construct a DNA library for sequencing against the purified DNA sample
  • a transformation module which is used to transform the constructed DNA library with bisulfite
  • a pre-PCR amplification module which is used for pre-PCR amplification of the bisulfite-converted DNA library
  • Hybrid capture module which is used to hybridize and capture the sample amplified by pre-PCR by using the probe composition
  • PCR amplification module which is used to amplify products captured by hybridization by PCR
  • Sequencing module which is used to perform high-throughput second-generation sequencing on the hybridized and captured products after PCR amplification
  • the data analysis module is used to analyze the sequencing data and determine the methylation level of the sample
  • An interpretation module which is used to determine the patient's condition based on the methylation level of the sample.
  • the sample collected from the subject is a plasma sample.
  • the probe composition used in the hybrid capture module includes:
  • M probes targeting tissue-specific regions.
  • the probe composition used in the hybrid capture module includes:
  • a hypomethylation probe that hybridizes to the cancer-specific region, pan-cancer-specific region, and tissue-specific region that are converted by bisulfite without CG methylation, and
  • a hypermethylation probe that hybridizes to the cancer-specific region, pan-cancer specific region, and tissue-specific region where the bisulfite-converted CG is fully methylated.
  • n is any integer selected from 1-192;
  • cancer-specific region is selected from any of Seq ID No.: 1-62.
  • n is any integer selected from 1-44;
  • tissue-specific region is selected from any of Seq ID No.: 65-83.
  • the hypomethylation probe includes a probe that targets a cancer-specific region. Seq ID No.: any of 84-180, target Seq ID No. of probes targeting pan-cancer specific regions: any of 181-182, and Seq ID No. of probes targeting tissue-specific regions: any of 183-204.
  • the hypermethylation probe includes a probe targeting a cancer-specific region. Seq ID No.: any of 205-301, targeting pan Seq ID No. of probes for cancer-specific regions: any of 302-330, and Seq ID No. of probes that target tissue-specific regions: any of 304-325.
  • Pan-cancer interpretation module which is used to compare pan-cancer specific area databases and perform interpretation to confirm whether the subject has cancer
  • the cancer interpretation module which is used to compare the cancer-specific region database and perform interpretation to further confirm that the cancer the subject has is one of several suspected cancers;
  • the tissue-specific interpretation module compares the tissue-specific region database and performs interpretation to confirm the location of the subject's cancer.
  • the pan-cancer interpretation module includes the following interpretation: determining whether the methylation level of the pan-cancer specific region Seq ID No.: 63 is greater than or equal to 55%, and determining whether State the pan-cancer specific region Seq ID No.: whether the methylation level of 64 is greater than or equal to 60%, if the methylation level of Seq ID No.: 63 is greater than or equal to 55% and the methylation level of Seq ID No.: 64 If the level is greater than or equal to 60%, it is judged that the patient has cancer.
  • the cancer interpretation module includes the following interpretation: if among n probes targeting the cancer-specific region, the region targeted by n1 probes is methylated If the level is greater than or equal to the respective threshold, and n1/n ⁇ 20%, preferably n1/n ⁇ 30%, it is judged that the patient has any one of the tissue-specific cancers.
  • tissue-specific interpretation module includes the following interpretation: if among the m probes targeting the tissue-specific region, the methyl group of the region targeted by m1 probes If the level is greater than or equal to their respective thresholds, further analyze the tissues targeted by m1 probes greater than or equal to their respective thresholds and count the number of probes in each tissue greater than or equal to the threshold to determine that the tissue in which the patient is suffering from cancer is methylated The tissue with the largest number of probes with a level greater than or equal to the threshold.
  • An in vitro method for detecting cancer in a subject comprising the following steps:
  • probe composition to hybridize and capture samples amplified by pre-PCR
  • the probe composition comprising:
  • M probes targeting tissue-specific regions.
  • a hypomethylation probe that hybridizes to the cancer-specific region, pan-cancer-specific region, and tissue-specific region that are converted by bisulfite without CG methylation, and
  • a hypermethylation probe that hybridizes to the cancer-specific region, pan-cancer specific region, and tissue-specific region where the bisulfite-converted CG is fully methylated.
  • each probe in the probe composition has a length of 40-60 bp.
  • the length of each probe in the probe composition is 45-56 bp, preferably 50-56 bp, and more preferably 50 bp.
  • n is any integer selected from 1-192;
  • cancer-specific region is selected from any of Seq ID No.: 1-62.
  • n is any integer selected from 1-44;
  • tissue-specific region is selected from any of Seq ID No.: 65-83.
  • hypomethylation probe includes a probe targeting a cancer-specific region, Seq ID No.: any of 84-180, a probe targeting a pan-cancer-specific region Seq ID No.: any of 181-182, and Seq ID No. of probes targeting tissue-specific regions: any of 183-204.
  • the hypermethylation probe includes a probe targeting a cancer-specific region Seq ID No.: any of 205-301, a probe targeting a pan-cancer-specific region Seq ID No.: any of 302-303, and Seq ID No. of probes targeting tissue-specific regions: any of 304-325.
  • step (1) includes the following interpretation: judging whether the methylation level of the pan-cancer specific region Seq ID No.: 63 is greater than or equal to 55%, and judging all State the pan-cancer specific region Seq ID No.: whether the methylation level of 64 is greater than or equal to 60%, if the methylation level of Seq ID No.: 63 is greater than or equal to 55% and the methylation level of Seq ID No.: 64 If the level is greater than or equal to 60%, it is judged that the patient has cancer.
  • step (2) includes the following interpretation: if among the n probes targeting the cancer-specific region, the methyl group of the region targeted by n1 probes If the level is greater than or equal to the respective threshold, and n1/n ⁇ 20%, preferably n1/n ⁇ 30%, it is determined that the patient has any one of the tissue-specific cancers, and then the possibility of each cancer type is determined based on pattern recognition.
  • step (3) includes the following interpretation: if the region targeted by m1 probes is methylated among the m probes targeting the tissue-specific region If the level is greater than or equal to their respective thresholds, further analyze the tissues targeted by m1 probes greater than or equal to their respective thresholds and count the number of probes in each tissue greater than or equal to the threshold, and interpret the tissues that the patient is suffering from cancer as the methylation level The tissue with the largest number of probes greater than or equal to the threshold.
  • Liquid biopsy is a non-invasive tumor detection, which can be applied to asymptomatic and patient groups who cannot obtain tissue samples.
  • the average sequencing depth exceeds 5000X.
  • the screening of all high-incidence cancers can be completed at one time, which improves the detection efficiency.
  • the average price of each marker is lower than the existing single-marker detection in the market.
  • one Panel can complete the screening of major cancers, which saves the cost of probe synthesis, simplifies the experimental process, and facilitates the operation of experimenters.
  • Figure 1 is the implementation process of this application.
  • This application provides a cancer gene methylation detection system and a cancer in vitro detection method implemented in the system.
  • NGS high-throughput sequencing
  • the probe is a single-stranded or double-stranded DNA with a length of tens to hundreds or even thousands of base pairs, which can take advantage of molecular denaturation, renaturation and high accuracy of base complementary pairing, and can be complementary to the sample to be tested
  • the unlabeled single-stranded DNA or RNA is hydrogen-bonded (hybridized) to form a double-stranded complex (hybrid).
  • detection systems such as autoradiography or enzyme-linked reaction can be used to detect the results of the hybridization reaction.
  • the region that complementarily binds or hybridizes with the probe is the specific target region. Multiple probes are combined into a probe composition.
  • a cancer-specific region refers to a significant difference in the methylation level of this region compared with normal control tissues in a small number of cancer types.
  • pan-cancer specific region refers to a significant difference in the methylation level of this region compared with normal control tissues in most cancer types.
  • Tissue-specific region refers to the significant difference in the methylation level of the region in a specific tissue compared with other tissues.
  • DNA methylation refers to the methylation process that occurs at the 5th carbon atom of cytosine in CpG dinucleotides. As a stable modification state, DNA methyltransferase can follow DNA The duplication process of dna is inherited to the new generation DNA, which is an important epigenetic mechanism. When DNA is methylated, the methylation of the gene promoter region can lead to the silence of tumor suppressor gene transcription, so it is related to the occurrence of tumors. close. Abnormal methylation includes hypermethylation of tumor suppressor genes and DNA repair genes, hypomethylation of repetitive sequence DNA, and loss of imprinting of certain genes, which are related to the occurrence of a variety of tumors.
  • Panel refers to the probe composition used in this article.
  • This article relates to a system for detecting cancer methylation, which includes: a sample collection module, which is used to collect a sample of a subject; a DNA extraction module, which is used to extract and purify DNA in the sample; A library module, which is used to construct a DNA library for sequencing against a purified DNA sample; a transformation module, which is used to transform the constructed DNA library with bisulfite; a pre-PCR amplification module, which is used for pre-PCR amplification Increase the bisulfite-converted DNA library; hybrid capture module, which is used to hybridize and capture samples amplified by pre-PCR using the probe composition; post-PCR amplification module, which is used to use PCR amplification The product captured by hybridization; the sequencing module, which is used to perform high-throughput second-generation sequencing on the product captured by the hybridization after PCR amplification; the data analysis module, which is used to analyze the sequencing data to determine the sample's A Basicization level; an interpretation module, which is used to determine the patient's condition based
  • the sample collection module refers to a module that is integrated in the system to automatically collect the sample to be tested, that is, the blood or plasma of the subject, and is equipped with the sample to be tested;
  • the DNA extraction module refers to a module where the sample to be tested enters it, and the DNA is extracted by known conventional methods, such as heating and cleavage to release the DNA, and then filtering and removing impurities into the next module for reaction and detection;
  • the library building module refers to the end repair of the target fragment extracted from the DNA extraction module and the addition of base A, and connect it with the adapter to form a ligation product, and then the ligation product is amplified, separated and purified to form a DNA sequencing library. ;
  • the transformation module refers to a module that transforms the DNA library constructed in the library building module with bisulfite using a known transformation method
  • the pre-PCR amplification module refers to a module that uses a known method to amplify the bisulfite-converted DNA library in the transformation module to an amount that can be hybridized and captured with the probe composition described herein;
  • the hybridization capture module refers to a module that uses the traditional liquid phase hybridization capture system and the probe composition described herein to perform hybridization capture of samples amplified by pre-PCR;
  • the PCR amplification module refers to a module that uses a known amplification method to amplify the product captured by hybridization
  • Sequencing module refers to a module that uses a conventional high-throughput second-generation sequencing platform, such as the Illumina platform, to sequence the products after PCR amplification and hybridization capture;
  • the data analysis module refers to a module that analyzes the sequencing data based on a multi-cancer methylation analysis database obtained by integrating public data and existing sequencing data to determine the methylation level of the sample;
  • the interpretation module is based on the methylation level data of the sample obtained from the data analysis module, uses a computer to perform pattern recognition according to the database, constructs an interpretation of the overall cancer risk model, and analyzes the cancer risk and tissue source of the detected object, thereby A module for interpreting the patient’s condition.
  • This article also relates to a system for detecting cancer methylation, which includes: a sample collection module, which is used to collect a sample of a subject; a DNA extraction module, which is used to extract and purify DNA in the sample; A library building module, which is used to construct a DNA library for sequencing against a purified DNA sample; a transformation module, which is used to transform the constructed DNA library with bisulfite; a pre-PCR amplification module, which is used for pre-PCR Amplify the bisulfite-converted DNA library; a hybridization capture module, which is used to hybridize and capture samples amplified by pre-PCR using a probe composition; a post-PCR amplification module, which is used to amplify by PCR Increase the product captured by hybridization; the sequencing module, which is used to perform high-throughput second-generation sequencing on the product captured by the hybridization after PCR amplification; the data analysis module, which is used to analyze the sequencing data and determine the sample Methylation level; pan-cancer interpretation module,
  • This article also relates to a system for detecting cancer methylation, which includes: a sample collection module, which is used to collect a sample of a subject; a DNA extraction module, which is used to extract and purify DNA in the sample; A library building module, which is used to construct a DNA library for sequencing against a purified DNA sample; a transformation module, which is used to transform the constructed DNA library with bisulfite; a pre-PCR amplification module, which is used for pre-PCR Amplify the bisulfite-converted DNA library; a hybridization capture module, which is used to hybridize and capture samples amplified by pre-PCR using a probe composition; a post-PCR amplification module, which is used to amplify by PCR Increase the product captured by hybridization; the sequencing module, which is used to perform high-throughput second-generation sequencing on the product captured by the hybridization after PCR amplification; the data analysis module, which is used to analyze the sequencing data and determine the sample Methylation level; pan-cancer interpretation module,
  • m1 when m1 is 6, it is further determined that 5 of the probes are probes targeting the stomach, and one probe is a probe targeting the pancreas, and then it is determined that the patient is suffering from cancer in the stomach. If it is further judged when m1 is 6, 3 of the probes are probes targeting the stomach and 3 probes are probes targeting the pancreas, then it is judged that the patient is suffering from cancer in the stomach and pancreas. For example, when m1 is 6, it is further judged that 2 probes are probes targeting esophageal cancer, 2 probes are probes targeting gastric cancer, and 2 probes are probes targeting colorectal cancer. Then it is judged that the patient suffering from cancer is the esophagus, stomach, and colorectal.
  • Seq ID No. 84, Seq ID No. 85 and Seq ID No. 86 are all hypomethylated probes that target the target region shown in Seq ID No. 1
  • Seq ID No. 205 , Seq ID No. 206 and Seq ID No. 207 are all hypermethylated probes that target the target region shown in Seq ID No. 1.
  • Seq ID No. 87 and Seq ID No. 88 are hypomethylated probes that target the target region shown in Seq ID No. 2
  • Seq ID No. 208 and Seq ID No. 209 are both targeted to Seq ID No. .2 Hypermethylated probes in the target region shown.
  • Seq ID No. 90 are hypomethylated probes that target the target region shown in Seq ID No. 3, and Seq ID No. 210 and Seq ID No. 211 are both targeted to Seq ID No. .3 Hypermethylated probes in the target region shown.
  • Seq ID No. 91 is a hypomethylated probe targeting the target region shown in Seq ID No. 4
  • Seq ID No. 212 is a hypermethylated probe targeting the target region shown in Seq ID No. 4.
  • Seq ID No. 92 and Seq ID No. 93 are hypomethylated probes that target the target region shown in Seq ID No. 5, and Seq ID No. 213 and Seq ID No. 214 are both targeted to Seq ID No. .3 Hypermethylated probes in the target region shown.
  • Seq ID No. 94 is a hypomethylated probe that targets the target region shown in Seq ID No. 6, and Seq ID No. 215 is a hypermethylated probe that targets the target region shown in Seq ID No. 6.
  • Seq ID No. 95 is a hypomethylated probe that targets the target region shown in Seq ID No. 7, and Seq ID No. 216 is a hypermethylated probe that targets the target region shown in Seq ID No. 7.
  • Seq ID No. 96 and Seq ID No. 97 are hypomethylated probes that target the target region shown in Seq ID No. 8, and Seq ID No. 217 and Seq ID No. 218 are both targeted to Seq ID No.
  • Seq ID No. 100 is all hypomethylated probes that target the target region shown in Seq ID No. 9, Seq ID No. 219, Seq ID No. 220 and Seq ID No. 221 are all hypermethylated probes that target the target region shown in Seq ID No. 9.
  • Seq ID No. 101 is a hypomethylated probe targeting the target region shown in Seq ID No. 10
  • Seq ID No. 222 is a hypermethylated probe targeting the target region shown in Seq ID No. 10.
  • Seq ID No. 102 is a hypomethylated probe targeting the target region shown in Seq ID No. 11, and Seq ID No. 223 is a hypermethylated probe targeting the target region shown in Seq ID No. 11.
  • Seq ID No. 103 is a hypomethylated probe targeting the target region shown in Seq ID No. 12, and Seq ID No. 224 is a hypermethylated probe targeting the target region shown in Seq ID No. 12.
  • Seq ID No. 104 is a hypomethylated probe targeting the target region shown in Seq ID No. 13, and Seq ID No. 225 is a hypermethylated probe targeting the target region shown in Seq ID No. 13.
  • Seq ID No. 105 and Seq ID No. 106 are both hypomethylated probes that target the target region shown in Seq ID No. 14, and Seq ID No. 226 and Seq ID No. 227 are both targeted to Seq ID No.
  • Seq ID No. 109 and Seq ID No. 110 are both hypomethylated probes that target the target region shown in Seq ID No. 16, and Seq ID No. 230 and Seq ID No. 231 are both targeted to Seq ID No.
  • Seq ID No. 111 and Seq ID No. 112 are hypomethylated probes that target the target region shown in Seq ID No. 17, and Seq ID No. 232 and Seq ID No.
  • Seq ID No. 113 is a hypomethylated probe targeting the target region shown in Seq ID No. 18, and Seq ID No. 234 is a hypermethylated probe targeting the target region shown in Seq ID No. 18.
  • Seq ID No. 114 and Seq ID No. 115 are both hypomethylated probes that target the target region shown in Seq ID No. 19, and Seq ID No. 235 and Seq ID No. 236 are both targeted to Seq ID No.
  • Seq ID No. 116 is a hypomethylated probe targeting the target region shown in Seq ID No. 20, and Seq ID No.
  • Seq ID No. 20 is a hypermethylated probe targeting the target region shown in Seq ID No. 20.
  • Seq ID No. 117 is a hypomethylated probe targeting the target region shown in Seq ID No. 21, and Seq ID No. 238 is a hypermethylated probe targeting the target region shown in Seq ID No. 21.
  • Seq ID No.118 is a hypomethylated probe targeting the target region shown in Seq ID No.22, and Seq ID No.239 is a hypermethylated probe targeting the target region shown in Seq ID No.22.
  • Seq ID No. 119, Seq ID No. 120, and Seq ID No. 121 are all hypomethylated probes that target the target region shown in Seq ID No. 23, Seq ID No. 240, Seq ID No.
  • Seq ID No. 242 are all hypermethylated probes that target the target region shown in Seq ID No. 23.
  • Seq ID No. 122 and Seq ID No. 123 are both hypomethylated probes that target the target region shown in Seq ID No. 24, and Seq ID No. 243 and Seq ID No. 244 are both targeted to Seq ID No.
  • Seq ID No. 124 and Seq ID No. 125 are both hypomethylated probes that target the target region shown in Seq ID No. 25, and Seq ID No. 245 and Seq ID No. 246 are both targeted to Seq ID No.
  • Seq ID No. 126 and Seq ID No. 127 are both hypomethylated probes that target the target region shown in Seq ID No. 26, and Seq ID No. 247 and Seq ID No. 248 are both targeted to Seq ID No.
  • Seq ID No. 128 is a hypomethylated probe targeting the target region shown in Seq ID No. 27, and Seq ID No. 249 is a hypermethylated probe targeting the target region shown in Seq ID No. 27.
  • Seq ID No. 129 and Seq ID No. 130 are hypomethylated probes that target the target region shown in Seq ID No. 28, and Seq ID No. 250 and Seq ID No. 251 are both targeted to Seq ID No.
  • the hypermethylated probe in the target region shown in .28. Seq ID No. 131 and Seq ID No. 132 are hypomethylated probes that target the target region shown in Seq ID No. 29, and Seq ID No. 252 and Seq ID No. 253 are both targeted to Seq ID No.
  • Seq ID No. 133 is a hypomethylated probe that targets the target region shown in Seq ID No. 30, and Seq ID No. 254 is a hypermethylated probe that targets the target region shown in Seq ID No. 30.
  • Seq ID No. 134 and Seq ID No. 135 are hypomethylated probes that target the target region shown in Seq ID No. 31, and Seq ID No.
  • Seq ID No. 136 is a hypomethylated probe targeting the target region shown in Seq ID No. 32
  • Seq ID No. 257 is a hypermethylated probe targeting the target region shown in Seq ID No. 32.
  • Seq ID No. 137 and Seq ID No. 138 are hypomethylated probes that target the target region shown in Seq ID No. 33
  • Seq ID No. 258 and Seq ID No. 259 are both targeted to Seq ID No.
  • Seq ID No. 139 is a hypomethylated probe targeting the target region shown in Seq ID No.
  • Seq ID No. 34 is a hypermethylated probe targeting the target region shown in Seq ID No. 34.
  • Seq ID No. 140 and Seq ID No. 141 are both hypomethylated probes that target the target region shown in Seq ID No. 35, and Seq ID No. 261 and Seq ID No. 262 are both targeted to Seq ID No.
  • Seq ID No. 142 is a hypomethylated probe targeting the target region shown in Seq ID No. 36
  • Seq ID No. 263 is a hypermethylated probe targeting the target region shown in Seq ID No. 36.
  • Seq ID No. 144 are hypomethylated probes that target the target region shown in Seq ID No. 37, and Seq ID No. 264 and Seq ID No. 265 are both targeted to Seq ID No.
  • Seq ID No. 145 and Seq ID No. 146 are hypomethylated probes that target the target region shown in Seq ID No. 38, and Seq ID No. 266 and Seq ID No. 267 are both targeted to Seq ID No.
  • Seq ID No. 147 is a hypomethylated probe targeting the target region shown in Seq ID No. 39
  • Seq ID No. 268 is a hypermethylated probe targeting the target region shown in Seq ID No. 39.
  • Seq ID No. 148 and Seq ID No. 149 are both hypomethylated probes that target the target region shown in Seq ID No. 40, and Seq ID No. 269 and Seq ID No. 270 are both targeted to Seq ID No.
  • Seq ID No. 150 is a hypomethylated probe targeting the target region shown in Seq ID No. 41
  • Seq ID No. 271 is a hypermethylated probe targeting the target region shown in Seq ID No. 41.
  • Seq ID No. 151 is a hypomethylated probe targeting the target region shown in Seq ID No. 42
  • Seq ID No. 272 is a hypermethylated probe targeting the target region shown in Seq ID No. 42.
  • Seq ID No. 152 is a hypomethylated probe targeting the target region shown in Seq ID No. 43
  • Seq ID No. 273 is a hypermethylated probe targeting the target region shown in Seq ID No. 43.
  • Seq ID No. 153 is a hypomethylated probe targeting the target region shown in Seq ID No. 44
  • Seq ID No. 274 is a hypermethylated probe targeting the target region shown in Seq ID No. 44.
  • Seq ID No. 154 and Seq ID No. 155 are hypomethylated probes that target the target region shown in Seq ID No. 45
  • Seq ID No. 275 and Seq ID No. 276 are both targeted to Seq ID No.
  • Seq ID No.156 is a hypomethylated probe targeting the target region shown in Seq ID No.46
  • Seq ID No.277 is a hypermethylated probe targeting the target region shown in Seq ID No.46.
  • Seq ID No. 157 is a hypomethylated probe targeting the target region shown in Seq ID No. 47
  • Seq ID No. 278 is a hypermethylated probe targeting the target region shown in Seq ID No. 47.
  • Seq ID No. 158 is a hypomethylated probe targeting the target region shown in Seq ID No. 48
  • Seq ID No. 279 is a hypermethylated probe targeting the target region shown in Seq ID No. 48.
  • Seq ID No. 159 is a hypomethylated probe targeting the target region shown in Seq ID No.
  • Seq ID No. 280 is a hypermethylated probe targeting the target region shown in Seq ID No. 49.
  • Seq ID No. 160 is a hypomethylated probe targeting the target region shown in Seq ID No. 50
  • Seq ID No. 281 is a hypermethylated probe targeting the target region shown in Seq ID No. 50.
  • Seq ID No. 161 and Seq ID No. 162 are hypomethylated probes that target the target region shown in Seq ID No. 51, and Seq ID No. 282 and Seq ID No. 283 are both targeted to Seq ID No.
  • Seq ID No. 163 is a hypomethylated probe targeting the target region shown in Seq ID No. 52, and Seq ID No.
  • Seq ID No. 284 is a hypermethylated probe targeting the target region shown in Seq ID No. 52.
  • Seq ID No.164 and Seq ID No.165 are hypomethylated probes that target the target region shown in Seq ID No.53, and Seq ID No.285 and Seq ID No.286 are both targeted to Seq ID No.
  • Seq ID No.166 is a hypomethylated probe targeting the target region shown in Seq ID No.54
  • Seq ID No.287 is a hypermethylated probe targeting the target region shown in Seq ID No.54.
  • Seq ID No. 167 and Seq ID No. 168 are hypomethylated probes that target the target region shown in Seq ID No. 55, and Seq ID No.
  • Seq ID No. 288 and Seq ID No. 289 are both targeted to Seq ID No.
  • Seq ID No. 169 and Seq ID No. 170 are both hypomethylated probes that target the target region shown in Seq ID No. 56, and Seq ID No. 290 and Seq ID No. 291 are both targeted to Seq ID No.
  • Seq ID No. 171 and Seq ID No. 172 are hypomethylated probes that target the target region shown in Seq ID No. 57, and Seq ID No. 292 and Seq ID No. 293 are both targeted to Seq ID No.
  • Seq ID No. 173 and Seq ID No. 174 are both hypomethylated probes that target the target region shown in Seq ID No. 58, and Seq ID No. 294 and Seq ID No. 295 are both targeted to Seq ID No.
  • Seq ID No. 175 and Seq ID No. 176 are hypomethylated probes that target the target region shown in Seq ID No. 59, and Seq ID No. 296 and Seq ID No. 297 are both targeted to Seq ID No.
  • Seq ID No. 177 is a hypomethylated probe targeting the target region shown in Seq ID No. 60, and Seq ID No.
  • Seq ID No. 298 is a hypermethylated probe targeting the target region shown in Seq ID No. 60.
  • Seq ID No. 178 and Seq ID No. 179 are hypomethylated probes that target the target region shown in Seq ID No. 61, and Seq ID No. 299 and Seq ID No. 300 are both targeted to Seq ID No.
  • Seq ID No. 180 is a hypomethylated probe targeting the target region shown in Seq ID No. 62
  • Seq ID No. 301 is a hypermethylated probe targeting the target region shown in Seq ID No. 62.
  • Seq ID No. 181 is a hypomethylated probe targeting the target region shown in Seq ID No. 63, and Seq ID No.
  • Seq ID No. 302 is a hypermethylated probe targeting the target region shown in Seq ID No. 63.
  • Seq ID No. 182 is a hypomethylated probe targeting the target region shown in Seq ID No. 64
  • Seq ID No. 303 is a hypermethylated probe targeting the target region shown in Seq ID No. 64.
  • Seq ID No. 183 is a hypomethylated probe targeting the target region shown in Seq ID No. 65
  • Seq ID No. 304 is a hypermethylated probe targeting the target region shown in Seq ID No. 65.
  • Seq ID No. 184 is a hypomethylated probe targeting the target region shown in Seq ID No. 66
  • Seq ID No. 305 is a hypermethylated probe targeting the target region shown in Seq ID No. 66.
  • Seq ID No. 185 is a hypomethylated probe targeting the target region shown in Seq ID No. 67
  • Seq ID No. 306 is a hypermethylated probe targeting the target region shown in Seq ID No. 67
  • Seq ID No. 186 is a hypomethylated probe targeting the target region shown in Seq ID No. 68
  • Seq ID No. 307 is a hypermethylated probe targeting the target region shown in Seq ID No. 68.
  • Seq ID No. 187 is a hypomethylated probe targeting the target region shown in Seq ID No. 69
  • Seq ID No. 308 is a hypermethylated probe targeting the target region shown in Seq ID No. 69.
  • Seq ID No.188 is a hypomethylated probe targeting the target region shown in Seq ID No.70
  • Seq ID No.309 is a hypermethylated probe targeting the target region shown in Seq ID No.70
  • Seq ID No. 189 is a hypomethylated probe targeting the target region shown in Seq ID No. 71
  • Seq ID No. 310 is a hypermethylated probe targeting the target region shown in Seq ID No. 71
  • Seq ID No. 190 is a hypomethylated probe targeting the target region shown in Seq ID No. 72
  • Seq ID No. 311 is a hypermethylated probe targeting the target region shown in Seq ID No. 72
  • Seq ID No. 191 is a hypomethylated probe targeting the target region shown in Seq ID No.
  • Seq ID No. 312 is a hypermethylated probe targeting the target region shown in Seq ID No. 73.
  • Seq ID No. 192 is a hypomethylated probe targeting the target region shown in Seq ID No. 74
  • Seq ID No. 313 is a hypermethylated probe targeting the target region shown in Seq ID No. 74.
  • Seq ID No. 193 is a hypomethylated probe that targets the target region shown in Seq ID No. 75
  • Seq ID No. 314 is a hypermethylated probe that targets the target region shown in Seq ID No. 75.
  • Seq ID No. 194 is a hypomethylated probe targeting the target region shown in Seq ID No. 76, and Seq ID No.
  • Seq ID No. 315 is a hypermethylated probe targeting the target region shown in Seq ID No. 76.
  • Seq ID No. 195 is a hypomethylated probe targeting the target region shown in Seq ID No. 77
  • Seq ID No. 316 is a hypermethylated probe targeting the target region shown in Seq ID No. 77.
  • Seq ID No. 196 is a hypomethylated probe targeting the target region shown in Seq ID No. 78
  • Seq ID No. 317 is a hypermethylated probe targeting the target region shown in Seq ID No. 78.
  • Seq ID No.197 is a hypomethylated probe targeting the target region shown in Seq ID No.79
  • Seq ID No.318 is a hypermethylated probe targeting the target region shown in Seq ID No.79.
  • Seq ID No. 198 and Seq ID No. 199 are hypomethylated probes that target the target region shown in Seq ID No. 80, and Seq ID No. 319 and Seq ID No. 320 are both targeted to Seq ID No. A hypermethylated probe in the target region shown in .80.
  • Seq ID No. 200, Seq ID No. 201, and Seq ID No. 202 are all hypomethylated probes that target the target region shown in Seq ID No. 81, Seq ID No. 321, Seq ID No. 322, and Seq ID No. 322.
  • Seq ID No. 323 is a hypermethylated probe that targets the target region shown in Seq ID No. 81.
  • Seq ID No. 203 is a hypomethylated probe targeting the target region shown in Seq ID No. 82
  • Seq ID No. 324 is a hypermethylated probe targeting the target region shown in Seq ID No. 82.
  • Seq ID No. 204 is a hypomethylated probe targeting the target region shown in Seq ID No. 83
  • Seq ID No. 325 is a hypermethylated probe targeting the target region shown in Seq ID No. 83.
  • Table 1 shows the target sequence targeted by the probe.
  • the determination of the methylation level threshold of the two pan-cancer markers refers to that in more than 50% of the cancer samples, both meet or exceed this indicator, but are lower than this indicator in the corresponding normal control.
  • the ratio is adjusted to 20% to improve the detection of early cancer.
  • tissue sources By counting the methylation levels of all tissue markers, and analyzing the tissues pointed to by the tissue markers greater than or equal to the threshold, the possible tissue sources are finally given.
  • the threshold of the methylation level of the pan-cancer specific region Seq ID No.: 63 is 55%, so when the detection result is greater than or equal to 55%, it is considered that the pan-cancer specific region is The methylation level is greater than or equal to the threshold;
  • the pan-cancer specific region Seq ID No.: The threshold of the methylation level of 64 is 60%, so when the detection result is greater than or equal to 60%, it is considered that the pan-cancer specific region is The methylation level is greater than or equal to the threshold.
  • the methylation level of the region targeted by n1 probes is greater than or equal to their respective thresholds, n1/n ⁇ 20%, preferably ⁇ 30%, then Interpret that the patient has any of the tissue-specific cancers.
  • Table 1 shows the respective thresholds of cancer-specific regions. For example, the region shown in Seq ID No.: 48 has a threshold of 0.35. Therefore, when detecting with a probe targeting this region, if the detection result is greater than Equal to 0.35, the methylation level of the region detected by the probe is greater than or equal to the threshold.
  • the methylation thresholds of other cancer-specific regions can be found in Table 1.
  • the methylation level of the region targeted by m1 probes among the m probes targeting the tissue-specific region is greater than or equal to their respective thresholds, then further analysis of m1 probes greater than or equal to their respective thresholds
  • the targeted tissues are counted and the number of probes in each tissue greater than or equal to the threshold is counted, and the tissue in which the patient suffers from cancer is determined to be the tissue with the largest number of probes with a methylation level greater than or equal to the threshold.
  • Table 1 shows the respective thresholds of tissue-specific regions. For example, the region shown in Seq ID No.: 67 has a threshold of 0.29.
  • the methylation level of the region detected by the probe is greater than or equal to the threshold.
  • the methylation thresholds of other tissue-specific regions can be found in Table 1.
  • a sample is collected from the subject.
  • the subject may be a subject suspected of having cancer, or a subject already suffering from cancer.
  • the cancer may be esophageal cancer, gastric cancer, or colorectal cancer.
  • the sample can be blood or plasma.
  • DNA is extracted and purified from a sample, and the DNA can be gDNA or cfDNA.
  • PCR is used to amplify the bisulfite-converted DNA library, and PCR amplification can increase the amount of bisulfite-converted DNA. The amount that can be hybridized with the probe composition is reached.
  • a probe composition is used to hybridize and capture a sample.
  • the probe composition includes two probes targeting a pan-cancer specific region, N probes for cancer-specific regions, and m probes for tissue-specific regions.
  • n may be any integer selected from 1-192.
  • n can be, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 , 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48 , 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73 , 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98 , 99, 100, 110, ..., 192.
  • the cancer-specific region can be selected from any of Seq ID No.: 1-62.
  • m may be any integer selected from 1-44.
  • m can be, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 , 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44.
  • the tissue-specific region can be selected from any of Seq ID No.: 65-83.
  • the probe composition includes: a hypomethylated probe, which is compatible with the bisulfite-converted cancer without CG methylation Hybridization of specific regions, pan-cancer specific regions, and tissue-specific regions, and hypermethylation probes, which are all methylated with bisulfite-converted CG, the cancer-specific regions and pan-cancer specific regions , And tissue-specific region hybridization.
  • the hypomethylation probes include probes targeting cancer-specific regions Seq ID No.: any of 84-180, and probes targeting pan-cancer specific regions Seq ID No.: 181-182 Any, and Seq ID No. of probes targeting tissue-specific regions: Any of 183-204.
  • the hypermethylation probes include probes targeting cancer-specific regions, Seq ID No.: any of 205-301, and probes targeting pan-cancer specific regions, Seq ID No.: 302-303, Seq ID No.: any of 304-325 probes targeting tissue-specific regions.
  • the length of each probe in the probe composition is 40-60 bp.
  • it can be 41 ⁇ 60bp, 42 ⁇ 60bp, 43 ⁇ 60bp, 44 ⁇ 60bp, 45 ⁇ 60bp, 45 ⁇ 59bp, 45 ⁇ 58bp, 45 ⁇ 57bp, 45 ⁇ 56bp, 46 ⁇ 56bp, 47 ⁇ 56bp, 48 ⁇ 56bp, 49 ⁇ 56bp, 50 ⁇ 56bp.
  • the length of each probe in the probe composition is preferably 50-56 bp, more preferably 50 bp.
  • PCR is used to amplify the product captured by hybridization, and PCR amplification can make the amount of product captured by hybridization reach the initial amount that can be sequenced on the computer . If PCR is not used to amplify and hybridize the captured product, the amount of product cannot meet the requirements for on-machine sequencing.
  • the platform for high-throughput second-generation sequencing is the Illumina platform.
  • the database can provide a multi-cancer methylation analysis database by integrating public data and existing sequencing data.
  • the database for pattern recognition contains three kinds of marker information, namely: pan-cancer markers, tissue-specific markers and cancer characteristic markers.
  • the detection method uses hybrid capture to enrich cfDNA, and uses NGS technology to detect methylation sites that are highly related to cancer. It covers the three most common malignant tumors of luminal organs in my country (esophageal cancer, gastric cancer and colorectal cancer). Finally, based on the detection of gene methylation changes in plasma cfDNA, it provides information for early screening and early diagnosis of multiple cancers.
  • Agilent 2100 performs fragment detection, and Qubit is directly used in subsequent experiments.
  • DNA protection buffer added to the liquid turns blue. Mix gently by pipetting, and then divide into two tubes on the PCR machine.
  • the number of PCR cycles is adjusted according to the amount of DNA input.
  • the reference data is as follows:
  • the length of the library is about 270bp-320bp.
  • Component volume Pre-amplified product 750ng corresponding volume Hyb human blocker 5 ⁇ l Junction blocker 6 ⁇ l Enhancer 5 ⁇ l
  • Hyb buffer solution at room temperature to melt, there will be precipitation after melting, and place it in a 65°C water bath to preheat it after mixing. After completely dissolved (no precipitation and turbidity), take 20 ⁇ l Hyb buffer solution and place In the new 200 ⁇ l PCR tube, close the tube cap, mark it as A, and continue to incubate it in a 65°C water bath for later use.
  • Methylation bio-information analysis process It is roughly as follows: Use quality control software such as trimmomatic to check the quality of sequencing, remove low-quality reads, and then use comparison software such as Bismarker to compare the clean data after quality control to the reference genome, and use R packages such as methykit to extract the corresponding Methylation site. Finally, calculate the methylation ratio of each target area on the Panel.
  • quality control software such as trimmomatic to check the quality of sequencing, remove low-quality reads
  • comparison software such as Bismarker to compare the clean data after quality control to the reference genome
  • R packages such as methykit
  • pan-cancer specific markers TBX15 and CRYGD genes greater than or equal to 55% and 60%, and then preliminarily determine that the sample is a sample suffering from cancer;
  • the methylation levels of cancer-specific markers OTX1, SFRP2, CDO1, TRIM15, ALX4, and CCNA1 are greater than or equal to the respective thresholds shown in Table 1 (as shown in Table 19 above), and then the sample is further judged as suffering from the following 11 types
  • a sample of any cancer esophageal cancer, stomach cancer, colorectal cancer, lung cancer, liver cancer, pancreatic cancer, prostate cancer, breast cancer, ovarian cancer, cervical cancer, and endometrial cancer
  • interpret tissue-specific markers Based on the target regions that are greater than or equal to their respective thresholds in Table 19, it can be seen that there are 6 target regions specific to gastric tissue Seq ID No.
  • Seq ID No. 69, Seq ID No. 75, Seq ID No. 76, Seq ID No. 77 and Seq ID No. 78 are greater than or equal to their respective thresholds of methylation levels, and no other tissue-specific markers have methylation greater than or equal to their respective thresholds, so they are greater than or equal to their respective methylation thresholds Among the tissue-specific markers, gastric tissue-specific markers are the most, and the sample is finally judged to be a sample with gastric cancer.
  • the patient’s blood was drawn again 48 hours after the operation, and the peripheral blood was collected using the Panel test of the application according to the method of Example 1.
  • the database was constructed and sequenced on the Illumina platform; the sequencing data was subjected to the above-mentioned biological information analysis process and passed pattern recognition By analyzing all the sequencing data, the results show that the gene methylation level in the above table has returned to the normal level.
  • pan-cancer specific markers TBX15 and CRYGD genes greater than or equal to 55% and 60%, and then preliminarily determine that the sample is a sample suffering from cancer;
  • the methylation levels of the cancer-specific markers TRH, CDO1, ELMO1, GFRA1, CCNA1, and SALL1 are greater than or equal to the respective thresholds shown in Table 1 (as shown in Table 20 above), then the sample is further judged as suffering from the following 11 types
  • a sample of any cancer esophageal cancer, stomach cancer, colorectal cancer, lung cancer, liver cancer, pancreatic cancer, prostate cancer, breast cancer, ovarian cancer, cervical cancer, and endometrial cancer
  • interpret tissue-specific markers Based on the target regions in Table 20 that are greater than or equal to their respective thresholds, it can be seen that there are 6 target regions specific to colorectal tissues Seq ID No.
  • Seq ID No. 64, Seq ID No. 71, Seq ID No. 72 , Seq ID No. 73, Seq ID No. 74 are greater than or equal to their respective thresholds of methylation levels, and no other tissue-specific markers have methylation greater than or equal to their respective thresholds, so they are greater than or equal to their respective methylation thresholds.
  • threshold tissue-specific markers colorectal tissue-specific markers are the most, and the sample is finally judged to be a sample with colorectal cancer.
  • a sample of esophageal cancer was detected by the Panel of this application, and peripheral blood was collected according to the method of Example 1; the database was built and sequenced on the Illumina platform; the sequencing data was subjected to the above-mentioned biological information analysis process to obtain the methylation level.
  • the results are as follows 21 (Table 21 shows the detected target regions greater than or equal to the methylation threshold).
  • the target regions greater than or equal to their respective thresholds in Table 21 Based on the target regions greater than or equal to their respective thresholds in Table 21, it can be seen that they belong to the 5 target regions specific to esophageal tissue Seq ID No. 65, Seq ID No. 66, Seq ID No. 67, Seq ID No. 68, and Seq ID No. 69 are greater than or equal to the threshold of their respective methylation levels, and there are no other tissue-specific markers.
  • the basalization is greater than or equal to its respective threshold. Therefore, among the tissue-specific markers greater than or equal to the respective methylation threshold, the esophageal tissue-specific markers are the most, and the sample is finally judged to be a sample suffering from esophageal cancer.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Immunology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明提供了一种用于检测癌症甲基化的系统,其包括:样品采集模块,其用于采集受试者样品;DNA提取模块,其用于提取纯化样品中的DNA;建库模块,其用于针对纯化的DNA样品构建用于测序的DNA文库;转化模块,其用于用重亚硫酸盐转化构建的DNA文库;预PCR扩增模块,其用于预PCR扩增经重亚硫酸盐转化的DNA文库;杂交捕获模块,其用于利用探针组合物对经预PCR扩增的样品进行杂交捕获;后PCR扩增模块,其用于利用PCR扩增经杂交捕获后的产物;测序模块,其用于对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;数据分析模块,其用于对测序数据进行分析,确定样本的甲基化水平;判读模块,其用于基于样本的甲基化水平判读患者的患病情况。

Description

一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法 技术领域
本申请涉及一种癌症基因甲基化检测系统,特别涉及一种基于高通量测序(NGS)检测食管癌、胃癌和结直肠癌等3种管腔性器官肿瘤游离DNA甲基化水平变化的系统以及在该系统中执行的癌症体外检测方法。
背景技术
高通量测序(NGS)技术是现代基因组学研究领域革命性的创新,该技术可以同时对几十到几百万条DNA分子进行序列分析,这标志着后基因组时代的到来。通过对测序深度的控制,可实现从头测序和重测序等不同的目标,也可以通过不同的前期处理,分析基因组、转录组和甲基化组的序列。
目前临床基因检测的技术主要是聚合酶链式反应(PCR)、荧光原位杂交(FISH)和基因芯片技术。PCR仪器设备价格低,灵敏性高,操作简单快速,在临床的普及度高,但是受技术限制,只能同时检测少数几个基因。FISH灵敏度高,但操作难度较大。基因芯片通量高于前两者,可以同时检测大量的基因。但局限在于只能检测已知的基因或变异,准确性低,假阳性高。而NGS技术具有通量高(同时检测大量已知和未知的基因及变异),结果准确(准确率高于基因芯片),检测速度较快,均摊到每个基因的检测成本低等特点,现在已经逐渐应用到临床疾病检测和监控等领域。随着未来测序费用的进一步降低,NGS必然会逐步取代基因芯片等其他高通量技术。
由于目前常规全基因组测序的总价比较高,如果为了检测稀有变异而增加测序深度,则最终价格让消费者难以承受。因此目标序列捕获测序成为比较主流的选择,该技术是根据检测的需求,针对感兴趣的基因组区域设计捕获探针,通过杂交互补的原理富集目标片段DNA,并在后续进行NGS检测。这种策略可以根据研究或检测的目的进行灵活定制,只选择少量的基因区域,加大测序深度,可以有效的发现目标区域变异的状况,具有很高的灵敏 度和准确性。
在癌症的发生和进展过程中,遗传信息会产生一系列的变化,包括DNA的突变、插入/缺失,染色体结构变异,拷贝数变异,以及表观遗传信息的改变。在癌症的演进过程中,DNA序列的变异是随机发生的,仅当变异发生在关键的生长控制基因时,才可以导致恶性肿瘤的产生。而大多数的基因表达异常是源于表观遗传的改变,通常是DNA甲基化水平的改变。研究表明,基因甲基化水平的改变早于基因变异,跟踪检测基因甲基化的变化,可以较早的预测癌症产生。近年来,随着基因组学的发展,现在已经对超过30种癌症的表观基因组进行了研究。结果显示,尽管DNA甲基化并非在每种癌中都占主导地位,但毫无疑问的是,基因甲基化修饰模式的变化会改变细胞的发展倾向以及肿瘤的表型,从而对大多数癌症的发生发展起重要的影响。
2018年初癌症基因组图谱(TCGA)发表了27篇总结性的分析文章,对历时十多年的30多种癌症数据做了迄今为止最全面的泛癌基因组分析。在通过整合染色体变异、DNA甲基化、RNA和蛋白等多种数据进行分析后,发现,解剖学上的33中癌症可以根据分子特征分成28个亚型。在某种分子亚型会包括25种传统意义上的癌症。这个结果说明,来源于不同器官的癌症存在共同的分子特征。与此同时,来源于同一器官的癌症可能有不同的基因组图谱。由此可见,在不远的将来,癌症筛查和诊断标志物的开发必将引入更多泛癌的概念,不仅研究解剖学层次的癌症,更应在分子水平研究癌症,开发出可以覆盖某种分子分型的泛癌标志物。
液体活检是体外诊断的一种方式,采用非侵入性的血液检测,可以监测肿瘤或转移灶释放到血液的循环肿瘤细胞(CTC)或循环肿瘤DNA(ctDNA),该技术可以有效降低侵入性造成的危害,且能够实现对肿瘤各部分和所有转移灶的采样,克服肿瘤异质性(而目前采用的标准组织活检只能反映肿瘤某一部分的特征),并实现实时监测,具有更高的灵敏度,甚至有可能通过基因组信息预测发生病变的部位,可以有效延长患者生存期。根据这些优点,液体活检可以用于肿瘤的早诊、辅助分期、预后和复发监测,用药指导等方面。目前最常用于的液体活检的游离DNA。
游离DNA(cfDNA)是存在于循环血中游离与细胞外的部分降解的内源性DNA。研究表明,肿瘤组织在发生发展的过程中,其肿瘤细胞凋亡后, 会向血浆中释放DNA,经降解后,形成游离肿瘤DNA(ctDNA)。CtDNA的分子遗传特征(如基因突变,微卫星不稳定性和抑癌基因启动子甲基化等)与肿瘤组织DNA相一致。在多癌的早期筛查和检测中,采集外周血比其它临床检测手段更为简便、易于向基层推广,而且由于其无创的特点,更易于被无症状人群接受。因此通过检测血浆中的ctDNA甲基化水平改变,可以成为多癌早期筛查和诊断的重要手段之一。
利用目标序列捕获技术结合NGS监测cfDNA的变异以及甲基化水平变化,可以实现肿瘤早期筛查、易感基因监测、伴随诊断、个性化用药、预后监测等各方面的应用。目前国内外多个公司都推出了不同规模,针对不同应用场景的癌症检测panel,有部分panel已经获得了FDA或CFDA的批准文号。如Foundation Medicine推出的FoundationOne CDx覆盖324个基因,纪念斯隆·凯特琳癌症研究中心(MSK)推出的IMPACT覆盖468种癌症相关基因,燃石医学推出的“人EGFR/ALK/BRAF/KRAS基因突变联合检测试剂盒”,诺禾致源推出的“人EGFR、KRAS、BRAF、PIK3CA、ALK、ROS1基因突变检测试剂盒”等。鹍远基因也于2018年推出了检测结直肠癌甲基化水平的产品“常乐思”。
发明内容
综上所述,针对目前多癌检测产品的缺乏和技术限制,本申请提供了一种用于检测癌症甲基化的系统和在该系统中执行的癌症体外检测方法,可用于食管癌、胃癌和结直肠癌等3种管腔性器官肿瘤的早期筛查。该系统和在该系统中执行的方法可以:1)以非侵入性的方式用于无症状人群的早期筛查,以及癌症患者的预后检测,降低了侵入性检测造成的危害,2)加大了测序深度,使检测基因的广度优于现有的技术和产品,具有通量高,检测速度较快,均摊到每个基因的检测成本低等特点,3)能够实现对肿瘤各部分和所有转移灶的采样,克服肿瘤异质性,和4)具有更高的灵敏度和准确性,能够实现实时监测,甚至有可能通过基因组信息预测发生病变的部位有效延长患者生存期。
具体地,本申请涉及如下内容:
1.一种系统,该系统用于检测癌症甲基化,其包括:
样品采集模块,其用于采集受试者样品;
DNA提取模块,其用于提取纯化所述样品中的DNA;
建库模块,其用于针对纯化的DNA样品构建用于测序的DNA文库;
转化模块,其用于用重亚硫酸盐转化所述构建的DNA文库;
预PCR扩增模块,其用于预PCR扩增所述经重亚硫酸盐转化的DNA文库;
杂交捕获模块,其用于利用探针组合物对经预PCR扩增的样品进行杂交捕获;
PCR扩增模块,其用于利用PCR扩增经杂交捕获后的产物;
测序模块,其用于对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;
数据分析模块,其用于对测序数据进行分析,确定样本的甲基化水平;
判读模块,其用于基于所述样本的甲基化水平判读所述患者的患病情况。
2.根据第1项所述的系统,所述受试者疑似患有癌症。
3.根据第1或2项所述的系统,采集受试者的样品为血浆样品。
4.根据第1-3项中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物包括:
靶向泛癌特异性区域的2个探针,
靶向癌症特异性区域的n个探针,和
靶向组织特异性区域的m个探针。
5.根据第1-4项中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物包括:
低甲基化探针,其与经重亚硫酸盐转化的不含CG甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交,和
高甲基化探针,其与经重亚硫酸盐转化的CG全部甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交。
6.根据第1-5项中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物中的每一个探针的长度为40~60bp。
7.根据第1-6项中任一项所述的系统,在所述杂交捕获模块中使用的探 针组合物中的每一个探针的长度为45~56bp,优选50~56bp,进一步优选50bp。
8.根据第1-7项中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物中的n个探针靶向癌症特异性区域,
其中,n为选自1-192中的任意的整数;
其中,所述癌症特异性区域选自Seq ID No.:1-62中的任意。
9.根据第1-8项中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物中的m个探针靶向所述组织特异性区域,
其中,m为选自1-44中的任意的整数;
其中,所述组织特异性区域选自Seq ID No.:65-83中的任意。
10.根据第5项所述的系统,在所述杂交捕获模块中,所述低甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:84-180中的任意,靶向泛癌特异性区域的探针Seq ID No.:181-182中的任意,和靶向组织特异性区域的探针Seq ID No.:183-204中的任意。
11.根据第5项所述的系统,在所述杂交捕获模块中,所述高甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:205-301中的任意,靶向泛癌特异性区域的探针Seq ID No.:302-303中的任意,和靶向组织特异性区域的探针Seq ID No.:304-325中的任意。
12.根据第1-11项中任一项所述的系统,所述判读模块包括:
(1)泛癌判读模块,其用于比对泛癌特异性区域数据库,并进行判读以确认受试者是否患有癌症;
(2)癌症判读模块,其用于比对癌症特异性区域数据库,并进行判读以进一步确认受试者患有的癌症为几种疑似癌症中的一种;和
(3)组织特异性判读模块,比对组织特异性区域数据库,并进行判读以确认受试者患癌的部位。
13.根据第12项所述的系统,所述泛癌判读模块包括进行如下判读:判断所述泛癌特异性区域Seq ID No.:63的甲基化水平是否大于等于55%,并且判断所述泛癌特异性区域Seq ID No.:64的甲基化水平是否大于等于60%,如果Seq ID No.:63的甲基化水平大于等于55%且Seq ID No.:64的甲基化水平大于等于60%,则判读所述患者患有癌症。
14.根据第12项所述的系统,所述癌症判读模块包括进行如下判读:如果在靶向所述癌症特异性区域的n个探针中,n1个探针靶向的区域的甲基化水平大于等于各自阈值,且n1/n≥20%,优选n1/n≥30%,则判读患者患有组织特异性癌症中的任意一种。
15.根据第12项所述的系统,所述组织特异性判读模块包括进行如下判读:如果在靶向所述组织特异性区域的m个探针中m1个探针靶向的区域的甲基化水平大于等于各自阈值,则进一步分析大于等于各自阈值的m1个探针所靶向的组织并计数每一个组织大于等于阈值的探针的个数,判读认为患者罹患癌症的组织是甲基化水平大于等于阈值的探针个数最多的组织。
16.一种受试者癌症体外检测方法,包括以下步骤:
采集受试者样品;
提取纯化所述样品中的DNA;
针对纯化的DNA样品构建用于测序的DNA文库;
用重亚硫酸盐转化所述构建的DNA文库;
预PCR扩增所述经重亚硫酸盐转化的DNA文库;
利用探针组合物对经预PCR扩增的样品进行杂交捕获;
利用PCR扩增经杂交捕获后的产物;
对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;
对测序数据进行分析,确定样本的甲基化水平;
基于所述样本的甲基化水平判读所述患者的患病情况。
17.根据第16项所述的方法,所述受试者疑似患有癌症。
18.根据第16或17项所述的方法,采集受试者的样品为血浆样品。
19.根据第16-18项中任一项所述的方法,所述转化为使用重亚硫酸盐处理。
20.根据第16-18项中任一项所所述的方法,所述探针组合物包括:
靶向泛癌特异性区域的2个探针,
靶向癌症特异性区域的n个探针,和
靶向组织特异性区域的m个探针。
21.根据第16-20项中任一项所所述的方法,所述探针组合物包括:
低甲基化探针,其与经重亚硫酸盐转化的不含CG甲基化的所述癌症特 异性区域、泛癌特异性区域、以及组织特异性区域杂交,和
高甲基化探针,其与经重亚硫酸盐转化的CG全部甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交。
22.根据第16-21项中任一项所所述的方法,所述探针组合物中的每一个探针的长度为40~60bp。
23.根据第16-22项中任一项所所述的方法,所述探针组合物中的每一个探针的长度为45~56bp,优选50~56bp,进一步优选50bp。
24.根据第16-23项中任一项所所述的方法,所述探针组合物中的n个探针靶向癌症特异性区域,
其中,n为选自1-192中的任意的整数;
其中,所述癌症特异性区域选自Seq ID No.:1-62中的任意。
25.根据第16-24项中任一项所所述的方法,所述探针组合物中的m个探针靶向所述组织特异性区域,
其中,m为选自1-44中的任意的整数;
其中,所述组织特异性区域选自Seq ID No.:65-83中的任意。
26.根据第21项所述的方法,所述低甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:84-180中的任意,靶向泛癌特异性区域的探针Seq ID No.:181-182中的任意,和靶向组织特异性区域的探针Seq ID No.:183-204中的任意。
27.根据第21项所述的方法,所述高甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:205-301中的任意,靶向泛癌特异性区域的探针Seq ID No.:302-303中的任意,和靶向组织特异性区域的探针Seq ID No.:304-325中的任意。
28.根据第16-27项中任一项所所述的方法,所述判读包括以下步骤:
(1)比对泛癌特异性区域数据库,并进行判读以确认受试者是否患有癌症;
(2)比对癌症特异性区域数据库,并进行判读以确认受试者患有的癌症为几种疑似癌症中的一种;
(3)比对组织特异性区域数据库,并进行判读以确认受试者患癌的部位。
29.根据第28项所述的方法,所述步骤(1)包括进行如下判读:判断所 述泛癌特异性区域Seq ID No.:63的甲基化水平是否大于等于55%,并且判断所述泛癌特异性区域Seq ID No.:64的甲基化水平是否大于等于60%,如果Seq ID No.:63的甲基化水平大于等于55%且Seq ID No.:64的甲基化水平大于等于60%,则判读所述患者患有癌症。
30.根据第28项所述的方法,所述步骤(2)包括进行如下判读:如果在靶向所述癌症特异性区域的n个探针中,n1个探针靶向的区域的甲基化水平大于等于各自阈值,且n1/n≥20%,优选n1/n≥30%,则判读患者患有组织特异性癌症中的任意一种,再根据模式识别判读各癌种的可能性。
31.根据第28项所述的方法,所述步骤(3)包括进行如下判读:如果在靶向所述组织特异性区域的m个探针中m1个探针靶向的区域的甲基化水平大于等于各自阈值,则进一步分析大于等于各自阈值的m1个探针所靶向的组织并计数每一个组织大于等于阈值的探针的个数,判读认为患者罹患癌症的组织是甲基化水平大于等于阈值的探针个数最多的组织。
由于现有癌症检测技术的局限性,因此,需要开发一种具有以下优势的用于检测癌症甲基化的系统和在该系统中执行的癌症体外检测方法:
1液体活检属于无创肿瘤检测,对于无法获取组织样品的无症状人群和病人群体均能适用。
2可以同时检测3种中国常见管腔性器官肿瘤的甲基化水平改变,覆盖超过80%的癌症发病人群。
3为增加检测的准确性,对于每一种癌症,平均测序深度超过5000X。
4对于受检人,一次性可以完成所有高发癌症的筛查,提高了检测效率,每个标志物的平均价格低于市场现有单标志物的检测。
5对于企业,用一个Panel可以完成对主要癌症的筛选,节约了探针合成成本,可以简化实验流程,便于实验员操作。
6原则上也可以用于癌症患者预后的癌症监测。
附图说明
图1为本申请的实施操作流程。
具体实施方式
本申请提供了一种癌症基因甲基化检测的系统和在该系统中执行的癌症体外检测方法。基于高通量测序(NGS)方法检测食管癌、胃癌和结直肠癌等3种管腔性器官肿瘤游离DNA甲基化水平变化。其可以以非侵入的方式同时检测3种常见癌症的甲基化水平改变,灵敏性和准确定高,测序深度深且成本低,适用于无法获取组织样品的无症状人群和病人群体,以及癌症患者预后的癌症监测。
定义
除非在本文的其他地方具体限定,否则本文使用的所有其他技术和科学术语具有本申请所属领域的普通技术人员通常理解的含义。
探针为长度在几十到几百甚至上千碱基对的单链或双链DNA,其可利用分子的变性、复性以及碱基互补配对的高度精确性,能与待测样本中互补的非标记单链DNA或RNA以氢键结合(杂交),形成双链复合物(杂交体)。将未配对结合的探针洗去后,可用放射自显影或酶联反应等检测系统检测杂交反应结果。在本申请中,与探针互补结合或杂交的区域为特异性靶区域。多个探针组合成探针组合物。
癌症特异性区域是指在少部分癌症种类中,与正常的对照组织相比,该区域的甲基化水平存在显著差异。
泛癌特异性区域是指在大部分癌症种类中,与正常的对照组织相比,该区域的甲基化水平存在显著差异。
组织特异性区域是指该区域在特定组织中的甲基化水平与其他组织相比存在显著差异。
DNA甲基化是指发生在CpG二核苷酸中胞嘧啶上第5位碳原子的甲基化过程,作为一种对稳定的修饰状态,在DNA甲基转移酶的作用下,可随DNA的复制过程遗传给新生的子代DNA,是一种重要的表观遗传机制,DNA甲基化时,基因启动子区的甲基化可导致抑癌基因转录沉寂,因此它与肿瘤的发生关系密切。异常甲基化包括抑癌基因和DNA修复基因的高甲基化、重复序列DNA的低甲基化、某些基因的印记丢失,其与多种肿瘤的发生有关。
在本文中,Panel是指本文中使用的探针组合物。
以下详细描述本申请的技术方案。
本文涉及一种系统,该系统用于检测癌症甲基化,其包括:样品采集模块,其用于采集受试者样品的;DNA提取模块,其用于提取纯化所述样品中的DNA;建库模块,其用于针对纯化的DNA样品构建用于测序的DNA文库;转化模块,其用于用重亚硫酸盐转化所述构建的DNA文库;预PCR扩增模块,其用于预PCR扩增所述经重亚硫酸盐转化的DNA文库;杂交捕获模块,其用于利用探针组合物对经预PCR扩增的样品进行杂交捕获;后PCR扩增模块,其用于利用PCR扩增经杂交捕获后的产物;测序模块,其用于对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;数据分析模块,其用于对测序数据进行分析,确定样本的甲基化水平;判读模块,其用于基于所述样本的甲基化水平判读所述患者的患病情况。
在上述本文涉及的系统中,样品采集模块是指集成在该系统中用于自动采集待测样品,即受试者的血液或血浆,并装有待测样品的模块;
DNA提取模块是指待测样品进入其中,并在其中用已知的常规方法来提取DNA,例如经过加热裂解释放DNA,再通过过滤除杂进入下一模块进行反应和检测的模块;
建库模块是指将DNA提取模块中提取的目的片段进行末端修复和添加碱基A,并将其与接头相连成为连接产物,再将连接产物进行扩增以及分离纯化后构成DNA测序文库的模块;
转化模块是指将建库模块中所构建的DNA文库用重亚硫酸盐使用已知的转化方法进行转化的模块;
预PCR扩增模块是指用已知的方法将在转化模块中经重亚硫酸盐转化的DNA文库扩增至可以与本文所述的探针组合物进行杂交捕获的量的模块;
杂交捕获模块是指利用传统液相杂交捕获体系,利用本文所述的探针组合物对经预PCR扩增的样品进行杂交捕获的模块;
PCR扩增模块是指使用已知的扩增方法扩增经杂交捕获后的产物的模块;
测序模块是指利用常规的高通量二代测序平台,例如Illumina平台,对PCR扩增后的经杂交捕获后的产物进行测序的模块;
数据分析模块是指根据通过整合公共数据以及已有的测序数据得到的一种多癌甲基化分析数据库对测序数据进行分析,确定样本的甲基化水平的模块;
判读模块是指基于从所述数据分析模块获得的所述样本的甲基化水平数据,利用计算机根据数据库进行模式识别,构建判读整体癌症风险模型,分析检测对象的癌症风险及组织来源,由此判读所述患者的患病情况的模块。
本文还涉及一种系统,该系统用于检测癌症甲基化,其包括:样品采集模块,其用于采集受试者样品的;DNA提取模块,其用于提取纯化所述样品中的DNA;建库模块,其用于针对纯化的DNA样品构建用于测序的DNA文库;转化模块,其用于用重亚硫酸盐转化所述构建的DNA文库;预PCR扩增模块,其用于预PCR扩增所述经重亚硫酸盐转化的DNA文库;杂交捕获模块,其用于利用探针组合物对经预PCR扩增的样品进行杂交捕获;后PCR扩增模块,其用于利用PCR扩增经杂交捕获后的产物;测序模块,其用于对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;数据分析模块,其用于对测序数据进行分析,确定样本的甲基化水平;泛癌判读模块,其用于比对泛癌特异性区域数据库,并判读;癌症判读模块,其用于比对癌症特异性区域数据库,并判读;和组织特异性判读模块,比对组织特异性区域数据库,并判读。
本文还涉及一种系统,该系统用于检测癌症甲基化,其包括:样品采集模块,其用于采集受试者样品的;DNA提取模块,其用于提取纯化所述样品中的DNA;建库模块,其用于针对纯化的DNA样品构建用于测序的DNA文库;转化模块,其用于用重亚硫酸盐转化所述构建的DNA文库;预PCR扩增模块,其用于预PCR扩增所述经重亚硫酸盐转化的DNA文库;杂交捕获模块,其用于利用探针组合物对经预PCR扩增的样品进行杂交捕获;后PCR扩增模块,其用于利用PCR扩增经杂交捕获后的产物;测序模块,其用于对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;数据分析模块,其用于对测序数据进行分析,确定样本的甲基化水平;泛癌判读模块,其通过比对泛癌特异性区域数据库,判断所述泛癌特异性区域Seq ID No.:63的甲基化水平是否大于等于55%,并且所述泛癌特异性区域Seq ID No.: 64的甲基化水平是否大于等于60%,如果Seq ID No.:63的甲基化水平大于等于55%且Seq ID No.:64的甲基化水平大于等于60%,则判读所述患者患有癌症;癌症判读模块,其进行如下判读:如果在靶向所述癌症特异性区域的n个探针中,n1个探针靶向的区域的甲基化水平大于等于各自阈值,且n1/n≥20%,优选n1/n≥30%,则判读患者患有组织特异性癌症中的任意一种;组织特异性判读模块,其进行如下判读:如果在靶向所述组织特异性区域的m个探针中,m1个探针靶向的区域的甲基化水平大于等于各自阈值,则进一步分析大于等于各自阈值的m1个探针所靶向的组织并计数每一个组织大于等于阈值的探针的个数,判读认为患者罹患癌症的组织是甲基化水平大于等于阈值的探针个数最多的组织。
具体来说,例如当m1为6时,进一步判断,其中5个探针是靶向胃部的探针,1个探针是靶向胰腺的探针,则判断患者罹患癌症是胃部。如果当m1为6时,进一步判断,其中3个探针是靶向胃部的探针,3个探针是靶向胰腺的探针,则判断患者罹患癌症是胃部和胰腺。例如当m1为6时,进一步判断,其中2个探针是靶向食管癌的探针、2个探针是靶向胃癌的探针、2个探针是靶向结直肠癌的探针,则判断患者罹患癌症是食管、胃、结直肠。
参见下表1,表1中列出了全部靶区域各自的甲基化阈值。
Figure PCTCN2021077065-appb-000001
Figure PCTCN2021077065-appb-000002
Figure PCTCN2021077065-appb-000003
Figure PCTCN2021077065-appb-000004
Figure PCTCN2021077065-appb-000005
Figure PCTCN2021077065-appb-000006
Figure PCTCN2021077065-appb-000007
Figure PCTCN2021077065-appb-000008
Figure PCTCN2021077065-appb-000009
Figure PCTCN2021077065-appb-000010
Figure PCTCN2021077065-appb-000011
Figure PCTCN2021077065-appb-000012
如表1所示,其中Seq ID No.84、Seq ID No.85和Seq ID No.86均是靶向Seq ID No.1所示靶区域的低甲基化探针,Seq ID No.205、Seq ID No.206和Seq ID No.207均是靶向Seq ID No.1所示靶区域的高甲基化探针。Seq ID No.87和Seq ID No.88均是靶向Seq ID No.2所示靶区域的低甲基化探针,Seq ID No.208和Seq ID No.209均是靶向Seq ID No.2所示靶区域的高甲基化探针。Seq ID No.89和Seq ID No.90均是靶向Seq ID No.3所示靶区域的低甲基化探针,Seq ID No.210和Seq ID No.211均是靶向Seq ID No.3所示靶区域的高甲基化探针。Seq ID No.91是靶向Seq ID No.4所示靶区域的低甲基化探针,Seq ID No.212是靶向Seq ID No.4所示靶区域的高甲基化探针。Seq ID No.92和Seq ID No.93均是靶向Seq ID No.5所示靶区域的低甲基化 探针,Seq ID No.213和Seq ID No.214均是靶向Seq ID No.3所示靶区域的高甲基化探针。Seq ID No.94是靶向Seq ID No.6所示靶区域的低甲基化探针,Seq ID No.215是靶向Seq ID No.6所示靶区域的高甲基化探针。Seq ID No.95是靶向Seq ID No.7所示靶区域的低甲基化探针,Seq ID No.216是靶向Seq ID No.7所示靶区域的高甲基化探针。Seq ID No.96和Seq ID No.97均是靶向Seq ID No.8所示靶区域的低甲基化探针,Seq ID No.217和Seq ID No.218均是靶向Seq ID No.8所示靶区域的高甲基化探针。Seq ID No.98、Seq ID No.99和Seq ID No.100均是靶向Seq ID No.9所示靶区域的低甲基化探针,Seq ID No.219、Seq ID No.220和Seq ID No.221均是靶向Seq ID No.9所示靶区域的高甲基化探针。Seq ID No.101是靶向Seq ID No.10所示靶区域的低甲基化探针,Seq ID No.222是靶向Seq ID No.10所示靶区域的高甲基化探针。Seq ID No.102是靶向Seq ID No.11所示靶区域的低甲基化探针,Seq ID No.223是靶向Seq ID No.11所示靶区域的高甲基化探针。Seq ID No.103是靶向Seq ID No.12所示靶区域的低甲基化探针,Seq ID No.224是靶向Seq ID No.12所示靶区域的高甲基化探针。Seq ID No.104是靶向Seq ID No.13所示靶区域的低甲基化探针,Seq ID No.225是靶向Seq ID No.13所示靶区域的高甲基化探针。Seq ID No.105和Seq ID No.106均是靶向Seq ID No.14所示靶区域的低甲基化探针,Seq ID No.226和Seq ID No.227均是靶向Seq ID No.14所示靶区域的高甲基化探针。Seq ID No.107和Seq ID No.108均是靶向Seq ID No.15所示靶区域的低甲基化探针,Seq ID No.228和Seq ID No.229均是靶向Seq ID No.15所示靶区域的高甲基化探针。Seq ID No.109和Seq ID No.110均是靶向Seq ID No.16所示靶区域的低甲基化探针,Seq ID No.230和Seq ID No.231均是靶向Seq ID No.16所示靶区域的高甲基化探针。Seq ID No.111和Seq ID No.112均是靶向Seq ID No.17所示靶区域的低甲基化探针,Seq ID No.232和Seq ID No.233均是靶向Seq ID No.17所示靶区域的高甲基化探针。Seq ID No.113是靶向Seq ID No.18所示靶区域的低甲基化探针,Seq ID No.234是靶向Seq ID No.18所示靶区域的高甲基化探针。Seq ID No.114和Seq ID No.115均是靶向Seq ID No.19所示靶区域的低甲基化探针,Seq ID No.235和Seq ID No.236均是靶向Seq ID No.19所示靶区域的高甲基化探针。Seq ID No.116是靶向Seq ID No.20所示靶区域的低甲基化探 针,Seq ID No.237是靶向Seq ID No.20所示靶区域的高甲基化探针。Seq ID No.117是靶向Seq ID No.21所示靶区域的低甲基化探针,Seq ID No.238是靶向Seq ID No.21所示靶区域的高甲基化探针。Seq ID No.118是靶向Seq ID No.22所示靶区域的低甲基化探针,Seq ID No.239是靶向Seq ID No.22所示靶区域的高甲基化探针。Seq ID No.119、Seq ID No.120和Seq ID No.121均是靶向Seq ID No.23所示靶区域的低甲基化探针,Seq ID No.240、Seq ID No.241和Seq ID No.242均是靶向Seq ID No.23所示靶区域的高甲基化探针。Seq ID No.122和Seq ID No.123均是靶向Seq ID No.24所示靶区域的低甲基化探针,Seq ID No.243和Seq ID No.244均是靶向Seq ID No.24所示靶区域的高甲基化探针。Seq ID No.124和Seq ID No.125均是靶向Seq ID No.25所示靶区域的低甲基化探针,Seq ID No.245和Seq ID No.246均是靶向Seq ID No.25所示靶区域的高甲基化探针。Seq ID No.126和Seq ID No.127均是靶向Seq ID No.26所示靶区域的低甲基化探针,Seq ID No.247和Seq ID No.248均是靶向Seq ID No.26所示靶区域的高甲基化探针。Seq ID No.128是靶向Seq ID No.27所示靶区域的低甲基化探针,Seq ID No.249是靶向Seq ID No.27所示靶区域的高甲基化探针。Seq ID No.129和Seq ID No.130均是靶向Seq ID No.28所示靶区域的低甲基化探针,Seq ID No.250和Seq ID No.251均是靶向Seq ID No.28所示靶区域的高甲基化探针。Seq ID No.131和Seq ID No.132均是靶向Seq ID No.29所示靶区域的低甲基化探针,Seq ID No.252和Seq ID No.253均是靶向Seq ID No.29所示靶区域的高甲基化探针。Seq ID No.133是靶向Seq ID No.30所示靶区域的低甲基化探针,Seq ID No.254是靶向Seq ID No.30所示靶区域的高甲基化探针。Seq ID No.134和Seq ID No.135均是靶向Seq ID No.31所示靶区域的低甲基化探针,Seq ID No.255和Seq ID No.256均是靶向Seq ID No.31所示靶区域的高甲基化探针。Seq ID No.136是靶向Seq ID No.32所示靶区域的低甲基化探针,Seq ID No.257是靶向Seq ID No.32所示靶区域的高甲基化探针。Seq ID No.137和Seq ID No.138均是靶向Seq ID No.33所示靶区域的低甲基化探针,Seq ID No.258和Seq ID No.259均是靶向Seq ID No.33所示靶区域的高甲基化探针。Seq ID No.139是靶向Seq ID No.34所示靶区域的低甲基化探针,Seq ID No.260是靶向Seq ID No.34所示靶区域的高甲基化探针。Seq ID No.140和Seq ID  No.141均是靶向Seq ID No.35所示靶区域的低甲基化探针,Seq ID No.261和Seq ID No.262均是靶向Seq ID No.35所示靶区域的高甲基化探针。Seq ID No.142是靶向Seq ID No.36所示靶区域的低甲基化探针,Seq ID No.263是靶向Seq ID No.36所示靶区域的高甲基化探针。Seq ID No.143和Seq ID No.144均是靶向Seq ID No.37所示靶区域的低甲基化探针,Seq ID No.264和Seq ID No.265均是靶向Seq ID No.37所示靶区域的高甲基化探针。Seq ID No.145和Seq ID No.146均是靶向Seq ID No.38所示靶区域的低甲基化探针,Seq ID No.266和Seq ID No.267均是靶向Seq ID No.38所示靶区域的高甲基化探针。Seq ID No.147是靶向Seq ID No.39所示靶区域的低甲基化探针,Seq ID No.268是靶向Seq ID No.39所示靶区域的高甲基化探针。Seq ID No.148和Seq ID No.149均是靶向Seq ID No.40所示靶区域的低甲基化探针,Seq ID No.269和Seq ID No.270均是靶向Seq ID No.40所示靶区域的高甲基化探针。Seq ID No.150是靶向Seq ID No.41所示靶区域的低甲基化探针,Seq ID No.271是靶向Seq ID No.41所示靶区域的高甲基化探针。Seq ID No.151是靶向Seq ID No.42所示靶区域的低甲基化探针,Seq ID No.272是靶向Seq ID No.42所示靶区域的高甲基化探针。Seq ID No.152是靶向Seq ID No.43所示靶区域的低甲基化探针,Seq ID No.273是靶向Seq ID No.43所示靶区域的高甲基化探针。Seq ID No.153是靶向Seq ID No.44所示靶区域的低甲基化探针,Seq ID No.274是靶向Seq ID No.44所示靶区域的高甲基化探针。Seq ID No.154和Seq ID No.155均是靶向Seq ID No.45所示靶区域的低甲基化探针,Seq ID No.275和Seq ID No.276均是靶向Seq ID No.45所示靶区域的高甲基化探针。Seq ID No.156是靶向Seq ID No.46所示靶区域的低甲基化探针,Seq ID No.277是靶向Seq ID No.46所示靶区域的高甲基化探针。Seq ID No.157是靶向Seq ID No.47所示靶区域的低甲基化探针,Seq ID No.278是靶向Seq ID No.47所示靶区域的高甲基化探针。Seq ID No.158是靶向Seq ID No.48所示靶区域的低甲基化探针,Seq ID No.279是靶向Seq ID No.48所示靶区域的高甲基化探针。Seq ID No.159是靶向Seq ID No.49所示靶区域的低甲基化探针,Seq ID No.280是靶向Seq ID No.49所示靶区域的高甲基化探针。Seq ID No.160是靶向Seq ID No.50所示靶区域的低甲基化探针,Seq ID No.281是靶向Seq ID No.50所示靶区域的高甲基化探针。 Seq ID No.161和Seq ID No.162均是靶向Seq ID No.51所示靶区域的低甲基化探针,Seq ID No.282和Seq ID No.283均是靶向Seq ID No.51所示靶区域的高甲基化探针。Seq ID No.163是靶向Seq ID No.52所示靶区域的低甲基化探针,Seq ID No.284是靶向Seq ID No.52所示靶区域的高甲基化探针。Seq ID No.164和Seq ID No.165均是靶向Seq ID No.53所示靶区域的低甲基化探针,Seq ID No.285和Seq ID No.286均是靶向Seq ID No.53所示靶区域的高甲基化探针。Seq ID No.166是靶向Seq ID No.54所示靶区域的低甲基化探针,Seq ID No.287是靶向Seq ID No.54所示靶区域的高甲基化探针。Seq ID No.167和Seq ID No.168均是靶向Seq ID No.55所示靶区域的低甲基化探针,Seq ID No.288和Seq ID No.289均是靶向Seq ID No.55所示靶区域的高甲基化探针。Seq ID No.169和Seq ID No.170均是靶向Seq ID No.56所示靶区域的低甲基化探针,Seq ID No.290和Seq ID No.291均是靶向Seq ID No.56所示靶区域的高甲基化探针。Seq ID No.171和Seq ID No.172均是靶向Seq ID No.57所示靶区域的低甲基化探针,Seq ID No.292和Seq ID No.293均是靶向Seq ID No.57所示靶区域的高甲基化探针。Seq ID No.173和Seq ID No.174均是靶向Seq ID No.58所示靶区域的低甲基化探针,Seq ID No.294和Seq ID No.295均是靶向Seq ID No.58所示靶区域的高甲基化探针。Seq ID No.175和Seq ID No.176均是靶向Seq ID No.59所示靶区域的低甲基化探针,Seq ID No.296和Seq ID No.297均是靶向Seq ID No.59所示靶区域的高甲基化探针。Seq ID No.177是靶向Seq ID No.60所示靶区域的低甲基化探针,Seq ID No.298是靶向Seq ID No.60所示靶区域的高甲基化探针。Seq ID No.178和Seq ID No.179均是靶向Seq ID No.61所示靶区域的低甲基化探针,Seq ID No.299和Seq ID No.300均是靶向Seq ID No.61所示靶区域的高甲基化探针。Seq ID No.180是靶向Seq ID No.62所示靶区域的低甲基化探针,Seq ID No.301是靶向Seq ID No.62所示靶区域的高甲基化探针。Seq ID No.181是靶向Seq ID No.63所示靶区域的低甲基化探针,Seq ID No.302是靶向Seq ID No.63所示靶区域的高甲基化探针。Seq ID No.182是靶向Seq ID No.64所示靶区域的低甲基化探针,Seq ID No.303是靶向Seq ID No.64所示靶区域的高甲基化探针。Seq ID No.183是靶向Seq ID No.65所示靶区域的低甲基化探针,Seq ID No.304是靶向Seq ID No.65所示靶区域的高甲基化 探针。Seq ID No.184是靶向Seq ID No.66所示靶区域的低甲基化探针,Seq ID No.305是靶向Seq ID No.66所示靶区域的高甲基化探针。Seq ID No.185是靶向Seq ID No.67所示靶区域的低甲基化探针,Seq ID No.306是靶向Seq ID No.67所示靶区域的高甲基化探针。Seq ID No.186是靶向Seq ID No.68所示靶区域的低甲基化探针,Seq ID No.307是靶向Seq ID No.68所示靶区域的高甲基化探针。Seq ID No.187是靶向Seq ID No.69所示靶区域的低甲基化探针,Seq ID No.308是靶向Seq ID No.69所示靶区域的高甲基化探针。Seq ID No.188是靶向Seq ID No.70所示靶区域的低甲基化探针,Seq ID No.309是靶向Seq ID No.70所示靶区域的高甲基化探针。Seq ID No.189是靶向Seq ID No.71所示靶区域的低甲基化探针,Seq ID No.310是靶向Seq ID No.71所示靶区域的高甲基化探针。Seq ID No.190是靶向Seq ID No.72所示靶区域的低甲基化探针,Seq ID No.311是靶向Seq ID No.72所示靶区域的高甲基化探针。Seq ID No.191是靶向Seq ID No.73所示靶区域的低甲基化探针,Seq ID No.312是靶向Seq ID No.73所示靶区域的高甲基化探针。Seq ID No.192是靶向Seq ID No.74所示靶区域的低甲基化探针,Seq ID No.313是靶向Seq ID No.74所示靶区域的高甲基化探针。Seq ID No.193是靶向Seq ID No.75所示靶区域的低甲基化探针,Seq ID No.314是靶向Seq ID No.75所示靶区域的高甲基化探针。Seq ID No.194是靶向Seq ID No.76所示靶区域的低甲基化探针,Seq ID No.315是靶向Seq ID No.76所示靶区域的高甲基化探针。Seq ID No.195是靶向Seq ID No.77所示靶区域的低甲基化探针,Seq ID No.316是靶向Seq ID No.77所示靶区域的高甲基化探针。Seq ID No.196是靶向Seq ID No.78所示靶区域的低甲基化探针,Seq ID No.317是靶向Seq ID No.78所示靶区域的高甲基化探针。Seq ID No.197是靶向Seq ID No.79所示靶区域的低甲基化探针,Seq ID No.318是靶向Seq ID No.79所示靶区域的高甲基化探针。Seq ID No.198和Seq ID No.199均是靶向Seq ID No.80所示靶区域的低甲基化探针,Seq ID No.319和Seq ID No.320均是靶向Seq ID No.80所示靶区域的高甲基化探针。Seq ID No.200、Seq ID No.201和Seq ID No.202均是靶向Seq ID No.81所示靶区域的低甲基化探针,Seq ID No.321、Seq ID No.322和Seq ID No.323均是靶向Seq ID No.81所示靶区域的高甲基化探针。Seq ID No.203是靶向Seq ID No.82所示靶区域的低甲基 化探针,Seq ID No.324是靶向Seq ID No.82所示靶区域的高甲基化探针。Seq ID No.204是靶向Seq ID No.83所示靶区域的低甲基化探针,Seq ID No.325是靶向Seq ID No.83所示靶区域的高甲基化探针。表1中给出了探针所靶向的靶序列。
两个泛癌标志物甲基化水平阈值的确定是指在超过50%的癌症样本中分别均达到或超过这一指标,而在对应的正常对照中则低于此指标。
通过统计癌症特异性区域(标志物)的甲基化水平,并分析大于等于阈值的标志物,当大于等于阈值的标志物数量(n1)≥总体标志物数量(n)的20%,则判定为癌症。实际上任何一个癌特异性标志物甲基化水平的变化,都意味着样本存在或多或少的异常。在多数癌症样本中,甲基化水平差异的标志物占所有标志物比例均超过30%。考虑到多数癌症样本均为病理分级II期或以后的患者,为增加本panel检测的灵敏性,将该比例调整到20%,以提高早期癌症的检出。
通过统计所有组织标志物的甲基化水平,并分析大于等于阈值的组织标志物指向的组织,最后给出可能的组织来源。
在上述本文涉及的系统中,所述泛癌特异性区域Seq ID No.:63的甲基化水平的阈值是55%,因此当检测结果大于等于55%时则认为该泛癌特异性区域的甲基化水平大于等于阈值;所述泛癌特异性区域Seq ID No.:64的甲基化水平的阈值是60%,因此当检测结果大于等于60%时则认为该泛癌特异性区域的甲基化水平大于等于阈值。
在本文中,靶向所述癌症特异性区域的n个探针中,n1个探针靶向的区域的甲基化水平大于等于各自阈值,n1/n≥20%,优选≥30%,则判读患者患有组织特异性癌症中的任意一种。如表1示出了癌症特异性区域各自的阈值,例如如Seq ID No.:48所示的区域,其阈值为0.35,因此,利用靶向该区域的探针进行检测时,如果检测结果大于等于0.35,则该探针检测的该区域的甲基化水平大于等于阈值。其余癌症特异性区域的甲基化阈值均可以参见表1。
在本文中,如果在靶向所述组织特异性区域的m个探针中m1个探针靶向的区域的甲基化水平大于等于各自阈值,则进一步分析大于等于各自阈值的m1个探针所靶向的组织并计数每一个组织大于等于阈值的探针的个数, 判读认为患者罹患癌症的组织是甲基化水平大于等于阈值的探针个数最多的组织。如表1示出了组织特异性区域各自的阈值,例如如Seq ID No.:67所示的区域,其阈值为0.29,因此,利用靶向该区域的探针进行检测时,如果检测结果大于等于0.29,则该探针检测的该区域的甲基化水平大于等于阈值。其余组织特异性区域的甲基化阈值均可以参见表1。
在上述本文涉及的系统中,从受试者中采集样品。所述受试者可为,疑似患有癌症的受试者,或已经患有癌症的受试者。所述癌症可为,食管癌、胃癌,或结直肠癌。所述样品可为血液、血浆。在上述本文涉及的系统中,从样品中提取纯化DNA,所述DNA可为gDNA,或cfDNA。
在上述本文涉及的系统中,在预PCR扩增模块中,利用PCR扩增所述经重亚硫酸盐转化的DNA文库,使用PCR扩增可以使经重亚硫酸盐转化的DNA的量增加,达到可进行与探针组合物杂交的量。
在上述本文涉及的系统中,在所述杂交捕获模块中,利用探针组合物对样品进行杂交捕获,所述探针组合物包括,靶向泛癌特异性区域的2个探针,靶向癌症特异性区域的n个探针,和靶向组织特异性区域的m个探针。n可为选自1-192中的任意的整数。n可以是,例如1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,110,……,192。所述癌症特异性区域可以选自Seq ID No.:1-62中的任意。m可为选自1-44中的任意的整数。m可以是,例如1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44。所述组织特异性区域可选自Seq ID No.:65-83中的任意。
在上述本文涉及的系统中,在所述杂交捕获模块中,所述探针组合物包 括:低甲基化探针,其与经重亚硫酸盐转化的不含CG甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交,和高甲基化探针,其与经重亚硫酸盐转化的CG全部甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交。所述低甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:84-180中的任意,靶向泛癌特异性区域的探针Seq ID No.:181-182中的任意,和靶向组织特异性区域的探针Seq ID No.:183-204中的任意。所述高甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:205-301中的任意,靶向泛癌特异性区域的探针Seq ID No.:302-303中的任意,和靶向组织特异性区域的探针Seq ID No.:304-325中的任意。
在上述本文涉及的系统中,在所述杂交捕获模块中,所述探针组合物中的每一个探针的长度为40~60bp。例如,可以为,41~60bp,42~60bp,43~60bp,44~60bp,45~60bp,45~59bp,45~58bp,45~57bp,45~56bp,46~56bp,47~56bp,48~56bp,49~56bp,50~56bp。所述探针组合物中的每一个探针的长度优选为50~56bp,进一步优选50bp。
在上述本文涉及的系统中,在后PCR扩增模块中,利用PCR扩增经杂交捕获后的产物,使用PCR扩增可以使杂交捕获后的产物的量,达到可以上机测序的起始量。如果不使用PCR扩增杂交捕获后的产物,则产物的量无法达到上机测序的要求。
在上述本文涉及的系统中,在所述测序模块中,进行高通量二代测序的平台为Illumina平台。
在上述本文涉及的系统中,在所述判读模块中,所述数据库可以是通过整合公共数据以及已有的测序数据,提供了一种多癌甲基化分析数据库。根据数据库进行模式识别,数据库中包含三种标志物信息,分别为:泛癌标志物、组织特异性标志物和癌症特征性标志物。
在上述本文涉及的系统中执行的方法中,所述检测方法采用杂交捕获的方式富集cfDNA,并用NGS技术检测与癌症高度相关的甲基化位点。覆盖了我国发病率最高的三种管腔性器官恶性肿瘤(食管癌、胃癌和结直肠癌)。最终,根据检测血浆cfDNA中基因甲基化变化水平,为多癌早筛和早诊提供信息。
实施例
实施例1:
如图1所示,本申请的实施流程具体如下:
1.1.cfDNA提取纯化
1.1.1.血浆样本制备:
4℃、2000g离心血液样本10min,将血浆转移到一个新的离心管中。4℃、16000g离心血浆样本10min,根据使用的收集管类型,执行下一步,本实验中使用的收集管类型为其他。
表2
Figure PCTCN2021077065-appb-000013
1.1.2.裂解和结合
1.1.2.1.按照下表准备结合溶液/珠子混合物,然后彻底混匀。
表3
Figure PCTCN2021077065-appb-000014
加入适量体积的血浆样品。
1.1.2.2.彻底混匀血浆样品和结合溶液/珠子混合物。
1.1.2.3.在旋转混匀仪上充分的结合10min,使cfDNA结合到磁珠上。
1.1.2.4.将结合管放在磁力架上5min,直到溶液变得澄清,磁珠完全吸附在磁力架上。
1.1.2.5.用移液管小心的弃去上清,继续保持管子在磁力架上几分钟,用移液管移去残留上清。
1.1.3.洗涤
1.1.3.1.将珠子重悬在1ml洗涤溶液中。
1.1.3.2.将重悬液转移到新的无吸附1.5ml离心管中。保留结合管。
1.1.3.3.将含有珠子重悬液的离心管置于磁力架上,20s。
1.1.3.4.将分离得到的上清,吸出洗涤结合管,将清洗后的残留珠子再次收集到重悬液中,弃掉裂解/结合管。
1.1.3.5.管子置于磁力架上2min,直到溶液变得澄清,珠子聚集在磁力架,用1ml移液器移除上清。
1.1.3.6.管子留在磁力架上,用200μL移液器尽可能移除残留的液体。
1.1.3.7.将管子从磁力架取下来,加入1ml洗涤溶液,涡旋30s。
1.1.3.8.置于磁力架2min,直到溶液澄清,珠子聚集在磁力架上,用1ml移液管移除上清。
1.1.3.9.管子留在磁力架上,用200μL移液器彻底移除残留液体。
1.1.3.10.将管子从磁力架取下,加入1ml 80%乙醇,涡旋30s。
1.1.3.11.置于磁力架上2min,溶液变得澄清,用1ml移液器移去上清。
1.1.3.12.管子留在磁力架上,用200μL移液器移去残留液体。
1.1.3.13.用80%乙醇重复上述1.1.3.10.-1.1.3.12.步骤一次,尽可能除去上清。
1.1.3.14.管子留在磁力架上,空气中干燥珠子3~5分钟。
1.1.4.洗脱cfDNA
1.1.4.1.按照下表加入洗脱液。
表4
Figure PCTCN2021077065-appb-000015
1.1.4.2.涡旋5min,置于磁力架上2min,溶液变得澄清,吸取上清液中的cfDNA。
1.1.4.3.纯化的cfDNA立即使用,或者将上清转移至新的离心管中,-20℃保存。
1.2.gDNA打断与纯化:
1.2.1.按照Qubit浓度,取2μg gDNA,加水补至125μl,加入到covaris  130μl打断管中,设置程序:50W,20%,200个循环,250s。
1.2.2.打断结束后取1μl样品使用Agilent2100进行片段检测,正常打断后样品检测主峰约在150bp-200bp。
对于cfDNA样品,Agilent2100进行片段检测,直接Qubit用于后续的实验。
1.3.末端修复、3‘端加“A”:
1.3.1.取20ng打断后的gDNA或cfDNA至PCR管中,用无核酸酶水补至50μl,加入以下试剂,涡旋混匀:
表5
组分 体积
gDNA/cfDNA 50μl
终止修复和A加尾缓冲液 7μl
终止修复和A加尾酶混合物 3μl
总体积 60μl
1.3.2.设置以下程序在PCR仪上进行反应:热盖温度85℃。
表6
温度 时间
20℃ 30min
65℃ 30min
4℃
1.4.接头连接及纯化:
1.4.1.参照下表将接头提前稀释成合适的浓度:
表7
每50ul ER和AT反应的片段化的DNA 接头浓度
1μg 10uM
500ng 10uM
250ng 10uM
100ng 10uM
50ng 10uM
25ng 10uM
10ng 3uM
5ng 5uM
2.5ng 2.5uM
1ng 625nM
1.4.2.按下表配制以下试剂,轻轻吸打混匀,短暂离心:
表8
组分 体积
末端修复、加“A”反应产物 60μl
接头 5μl
无核酸酶水 5μl
连接缓冲液 30μl
DNA连接酶 10μl
总体积 110μl
1.4.3.设置以下程序在PCR仪上进行反应:无热盖。
表9
温度 时间
20℃ 30min
4℃
1.4.4.按照以下体系,加入纯化磁珠进行实验(Agencourt AMPure XP磁珠提前拿至室温震荡混合均匀备用):
表10
组分 体积
接头连接产物 110μl
Agencourt AMPure XP珠子 110μl
总体积 220μl
1.4.4.1.轻轻吸打混匀6次。
1.4.4.2.室温静置孵育5-15min,将PCR管置于磁力架上3min使溶液澄清。
1.4.4.3.移除上清,PCR管继续放置在磁力架上,向PCR管内加入200μl80%乙醇溶液,静置30s。
1.4.4.4.移除上清,再向PCR管内加入200μl 80%乙醇溶液,静置30s后彻底移除上清(建议使用10μl移液器移除底部残留乙醇溶液)。
1.4.4.5.室温静置3-5min,使残留乙醇彻底挥发。
1.4.4.6.加入22μl的无核酸酶水,把PCR管从磁力架取下,轻轻吸打重悬磁珠,避免产生气泡,室温静置2min。
1.4.4.7.将PCR管置于磁力架上2min使溶液澄清。
1.4.4.8.用移液器吸取20μl上清液,转移到新的PCR管。
1.5重亚硫酸盐处理及纯化:
1.5.1.预先拿出所需要的试剂,并溶解。根据下表加入各试剂:
表11
组分 高浓度样品(1ng-2μg)体积 低浓度样品(1-500ng)体积
接头连接纯化产物 20μl 40μl
重亚硫酸盐溶液 85μl 85μl
DNA保护缓冲液 35μl 15μl
总体积 140μl 140μl
1.5.2.DNA保护缓冲液加入液体变成蓝色。轻轻吸打混匀,然后分成两管至于PCR仪上。
1.5.3.设置以下程序,并运行:热盖105℃。
表12
温度 时间
95℃ 5min
60℃ 10min
95℃ 5min
60℃ 10min
4℃
1.5.4.简短离心将两管相同样本合并至同一个干净的1.5ml离心管中。
1.5.5.每个样本中加入310μl缓冲液BL(样本量少于100ng加入1μl的载体RNA(1μg/μl)),涡旋混匀,简短离心。
1.5.6.加入250μl无水乙醇到每个样本中,涡旋混匀15s,简短离心,将混合液加入到准备好的对应的离心柱中。
1.5.7.静置1min,离心1min,将收集管中的液体重新转移到离心柱中,离心1min,弃去离心管的液体。
1.5.8.加入500μl缓冲液BW(注意是否加入无水乙醇),离心1min,弃去废液。
1.5.9.加入500μl缓冲液BD(注意是否加入无水乙醇),盖好管盖,室温放置15min。离心1min,弃去离心下的液体。
1.5.10.加入500μl缓冲液BW(注意是否加入无水乙醇),离心1min,弃去离下来的液体,在重复一次,共2次。
1.5.11.加入250μl无水乙醇,离心1min,将离心柱放置到新的2ml收集管中,弃掉全部剩余液体。
1.5.12.将离心柱放置到干净的1.5ml离心管中,加入20μl无核酸酶水到离心柱膜中心,轻轻盖上管盖,室温放置1min,离心1min。
1.5.13.将收集管中的液体重新转移至离心柱中,室温放置1min,离心1min。
1.6.杂交前预扩增及纯化:
1.6.1.按下列表格配制反应体系,吹打混匀,短暂离心:
表13
Figure PCTCN2021077065-appb-000016
1.6.2.设置以下程序并启动PCR程序:热盖105℃
表14
Figure PCTCN2021077065-appb-000017
1.6.3.PCR循环数根据投入DNA的量不同进行调整,参考数据如下所示:
表15
Figure PCTCN2021077065-appb-000018
1.6.4.向反应结束后的PCR管中加入50μl Agencourt AMPure XP磁珠,用移液器吹打混匀,避免产生气泡(Agencourt AMPure XP提前室温混匀并平衡)。
1.6.5.室温孵育5-15min,把PCR管置于磁力架上3min使溶液澄清。
1.6.6.移除上清,PCR管继续放置在磁力架上,向PCR管内加入200μl80%乙醇溶液,静置30s。
1.6.7.移除上清,再向PCR管内加入200μl 80%乙醇溶液,静置30s后彻底移除上清(建议使用10μl移液器移除底部残留乙醇溶液)。
1.6.8.室温静置5min,使残留乙醇彻底挥发。
1.6.9.加入30μl的无核酸酶水,将离心管从磁力架取下,使用移液器,轻轻吸打重悬磁珠。
1.6.10.室温静置2min,将200μl PCR管置于磁力架上2min使溶液澄清。
1.6.11.用移液器将上清液转移到新的200μl PCR管中(置于冰盒上),在反应管上标记好样本号,准备下一步反应。
1.6.12.取1μl样品使用Qubit进行文库浓度测定,记录文库浓度。
1.6.13.取1μl样品使用安捷伦2100进行文库片段长度测定,文库长度约在270bp-320bp间。
1.7.样品与探针杂交:
1.7.1.按照以下体系将样品文库与各种Hyb阻断物混匀,标记为B:
表16
组分 体积
预扩增产物 750ng对应体积
Hyb人阻断物 5μl
接头阻断物 6μl
增强剂 5μl
1.7.2.将准备好的样品和Hyb阻断物混合物放入真空浓缩离心机,打开PCR管盖,启动离心机,打开真空泵开关,开始浓缩。
1.7.3.将抽干的样品重新溶在约9μl无核酸酶水中,总体积10μl,轻轻吸打混匀,短暂离心后置于冰上待用,标记为B。
1.7.4.将Hyb缓冲液置于室温融化,融解之后会有沉淀出现,混匀后置于65℃水浴锅内预热,完全溶解后(无沉淀及浑浊物)取20μl Hyb缓冲液置于新的200μl PCR管内,盖好管盖,标记为A,继续置于65℃水浴锅内孵育待用。
1.7.5.通过艾吉泰康生物科技(北京)有限公司合成以下低甲基化探针:
a)靶向癌症特异性区域的探针Seq ID No.:84-180中的任意,b)靶向泛癌特异性区域的探针Seq ID No.:181-182中的任意,和c)靶向组织特异性区域的探针Seq ID No.:183-204中的任意,
并且合成以下高甲基化探针:
d)靶向癌症特异性区域的探针Seq ID No.:205-301中的任意,e)靶向泛癌特异性区域的探针Seq ID No.:302-303中的任意,和f)靶向组织特异性区域的探针Seq ID No.:304-325中的任意。
并且以a:b:c:d:e:f=1:1:1:1:1:1的比例制成探针组合物。
1.7.6.取5μl RNA酶阻断物与2μl探针组合物置于200μl PCR管内,轻轻吸打混匀,短暂离心后置于冰上待用,标记为C。
1.7.7.设置PCR仪参数,热盖100℃,95℃,5min;65℃,保持。
1.7.8.将PCR管B置于PCR仪上,运行以上程序。
1.7.9.PCR仪温度降至65℃时,将PCR管A置于PCR仪上孵育,盖上PCR仪热盖。
1.7.10. 5min后,将C置于PCR上孵育,盖上PCR仪热盖。
1.7.11.将PCR管C放置入PCR仪2min后,把移液器调至13μl,从PCR管A中吸取13μl Hyb缓冲液移至PCR管C中,吸取全部PCR管B中样品移至PCR管C中,轻轻吸打10次,充分混匀,避免产生大量气泡,密封管盖,盖上PCR仪热盖,65℃孵育过夜(16-24h)。
1.8.捕获目标区域DNA文库:
1.8.1.捕获磁珠的准备
1.8.1.1.将磁珠(Dynabeads MyOne Streptavidin T1磁珠)从4℃取出,涡旋震荡重悬。
1.8.1.2.取50μl磁珠置于新的PCR管内,置于磁力架上1min使溶液澄清,移除上清。
1.8.1.3.从磁力架上取下PCR管,加入200μL结合缓冲液轻轻吸打数次混匀,重悬磁珠。
1.8.1.4.置磁力架上1min,移除上清。
1.8.1.5.重复步骤3-4两次,共清洗磁珠3次。
1.8.1.6.从磁力架上取下PCR管,加入200μL结合缓冲液轻轻吸打6次重悬磁珠待用。
1.8.2.捕获目标DNA文库
1.8.2.1.保持杂交产物PCR管C在PCR仪上,将准备好的200μL捕获磁珠加入到杂交后的产物PCR管C中,用移液器吸打6次混匀,置于旋转混匀仪上室温结合30min(转速最好不要超过10转/min)。
1.8.2.2.将PCR管置于磁力架上2min使溶液澄清,移除上清液。
1.8.2.3.向PCR管C内加入200μL的洗涤缓冲液1,轻轻吸打6次混匀,置于旋转混匀仪上清洗15min(转速最好不要超过10转/min),然后短暂离心,将PCR管放于磁力架上2min使溶液澄清,移除上清。
1.8.2.4.加入200μl的65℃预热后的洗涤缓冲液2,轻轻吸打6次混匀,置于混匀仪上65℃孵育10min,转速800转/min进行清洗。
1.8.2.5.短暂离心,将PCR管放于磁力架上2min,移除上清。使用洗涤缓冲液2再重复2次清洗,共计3次。最后一次彻底移除洗涤缓冲液2。
1.8.2.6.PCR管继续置于磁力架上,向PCR管内加入200μl 80%乙醇,静置30s后彻底移除乙醇溶液,室温晾干2min。
1.8.2.7.向PCR管加入30μL无核酸酶水,从磁力架上取下PCR管,轻轻吸打6次重悬磁珠待用。
1.9.捕获后扩增及纯化
1.9.1.根据下表配制反应体系进行捕获文库的富集,轻轻吹打混匀后,短暂离心:
表17
Figure PCTCN2021077065-appb-000019
1.9.2.设置以下程序,将样品置于PCR仪中,运行程序:热盖105℃。
表18
Figure PCTCN2021077065-appb-000020
1.9.3.PCR结束后向样品加入55μl Agencourt AMPure XP磁珠,用移液器轻轻吸打混匀。
1.9.4.室温孵育5min,把PCR管置于磁力架上3min使溶液澄清。
1.9.5.移除上清,PCR管继续置于磁力架上,加入200μl 80%无水乙醇,静置30s。
1.9.6.移除上清,再向PCR管内加入200μl 80%无水乙醇,静置30后彻底移除上清。
1.9.7.室温放置5min,使得残留乙醇彻底挥发。
1.9.8.加入25μl无核酸酶水,将PCR管从磁力架拿下,轻轻吹打混匀重悬磁珠,室温放置2min。
1.9.9.将PCR管置于磁力架上2min使溶液澄清。
1.9.10.用移液器吸23μl上清液转移到1.5ml离心管,标记样品信息。
1.9.11.取1μl文库使用Qubit进行定量,记录文库浓度。
1.9.12.取1μl样品使用Agilent2100进行文库片段长度测定。
1.9.13.使用Illumina高通量测序平台进行测序。
1.10.甲基化生信分析流程。大致如下:使用trimmomatic等质控软件查看测序质量,去除低质量的读段,然后采用Bismarker等比对软件将质控后的干净的数据比对到参考基因组上,采用methykit等R包提取相应的甲基化位点。最后,计算出Panel上的每个靶区域的甲基化比率。
实施例2
一例经病理鉴定的胃癌样本,采用本申请的Panel检测,按实施例1的方法采集外周血;建库,并通过Illumina平台测序;测序数据经上述生物信息的分析流程,得到甲基化水平,结果如下表19所示(表19显示了检测到 的大于等于甲基化阈值的靶区域)。
表19
基因 CHR 起始 终止 甲基化比率 靶区域序列号
TBX15 1 119527108 119527157 0.55 Seq ID No.63
CRYGD 2 208989200 208989249 0.60 Seq ID No.64
CPE 4 166300051 166300291 0.42 Seq ID No.68
CPE 4 166300242 166300291 0.42 Seq ID No.69
PLXDC2 10 20104497 20104546 0.46 Seq ID No.75
PLXDC2 10 20104758 20104807 0.46 Seq ID No.76
PLXDC2 10 20104948 20104997 0.53 Seq ID No.77
PLXDC2 10 20105593 20105642 0.49 Seq ID No.78
OTX1 2 63281139 63281188 0.57 Seq ID No.11
SFRP2 4 154710475 154710536 0.40 Seq ID No.19
SFRP2 4 154710598 154710647 0.37 Seq ID No.20
SFRP2 4 154710702 154710751 0.36 Seq ID No.21
SFRP2 4 154710796 154710845 0.43 Seq ID No.22
CDO1 5 115152372 115152432 0.48 Seq ID No.24
CDO1 5 115152485 115152543 0.48 Seq ID No.25
TRIM15 6 30131701 30131768 0.62 Seq ID No.31
ALX4 11 44330903 44330952 0.49 Seq ID No.47
ALX4 11 44330958 44331007 0.37 Seq ID No.48
CCNA1 13 37004553 37004618 0.42 Seq ID No.53
CCNA1 13 37004620 37004669 0.47 Seq ID No.54
CCNA1 13 37005441 37005502 0.44 Seq ID No.55
CCNA1 13 37005566 37005631 0.39 Seq ID No.56
对检测样本进行模式识别分类鉴定,首先判读出泛癌特异性标志物TBX15和CRYGD基因的甲基化水平大于等于55%和60%,则初步判断该样本为患有癌症的样本;其次,判读出癌症特异性标志物OTX1、SFRP2、CDO1、TRIM15、ALX4、CCNA1的甲基化水平均大于等于表1中所示的各自阈值(如上表19所示),则进一步判断该样本为患有以下11种癌症(食管癌、胃癌、结直肠癌、肺癌、肝癌、胰腺癌、前列腺癌、乳腺癌、卵巢癌、宫颈癌和子宫内膜癌)中的任意癌症的样本;最后,判读组织特异性标志物,基于表19中大于等于各自阈值的靶区域,可以看出属于胃组织特异性的6个靶区域Seq ID No.68、Seq ID No.69、Seq ID No.75、Seq ID No.76、Seq ID No.77和Seq ID No.78大于等于各自的甲基化水平的阈值,没有其他的组织特异性标志物的甲基化大于等于其各自的阈值,因此在大于等于各自甲 基化阈值的组织特异性标志物中,胃组织特异性标志物最多,则最终判断该样本为患有胃癌的样本。
该患者术后48小时,再次抽血,采用本申请的Panel检测,按实施例1的方法采集外周血;建库,并通过Illumina平台测序;测序数据经上述生物信息的分析流程,通过模式识别的方法,分析所有测序数据,结果显示上表中的基因甲基化水平回归正常水平。
实施例3
一例结直肠癌样本,采用本申请的Panel检测,按实施例1的方法采集外周血;建库,并通过Illumina平台测序;测序数据经上述生物信息的分析流程,得到甲基化水平,结果如下表20所示(表20显示了检测到的大于等于甲基化阈值的靶区域)。
表20
基因 CHR 起始 终止 甲基化比率 靶区域序列号
TBX15 1 119527108 119527157 0.55 Seq ID No.63
CRYGD 2 208989200 208989249 0.60 Seq ID No.64
C6orf155 6 72130359 72130408 0.56 Seq ID No.71
C6orf155 6 72130553 72130602 0.71 Seq ID No.72
C6orf155 6 72130641 72130690 0.65 Seq ID No.73
C6orf155 6 72130755 72130804 0.69 Seq ID No.74
SHISA2 13 26625273 26625397 0.61 Seq ID No.81
TRH 3 129693370 129693434 0.66 Seq ID No.16
TRH 3 129693586 129693662 0.53 Seq ID No.17
CDO1 5 115152372 115152432 0.48 Seq ID No.24
CDO1 5 115152485 115152543 0.48 Seq ID No.25
ELMO1 7 37488516 37488578 0.4 Seq ID No.35
GFRA1 10 118032831 118032906 0.33 Seq ID No.40
GFRA1 10 118032948 118032997 0.52 Seq ID No.41
CCNA1 13 37004553 37004618 0.42 Seq ID No.53
CCNA1 13 37004620 37004669 0.45 Seq ID No.54
CCNA1 13 37005441 37005502 0.39 Seq ID No.55
CCNA1 13 37005566 37005631 0.33 Seq ID No.56
SALL1 16 51184379 51184441 0.63 Seq ID No.58
对检测样本进行模式识别分类鉴定,首先判读出泛癌特异性标志物TBX15和CRYGD基因的甲基化水平大于等于55%和60%,则初步判断该样本为患有癌症的样本;其次,判读出癌症特异性标志物TRH、CDO1、 ELMO1、GFRA1、CCNA1、SALL1的甲基化水平均大于等于表1中所示的各自阈值(如上表20所示),则进一步判断该样本为患有以下11种癌症(食管癌、胃癌、结直肠癌、肺癌、肝癌、胰腺癌、前列腺癌、乳腺癌、卵巢癌、宫颈癌和子宫内膜癌)中的任意癌症的样本;最后,判读组织特异性标志物,基于表20中大于等于各自阈值的靶区域,可以看出属于结直肠组织特异性的6个靶区域Seq ID No.63、Seq ID No.64、Seq ID No.71、Seq ID No.72、Seq ID No.73、Seq ID No.74大于等于各自的甲基化水平的阈值,没有其他的组织特异性标志物的甲基化大于等于其各自的阈值,因此在大于等于各自甲基化阈值的组织特异性标志物中,结直肠组织特异性标志物最多,则最终判断该样本为患有结直肠癌的样本。
患者术后48小时,再次抽血,采用本申请的Panel检测,按实施例1的方法采集外周血,建库,测序;测序数据经上述生物信息的分析流程,通过模式识别的方法,分析所有测序数据,结果显示上表中的基因甲基化水平回归正常水平。
实施例4
一例食管癌样本,采用本申请的Panel检测,按实施例1的方法采集外周血;建库,并通过Illumina平台测序;测序数据经上述生物信息的分析流程,得到甲基化水平,结果如下表21所示(表21显示了检测到的大于等于甲基化阈值的靶区域)。
表21
Figure PCTCN2021077065-appb-000021
Figure PCTCN2021077065-appb-000022
对检测样本进行模式识别分类鉴定,首先判读出泛癌特异性标志物TBX15和CRYGD基因的甲基化水平大于等于55%和60%,则初步判断该样本为患有癌症的样本;其次,判读出癌症特异性标志物CPE、TFAP2E、TRH、C11orf21、EDNRB的甲基化水平均大于等于表1中所示的各自阈值(如上表21所示),则进一步判断该样本为患有以下3种癌症(食管癌、胃癌、结直肠癌)中的任意癌症的样本;最后,判读组织特异性标志物,基于表21中大于等于各自阈值的靶区域,可以看出属于食管组织特异性的5个靶区域Seq ID No.65、Seq ID No.66、Seq ID No.67、Seq ID No.68和Seq ID No.69大于等于各自的甲基化水平的阈值,没有其他的组织特异性标志物的甲基化大于等于其各自的阈值,因此在大于等于各自甲基化阈值的组织特异性标志物中,食管组织特异性标志物最多,则最终判断该样本为患有食管癌的样本。
患者术后48小时,再次抽血,采用本申请的Panel检测,按实施例1的方法采集外周血,建库,测序;测序数据经上述生物信息的分析流程,通过模式识别的方法,分析所有测序数据,结果显示上表中的基因甲基化水平回归正常水平。
对于本领域的技术人员来说,可以根据以上的技术方案和构思,作出各种相应的改变和变形,而所有的这些改变和变形都应该包括在本申请权利要求的保护范围之内。

Claims (31)

  1. 一种系统,该系统用于检测癌症甲基化,其包括:
    样品采集模块,其用于采集受试者样品;
    DNA提取模块,其用于提取纯化所述样品中的DNA;
    建库模块,其用于针对纯化的DNA样品构建用于测序的DNA文库;
    转化模块,其用于用重亚硫酸盐转化所述构建的DNA文库;
    预PCR扩增模块,其用于预PCR扩增所述经重亚硫酸盐转化的DNA文库;
    杂交捕获模块,其用于利用探针组合物对经预PCR扩增的样品进行杂交捕获;
    PCR扩增模块,其用于利用PCR扩增经杂交捕获后的产物;
    测序模块,其用于对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;
    数据分析模块,其用于对测序数据进行分析,确定样本的甲基化水平;
    判读模块,其用于基于所述样本的甲基化水平判读所述患者的患病情况。
  2. 根据权利要求1所述的系统,所述受试者疑似患有癌症。
  3. 根据权利要求1或2所述的系统,采集受试者的样品为血浆样品。
  4. 根据权利要求1-3中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物包括:
    靶向泛癌特异性区域的2个探针,
    靶向癌症特异性区域的n个探针,和
    靶向组织特异性区域的m个探针。
  5. 根据权利要求1-4中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物包括:
    低甲基化探针,其与经重亚硫酸盐转化的不含CG甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交,和
    高甲基化探针,其与经重亚硫酸盐转化的CG全部甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交。
  6. 根据权利要求1-5中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物中的每一个探针的长度为40~60bp。
  7. 根据权利要求1-6中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物中的每一个探针的长度为45~56bp,优选50~56bp,进一步优选50bp。
  8. 根据权利要求1-7中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物中的n个探针靶向癌症特异性区域,
    其中,n为选自1-192中的任意的整数;
    其中,所述癌症特异性区域选自Seq ID No.:1-62中的任意。
  9. 根据权利要求1-8中任一项所述的系统,在所述杂交捕获模块中使用的探针组合物中的m个探针靶向所述组织特异性区域,
    其中,m为选自1-44中的任意的整数;
    其中,所述组织特异性区域选自Seq ID No.:65-83中的任意。
  10. 根据权利要求5所述的系统,在所述杂交捕获模块中,所述低甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:84-180中的任意,靶向泛癌特异性区域的探针Seq ID No.:181-182中的任意,和靶向组织特异性区域的探针Seq ID No.:183-204中的任意。
  11. 根据权利要求5所述的系统,在所述杂交捕获模块中,所述高甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:205-301中的任意,靶向泛癌特异性区域的探针Seq ID No.:302-303中的任意,和靶向组织特异性区域的探针Seq ID No.:304-325中的任意。
  12. 根据权利要求1-11中任一项所述的系统,所述判读模块包括:
    (1)泛癌判读模块,其用于比对泛癌特异性区域数据库,并进行判读以确认受试者是否患有癌症;
    (2)癌症判读模块,其用于比对癌症特异性区域数据库,并进行判读以进一步确认受试者患有的癌症为几种疑似癌症中的一种;和
    (3)组织特异性判读模块,比对组织特异性区域数据库,并进行判读以确认受试者患癌的部位。
  13. 根据权利要求12所述的系统,所述泛癌判读模块包括进行如下判读:判断所述泛癌特异性区域Seq ID No.:63的甲基化水平是否大于等于 55%,并且判断所述泛癌特异性区域Seq ID No.:64的甲基化水平是否大于等于60%,如果Seq ID No.:63的甲基化水平大于等于55%且Seq ID No.:64的甲基化水平大于等于60%,则判读所述患者患有癌症。
  14. 根据权利要求12所述的系统,所述癌症判读模块包括进行如下判读:如果在靶向所述癌症特异性区域的n个探针中,n1个探针靶向的区域的甲基化水平大于等于各自阈值,且n1/n≥20%,优选n1/n≥30%,则判读患者患有组织特异性癌症中的任意一种。
  15. 根据权利要求12所述的系统,所述组织特异性判读模块包括进行如下判读:如果在靶向所述组织特异性区域的m个探针中m1个探针靶向的区域的甲基化水平大于等于各自阈值,则进一步分析大于等于各自阈值的m1个探针所靶向的组织并计数每一个组织大于等于阈值的探针的个数,判读认为患者罹患癌症的组织是甲基化水平大于等于阈值的探针个数最多的组织。
  16. 一种受试者癌症体外检测方法,包括以下步骤:
    采集受试者样品;
    提取纯化所述样品中的DNA;
    针对纯化的DNA样品构建用于测序的DNA文库;
    用重亚硫酸盐转化所述构建的DNA文库;
    预PCR扩增所述经重亚硫酸盐转化的DNA文库;
    利用探针组合物对经预PCR扩增的样品进行杂交捕获;
    利用PCR扩增经杂交捕获后的产物;
    对PCR扩增后的经杂交捕获后的产物进行高通量二代测序;
    对测序数据进行分析,确定样本的甲基化水平;
    基于所述样本的甲基化水平判读所述患者的患病情况。
  17. 根据权利要求16所述的方法,所述受试者疑似患有癌症。
  18. 根据权利要求16或17所述的方法,采集受试者的样品为血浆样品。
  19. 根据权利要求16-18中任一项所述的方法,所述转化为使用重亚硫酸盐处理。
  20. 根据权利要求16-18中任一项所所述的方法,所述探针组合物包括:
    靶向泛癌特异性区域的2个探针,
    靶向癌症特异性区域的n个探针,和
    靶向组织特异性区域的m个探针。
  21. 根据权利要求16-20中任一项所所述的方法,所述探针组合物包括:
    低甲基化探针,其与经重亚硫酸盐转化的不含CG甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交,和
    高甲基化探针,其与经重亚硫酸盐转化的CG全部甲基化的所述癌症特异性区域、泛癌特异性区域、以及组织特异性区域杂交。
  22. 根据权利要求16-21中任一项所所述的方法,所述探针组合物中的每一个探针的长度为40~60bp。
  23. 根据权利要求16-22中任一项所所述的方法,所述探针组合物中的每一个探针的长度为45~56bp,优选50~56bp,进一步优选50bp。
  24. 根据权利要求16-23中任一项所所述的方法,所述探针组合物中的n个探针靶向癌症特异性区域,
    其中,n为选自1-192中的任意的整数;
    其中,所述癌症特异性区域选自Seq ID No.:1-62中的任意。
  25. 根据权利要求16-24中任一项所所述的方法,所述探针组合物中的m个探针靶向所述组织特异性区域,
    其中,m为选自1-44中的任意的整数;
    其中,所述组织特异性区域选自Seq ID No.:65-83中的任意。
  26. 根据权利要求21所述的方法,所述低甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:84-180中的任意,靶向泛癌特异性区域的探针Seq ID No.:181-182中的任意,和靶向组织特异性区域的探针Seq ID No.:183-204中的任意。
  27. 根据权利要求21所述的方法,所述高甲基化探针包括靶向癌症特异性区域的探针Seq ID No.:205-301中的任意,靶向泛癌特异性区域的探针Seq ID No.:302-303中的任意,和靶向组织特异性区域的探针Seq ID No.:304-325中的任意。
  28. 根据权利要求16-27中任一项所所述的方法,所述判读包括以下步骤:
    (1)比对泛癌特异性区域数据库,并进行判读以确认受试者是否患有癌症;
    (2)比对癌症特异性区域数据库,并进行判读以确认受试者患有的癌症为几种疑似癌症中的一种;
    (3)比对组织特异性区域数据库,并进行判读以确认受试者患癌的部位。
  29. 根据权利要求28所述的方法,所述步骤(1)包括进行如下判读:判断所述泛癌特异性区域Seq ID No.:63的甲基化水平是否大于等于55%,并且判断所述泛癌特异性区域Seq ID No.:64的甲基化水平是否大于等于60%,如果Seq ID No.:63的甲基化水平大于等于55%且Seq ID No.:64的甲基化水平大于等于60%,则判读所述患者患有癌症。
  30. 根据权利要求28所述的方法,所述步骤(2)包括进行如下判读:如果在靶向所述癌症特异性区域的n个探针中,n1个探针靶向的区域的甲基化水平大于等于各自阈值,且n1/n≥20%,优选n1/n≥30%,则判读患者患有组织特异性癌症中的任意一种,再根据模式识别判读各癌种的可能性。
  31. 根据权利要求28所述的方法,所述步骤(3)包括进行如下判读:如果在靶向所述组织特异性区域的m个探针中m1个探针靶向的区域的甲基化水平大于等于各自阈值,则进一步分析大于等于各自阈值的m1个探针所靶向的组织并计数每一个组织大于等于阈值的探针的个数,判读认为患者罹患癌症的组织是甲基化水平大于等于阈值的探针个数最多的组织。
PCT/CN2021/077065 2020-02-25 2021-02-20 一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法 WO2021169875A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202180016643.3A CN115176034A (zh) 2020-02-25 2021-02-20 一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010116119.0A CN112662760A (zh) 2020-02-25 2020-02-25 一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法
CN202010116119.0 2020-02-25

Publications (1)

Publication Number Publication Date
WO2021169875A1 true WO2021169875A1 (zh) 2021-09-02

Family

ID=75402723

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/077065 WO2021169875A1 (zh) 2020-02-25 2021-02-20 一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法

Country Status (2)

Country Link
CN (2) CN112662760A (zh)
WO (1) WO2021169875A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114317740A (zh) * 2021-11-24 2022-04-12 博尔诚(北京)科技有限公司 用于胃癌筛查的标志物、探针组合物及其应用
CN114540497A (zh) * 2022-02-25 2022-05-27 博尔诚(北京)科技有限公司 用于膀胱癌筛查的标志物、探针组合物及其应用
CN115274124A (zh) * 2022-07-22 2022-11-01 江苏先声医学诊断有限公司 一种基于数据驱动的肿瘤早筛靶向Panel和分类模型的动态优化方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114058681A (zh) * 2021-12-01 2022-02-18 大连晶泰生物技术有限公司 一种基于目标区域捕获的甲基化突变检测方法及试剂盒
CN114292911A (zh) * 2021-12-06 2022-04-08 上海锐翌生物科技有限公司 用于肠癌早筛的组合物、试剂盒及其应用

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102796808A (zh) * 2011-05-23 2012-11-28 深圳华大基因科技有限公司 甲基化高通量检测方法
CN103103624A (zh) * 2011-11-15 2013-05-15 深圳华大基因科技有限公司 高通量测序文库的构建方法及其应用
CN103806111A (zh) * 2012-11-15 2014-05-21 深圳华大基因科技有限公司 高通量测序文库的构建方法及其应用
CN107164535A (zh) * 2017-07-07 2017-09-15 沈阳宁沪科技有限公司 一种无创高通量甲基化结肠癌诊断、研究和治疗方法
CN107541791A (zh) * 2017-10-26 2018-01-05 中国科学院北京基因组研究所 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102796808A (zh) * 2011-05-23 2012-11-28 深圳华大基因科技有限公司 甲基化高通量检测方法
CN103103624A (zh) * 2011-11-15 2013-05-15 深圳华大基因科技有限公司 高通量测序文库的构建方法及其应用
CN103806111A (zh) * 2012-11-15 2014-05-21 深圳华大基因科技有限公司 高通量测序文库的构建方法及其应用
CN107164535A (zh) * 2017-07-07 2017-09-15 沈阳宁沪科技有限公司 一种无创高通量甲基化结肠癌诊断、研究和治疗方法
CN107541791A (zh) * 2017-10-26 2018-01-05 中国科学院北京基因组研究所 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114317740A (zh) * 2021-11-24 2022-04-12 博尔诚(北京)科技有限公司 用于胃癌筛查的标志物、探针组合物及其应用
CN114317740B (zh) * 2021-11-24 2024-01-23 博尔诚(北京)科技有限公司 用于胃癌筛查的标志物、探针组合物及其应用
CN114540497A (zh) * 2022-02-25 2022-05-27 博尔诚(北京)科技有限公司 用于膀胱癌筛查的标志物、探针组合物及其应用
CN114540497B (zh) * 2022-02-25 2024-02-27 博尔诚(北京)科技有限公司 用于膀胱癌筛查的标志物、探针组合物及其应用
CN115274124A (zh) * 2022-07-22 2022-11-01 江苏先声医学诊断有限公司 一种基于数据驱动的肿瘤早筛靶向Panel和分类模型的动态优化方法
CN115274124B (zh) * 2022-07-22 2023-11-14 江苏先声医学诊断有限公司 一种基于数据驱动的肿瘤早筛靶向Panel和分类模型的动态优化方法

Also Published As

Publication number Publication date
CN115176034A (zh) 2022-10-11
CN112662760A (zh) 2021-04-16

Similar Documents

Publication Publication Date Title
WO2021169875A1 (zh) 一种癌症基因甲基化检测系统和在该系统在中执行的癌症体外检测方法
WO2021128519A1 (zh) Dna甲基化生物标志物组合、检测方法和试剂盒
WO2021180106A1 (zh) 一种检测消化道5种肿瘤的探针组合物
WO2021180105A1 (zh) 一种检测常见两性癌症的探针组合物
WO2021185274A1 (zh) 一种检测6种中国高发癌症的探针组合物
EP2891720B1 (en) Method for screening cancer
WO2022143396A1 (zh) 乳腺肿瘤特异性甲基化检测的试剂盒
CN109609629A (zh) 用于检测肝癌的组合物及其用途
WO2021169874A1 (zh) 一种检测3种管腔性器官肿瘤的探针组合物
WO2017223216A1 (en) Compositions and methods for diagnosing lung cancers using gene expression profiles
WO2021175284A1 (zh) 检测3种实质性器官肿瘤的探针组合物
CN109055555B (zh) 一种肺癌早期转移诊断标志物及其试剂盒和应用
CN107630093B (zh) 用于诊断肝癌的试剂、试剂盒、检测方法及用途
CN114182022A (zh) 一种基于cfDNA碱基突变频率分布检测肝癌特异突变的方法
CN106947805A (zh) 基于ARMS‑PCR法检测人外周血游离DNA中septin9基因甲基化的荧光PCR方法、试剂盒及体系
WO2021185275A1 (zh) 一种检测11种癌症的探针组合物
CN111826446A (zh) 用于膀胱癌早期筛查与辅助诊断的引物、探针和试剂盒
US11535897B2 (en) Composite epigenetic biomarkers for accurate screening, diagnosis and prognosis of colorectal cancer
CN116219020A (zh) 一种甲基化内参基因及其应用
CN114540497B (zh) 用于膀胱癌筛查的标志物、探针组合物及其应用
CN114507734B (zh) 用于甲状腺癌筛查的标志物、探针组合物及其应用
CN114703281B (zh) 用于睾丸癌筛查的标志物、探针组合物及其应用
WO2024001668A1 (zh) 用于检测肺结节良恶性的甲基化分子标记物及其应用
WO2023082139A1 (zh) 一种用于诊断肝癌的核酸及蛋白检测试剂盒
CN114369664B (zh) 用于胰腺癌筛查的标志物、探针组合物及其应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21760248

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21760248

Country of ref document: EP

Kind code of ref document: A1