CN117551762B - DNA methylation site combination as colorectal tumor marker and application thereof - Google Patents

DNA methylation site combination as colorectal tumor marker and application thereof Download PDF

Info

Publication number
CN117551762B
CN117551762B CN202311361229.3A CN202311361229A CN117551762B CN 117551762 B CN117551762 B CN 117551762B CN 202311361229 A CN202311361229 A CN 202311361229A CN 117551762 B CN117551762 B CN 117551762B
Authority
CN
China
Prior art keywords
methylation
colorectal
dna methylation
sites
combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311361229.3A
Other languages
Chinese (zh)
Other versions
CN117551762A (en
Inventor
张道允
巩子英
仇如梦
叶宗辉
孙永华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiaxing Yunying Medical Inspection Co ltd
Original Assignee
Jiaxing Yunying Medical Inspection Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiaxing Yunying Medical Inspection Co ltd filed Critical Jiaxing Yunying Medical Inspection Co ltd
Priority to CN202311361229.3A priority Critical patent/CN117551762B/en
Publication of CN117551762A publication Critical patent/CN117551762A/en
Application granted granted Critical
Publication of CN117551762B publication Critical patent/CN117551762B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Pathology (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Medical Informatics (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Immunology (AREA)
  • Wood Science & Technology (AREA)
  • Public Health (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The embodiments of the present specification provide a combination of DNA methylation sites as colorectal tumor markers, a detection reagent for the combination of DNA methylation sites, and the use of the combination of DNA methylation sites or the detection reagent thereof in the preparation of a kit for detecting colorectal tumors or predicting the risk of colorectal tumor diseases. The DNA methylation site combination disclosed by the specification has good sensitivity and specificity, shows a significant difference in methylation level between a known colorectal tumor patient and a known non-colorectal tumor patient, can be used as a marker for detecting colorectal tumor, predicting colorectal tumor disease risk and the like, and can also be used for designing diagnostic reagents or kits. The present specification also provides devices, kits for detecting colorectal tumors or predicting the risk of colorectal tumor development.

Description

DNA methylation site combination as colorectal tumor marker and application thereof
Technical Field
The specification relates to the biotechnology field, in particular to a DNA methylation site combination as a colorectal tumor marker and application thereof.
Background
Colorectal tumors (Colorectal tumor) are tumors that occur in the colon (large intestine) and rectal sites, including colorectal carcinoma, colorectal adenoma, and the like. Colon adenomas are benign tumors originating from colorectal mucosal gland epithelium, which are closely related to the occurrence of colorectal cancer and are considered to be a precancerous lesion.
Along with the change of people's life habits, the incidence and mortality of colorectal tumors, especially colorectal cancers, are obviously increased. Early diagnosis and treatment of colorectal tumors (e.g., polypectomy) can prevent the development of malignant tumors, reducing mortality. Thus, detecting colorectal tumors and/or predicting the risk of developing colorectal tumors is an effective means of improving patient survival.
Currently, fecal occult blood tests and colonoscopy are commonly used for diagnosis of colorectal tumors. Among other things, fecal occult blood tests are susceptible to food, drugs, and other factors, potentially leading to false positive results. The sensitivity of fecal occult blood test is between 30-80%. Colonoscopy, however, is an invasive examination in which tumor specimens are removed under a colonoscope for pathological examination activity, and there are various contraindications such as severe heart disease, cardiopulmonary insufficiency, acute diarrhea, etc. In addition, colonoscopy is only between 60-70% sensitive. Thus, there is a need to propose biomarkers of higher sensitivity and specificity, for example, for achieving a method of detecting colorectal tumors and a method of predicting the risk of colorectal tumor disease with greater applicability.
Disclosure of Invention
One or more embodiments of the present specification provide a DNA methylation site combination as a biomarker for detecting colorectal tumors or predicting colorectal tumor risk of developing, characterized in that the DNA methylation site combination comprises one or more of the following group: a locus NDRG4_40 with chromosome coordinates of chr16:58497406 on the NDRG4 gene; the chromosomal coordinate at position THBD _102 of chr20:23031082 on THBD gene; a locus WIF1_68 with chromosome coordinates of chr12:65515031 on the WIF1 gene; locus SDC2-2_56 with chromosome coordinates of chr8:97505785 on SDC2-2 gene; position DNAAF9_41 on DNAAF gene with chromosome coordinates of chr20: 3388892; a locus LIFR_42 with chromosome coordinates of chr5:38557321 on the LIFR gene; a locus ZNF304_71 with chromosome coordinates of chr19:57862624 on the ZNF304 gene; wherein the chromosomal coordinates of the corresponding sites are derived from the human reference genome GRCh37.
In some embodiments, the DNA methylation site combinations include ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9_41, lifr_42, and znf304_71.
In some embodiments, the DNA methylation site combination consists of ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9 _9_41, lifr_42, and znf304_71.
In some embodiments, the detection reagent comprises a primer set for amplifying a combination of DNA methylation sites, wherein: the primer pair for amplifying NDRG4_40 is shown as SEQ ID NO. 1 and SEQ ID NO. 2; the primer pair for amplifying THBD _102 is shown as SEQ ID NO. 3 and SEQ ID NO. 4; the primer pair for amplifying WIF1_68 is shown as SEQ ID NO. 5 and SEQ ID NO. 6; primer pairs for amplifying SDC2-2_56 are shown as SEQ ID NO. 7 and SEQ ID NO. 8; the primer pair for amplifying DNAAF9_41 is shown as SEQ ID NO. 9 and SEQ ID NO. 10; the primer pair for amplifying LIFR_42 is shown as SEQ ID NO. 11 and SEQ ID NO. 12; the primer pair for amplifying ZNF304_71 is shown as SEQ ID NO. 13 and SEQ ID NO. 14.
In some embodiments, the method of detecting a colorectal tumor or predicting a risk of a colorectal tumor comprising: obtaining the methylation level of said combination of DNA methylation sites in a biological sample of a subject; based on the methylation levels of the combination of DNA methylation sites, a screening model is used to assess whether the subject is likely to have a colorectal tumor or the subject's risk of developing a colorectal tumor.
In some embodiments, the screening model is a model based on methylation thresholds of the DNA methylation site combinations.
In some embodiments, the evaluating comprises: for each DNA methylation site in the DNA methylation site combination, comparing the methylation rate of the DNA methylation site to a methylation threshold value corresponding to the DNA methylation site, determining the number of positive sites of the DNA methylation site combination; an evaluation result is obtained based on the number of positive sites, wherein the number of positive sites being ≡1 indicates that the subject may have a colorectal tumor or that the subject is at risk of developing a colorectal tumor.
In some embodiments, the methylation threshold of the DNA methylation site is determined by the following method: obtaining a training sample set comprising known methylation rates of the DNA methylation sites for colorectal tumor patients and non-colorectal tumor patients; analyzing the training sample set using ROC curves to determine cut-off values for distinguishing the colorectal tumor patient from the non-colorectal tumor patient, the cut-off values being used as methylation thresholds for the DNA methylation sites; wherein the cut-off value is selected from methylation rates with specificity of 95% -100%.
In some embodiments, the methylation threshold of ndrg4_40 is 0.1857; THBD _102 has a methylation threshold of 0.1094; the methylation threshold of wif1_68 is 0.2983; the methylation threshold of SDC2-2_56 is 0.0566; the methylation threshold of DNAAF9_41 is 0.0172; LIFR_42 has a methylation threshold of 0.0407; the methylation threshold of znf304_71 is 0.0959.
In some embodiments, the colorectal tumor includes colorectal cancer and colorectal adenoma.
In some embodiments, the biological sample is from a colorectal lavage fluid of the subject.
One or more embodiments of the present disclosure also provide a kit for detecting a colorectal tumor or predicting the risk of a colorectal tumor, the kit comprising the detection reagent as described previously.
One or more embodiments of the present specification also provide an apparatus for detecting colorectal tumors or predicting colorectal tumor risk, the apparatus comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized in that the processor performs the following method when the processor executes the program: obtaining a methylation level of a combination of DNA methylation sites in a biological sample of a subject; based on the methylation levels of the combination of DNA methylation sites, a screening model is used to assess whether the subject is likely to have a colorectal tumor or the subject's risk of developing a colorectal tumor.
Drawings
The present specification will be further elucidated by way of example embodiments, which will be described in detail by means of the accompanying drawings. The embodiments are not limiting, in which like numerals represent like structures, wherein:
FIG. 1 is a diagram of an application scenario of a system for detecting colorectal tumors or predicting colorectal tumor risk according to some embodiments of the present disclosure;
FIG. 2 is a schematic diagram of an architecture of a computing device shown in accordance with some embodiments of the present description;
FIG. 3 is a block diagram of a system for detecting colorectal tumors or predicting colorectal tumor risk according to some embodiments of the present disclosure;
FIG. 4 is a flow chart of a method of detecting colorectal neoplasms or predicting risk of colorectal neoplasms according to some embodiments of the present disclosure;
FIG. 5 is a schematic diagram of a flow chart for determining methylation thresholds for DNA methylation sites according to some embodiments of the present disclosure;
FIG. 6 is a graph of ROC of methylation site NDRG4_40 in a training sample set according to some embodiments of the present description;
FIG. 7 is a graph of ROC of methylation site SDC2_101 in a training sample set, according to some embodiments of the present description;
FIG. 8 is a graph of ROC of methylation site SEP9-1_69 in a training sample set, according to some embodiments of the present description;
FIG. 9 is a graph of ROC of methylation site SEP9-2_61 in a training sample set, according to some embodiments of the present disclosure;
FIG. 10 is a graph of ROC of methylation site SFRP 1-40 in a training sample set according to some embodiments of the present disclosure;
FIG. 11 is a graph of ROC of methylation site TFPI 2-117 in a training sample set, according to some embodiments of the present disclosure;
FIG. 12 is a graph of ROC of methylation site THBD _102 in a training sample set, according to some embodiments of the present description;
FIG. 13 is a graph of ROC of methylation site BCAT1_38 in a training sample set, according to some embodiments of the present disclosure;
FIG. 14 is a graph of ROC of methylation site WIF1_68 in a training sample set, according to some embodiments of the present disclosure;
FIG. 15 is a graph of ROC of methylation site WIF1_88 in a training sample set, according to some embodiments of the present disclosure;
FIG. 16 is a graph of ROC of methylation site WNT5A_47 in a training sample set, according to some embodiments of the present disclosure;
FIG. 17 is a ROC graph of methylation site TFPI2-2_91 in a training sample set, according to some embodiments of the present disclosure;
FIG. 18 is a ROC graph of methylation site SDC2-2_56 in a training sample set, according to some embodiments of the present disclosure;
FIG. 19 is a ROC graph of methylation site SDC2-2_73 in a training sample set, according to some embodiments of the present disclosure;
FIG. 20 is a graph of ROC of methylation site DNAAF9_41 in a training sample set, according to some embodiments of the present description;
FIG. 21 is a graph of ROC of methylation site LIFR_42 in a training sample set, according to some embodiments of the present disclosure;
fig. 22 is a graph of ROC for methylation site znf304_71 in a training sample set according to some embodiments of the present disclosure.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present specification, and it is possible for those of ordinary skill in the art to apply the present specification to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
It will be appreciated that "system," "apparatus," "unit" and/or "module" as used herein is one method for distinguishing between different components, elements, parts, portions or assemblies at different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.
As used in this specification and the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
A flowchart is used in this specification to describe the operations performed by the system according to embodiments of the present specification. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.
DNA methylation is closely related to the occurrence and development of cancers, methylation processes of related gene loci are critical events occurring in very early stages of tumors, and tumor methylation patterns have the characteristics of cancer species/tissue/space-time specificity. DNA methylation occurs primarily in CpG islands in the gene promoter region, leading to reduced expression of genes, particularly cancer suppressor genes, by blocking transcription factor binding, leading to the occurrence of tumor events. The inventors have found that a study of the progress of DNA methylation has important implications for the identification/prediction of colorectal tumour development. The specification proposes that DNA methylation site combination can be used as a colorectal tumor marker to detect colorectal tumor and predict colorectal tumor disease risk. The DNA methylation site combined detection sample can be widely derived from body fluid, cells, tissues and organs of a subject, particularly colorectal lavage fluid of the subject, and can be used for realizing accurate, rapid and noninvasive colorectal tumor detection and disease risk prediction.
The present specification provides a method of detecting colorectal neoplasms or predicting the risk of colorectal neoplasms, and systems and devices thereof, that assess a subject's likelihood of developing colorectal neoplasms or risk of developing colorectal cancer based on the relevant methylation levels of the aforementioned combination of DNA methylation sites.
The present specification also provides a reagent for detecting a combination of DNA methylation sites, including a reagent for amplifying the aforementioned combination of DNA methylation sites, which can be widely used in various aspects including detection of colorectal tumors, prediction of risk of colorectal tumor diseases, and the like.
The present specification also provides a kit for detecting colorectal neoplasms or predicting the risk of colorectal neoplasms.
The present specification also provides related uses of the DNA methylation site combinations as biomarkers, and related uses of the detection reagents of the DNA methylation site combinations. Such uses include, but are not limited to, use in the preparation of kits for detecting colorectal tumors, use in the preparation of kits for predicting risk of colorectal tumor disease, and the like, which allow for both and improved screening, prediction, sensitivity and specificity of screening.
According to one aspect of the present description, a system for detecting colorectal neoplasms or predicting the risk of colorectal neoplasms is provided. Fig. 1 is a diagram of an application scenario of a system for detecting colorectal tumors or predicting colorectal tumor risk according to some embodiments of the present disclosure. In some embodiments, the scenario 100 includes a processing device 110, a storage device 120, and a network 130.
The processing device 110 is used for processing data and/or information. In some embodiments, processing device 110 may obtain data and/or information from storage device 120 or other components of scene 100 (e.g., user terminal 140, detection device 160) and execute program instructions based on such information and/or data to perform one or more of the functions described herein. For example, processing device 110 may obtain a training sample set from storage device 120 and construct a screening model based on the training sample set. For another example, the processing device 110 may obtain methylation level related information for a combination of DNA methylation sites of the subject biological sample 150 measured by the detection device 160 and invoke a screening model stored at the storage device 120 to process the methylation level related information to assess whether the subject has a likelihood of a colorectal tumor or a risk of developing a colorectal tumor. In some embodiments, the processing device 110 may be a server or a central processor.
The storage device 120 is used to store data and/or information. In some embodiments, the storage device 120 may store data and/or information obtained from the processing device 110 or other components of the scene 100 (e.g., the user terminal 140, the detection device 160). For example, the storage device 120 may store the screening model for invocation by the processing device 110. For another example, the storage device 120 may obtain and store methylation level related information for a combination of DNA methylation sites of the subject biological sample 150 from the detection device 160. As another example, the storage device 120 may receive and store information uploaded by the user terminal 140, such as identity information of the subject, and the like.
The network 130 is used to provide a channel for information exchange. In some embodiments, information may be exchanged between processing device 110 and other components of scene 100 (e.g., storage device 120, user terminal 140, detection device 160) via network 130. For example, processing device 110 may receive data in storage device 120 over network 130. For another example, information regarding the methylation level of the combination of DNA methylation sites of the subject biological sample 150 measured by the detection device 160 can be transmitted to the processing device 110 over a network. In some embodiments, the network 130 may be any one or more of a wired network or a wireless network. For example, network 130 may include a cable network, a fiber optic network, and the like. In some embodiments, the network 130 may be a point-to-point, shared, centralized, etc. variety of topologies or a combination of topologies. In some embodiments, network 130 may include one or more network access points. For example, one or more components of the scenario 100 may be connected to the network 130 to exchange data and/or information through access points, such as base stations and/or one or more network switching points.
In some embodiments, the scenario 100 further comprises a user terminal 140. The user terminal 140 is used to implement services provided by the scenario 100 to a user. For example, a user may send methylation level related information for a combination of DNA methylation sites of a biological sample of a subject to the processing device 110 via the user terminal 140. For another example, the user may receive the evaluation result of the subject transmitted by the processing device 110 through the user terminal 140. For another example, the user may send the clinical test results of the subject to the processing device 110 through the user terminal 140 to cause the processing device 110 to update the training sample set based on the clinical test results of the subject and to iterate through the screening model. In some embodiments, the user terminal 140 may comprise one or any combination of a smart phone 140-1, a tablet computer 140-2, a laptop computer 140-3, etc., or other input and/or output enabled devices.
In some embodiments, the scene 100 further includes a detection device 160. The detection device 160 is used to detect the methylation level of a combination of DNA methylation sites of the biological sample 150. As an example, the detection device may comprise means to implement one or more of the following methods: WGBS, RRBS, oxBS-seq, methylCap-seq, MBD-seq, meDIP-seq, HPLC, MSRF, MASP, methylation chip method (e.g.805 k chip), pyrosequencing method, dPCR and MS-PCR.
According to yet another aspect of the present description, a computing device is provided. FIG. 2 is a schematic diagram of an architecture of a computing device, shown in accordance with some embodiments of the present description. As shown in fig. 2, computing device 200 includes a processor 210, a memory 220, an input/output interface 230, and a communication port 240. In some embodiments, computing device 200 may implement processing device 110 and/or storage device 120. For example, the processing device 110 may be implemented on the computing device 200, and the computing device 200 is configured to perform the functions of the processing device 110 described herein. In some embodiments, the means for detecting colorectal tumor or predicting colorectal tumor risk may be implemented in the computing device 200.
The processor 210 is configured to execute computing instructions (program code) and perform the functions of the processing device 110 described herein. Computing instructions may include programs, objects, components, data structures, procedures, modules, and functions (functions refer to particular functions described in this disclosure). For example, the processor 210 may process instructions entered by the user to detect colorectal tumors or predict the likelihood of colorectal tumor disease. In some embodiments, computing device 200 may include one or more processors 210; processor 210 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), any circuit and processor capable of performing one or more functions, and the like, or any combination.
Memory 220 is used to store data/information obtained from any of the components of scene 100. In some embodiments, memory 220 may include Random Access Memory (RAM), read Only Memory (ROM), and the like, or any combination thereof.
The input-output interface 230 is used to input or output signals, data, or information. In some embodiments, the input-output interface 230 may be used to enable interactive behavior of a user (e.g., subject, operator, etc.) with the processing device 210. In some embodiments, the user may input relevant information for the subject (e.g., methylation level related information for a combination of DNA methylation sites, as well as basic identity information for name, age, etc.) via input output interface 230. In some embodiments, the input-output interface 230 may include an input device and an output device. Such as a keyboard, mouse, display device, microphone, speaker, etc.
The communication port 240 is for connecting to the network 130 for data communication. The connection may be a wired connection, a wireless connection, or a combination of both, such as a connection through cable, fiber optic cable, mobile network, WIFI, WLAN, or bluetooth, among others. In some embodiments, the communication port 240 may be a standardized port, such as RS232, RS485, and the like. In some embodiments, communication port 240 may be a specially designed port.
Fig. 3 is a block diagram of a system for detecting colorectal tumors or predicting colorectal tumor risk according to some embodiments of the present disclosure. As shown in fig. 3, a system 300 for detecting colorectal tumors or predicting colorectal tumor-bearing risks includes an acquisition module 310, an analysis module 320, and a determination module 330.
The acquisition module 310 is used to acquire the methylation level of a combination of DNA methylation sites in a biological sample of a subject, which may include, for example, one or more of ndrg4_40, THBD _102, wif1_68, SDC2-2_56, DNAAF9_41, lifr_42, and znf304_71.
In some embodiments, the acquisition module 310 may include a detection unit and an information processing unit. The detection unit is used for carrying out DNA methylation detection on a biological sample of a subject. The detection unit may, for example, comprise means for implementing one or more of the following methods: WGBS, RRBS, oxBS-seq, methylCap-seq, MBD-seq, meDIP-seq, HPLC, MSRF, MASP, methylation chip method (e.g.805 k chip), pyrosequencing method, dPCR and MS-PCR. The information processing unit is used for processing the detection data of the detection unit to obtain methylation level related information of the DNA methylation site combination of the biological sample of the subject.
The analysis module 320 is for assessing whether the subject is likely to have a colorectal tumor or the risk of the subject developing a colorectal tumor using a screening model based on the methylation level of the combination of DNA methylation sites of the biological sample of the subject. In some embodiments, analysis module 320 may be used to evaluate using a model based on methylation thresholds for combinations of DNA methylation sites. In some embodiments, the analysis module 320 may be used to evaluate using a model constructed based on a machine learning algorithm or a deep learning algorithm.
The determination module 330 is used to obtain a training sample set comprising methylation rates of DNA methylation sites for known colorectal tumor patients and non-colorectal tumor patients; and analyzing the training sample set using the ROC curve to determine a cutoff value for distinguishing colorectal tumor patients from non-colorectal tumor patients, the cutoff value being used as a methylation threshold for the DNA methylation site.
For more details regarding the implementation of the functions of the modules of the system 300, reference may be made to fig. 4, 5 and their associated descriptions.
It should be appreciated that the system 300 for detecting colorectal tumors or predicting colorectal tumor risk and its modules shown in fig. 3 may be implemented in a variety of ways. For example, in some embodiments, the system 300 and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may then be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or special purpose design hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such as provided on a carrier medium such as a magnetic disk, CD or DVD-ROM, a programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system of the present specification and its modules may be implemented not only with hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also with software executed by various types of processors, for example, and with a combination of the above hardware circuits and software (e.g., firmware).
It should be noted that the above description of the system 300 and its modules is for convenience of description only and is not intended to limit the present disclosure to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the principles of the system, various modules may be combined arbitrarily or a subsystem may be constructed in connection with other modules without departing from such principles. In some embodiments, the acquisition module, analysis module, and determination module disclosed in fig. 3 may be different modules in a system, or may be one module to implement the functions of two or more modules described above. For example, each module may share one memory module, or each module may have a respective memory module. Such variations are within the scope of the present description.
According to yet another aspect of the present description, a method of detecting a colorectal tumor or predicting the risk of a colorectal tumor is provided. Fig. 4 is a flow chart of a method of detecting colorectal tumors or predicting colorectal tumor risk according to some embodiments of the present disclosure. As shown in fig. 4, flow 400 includes steps 401 and 403. In some embodiments, at least a portion of the steps in flow 400 (e.g., step 401 and/or step 403) may be performed by a computing device (e.g., computing device 200 shown in fig. 2, processing device 110 shown in fig. 1). For example, at least a portion of the steps in flowchart 400 may be implemented as one instruction (e.g., an application) stored in storage device 120, memory 220. The processing device 110 of fig. 1, the processor 210 and/or the modules of fig. 2 may execute the instructions, and when executing the instructions, the processing device 110, the processor 210, and/or the modules may be configured to perform the flow 400. The operation of the process shown below is for illustrative purposes only. In some embodiments, the process 400 may be accomplished with one or more additional operations not described and/or one or more operations not described. In addition, the order in which the processes illustrated in FIG. 4 and described below are operated is not intended to be limiting.
Step 401, obtaining a methylation level of a combination of DNA methylation sites in a biological sample of a subject. In some embodiments, step 401 may be performed by a computing device (e.g., processing device 110 of fig. 1, acquisition module 310 of fig. 3).
In some embodiments, the methylation level of the combination of DNA methylation sites in a biological sample of a subject having a colorectal tumor can be distinguished from the methylation level of the combination of DNA methylation sites in a biological sample of a non-colorectal tumor subject (or normal subject).
The term "subject" refers to a subject receiving observations, assays, or experiments. In some embodiments, the subject may be a mammal. For example, mammals may include humans, mice, rats, and the like. In some embodiments, the subject may be a human.
The term "biological sample" refers to a composition of organs, tissues, cells and/or body fluids isolated from a subject. In some embodiments, the composition comprises one or more analytes of interest. For example, the target analytes may be nucleic acids, metabolites, and the like. In some embodiments, the biological sample is obtained from a body fluid of the subject. For example, body fluids may include lavage fluid, whole blood, plasma, serum, interstitial fluid, saliva, urine, stool, and the like. In some embodiments, the biological sample may be from a colorectal lavage of the subject.
The DNA methylation site combination includes one or more DNA methylation sites. The term "DNA methylation site" refers to a 5' -position of cytosine in a CpG dinucleotide of genomic DNA to which a methyl group is covalently bound as 5-methylcytosine (5 mC).
The DNA methylation site combinations are suitable for distinguishing between a population of a colorectal tumor from a normal population, and can be used to detect or predict different types of colorectal tumors. In some embodiments, the colorectal tumor comprises colorectal cancer. For example, colorectal cancers may include adenocarcinoma, adenosquamous carcinoma, and undifferentiated carcinoma, as a matter of pathological type; colorectal cancer may include rectal cancer, left-hand colorectal cancer (e.g., left-hand colon cancer, colon descending cancer, or sigmoid colon cancer), and right-hand colorectal cancer (e.g., blind colon cancer, colon ascending cancer, or right-hand colon cancer), depending on the anatomical site. In some embodiments, the colorectal tumor comprises a colorectal adenoma. For example, a straight colon adenoma may include a colon adenoma and a rectal adenoma. In some embodiments, colorectal tumors also include carcinoids (Carcinoid), appendiceal tumors, and colonic stromal tumors, among others.
In some preferred embodiments, colorectal tumors that are suitable for detection or prediction of DNA methylation site combinations include colorectal cancer and colorectal adenoma. In some embodiments, the DNA methylation site combination is suitable for detection or prediction of colorectal cancer, and the stage thereof comprises stage i, stage ii, stage iii, and stage iv. In some embodiments, the DNA methylation site combination is suitable for detection or prediction of colorectal adenomas, the grading of which includes a primary, secondary (e.g., low-grade and/or high-grade), and tertiary.
In some embodiments, the DNA methylation sites of the DNA methylation site combination can be located on colorectal tumor-associated genes (e.g., known or potentially colorectal tumor genes). Non-limiting examples of colorectal tumor-associated genes may include, but are not limited to, NDRG4 (chromosomal coordinate is chr16: 58497369-58497501), SDC2 (chromosomal coordinate is chr8: 97506318-97506450), SDC2-2 (chromosomal coordinate is chr8: 97505730-97505866), SFRP1 (chromosomal coordinate is chr8: 41166970-41167048), TFPI2 (chromosomal coordinate is chr7: 93519985-93520149), TFPI2-2 (chromosomal coordinate is chr7: 93519341-93519462), THBD (chromosomal coordinate is chr20: 23030981-23031129), BCAT1 (chromosomal coordinate is chr12: 25101973-25102116), SEP9-1 (chromosomal coordinate is chr17: 75369564-75369660), SEP9-2 (chromosomal coordinate is chr17: 75369559-75369649), WIF1 (chromosomal coordinate is chr12: 65514965-65515076), WNT5A (chromosomal coordinate is chr3: 55521225-55521340), DNAAF (chromosomal coordinate is chr20: 3388932-3388804), LIFR (chromosomal coordinate is chr5: 38557362-38557228) and ZNF (chromosomal coordinate is Chr19: 57862581-57862691).
The chromosomal coordinate information used herein is derived from the human reference genome GRCh37.
In some embodiments, the combination of DNA methylation sites used as biomarkers for detecting colorectal tumors or predicting risk of colorectal tumor progression may include one or more DNA methylation sites located on NDRG4, SDC2-2, SFRP1, TFPI2-2, THBD, BCAT1, SEP9-2, WIF1, WNT5A, DNAAF9, LIFR, and ZNF 304.
In some embodiments, the combination of DNA methylation sites used as biomarkers for detecting colorectal tumors or predicting risk of colorectal tumor disease may include ndrg4_40 (located on NDRG4, chromosome coordinates chr16: 58497406), sdc2_101 (located on SDC2, chromosome coordinates chr8: 97506415), SEP9-1_69 (located on SEP9-1, chromosome coordinates chr17: 75369592), SEP9-2_61 (on SEP9-2, chromosome coordinates chr17: 75369591), SFRP1_40 (on SFRP1, chromosome coordinates chr8: 41167007), TFPI2_117 (on TFPI2, chromosome coordinates chr7: 93520101), THBD _102 (on THBD, chromosome coordinates chr20: 23031082), BCAT1_38 (on BCAT1, chromosome coordinates chr12: 25102010), WIF1_68 (on WIF1, chromosome coordinates chr12: 65515031), WIF1_88 (on WIF1, chromosome coordinates chr12: 65515051), WNT5A_47 (on WNT5A, chromosome coordinates chr3: 55521260), TFPI2-2_91 (on PI2-2, chromosome coordinates chr7: 93519373), C2-2 (on SDC2-2, chromosome coordinates chr12: 25102010), WIF1_68 (on SDF 1, chromosome coordinates chr12: 65515051), WIF5_47 (on SDR 2: 35, 3: 3735), 3_35 (on SDR 2: 35), 3-20-35 (on SDR 2), 3-35-20-35 (on SDR 2-35), 20-35, and/or 33) (on LI35_35_35) (on SDF 2: 35, 35-35) (on SDF 2: 35), 35-35, 35) (35, 35-35) (on 35) and/g) (on F35, 35) on chr35F 35, 35 (35, 35) 20Y 35 (35) 20Y 35 (31F 35) 31F 35 (31F 35) 31F 47 31 (31 47 31 (31 (31) 31 (31) 31, (, 31, of,.
In some embodiments, the combination of DNA methylation sites as biomarkers for detecting colorectal tumors or predicting risk of colorectal tumor disease may comprise at least 1,2,3,4, 5, or 6 sites in the group of: ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9 _9_41, lifr_42, and znf304_71.
In some preferred embodiments, the combination of DNA methylation sites can include ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9_41, lifr_42, and znf304_71. Alternatively, the combination of DNA methylation sites may also include DNA methylation sites on one or more other colorectal tumor-associated genes.
In some embodiments, the DNA methylation site combination may consist of at least 4, 5, or 6 sites in the following group: ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9 _9_41, lifr_42, and znf304_71. For example, the DNA methylation site combination can consist of SDC2-2_56, znf304_71, ndrg4_40, and lifr_42. For another example, the DNA methylation site combinations can be composed of SDC2-2_56, ZNF304_71, NDRG4_40, THBD _102, and LIFR_42. For another example, the DNA methylation site combination can be made up of ndrg4_40, sdc2-2_56, DNAAF9_41, znf304_71, THBD _102, and lifr_42. In some preferred embodiments, the combination of DNA methylation sites can consist of ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9_41, lifr_42, and znf304_71.
There is a significant correlation between the methylation level of the DNA methylation site combinations provided by some of the examples of the present specification and colorectal tumors. The methylation status of the combination of DNA methylation sites can be quantified and used to measure the methylation level of the combination of DNA methylation sites. The sample containing the DNA methylation site combination can be widely collected from organs, tissues, cells, body fluids and the like of a subject, particularly colorectal lavage fluid of the subject, and the sample collection and detection have high comfort. The DNA methylation site combination can be used as a colorectal tumor marker to detect colorectal tumor, predict colorectal tumor disease risk and the like, and can improve sensitivity and specificity of screening/diagnosis, prediction and evaluation.
In some embodiments, the methylation level of the combination of DNA methylation sites can be obtained by detecting a biological sample from the subject using a detection reagent of the combination of DNA methylation sites. Detection reagents for combinations of DNA methylation sites are used to effect detection of the methylation level of combinations of DNA methylation sites. More on detection reagents for DNA methylation site combinations can be found elsewhere in this specification.
The computing device may implement the execution of step 401 in a variety of ways. In some embodiments, processing device 110 may invoke methylation level related information for a combination of DNA methylation sites of a subject biological sample stored in storage device 120. For example, methylation level related information for a combination of DNA methylation sites of a biological sample of a subject is uploaded by the user terminal 140 to the storage device 120 via the network 130, which the processing device 110 may invoke and retrieve for further analytical evaluation. In some embodiments, processing device 110 may receive methylation level related information for detecting a combination of DNA methylation sites of an obtained biological sample of a subject by detection device 160. For example, the processing device 110 sends detection instructions to a detection device 160 (e.g., a PCR instrument and/or an NGS sequencer), the detection device 160 detects methylation level related information of a DNA methylation site combination from which a biological sample of a subject was obtained based on the detection instructions, and sends the methylation level related information to the processing device 110. In some embodiments, the processing device 110 may obtain methylation level related information for a combination of DNA methylation sites of a biological sample of a subject based on user input.
Step 403, assessing whether the subject is likely to have a colorectal tumor or the risk of the subject developing a colorectal tumor using a screening model based on the methylation levels of the combination of DNA methylation sites in the biological sample of the subject. In some embodiments, step 403 may be performed by a computing device (e.g., processing device 110 of fig. 1, analysis module 320 of fig. 3).
In some embodiments, the screening model may be a model based on methylation thresholds for combinations of DNA methylation sites (or threshold model). The threshold model can divide the type of the biological sample of the subject through threshold judgment, so as to evaluate the possibility of illness or the risk of developing the disease. In some embodiments, the evaluation using the threshold model may include a positive site determination step and a comprehensive evaluation step.
In some embodiments, step 403 further comprises: for each DNA methylation site in the DNA methylation site combination, comparing the methylation rate of the DNA methylation site to the methylation threshold of the corresponding DNA methylation site, and determining the number of positive sites of the DNA methylation site combination.
In some embodiments, the methylation level of a combination of DNA methylation sites can be quantitatively described by the methylation rate, and the manner in which the methylation rate is determined can be set based on the particular methylation detection method.
In some embodiments, the methylation level of the combination of DNA methylation sites of the subject biological sample is detected by methylation conversion, specific amplification, and sequencing. For example, methylation conversion can include converting unmethylated cytosines in a DNA methylation site to thymine using a methylation conversion reagent, with no conversion of methylated cytosines occurring. For each DNA methylation site of the combination of DNA methylation sites, its methylation rate can be determined by the following formula (1):
Methylation ratio= NumC/(NumC + NumT) (1)
Wherein NumC represents the number of reads of cytosine in all sequencing reads (reads) that contain a particular DNA methylation site; numT represents the number of reads of thymine in all sequencing reads that contain a particular DNA methylation site.
Methylation threshold refers to a limit used to evaluate the methylation level of a DNA methylation site. In some embodiments, a single DNA methylation site in the combination of DNA methylation sites has a methylation rate greater than or equal to the methylation threshold value for that DNA methylation site, and the DNA methylation site can be determined to be a positive site. Otherwise, the negative site is the negative site.
For more details on determining the methylation threshold can be seen in fig. 5 and its associated description.
In some embodiments, step 403 further comprises: the evaluation result is obtained based on the number of positive sites of the DNA methylation site combination of the subject biological sample. Wherein, if the number of positive sites is greater than or equal to 1, the subject may be judged to be suffering from colorectal tumor or may be at risk of developing colorectal tumor. Conversely, the likelihood of the subject having a colorectal tumor, or the risk of the subject developing a colorectal tumor, may be excluded. For example, a combination of DNA methylation sites comprising n (> 2) methylation sites, any one or more of which methylation sites have a methylation rate in a sample of the subject that is greater than a corresponding methylation threshold, is indicative of the subject being likely to have or at risk of developing a colorectal tumor.
In some embodiments, the screening model may be a machine learning model or a deep learning model. For example, the machine learning model may include a linear regression model (Linear Regression), a logistic regression model (Logistic Regression), a support vector machine (Support Vector Machines), K-nearest neighbor (K-Nearest Neighbors), naive Bayes (Naive Bayes), and the like; the deep learning model may include an artificial neural network (ARTIFICIAL NEURAL NETWORKS), a convolutional neural network (Convolutional Neural Networks), a recurrent neural network (RecurrentNeural Networks), a Long Short-Term Memory network (Long Short-Term Memory), deep reinforcement learning (Deep ReinforcementLearning), and the like.
In some embodiments, the input to the screening model may be the methylation rate of the combination of DNA methylation sites of the biological sample of the subject, and the output of the screening model may be the probability of the subject having a colorectal tumor or the probability of the subject developing a colorectal tumor.
In some embodiments, the screening model may be trained based on the first training sample and the first label. The first training sample may be a methylation rate of a combination of DNA methylation sites of one or more known colorectal tumor patient samples and a methylation rate of a combination of DNA methylation sites of a non-colorectal tumor patient sample, and the first label may be whether the sample subject corresponding to the first training sample has a colorectal tumor.
The term "known colorectal tumor patient" refers to a subject or individual having clinical symptoms of colorectal tumor and having been clinically diagnosed and validated. The term "non-colorectal tumor patient" refers to a subject or individual that does not suffer from colorectal tumors and is free of disturbances in daily life.
The computing device may implement the execution of step 403 in a variety of ways. In some embodiments, processing device 110 may invoke the screening model stored in storage device 120 and process methylation level related information for the combination of DNA methylation sites of the subject biological sample using the screening model to obtain the evaluation result. In other embodiments, processing device 110 may update the screening model stored in storage device 120 based on user instructions and obtain the evaluation result using the updated screening model. Wherein the processing device 110 may collect methylation level related information of the combination of associated DNA methylation sites of the colorectal cancer population and the normal population from the public or non-public database over the network 130 for updating the training sample set and performing optimization of the screening model. The processing device 110 may also update the training sample set based on user input or based on data/information uploaded by the user terminal 140 and perform optimization of the screening model.
In some embodiments, the process 400 further comprises: based on the evaluation results, the drug is administered to a patient suffering from colorectal tumor. In some embodiments, drugs suitable for administration to colorectal tumor patients include, but are not limited to, oxaliplatin (Oxaliplatin), irinotecan (Irinotecan), 5-fluorouracil (5-FU), calcium folinate, tegafur, anti-EGFR antibody drugs (e.g., cetuximab and joadalimumab), VEGF inhibitors (e.g., bevacizumab), and anti-PD-1/PD-L1 inhibitors, and the like.
It should be noted that the above description of the process 400 is for purposes of illustration and description only, and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 400 will be apparent to those skilled in the art in light of the present description. However, such modifications and variations are still within the scope of the present description.
FIG. 5 is a schematic diagram of a flow chart for determining methylation thresholds for DNA methylation sites according to some embodiments of the present disclosure. As shown in fig. 5, flow 500 includes step 501 and step 503. In some embodiments, the process 500 may be performed by a computing device (e.g., the processing device 110 of fig. 1, the determination module 330 of fig. 3).
Step 501, a training sample set is obtained, the training sample set comprising known methylation rates of DNA methylation sites for colorectal tumor patients and non-colorectal tumor patients.
In some embodiments, known colorectal tumor patients may include colorectal cancer patients and colorectal adenoma patients. Known colorectal cancer patients may be either untreated individuals after diagnosis or treated individuals after diagnosis. In some embodiments, the colorectal cancer patient may be selected from colorectal cancer stage i, colorectal cancer stage ii, colorectal cancer stage iii, and colorectal cancer stage iv patients. In some embodiments, the colorectal adenoma patient may be selected from the group consisting of a primary colorectal adenoma patient, a secondary colorectal adenoma patient, and a tertiary colorectal adenoma patient.
At step 503, the training sample set is analyzed using the ROC curve to determine a cutoff value for distinguishing colorectal tumor patients from non-colorectal tumor patients, with the cutoff value being the methylation threshold of the DNA methylation site.
The cut-off value is a quantifiable value for dividing or distinguishing between colorectal tumor populations and non-colorectal populations. In some embodiments, the cutoff value can be used as a methylation threshold to detect or predict colorectal tumors by measuring the methylation level of a particular combination of DNA methylation sites in a subject.
The term "ROC curve" refers to a curve plotted on the ordinate of experimental sensitivity and on the abscissa of 1-specificity. ROC curves can be used to select the best cut-off value, as well as to evaluate model performance. In some embodiments, ROC curves may be made for individual DNA methylation sites using methylation rate data of a training sample set, and appropriate methylation thresholds determined based on cut-off selection patterns that are tailored to the application requirements.
In some embodiments, the manner in which the cutoff value is selected may affect the effect of the cutoff value in dividing or distinguishing between colorectal tumor populations and non-colorectal populations. In some embodiments, the cut-off value may be selected by using the methylation rate value corresponding to the specificity setting value as the cut-off value, for example, the specificity setting value is 95% -100%. In some embodiments, the cut-off value may be selected by using the methylation rate value corresponding to the sensitivity setting value as the cut-off value, for example, the sensitivity setting value is 95% -100%. In other embodiments, the cut-off value may be selected by using the methylation rate value corresponding to the maximum value of the about log index as the cut-off value. Wherein, the about dengue index may be sensitivity + specificity-1.
In some preferred embodiments, methylation rate values corresponding to specific settings are used as cut-off values in order to reduce overdiagnosis problems caused by the screening model, balancing the specificity and sensitivity of the screening model. For example, the specificity set point is 95%, 96%, 97%, 98%, 99% or 100%.
In some preferred embodiments, the cut-off value may be the methylation rate at which the specificity is 100%. In some embodiments, the methylation threshold for site ndrg4_40 may be 0.1857; the methylation threshold at position THBD _102 can be 0.1094; the methylation threshold for position wif1_68 may be 0.2983; the methylation threshold for site SDC2-2_56 can be 0.0566; the methylation threshold at position DNAAF9_41 can be 0.0172; the methylation threshold for position lifr_42 can be 0.0407; the methylation threshold at position znf304_71 can be 0.0959.
In some embodiments, the screening models provided herein can have a sensitivity of greater than 86%, 88%, 90%, 92%, 94%, 96%, or 98% in detecting colorectal tumors or predicting the risk of colorectal tumor disease. In some embodiments, the screening models provided herein can have a specificity of greater than 90%, 92%, 94%, 96%, 98%, or 99% in detecting colorectal tumors or predicting the risk of colorectal tumor progression.
The computing device may implement execution of the flow 500 in a variety of ways. In some embodiments, processing device 110 may invoke the training sample set stored in storage device 120 and determine the methylation threshold of the DNA methylation site based on a preset cut-off selection pattern. In some embodiments, processing device 110 may invoke the training sample set stored in storage device 120 to redefine the methylation threshold of the DNA methylation site based on instructions of the user modifying the cutoff policy. In some embodiments, processing device 110 may update the screening model stored in storage device 120 based on user instructions and obtain the evaluation result using the updated screening model. Wherein the processing device 110 may collect methylation level related information of the combination of associated DNA methylation sites of colorectal cancer population and normal population from the public or non-public database over the network 130 in real time or periodically for a training sample set in the storage device 120 and use the updated training sample set to optimize the methylation threshold of the DNA methylation sites.
It should be noted that the above description of the process 500 is for purposes of illustration and description only, and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 500 will be apparent to those skilled in the art in light of the present description. However, such modifications and variations are still within the scope of the present description.
According to yet another aspect of the present description, there is provided an apparatus for detecting colorectal neoplasms or predicting colorectal neoplasms risk of developing, the apparatus comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program to effect obtaining a methylation level of a combination of DNA methylation sites in a biological sample of a subject; based on the methylation level of the combination of DNA methylation sites, a screening model is used to assess whether the subject is likely to have a colorectal tumor or the risk of the subject developing a colorectal tumor.
For more details on methods of detecting colorectal tumors or predicting colorectal tumor risk, see fig. 4, fig. 5, and the description related thereto.
According to yet another aspect of the present disclosure, a detection reagent is provided for detecting a combination of DNA methylation sites. The DNA methylation site combinations can be used as biomarkers for detecting colorectal tumors, including one or more of ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9_41, lifr_42, and znf304_71.
In some embodiments, the detection reagent comprises a primer set for amplifying a combination of DNA methylation sites, the primer set for obtaining a specific amplified fragment comprising the combination of DNA methylation sites, and amplifying the detection information.
In some embodiments, the primer set for amplifying the DNA methylation site combination comprises a primer pair for amplifying one or more of ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9_41, lifr_42, and znf304_71.
In some preferred embodiments, the primer set for amplifying the combination of DNA methylation sites comprises a primer pair for amplifying all of the sites ndrg4_40, THBD _102, wif1_68, sdc2_2_56, DNAAF9_41, lifr_42, and znf304_71.
In some embodiments, the primer pair for amplifying NDRG4_40 is as shown in SEQ ID NO. 1 and SEQ ID NO. 2. In other embodiments, the sequences of the primer pair used to amplify NDRG4_40 have at least 95%, 96%, 97%, 98% or 99% similarity to the sequences shown in SEQ ID NO. 1 and SEQ ID NO. 2, respectively.
In some embodiments, the primer pair used to amplify THBD _102 is shown as SEQ ID NO. 3 and SEQ ID NO. 4. In other embodiments, the sequences of the primer pair used to amplify THBD _102 have at least 95%, 96%, 97%, 98% or 99% similarity to the sequences shown in SEQ ID NO. 3 and SEQ ID NO. 4, respectively.
In some embodiments, the primer pair used to amplify WIF1_68 is as shown in SEQ ID NO.5 and SEQ ID NO. 6. In other embodiments, the sequences of the primer pair used to amplify WIF1_68 have at least 95%, 96%, 97%, 98% or 99% similarity to the sequences set forth in SEQ ID NO.5 and SEQ ID NO. 6, respectively.
In some embodiments, the primer pair for amplifying SDC2-2_56 is set forth in SEQ ID NO. 7 and SEQ ID NO. 8. In other embodiments, the sequences of the primer pair used to amplify SDC2-2_56 have at least 95%, 96%, 97%, 98% or 99% similarity to the sequences set forth in SEQ ID NO. 7 and SEQ ID NO. 8, respectively.
In some embodiments, the primer pair for amplifying DNAAF9_41 is shown as SEQ ID NO. 9 and SEQ ID NO. 10. In other embodiments, the sequences of the primer pair used to amplify DNAAF9_41 have at least 95%, 96%, 97%, 98% or 99% similarity to the sequences shown in SEQ ID NO. 9 and SEQ ID NO. 10, respectively.
In some embodiments, the primer pair for amplifying LIFR_42 is shown as SEQ ID NO. 11 and SEQ ID NO. 12. In other embodiments, the sequences of the primer pair used to amplify LIFR_42 have at least 95%, 96%, 97%, 98% or 99% similarity to the sequences set forth in SEQ ID NO. 11 and SEQ ID NO. 12, respectively.
In some embodiments, the primer pair for amplifying ZNF 304-71 is as set forth in SEQ ID NO. 13 and SEQ ID NO. 14. In other embodiments, the sequences of the primer pair used to amplify ZNF 304-71 have at least 95%, 96%, 97%, 98% or 99% similarity to the sequences set forth in SEQ ID NO. 13 and SEQ ID NO. 14, respectively.
In some embodiments, the detection reagent further includes other reagents that detect the methylation level of the combination of DNA methylation sites. The other reagents may include reagents used in one or more selected from the following methods: whole Genome Bisulfite Sequencing (WGBS), reduced genome bisulfite sequencing (RRBS), oxy-bisulfite sequencing (oxBS-seq), methylated DNA capture sequencing (METHYLCAP-seq), methyl binding protein sequencing (MBD-seq), methylated DNA co-immunoprecipitation sequencing (MeDIP-seq), high Performance Liquid Chromatography (HPLC), methylation Sensitive Restriction Fingerprinting (MSRF), methylation sensitive amplification polymorphism (MASP), methylation chip methods (e.g., 805k chips), pyrosequencing, digital PCR (dPCR), and methylation specific PCR (MS-PCR). In some preferred embodiments, the other agent may be WGBS or an agent used by RRBS.
According to a further aspect of the present description there is provided a kit for detecting colorectal tumours or predicting the risk of developing colorectal tumours, the kit comprising detection reagents for the combination of DNA methylation sites as shown in some embodiments of the present description.
According to a further aspect of the present description there is provided the use of a detection reagent for detecting the methylation level of a combination of DNA methylation sites as shown in some of the examples of the present description in the manufacture of a kit for detecting colorectal tumours or predicting the risk of developing colorectal tumours.
The experimental methods in the following examples are conventional methods unless otherwise specified. The test materials used in the examples described below, unless otherwise specified, were purchased from conventional Biochemical reagent companies. The quantitative tests in the following examples were all set up in triplicate and the results averaged.
Examples
Method and procedure
Collection of lavage fluid sample sets for DNA methylation detection assays
Colorectal lavage samples from 94 subjects were collected as a training sample set, specifically including a colorectal tumor group (total of 81 colorectal tumor patients) and a normal control group (total of 13 healthy normal persons), wherein the colorectal tumor group further included a colorectal tumor group (total of 56 colorectal cancer patients including 15 patients in stage I, 20 patients in stage II, 15 patients in stage III, and 6 patients in stage IV) and a colon adenoma group (total of 25 colon adenoma patients). After sample collection, the samples were stored in a 50mL lavage DNA storage tube containing 7.5mL of the additive, centrifuged at 4000rpm for 10min, the supernatant discarded, and the pellet washed with 1 XPBS.
DNA extraction from lavage fluid sample sets
For DNA extraction of lavage liquid sample group, 180 mu L Buffer GTL is added into lavage liquid sediment, and sediment is resuspended; then 20. Mu.L of proteinase K is added, and the mixture is stirred and mixed evenly by vortex. Incubate at 56℃for 1h until the sample is completely dissolved, continue incubation at 90℃for 1h. The solution on the tube wall was collected to the bottom of the tube by brief centrifugation. 200. Mu.L Buffer GL was added to the tube and thoroughly mixed by vortexing. 200 μl of absolute ethanol was added and thoroughly mixed by vortex shaking. The solution on the tube wall was collected to the bottom of the tube by brief centrifugation.
Transferring the solution in the tube to a centrifuge tube where a silicon matrix material film is placed, adding 500 mu L Buffer GW1 added with absolute ethyl alcohol to the silicon matrix material film, centrifuging at 12000rpm for 1min, pouring out waste liquid in a collecting tube, and placing the silicon matrix material film back into the collecting tube. Adding 500 mu L Buffer GW2 added with absolute ethyl alcohol to the silicon substrate material film, centrifuging at 12000rpm for 1min, pouring out waste liquid in the collecting pipe, and replacing the silicon substrate material film in the collecting pipe. Centrifuging at 12000rpm for 2min, pouring out the waste liquid in the collecting pipe, and standing the silicon substrate material film at room temperature for several minutes to thoroughly dry.
Transferring the silicon substrate material film into a new centrifuge tube, adding 50-200 mu L Buffer GE, standing at room temperature for 2-5 min, centrifuging at 12000rpm for 1min, collecting DNA solution, and preserving at-20 ℃ for further use. DNA concentration (concentration should be not less than 1 ng/. Mu.L) was determined using a micro-spectrophotometer Nano-300 and Qubit.
DNA methylation transformation of lavage fluid sample group
Performing sulfite conversion treatment on the lavage liquid sample group: adding 50 mu L lavage fluid to the PCR tube to precipitate DNA sample, 150 mu LBisulfite Mix and 25 mu L MBuffer B-protective fluid; after brief centrifugation, the PCR tube was placed on a PCR instrument, incubated at 85℃for 50min, cooled to room temperature and centrifuged briefly. Wherein, the lavage liquid precipitated DNA sample is taken from the DNA solution, and the DNA content in 50 mu L of lavage liquid precipitated DNA sample is 20-1000 ng. Bisulfite Mix the preparation comprises adding 1.2-mL MBuffer A-conversion solution into a dry powder tube containing sodium bisulphite, shaking and mixing until the dry powder is completely dissolved.
DNA purification treatment after sulfite treatment: all the solutions in the PCR tubes were introduced into a 1.5mL centrifuge tube. To the centrifuge tube were added 285 mu LMBuffer C-conjugate, 115 mu L isopropyl alcohol, 10 mu L magnetic bead suspension (thoroughly mixed before use), and shaken for 10min. After short centrifugation, the mixture is placed on a magnetic rack for adsorption for 2min, and the supernatant is discarded. 1000 mu L MBuffer D-washing solution is added into the centrifuge tube, the centrifuge tube is not separated from the magnetic rack, the centrifuge tube is incubated for 30 seconds, and the supernatant is discarded. Adding 1000 mu L MBuffer E-incubation liquid into the centrifuge tube, incubating for 15min at room temperature, centrifuging briefly, placing into a magnetic rack, adsorbing for 2min, and discarding the supernatant. 1000 mu L MBuffer D-washing liquid is added into the centrifuge tube, the centrifuge tube does not leave the magnetic rack, the centrifuge tube is incubated for 30 seconds, the supernatant is discarded, and the step is repeated once. And (5) sucking the excessive washing liquid in the centrifuge tube to be clean, placing the centrifuge tube on an ultra-clean workbench, and drying for 5min.
DNA purification and recovery for lavage fluid sample group: adding 50 mu L MBuffer F-eluent into the centrifuge tube, and wetting at 56 ℃ to help to improve the eluting efficiency, and stirring by vortex to ensure that the eluent is fully and uniformly mixed and is waiting for 5min. Centrifuging briefly, and placing on a magnetic rack for adsorption for 2min. The supernatant was aspirated into a clean fresh centrifuge tube and the DNA solution was collected as a DNA conversion sample and stored at-20℃for further use.
Multiplex PCR-NGS detection
The first round of PCR was performed on 94 DNA transformation samples using colorectal tumor methylation specific primers as shown in Table 1.
TABLE 1 colorectal tumor methylation specific primers
/>
The reaction system of the first round of PCR comprises: 10 XACE buffer, 3. Mu.L; dNTP Mix (10 mM), 1. Mu.L; primer mix Primer, 5 μl; TMAC 600mm, 2.5. Mu.L; 50% glycerol, 6 μl;5 XEnhancer, 2. Mu.L; sterilized water, 5 μl; ace Taq enzyme, 0.5 μl; DNA transformation samples (i.e., sulfite treated DNA) were 5. Mu.L.
The reaction conditions for the first round of PCR were: 1) Cycle number 1:95 ℃ for 5min; 2) Cycle number 35:95℃30s,50℃1min,72℃30s; 3) Cycle number 1: and at 72℃for 5min.
The reaction system of the second round of PCR comprises: 10 XACE buffer, 3. Mu.L; dNTP Mix (10 mM), 1. Mu.L; primer AP5 (5. Mu.M), 2. Mu.L; primer Index (5. Mu.M), 2. Mu.L; 50% glycerol, 6 μl; sterilized water, 10.5 μl; aceTaq enzyme, 0.5 μl; the first round PCR reaction product, 5. Mu.L. Wherein: primer AP5 has the sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 31); the sequence of the primer index is CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT (SEQ ID NO: 32). Note that N is A, T, C or G, and "NNNNNNNN" represents an index for distinguishing between different samples.
The reaction conditions for the second round of PCR were: 1) Cycle number 1:95 ℃ for 10min; 2) Cycle number 20:95 ℃ for 30s,55 ℃ for 30s and 72 ℃ for 30s; 3) Cycle number 1: and at 72℃for 5min.
The amplified products were purified by nucleic acid purification reagents to obtain a sequencing library, and then sequenced on a MiniSeq sequencer (Illumina) using sequencing reagents Miniseq TM Mid Output REAGENT CARTRIDGE (Illumina, commercial REF:20001311, LOT No. 20660526), the sequencing depth of each methylation site being no less than 500X.
Data processing
Based on NGS sequencing results, the methylation rate of 174 DNA methylation sites in 94 colorectal lavage samples was calculated using the formula shown in formula (1) above.
Example 1: analysis of the data relating to the methylation rates of colorectal tumor groups and normal control groups revealed that the methylation level of the differential methylation sites was significantly altered in colorectal tumor patients
The methylation rate of 174 DNA methylation sites in 94 colorectal lavage samples was examined. F test was used to verify whether the distribution of methylation rates at each methylation site was variational between colorectal tumor group and normal control group. For methylation sites with methylation rates which belong to the variational alignment distribution between the colorectal tumor group and the normal control group, a two-tail student t test of independent samples is adopted to verify whether the average value of the methylation rates of all the sites between the colorectal tumor group and the normal control group has a significant difference. For methylation sites with methylation rates which belong to variance uneven distribution between the colorectal tumor group and the normal control group, a two-tail student t' test of independent samples is adopted to verify whether the average value of the methylation rates of all the sites between the colorectal tumor group and the normal control group has significant difference. Taking P <0.001 and the difference multiple between groups being more than 2 times as the criterion for evaluating significance, 17 points are selected from 174 methylation sites as difference methylation sites, which are NDRG4_40、SDC2_101、SEP9-1_69、SEP9-2_61、SFRP1_40、TFPI2_117、THBD_102、BCAT1_38、WIF1_68、WIF1_88、WNT5A_47、TFPI2-2_91、SDC2-2_56、SDC2-2_73、DNAAF9_41、LIFR_42 and ZNF 304-71 respectively.
Example 2: differential methylation sites and application analysis of combinations thereof in colorectal tumor prediction
The ability of individual differential methylation sites to predict colorectal tumors was analyzed using ROC curves. Specifically, ROC curves were made with methylation rates of each differential methylation site in the colorectal tumor group (e.g., labeled 1) and in the normal control group (e.g., labeled 0). Wherein, the AUC value corresponding to the site NDRG4_40 is 0.771 (shown in FIG. 6), the AUC value corresponding to the site SDC2_101 is 0.724 (shown in FIG. 7), the AUC value corresponding to the site SEP9-1_69 is 0.837 (shown in FIG. 8), the AUC value corresponding to the site SEP9-2_61 is 0.782 (shown in FIG. 9), the AUC value corresponding to the site SFRP1_40 is 0.791 (shown in FIG. 10), the AUC value corresponding to the site TFPI2_117 is 0.776 (shown in FIG. 11), the AUC value corresponding to the site THBD _102 is 0.751 (shown in FIG. 12), the AUC value corresponding to the site BCAT1_38 is 0.639 (shown in FIG. 13), the AUC value corresponding to the site WIF1_68 is 0.729 (shown in FIG. 14), the AUC value for site WIF1_88 is 0.757 (FIG. 15), the AUC value for site WNT5A_47 is 0.758 (FIG. 16), the AUC value for site TFPI2-2_91 is 0.827 (FIG. 17), the AUC value for site SDC2-2_56 is 0.880 (FIG. 18), the AUC value for site SDC2-2_73 is 0.878 (FIG. 19), the AUC value for site DNAAF9_41 is 0.908 (FIG. 20), the AUC value for site LIFR_42 is 0.915 (FIG. 21), and the AUC value for site ZNF304_71 is 0.886 (FIG. 22).
For a single differential methylation site, consider selecting an appropriate methylation threshold, and distinguishing normal and colorectal tumor patients in a training sample set based on the methylation threshold of the differential methylation site. The methylation threshold for the differential methylation sites was set according to the methylation rate at a specificity of 100%. The threshold values selected are as follows: the threshold for site ndrg4_40 was 0.1875 and the sensitivity of colorectal tumor prediction for the training sample set using the aforementioned threshold was 30.4%; the threshold for site sdc2_101 is 0.7297 and the sensitivity of colorectal tumor prediction for the training sample set using the aforementioned threshold is 28.6%; the threshold for site SEP9-1_69 was 0.3713 and the sensitivity for colorectal tumor prediction for the training sample set using the aforementioned threshold was 51.8%; the threshold for site SEP9-2_61 was 0.4613 and the sensitivity for colorectal tumor prediction for the training sample set using the aforementioned threshold was 26.8%; the threshold for site SFRP 1-40 was 0.5920 and the sensitivity for colorectal tumor prediction for the training sample set using the aforementioned threshold was 19.6%; the threshold for site tfpi2_117 was 0.7110, and the sensitivity of colorectal tumor prediction for the training sample set using the aforementioned threshold was 12.5%; the threshold for site THBD _102 was 0.1094 and the sensitivity for colorectal tumor prediction for the training sample set using the aforementioned threshold was 46.4%; the threshold for site bcat1_38 was 0.3462 and the sensitivity for colorectal tumor prediction for the training sample set using the aforementioned threshold was 21.4%; the threshold for site wif1_68 was 0.2983 and the sensitivity of colorectal tumor prediction for the training sample set using the aforementioned threshold was 39.3%; the threshold for site wif1_88 was 0.2726 and the sensitivity for colorectal tumor prediction for the training sample set using the aforementioned threshold was 42.9%; the threshold for site WNT5a_47 was 0.0678, and the sensitivity of colorectal tumor prediction for the training sample set was 21.4% using the aforementioned threshold; the threshold value of the site TFPI2-2_91 is 0.2144, and the sensitivity of colorectal tumor prediction on the training sample set by using the threshold value is 55.4%; the threshold for site SDC2-2_56 was 0.0566 and the sensitivity of colorectal tumor prediction for the training sample set using the aforementioned threshold was 69.6%; the threshold for site SDC2-2_73 was 0.0582 and the sensitivity of colorectal tumor prediction for the training sample set using the aforementioned threshold was 66.1%; the threshold for site DNAAF _41 was 0.0172 and the sensitivity to colorectal tumor prediction for the training sample set using the aforementioned threshold was 50.0%; the threshold for site lifr_42 is 0.0407, and the sensitivity of colorectal tumor prediction for the training sample set using the aforementioned threshold is 62.5%; the threshold for site znf304_71 was 0.0959 and the sensitivity to colorectal tumor prediction for the training sample set using the aforementioned threshold was 53.6%.
Based on the methylation threshold of the selected 17 different methylation sites, a screening model is considered to be built in a mode of combining 2-17 different methylation sites so as to further improve sensitivity. Specifically, for each methylation site of the differences associated with the model, if the methylation rate of the site in the sample is not lower than the corresponding threshold, the site is determined to be positive, and if the methylation rate of the site in the sample is lower than the corresponding threshold, the site is determined to be negative; for all of the differential methylation sites associated with the model, if one or more of the differential methylation sites are judged to be positive, then the sample is predicted to correspond to a subject likely to have or be at risk of developing a colorectal tumor, and conversely the likelihood or risk may be excluded.
And establishing a screening model in a mode of combining 2-17 sites, and carrying out colorectal cancer prediction by using the screening model based on colorectal cancer group samples and normal control group samples in the training sample set. Table 2 shows the sensitivity ranges of each screening model established for the 2-17 site combinations in predicting colorectal cancer, as well as the site combinations corresponding to the most sensitive and least sensitive.
TABLE 2.2-17 sensitivity ranges of combination of sites corresponding to screening models in predicting colorectal cancer
/>
/>
As can be seen from table 2, in the correlation analysis results of screening the model for colorectal cancer, as the number of combined sites increases, the minimum value of the sensitivity range of the model tends to increase, and the maximum value of the sensitivity range of the model peaks as the number of combined sites increases to 5. In particular, the 5-site combination of NDRG4_40, WIF1_68, SDC2.2_56, DNAAF9_41 and ZNF304_71 creates a screening model 1 that achieves 94.64% sensitivity with a minimum number of sites used.
And establishing a screening model in a mode of combining 2-17 sites, and carrying out colon adenoma prediction by using the screening model based on the colon adenoma group sample and the normal control group sample in the training sample set. Table 3 shows the sensitivity ranges of each screening model established for the 2-17 site combinations in predicting colon adenomas, as well as the site combinations corresponding to the highest sensitivity and the lowest sensitivity.
Tables 3.2-17 sensitivity ranges of the screening models in predicting colon adenomas
/>
/>
As can be seen from table 3, in the correlation analysis results of the screening model for predicting colon adenoma, as the number of combined sites increases, the minimum value of the sensitivity range of the model tends to increase, and the maximum value of the sensitivity range of the model peaks as the number of combined sites increases to 5. In particular, the 5-site combination of NDRG4_40, THBD _102, SDC2.2_56, DNAAF9_41, and LIFR_42 creates a screening model 2 that achieves 91.3% sensitivity with minimal site usage.
Consider combining the site associated with screening model 1 with the site associated with screening model 2 and building a new screening model to further improve the comprehensiveness of the model predictions. Specifically, NDRG4_40, THBD _102, WIF1_68, SDC2-2_56, DNAAF9_41, LIFR_42 and ZNF304_71 were selected as target methylation sites, and a 7-site combination-based screening model 3 was established.
In the prediction of colorectal tumors in the training sample set, the overall sensitivity of screening model 3 was 93.8% and the overall specificity was 100%.
Colorectal cancer prediction was performed using screening model 3 based on colorectal cancer group samples and normal control group samples in the training sample set. The results showed that screening model 3 identified 53 positive samples from 56 colorectal cancer samples with a sensitivity of 94.6% and a specificity of 100%. Wherein 13 positive samples are identified from 15 phase I colorectal cancer samples, and the sensitivity is 86.67%; 19 positive samples were identified from 20 phase II colorectal cancer samples with a sensitivity of 95%; 15 positive samples were identified from 15 phase III colorectal cancer samples with a sensitivity of 100%; from the 6 samples of patients with stage IV colorectal cancer, 6 positive samples were identified, with a sensitivity of 100%.
Based on the colon adenoma group samples and the normal control group samples in the training sample set, a screening model 3 was used for colon adenoma prediction. The results show that screening model 3 identified 23 positive samples from 25 colon adenoma samples with a sensitivity of 92% and a specificity of 100%.
In summary, the DNA methylation site combination and the corresponding screening model of the embodiments of the present disclosure have good sensitivity and specificity in predicting colorectal tumors, and can be used to realize accurate, rapid, noninvasive clinical screening or prediction of colorectal tumors.
While the basic concepts have been described above, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations to the present disclosure may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this specification, and therefore, such modifications, improvements, and modifications are intended to be included within the spirit and scope of the exemplary embodiments of the present invention.
Meanwhile, the specification uses specific words to describe the embodiments of the specification. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present description. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present description may be combined as suitable.
In some embodiments, numbers describing the components, number of attributes are used, it being understood that such numbers being used in the description of embodiments are modified in some examples by the modifier "about," approximately, "or" substantially. Unless otherwise indicated, "about," "approximately," or "substantially" indicate that the number allows for a 20% variation. Accordingly, in some embodiments, numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and employ a method for preserving the general number of digits. Although the numerical ranges and parameters set forth herein are approximations that may be employed in some embodiments to confirm the breadth of the range, in particular embodiments, the setting of such numerical values is as precise as possible.
Each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., referred to in this specification is incorporated herein by reference in its entirety. Except for application history documents that are inconsistent or conflicting with the content of this specification, documents that are currently or later attached to this specification in which the broadest scope of the claims to this specification is limited are also. It is noted that, if the description, definition, and/or use of a term in an attached material in this specification does not conform to or conflict with what is described in this specification, the description, definition, and/or use of the term in this specification controls.
Finally, it should be understood that the embodiments described in this specification are merely illustrative of the principles of the embodiments of this specification. Other variations are possible within the scope of this description. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present specification may be considered as consistent with the teachings of the present specification. Accordingly, the embodiments of the present specification are not limited to only the embodiments explicitly described and depicted in the present specification.

Claims (11)

  1. Use of a detection reagent for a combination of DNA methylation sites for the preparation of a kit for detecting colorectal tumours or predicting the risk of colorectal tumour diseases, characterized in that said combination of DNA methylation sites consists of:
    A locus NDRG4_40 with chromosome coordinates of chr16:58497406 on the NDRG4 gene;
    The chromosomal coordinate at position THBD _102 of chr20:23031082 on THBD gene;
    A locus WIF1_68 with chromosome coordinates of chr12:65515031 on the WIF1 gene;
    Locus SDC2-2_56 with chromosome coordinates of chr8:97505785 on SDC2-2 gene;
    position DNAAF9_41 on DNAAF gene with chromosome coordinates of chr20: 3388892;
    A locus LIFR_42 with chromosome coordinates of chr5:38557321 on the LIFR gene;
    A locus ZNF304_71 with chromosome coordinates of chr19:57862624 on the ZNF304 gene;
    Wherein the chromosomal coordinates of the corresponding sites are derived from the human reference genome GRCh37.
  2. 2. The use of claim 1, wherein the detection reagent comprises a primer set for amplifying the combination of DNA methylation sites, wherein:
    the primer pair for amplifying NDRG4_40 is shown as SEQ ID NO.1 and SEQ ID NO. 2;
    the primer pair for amplifying THBD _102 is shown as SEQ ID NO. 3 and SEQ ID NO. 4;
    The primer pair for amplifying WIF1_68 is shown as SEQ ID NO. 5 and SEQ ID NO. 6;
    Primer pairs for amplifying SDC2-2_56 are shown as SEQ ID NO. 7 and SEQ ID NO. 8;
    The primer pair for amplifying DNAAF9_41 is shown as SEQ ID NO. 9 and SEQ ID NO. 10;
    The primer pair for amplifying LIFR_42 is shown as SEQ ID NO. 11 and SEQ ID NO. 12;
    the primer pair for amplifying ZNF304_71 is shown as SEQ ID NO. 13 and SEQ ID NO. 14.
  3. 3. The use of claim 1, wherein the detection reagent is used to detect the methylation level of the combination of DNA methylation sites, and wherein the method of detecting colorectal tumor or predicting colorectal tumor risk comprises:
    obtaining the methylation level of said combination of DNA methylation sites in a biological sample of a subject;
    Based on the methylation levels of the combination of DNA methylation sites, a screening model is used to assess whether the subject is likely to have a colorectal tumor or the subject's risk of developing a colorectal tumor.
  4. 4. The use of claim 3, wherein the screening model is a model based on methylation thresholds of the combination of DNA methylation sites.
  5. 5. The use of claim 3, wherein the evaluating comprises:
    For each DNA methylation site in the DNA methylation site combination, comparing the methylation rate of the DNA methylation site to a methylation threshold value corresponding to the DNA methylation site, determining the number of positive sites of the DNA methylation site combination;
    an evaluation result is obtained based on the number of positive sites, wherein the number of positive sites being ≡1 indicates that the subject may have a colorectal tumor or that the subject is at risk of developing a colorectal tumor.
  6. 6. The use of claim 4, wherein the methylation threshold value of the DNA methylation site is determined by the following method:
    Obtaining a training sample set comprising known methylation rates of the DNA methylation sites for colorectal tumor patients and non-colorectal tumor patients;
    Analyzing the training sample set using ROC curves to determine cut-off values for distinguishing the colorectal tumor patient from the non-colorectal tumor patient, the cut-off values being used as methylation thresholds for the DNA methylation sites;
    Wherein the cut-off value is selected from methylation rates with specificity of 95% -100%.
  7. 7. The use of claim 4, wherein the methylation threshold of ndrg4_40 is 0.1857; THBD _102 has a methylation threshold of 0.1094; the methylation threshold of wif1_68 is 0.2983; the methylation threshold of SDC2-2_56 is 0.0566; the methylation threshold of DNAAF9_41 is 0.0172; LIFR_42 has a methylation threshold of 0.0407; the methylation threshold of znf304_71 is 0.0959.
  8. 8. The use according to claim 3, wherein the colorectal neoplasm comprises colorectal cancer and colorectal adenoma.
  9. 9. The use of claim 3, wherein the biological sample is from a colorectal lavage of the subject.
  10. 10. A kit for detecting colorectal neoplasms or predicting the risk of colorectal neoplasms, comprising the detection reagent of claim 2.
  11. 11. An apparatus for detecting colorectal neoplasms or predicting colorectal neoplasm risk, the apparatus comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements the method of:
    Obtaining the methylation level of the combination of DNA methylation sites of claim 1 in a biological sample of a subject;
    Based on the methylation levels of the combination of DNA methylation sites, a screening model is used to assess whether the subject is likely to have a colorectal tumor or the subject's risk of developing a colorectal tumor.
CN202311361229.3A 2023-10-19 2023-10-19 DNA methylation site combination as colorectal tumor marker and application thereof Active CN117551762B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311361229.3A CN117551762B (en) 2023-10-19 2023-10-19 DNA methylation site combination as colorectal tumor marker and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311361229.3A CN117551762B (en) 2023-10-19 2023-10-19 DNA methylation site combination as colorectal tumor marker and application thereof

Publications (2)

Publication Number Publication Date
CN117551762A CN117551762A (en) 2024-02-13
CN117551762B true CN117551762B (en) 2024-05-10

Family

ID=89817514

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311361229.3A Active CN117551762B (en) 2023-10-19 2023-10-19 DNA methylation site combination as colorectal tumor marker and application thereof

Country Status (1)

Country Link
CN (1) CN117551762B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106232833A (en) * 2014-01-30 2016-12-14 加利福尼亚大学董事会 The haplotyping that methylates (MONOD) for non-invasive diagnostic
WO2018087129A1 (en) * 2016-11-08 2018-05-17 Region Nordjylland, Aalborg University Hospital Colorectal cancer methylation markers
WO2022022386A1 (en) * 2020-07-29 2022-02-03 上海吉凯医学检验所有限公司 Dna methylation marker for early colorectal cancer and adenomas, method for detecting same, and application thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011126768A2 (en) * 2010-03-29 2011-10-13 Mayo Foundation For Medical Education And Research Methods and materials for detecting colorectal cancer and adenoma
US20170356051A1 (en) * 2014-10-17 2017-12-14 Tohoku University Method for estimating sensitivity to drug therapy for colorectal cancer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106232833A (en) * 2014-01-30 2016-12-14 加利福尼亚大学董事会 The haplotyping that methylates (MONOD) for non-invasive diagnostic
WO2018087129A1 (en) * 2016-11-08 2018-05-17 Region Nordjylland, Aalborg University Hospital Colorectal cancer methylation markers
WO2022022386A1 (en) * 2020-07-29 2022-02-03 上海吉凯医学检验所有限公司 Dna methylation marker for early colorectal cancer and adenomas, method for detecting same, and application thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MA,L.等.A Novel Stool Methylation Test for the Non-Invasive Screening of Gastric and Colorectal Cancer.frontiers in Oncology.2022,第12卷第860701篇. *

Also Published As

Publication number Publication date
CN117551762A (en) 2024-02-13

Similar Documents

Publication Publication Date Title
CN110964826B (en) Colorectal cancer suppressing gene methylation high-throughput detection kit and application thereof
JP2024069295A (en) Cell-free DNA for assessing and/or treating cancer - Patents.com
TWI647312B (en) Method for screening gene markers of intestinal cancer, gene markers screened by the method, and uses thereof
CN112992354B (en) Method and system for evaluating colorectal cancer metastasis and recurrence risk and dynamically monitoring based on methyl marker combination
LaPointe et al. Discovery and validation of molecular biomarkers for colorectal adenomas and cancer with application to blood testing
WO2015073949A1 (en) Method of subtyping high-grade bladder cancer and uses thereof
US20230183807A1 (en) Methylation status of gasdermin e gene as cancer biomarker
WO2023226938A1 (en) Methylation biomarker, kit and use
Hu et al. Integrated 5-hydroxymethylcytosine and fragmentation signatures as enhanced biomarkers in lung cancer
Wang et al. Clinicopathological and molecular characterization of biphasic hyalinizing psammomatous renal cell carcinoma: further support for the newly proposed entity
Mou et al. Gene expression analysis reveals a 5-gene signature for progression-free survival in prostate cancer
CN117551762B (en) DNA methylation site combination as colorectal tumor marker and application thereof
JP7008696B2 (en) Cancer DNA methylation signatures in host peripheral blood mononuclear cells and T cells
JP4886976B2 (en) Prognosis of colorectal cancer
Kwon et al. Advances in methylation analysis of liquid biopsy in early cancer detection of colorectal and lung cancer
CN110408706A (en) It is a kind of assess recurrent nasopharyngeal carcinoma biomarker and its application
Shao et al. Cell‐free DNA 5‐hydroxymethylcytosine as a marker for common cancer detection
CN116083588B (en) DNA methylation site combination as prostate cancer marker and application thereof
CN116987788B (en) Method and kit for detecting early lung cancer by using flushing liquid
CN104962612B (en) G.41256139delT, BRCA1 gene frameshift mutation and its is preparing the application in Computer-aided Diagnosis of Breast Cancer kit
JP7471601B2 (en) Molecular signatures and their use for identifying low-grade prostate cancer - Patents.com
Tomiyama et al. Urinary markers for bladder cancer diagnosis: A review of current status and future challenges
CN113811621A (en) Method for determining RCC subtype
CN113005198B (en) Kit for detecting 15 gene mutation sites related to sensitivity of radiotherapy and chemotherapy of rectal cancer and application thereof
Miyake et al. Heterogeneity of colorectal cancers and extraction of discriminator gene signatures for personalized prediction of prognosis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant