CN110195107A - The rDNA methylation markers of cancer detection and its application in peripheral blood - Google Patents
The rDNA methylation markers of cancer detection and its application in peripheral blood Download PDFInfo
- Publication number
- CN110195107A CN110195107A CN201910445136.6A CN201910445136A CN110195107A CN 110195107 A CN110195107 A CN 110195107A CN 201910445136 A CN201910445136 A CN 201910445136A CN 110195107 A CN110195107 A CN 110195107A
- Authority
- CN
- China
- Prior art keywords
- cancer
- site cpg
- marker
- cpg
- sequencing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/154—Methylation markers
Abstract
The present invention provides the nucleic acid body DNA methylation marker of cancer detection in peripheral blood and its applications.These markers include selected from least one of following site CpG: on the basis of human rebosomal DNA repeated fragment elements reference sequence U13369.1,38974th, 37148,37013,37028,37076,32936,21740,23407, the site CpG or the modified site CpG at 34657,28277 equal positions.Any combination of these markers is additionally provided simultaneously for the system of cancer diagnosis and kit etc..There are notable differences in tumor tissues and nonneoplastic tissue for the methylation state of these markers, the hypomethylation in tumor tissues, these markers combination in test set distinguish patient whether suffer from liver cancer, lung cancer, colorectal cancer ROC respectively reached 96%, 94%, 92%.
Description
Technical field
The invention belongs to field of biological detection, it is related to a kind of marker for cancer detection or postoperative evaluation and its answers
With, and in particular to DNA methylation marker object and its application using peripheral blood diagnosis cancer.
Background technique
Peripheral blood detection disease is a kind of minimally invasive or even noninvasive detection mode.There is dissociative DNA in peripheral blood, these trips
It is discharged into the DNA in blood from Apoptosis from DNA, therefore, can be identified in body by the analysis to dissociative DNA
The some problems of appearance.
DNA methylation is the pith of epigenetics, and DNA methylation has vital work to gene regulation
With.It is existing research shows that the generation of cancer and genomic DNA methylation level are very close, this makes by identifying DNA methylation
Variation becomes reality to detect cancer.DNA methylation refers in organism under the catalysis of dnmt rna, with S- gland
Glycosides methionine is methyl donor, and methyl is transferred to the process in specific base.DNA methylation is main in mammals
Occur on the C of CpG, generates 5-methylcytosine.
98% or more the site CpG is distributed in the repetitive sequence with swivel base potential in genome.In normal cell
In, these CpG are in high methylation/Transcriptional Silencing state, and these CpG have occurred and widely go first in tumour cell
Base leads to the transcription of repetitive sequence, the activation of transposons, increases the unstability of genome.It is remaining to account for total amount 2% or so
CpG be densely distributed in the island CpG of gene promoter region.The special aberrant methylation site of screening cancerous tissue facilitates
The detection of cancer.
Cancer seriously threatens human life and health, and due to existing marker poor specificity, many cancer patients are past when diagnosing
Toward being middle and advanced stage, the chance of radical resection is lost.Therefore the marker pair of cancer peripheral blood methylation high sensitivity is found
It is of great significance in the early discovery early treatment of cancer.
Summary of the invention
The present invention is directed to solve at least some of the technical problems in related technologies.For this purpose, of the invention
One purpose is to propose the marker of a kind of early diagnosis that can be used in cancer, postoperative evaluation.
The present inventor has found in the course of the research: rDNA plays the role of extremely important in life process.
The transcription product of rDNA occupies the 80% of cell RNA yield, most important for cell translation protein processes.It grinds
Study carefully and shows that rDNA transcriptional control will appear exception in cancer.Therefore, the methylation for studying rDNA in cancer is different
Often facilitate us and finds the marker of cancer detection.However the methylation of rDNA is not related in conventional analysis.This
Outside, autosomal copy number only has 2 in the genome of people, and the copy number of rDNA has about 400, this makes i.e.
When keeping sequencing depth shallower, we can also have the methylation status in enough data analyses single site CpG of rDNA.
Therefore, the existing manufacturing basis of marker that detection cancer is found on rDNA, also there is optimized integration.
Specifically, the present invention provides the following technical scheme that
According to the first aspect of the invention, the present invention provides a kind of marker for cancer, the marker includes
Selected from least one of following site CpG: on the basis of human rebosomal DNA repeated fragment elements reference sequence U13369.1,
38974th, 37148,37013,37028,37076,32936,21740,23407,34657,28277,38982,21695,
19811,32927,32906,32920,36988,32964,30940,19819,39004,31075,38940,19843,
18745,33830,31663,21709,23623,30639,32931,18727,37206,38980,21309,30630,
18737,38596,30647,37072,37162,30936,33838,36193,36218,34719,40812,37088,
15596,22970,18200,19952,14972,21194,23926 and the 24167,36996,37361,37178,39913,
The site CpG or the modified site CpG at 30789,37020,36832,34428 or 34805 positions.These markers
Methylation state in tumor tissues and nonneoplastic tissue there are notable difference, the hypomethylation in tumor tissues, Ke Yiyong
Make the early diagnosis of cancer and the prediction of postoperative recurrence.By taking liver cancer, lung cancer and colorectal cancer as an example, the combination of these markers is being surveyed
Examination concentrate distinguish patient whether suffer from liver cancer, lung cancer, colorectal cancer ROC respectively reached 96%, 94%, 92%, have it is high
Accuracy.
According to an embodiment of the invention, the marker described above for cancer may further include following technology spy
Sign:
In some embodiments of the invention, using human rebosomal DNA repeated fragment elements reference sequence U13369.1 as base
Standard, the marker include at least one of following site CpG: the 38974th, 37148,37013,37028,37076,
At least one of the site CpG or the modified site CpG at 32936,21740,23407,34657,28277 positions;With
And including at least one of following site: the 38982nd, 21695,19811,32927,32906,32920,36988,32964,
30940,19819,39004,31075,38940,19843,18745,33830,31663,21709,23623,30639,
32931,18727,37206,38980,21309,30630,18737,38596,30647,37072,37162,30936,
33838,36193,36218,34719,40812,37088,15596,22970,18200,19952,14972,21194,23926
The site CpG or the modified site CpG at position.
In some embodiments of the invention, using human rebosomal DNA repeated fragment elements reference sequence U13369.1 as base
Standard, the marker is including the site CpG at the 38974th, 37148,37013,37028,37076,32936 position or through repairing
At least one of site CpG of decorations;And the 21740th, 23407,34657,28277 site CpG at position or through repairing
At least one of site CpG of decorations.
In some embodiments of the invention, using human rebosomal DNA repeated fragment elements reference sequence U13369.1 as base
Standard, the marker is including the site CpG at the 38974th, 37148,37013,37028,37076,32936 position or through repairing
At least two in the site CpG of decorations.
In some embodiments of the invention, using human rebosomal DNA repeated fragment elements reference sequence U13369.1 as base
Standard, the marker is including the site CpG at the 38974th, 37148,37013,37028,37076,32936 position or through repairing
At least one of site CpG of decorations;And the 21740th, 23407,34657,28277 site CpG at position or through repairing
At least two in the site CpG of decorations.
In some embodiments of the invention, the site CpG of the modification includes 5- methylation modification or 5- methylol
Change modification.
According to the second aspect of the invention, the present invention provides a kind of primer sequence, the primer sequence is with the present invention the
On the one hand nucleotides sequence where the marker is classified as target sequence, the specific amplification for target sequence.
According to the third aspect of the invention we, the present invention provides a kind of probe, the probe is free in solution or consolidates
Due on chip, the probe being capable of the specific nucleotide sequence captured where marker described in first aspect present invention.
According to the fourth aspect of the invention, the present invention provides a kind of kits, and the kit is for diagnosing cancer, institute
It states kit and contains reagent for detecting marker described in first aspect present invention.
In some embodiments of the invention, the kit further comprises second aspect of the present invention any embodiment institute
Probe described in the primer sequence or third aspect present invention stated.
According to the fifth aspect of the invention, the present invention provides markers or primer sequence or probe in preparation cancer
Purposes in diagnostic kit, the marker are marker described in first aspect present invention, and the primer sequence is this hair
Primer sequence described in bright second aspect, the probe are probe described in third aspect present invention.
In some embodiments of the invention, the cancer includes liver cancer, lung cancer, colorectal cancer, breast cancer, nasopharyngeal carcinoma
And/or head and neck cancer.
According to the sixth aspect of the invention, the present invention provides the sides that target site in a kind of determining sample to be tested methylates
Method, the target site are the site CpG in marker described in first aspect present invention any embodiment, which comprises
(1) conversion processing is carried out to the dissociative DNA in the sample to be tested peripheral blood, so that the Cytosines not methylated are
Thymidine, the sample after obtaining conversion processing;(2) sample based on the conversion processing, constructs sequencing library, and sequencing obtains
Sequencing data;(3) sequencing data is compared with reference sequences, mesh in the sequencing data is determined based on comparison result
The methylation result of mark point.
According to an embodiment of the invention, the method that target site methylates in sample to be tested determined above can be wrapped further
Include following technical characteristic:
In some embodiments of the invention, the reference sequences are people's rDNA repeated fragment elements reference sequence
U13369.1。
In some embodiments of the invention, the sequencing be by second generation sequencing approach or third generation sequencing approach into
Capable.It may be implemented using existing two generations sequencing approach or three generations's sequencing approach to the site CpG in sample to be tested
Methylation result is measured.
In some embodiments of the invention, the sequencing is by selected from Hiseq2000, SOliD, 454 and unimolecule
At least one progress of sequencing device.
According to the seventh aspect of the invention, the present invention provides one kind for diagnosing cancer or prediction cancer relapse risk
System, comprising: conversion treatment device, the conversion treatment device are used for free in sample to be tested peripheral blood
DNA carries out conversion processing, so that the Cytosines not methylated are thymidine, obtains the sample of conversion processing;It surveys
Sequence device, the sequencing device are connected with the conversion treatment device, sample of the sequencing device based on the conversion processing,
Sequencing library is constructed, sequencing obtains sequencing data;Comparison device, the comparison device are connected with the sequencing device, the ratio
Device is compared for the sequencing data with reference sequences, marker in the sequencing data is determined based on comparison result
The methylation result in the middle site CpG;Result judgement device, the result judgement device are connected with the comparison device, the knot
Methylation of the fruit decision maker based on the site CpG in marker in the sequencing data by statistical model as a result, analyzed, judgement
Whether the sample to be tested suffers from cancer or the whether easy cancer stricken of the prediction sample to be tested or whether post-surgical cancer recurs;Its
In, the marker is marker described in first aspect present invention any embodiment.
According to an embodiment of the invention, diagnostic system described above may further include following technical characteristic:
In some embodiments of the invention, the reference sequences are people's rDNA repeated fragment elements reference sequence
U13369.1。
In some embodiments of the invention, the statistical model is multivariate statistical model.It can using multivariate statistical model
To analyze multiple sites CpG methylation status with the relationship of cancer, so that the methylation result using the site CpG determines cancer
Disease condition realizes the Rapid&Early diagnosis of cancer.
In some embodiments of the invention, the statistical model is suffered from based on multiple cancer patients and the multiple cancer
The methylation result in the site CpG is established in person, and the site CpG is to mark described in first aspect present invention any embodiment
Object.
In some embodiments of the invention, the multivariate statistical model is logistic regression model, random forest mould
At least one of type, SVM model, preferably logistic regression model.Regression model is quantitatively retouched to statistical relationship
A kind of mathematical model stated is that calculating mould of the variable about the specific dependence of another variable is studied by model
Type.By analysis of regression model, the relationship of the methylation result with cancer in each site CpG or multiple sites CpG can be studied,
To which the DNA methylation assay according to the site CpG is as a result, the disease condition of sample to be tested can be determined.Logistic regression model
As a kind of linear regression model (LRM) of broad sense, can accurate study of disease and variable relationship.
In some embodiments of the invention, the comparison, match party selected by software are carried out using software bs-seeker2
Formula is Local Alignment (local alignment).Select bs-seeker2 matched the reason is that the software support ' local
The match pattern of alignment ' helps to improve the ratio for matching back reference sequences using this match pattern, increases analysis
As a result robustness.
The group cooperation obtained by the present invention having the beneficial effect that using each site CpG provided by the invention or the site CpG
For marker, the methylation of part ribosomal dna sequence in detection patient's peripheral blood can be passed through using peripheral blood in patients as sample
Cancer diagnosis can be realized in state, to can realize diagnosis cancer in time in the case where noninvasive or minimally invasive.And this
Analyte detection cancer is marked provided by invention, specificity and sensitivity are very high, and these markers copying in genome
Shellfish number is more, and less marker can be thus achieved high-precision and detect.
Detailed description of the invention
Fig. 1 is the ROC curve that the rDNA site that embodiment according to the present invention provides is used for peripheral blood of patients with primary hepatocellular carcinoma detection
Figure.
Fig. 2 is that the rDNA site that embodiment according to the present invention provides is used for peripheral blood lung cancer, breast cancer and nasopharynx
The ROC curve figure of cancer detection.
Fig. 3 is the ROC that the rDNA site that embodiment according to the present invention provides is used for peripheral blood RRBS lung cancer detection
Curve graph.
Fig. 4 is the ROC that the rDNA site that embodiment according to the present invention provides is used for peripheral blood RRBS lung cancer detection
Curve graph.
Liver cancer postoperative effect figure is assessed for peripheral blood in the rDNA site that Fig. 5 embodiment according to the present invention provides.
Fig. 6 is the system for diagnosing cancer or predicting cancer risk that embodiment according to the present invention provides
Structural schematic diagram.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.
For convenience it will be appreciated by those skilled in the art that herein presented certain terms are explained and illustrated, need
Illustrate, these explanation and illustrations are only used to that those skilled in the art is helped to understand the present invention, and cannot regard as
It is limiting the scope of the invention.
Herein, the site CpG indicates dinucleotides pair, and after base guanine (G) follows cytimidine (C) closely, CpG is that born of the same parents are phonetic
The abbreviation of pyridine (C)-phosphoric acid (p)-guanine (G).
Herein, " marker ", which refers to, can be used to indicate that the case where subject is with cancer.These markers can be
Nucleic acid sequence, macromolecular, small molecule etc., such as can be the nucleic acid sequence of certain length, it is also possible to a specific site
Nucleotide or two specific sites nucleotide, as long as the case where can be used to indicate that subject with liver cancer.According to this
The embodiment of invention, marker provided by the invention refer to can be used in detecting or diagnose subject whether suffer from cancer or
Person whether the site CpG whether susceptible cancer or post-surgical cancer recur.
The present invention provides utilize marker and the application that can be used to detect cancer.These markers are from human rebosomal
It is screened in DNA reference sequences.Present invention discloses the sequence areas of human rebosomal DNA methylation exception, filter out
It can be using the relatively high site CpG of the accuracy of 10 of liver cancer predictions of peripheral blood DNA detection or diagnosis cancer and pre-
Survey or diagnose preparatory 45 slightly lower sites CpG of cancer.The methylation state in these regions is in tumor tissues and non-swollen
There are notable differences in tumor tissue, can be used to disclose the disease condition of tested sample.
According to an aspect of the present invention, the present invention provides a kind of marker for cancer, the marker is with people
On the basis of rDNA repeated fragment elements reference sequence U13369.1, selected from least one of following site CpG: the
38974,37148,37013,37028,37076,32936,21740,23407,34657,28277,38982,21695,
19811,32927,32906,32920,36988,32964,30940,19819,39004,31075,38940,19843,
18745,33830,31663,21709,23623,30639,32931,18727,37206,38980,21309,30630,
18737,38596,30647,37072,37162,30936,33838,36193,36218,34719,40812,37088,
The site CpG or the modified site CpG at 15596,22970,18200,19952,14972,21194,23926 positions.
The site CpG as marker can be any one in these sites, and any two, any three, any four are appointed
Meaning five, any six, any seven, any eight, any nine, even all ten.When the position CpG for being used as marker
When point is more, cancer diagnosis is carried out by these markers, diagnostic result obtained is more reliable.The methylation of these markers
State can indicate kinds cancer, including but not limited to liver cancer, lung cancer, colorectal cancer, breast cancer, nasopharyngeal carcinoma etc..
Other than the above site CpG, these markers for being used for cancer can also include at least one in following sites:
On the basis of human rebosomal DNA repeated fragment elements reference sequence U13369.1, the 24167th, 36996,37361,37178,
The site CpG or the modified site CpG at 39913,30789,37020,36832,34428 or 34805 positions.These
Before the site CpG or the modified site CpG at position after study, discovery it can be used to the early diagnosis of liver cancer and
Screening, wherein correspondingly content is recorded in the Chinese patent application that number of patent application is 201910157582.7.Meanwhile these
Site by further confirming, it is found that these sites can be also used for early diagnosis and the postoperative recurrence of other cancers in the application
Prediction, and be not merely limited to liver cancer.It certainly, can be independent it has been investigated that can be used for the marker of liver cancer before these
Or be applied in combination, for other cancers in addition to liver cancer;It can also be applied to cancer with CpG Sites Combination mentioned above
Diagnosis, detection and postoperative evaluation.
In at least some embodiments, the CpG at the 38974th, 37148,37013,37028,37076,32936 position
The prediction rate in site is relatively reliable, can be used alone, and either combines two of them or three applications.In at least some implementations
In mode, the cancer diagnosis rate in the site CpG and other 45 sites CpG at the 21740th, 23407,34657,28277 position
It is lower than other sites, relatively reliable diagnostic result can be obtained with combined application.
Herein, refer to " on the basis of human rebosomal DNA repeated fragment elements reference sequence U13369.1 " herein in table
It is the table carried out with the position in human rebosomal DNA repeated fragment elements reference sequence U13369.1 when stating these sites CpG
It states.These CpG contained in the human rebosomal DNA repeated fragment elements reference sequence U13369.1 being embodied in Genebank
Point may be used as the marker of liver cancer, by can be with the whether easy cancer stricken of forecast sample or diagnosis to these CpG Locus Analysis in Shoots
Whether cancer is suffered from.Perhaps, the position in these sites CpG can update or because of disparate databases with the data of database
The difference of characteristic manner and change, but these variation do not influence these sites for diagnosing liver cancer function.These become
Change is also contained within protection scope of the present invention.
In at least some embodiments of the invention, the modification in the site CpG includes 5- methylation modification, 5- hydroxyl first
Baseization modification.Based on these markers, can be handled by human peripheral blood DNA, the diagnosis for cancer.It can also be based on
These markers, the detection reagent or kit of preparation detection cancer.
According to another aspect of the present invention, the present invention provides a kind of methods for diagnosing cancer, comprising: (1) treats test sample
Dissociative DNA in this peripheral blood carries out methylation processing, so that the Cytosines not methylated are thymidine, obtains
The sample of conversion processing;(2) sample based on the conversion processing, constructs sequencing library, and sequencing obtains sequencing data;(3) will
The sequencing data is compared with human rebosomal DNA reference sequences, is determined in the sequencing data and is marked based on comparison result
The methylation result in the site CpG in object;(4) methylation based on the site CpG in the sequencing data is as a result, pass through statistical model
Analysis, determines whether the sample to be tested suffers from cancer.It should be noted that this method can not only be used to judge sample to be tested
Whether cancer is suffered from, can also predict that sample to be tested future suffers from the risk of cancer, to realize early a little treatment or prevention.
Herein, when the dissociative DNA in sample to be tested peripheral blood carries out conversion processing, it can use weight bisulfites
Carry out the conversion processing.Commercially available heavy bisulfite agent box can directly be bought or oneself prepared and obtained.It needs
It is bright, it is well known by those skilled in the art that either utilizing weight bisulfites, sulfurous acid when carrying out conversion processing
Hydrogen salt, bisulphate, bisulfite etc. can achieve the purpose of above-mentioned conversion processing, i.e., will be in peripheral blood in dissociative DNA
The non-thymidine of the Cytosines not methylated, obtains the sample of conversion processing, therefore, either utilizes these reagents
In which kind of carry out conversion processing, be all included in the scope of protection of the present invention.As long as moreover, can obtain at above-mentioned conversion
The purpose of reason, i.e., comprising within the protection scope of the present invention.
It carries out building library in the dissociative DNA to sample to be tested peripheral blood, be sequenced, to obtain the methylation result in each site CpG
When, technological means generally in the art can be used.In at least some embodiments, at least some embodiments, sharp
It is methylated with full-length genome and the methylation for obtaining each site CpG is sequenced as a result, obtaining methylation result using RRBS.Example
Such as, blood samples of patients sample is obtained into blood plasma by 16000 × g centrifugal filtration in 1600 × g and 10 minute in 10 minutes;Pass through DSP
Blood Mini Kit (Qiagen) extracts DNA, and each patient dna sample is extracted from the blood plasma of 4mL;Use Illumina's
Paired-End Sequencing Sample Preparation Kit carries out methylation connector;Next, sequencing library makes
It is purified with AMPure XP magnetic beads (Beckman Coulter), then utilizes EpiTect Plus DNA
The heavy bisulfite conversion of Bisulfite Kit (Qiagen) progress two-wheeled;Product is carried out to the PCR amplification of 10 circulations,
Finally single-ended sequencing is carried out at HiSeq 2000 (Illumina).
The present invention also provides a kind of systems for diagnosing cancer or predicting cancer risk, as shown in fig. 6, packet
Include: conversion treatment device, sequencing device, comparison device and result judgement device, the conversion treatment device is for treating test sample
Dissociative DNA in this person peripheral blood carries out methylation processing, so that the Cytosines not methylated are thymidine, obtains
Obtain the sample of conversion processing;The sequencing device is connected with conversion treatment device, and the sequencing device is based on the conversion processing
Sample, construct sequencing library, sequencing data is obtained in microarray dataset;The comparison device is connected with the sequencing device,
The comparison device is compared for the sequencing data with reference sequences, is determined in the sequencing data based on comparison result
The methylation result in the site marker location CpG;The result judgement device is connected with the comparison device, the result judgement
Methylation of the device based on the site CpG in the sequencing data determines that the sample to be tested is as a result, by statistical model analysis
It is no to suffer from cancer or the prediction whether easy cancer stricken of sample to be tested.In some embodiments, available statistical model is
Multivariate statistical model.Wherein available multivariate statistical model include but is not limited to logistic regression model, Random Forest model,
SVM model etc..
The solution of the present invention is explained below in conjunction with embodiment.It will be understood to those of skill in the art that following
Embodiment is merely to illustrate the present invention, and should not be taken as limiting the scope of the invention.Particular technique or item are not specified in embodiment
Part, it described technology or conditions or is carried out according to the literature in the art according to product description.Agents useful for same or instrument
Production firm person is not specified in device, and being can be with conventional products that are commercially available.
The site CpG of difference on 1 full-length genome of embodiment methylation sequencing data screening rDNA
It is entitled that we using 2013 were published in PNAS " Noninvasive detection of cancer-
associated genome-wide hypomethylation and copy number aberrations by plasma
The peripheral blood weight bisulfite sequencing data delivered in DNA bisulfite sequencing " article, data deposit in Europe
Continent genome-phenotype archives (European Genome-Phenome Archive), searching number EGAS00001000566.This
In using to Healthy People (32), HBV infection non-cancer patient (8), early liver cancer patient (I phase, II phase, 26) peripheral blood
DNA methylation data, and wherein 15 pairs of liver cancer tissues and leukocytic cream DNA methylation data.
The reference sequences U13369.1 of manned rDNA repeated fragment unit under in GENBANK, overall length are total
42999 bases, 3288 sites CpG.Sequencing data is matched back on reference sequences using bs-seeker2 software
U13369.1, no longer removal sequencing repeat, and reason is that the sequencing coverage on rDNA is relatively high.It calculates CpG each
C number of methylation of point and the C number that do not methylate.
Next, screening out the few site CpG of those matching times, 2871 effective sites CpG are obtained.
At this moment, patient is split into two parts at random, a part is used as training set, and a part is used as test set, wherein point
Do not select respectively the non-cancer people of 2/3 Healthy People, 2/3 HBV infection, 2/3 liver cancer patient as training set, remaining patient
As test set.The selection markers object on training set, is tested on test set.Random split process repeats 100 times, into
The subsequent analytical procedure of row.
Non-cancer, cancer patient can effectively be distinguished by being filtered out on 2871 effective sites CpG using training set data
The site CpG.Basic operation is to distinguish non-cancer, cancer patient using the methylation level in each site CpG, draw each CpG's
ROC (receiver operating characteristic) curve calculates AUC (area under curve).To each position
The AUC of point is sorted from large to small, and screens preceding 200 sites CpG, and the AUC in general preceding 200 sites CpG can be greater than 80%.
Using preceding 200 sites CpG filtered out, carry out feature selecting using training set data, that is, filter out effect compared with
Then the logistic regression model of regularization is trained in the site CpG well, the site CpG that feature selecting obtains is target mark
Remember object.Model is obtained in test integrated test using screening, examines the effect of selected bits point and model.
By the random fractionation of 100 training sets and test set, 100 regression models and CpG corresponding are obtained
The combination of point, it is therefore an objective to avoid the randomness split, further, calculate the number that the site CpG is selected in 100 experiments.Fig. 1
It is given at the distribution of ROC curve in 100 test set tests.Wherein abscissa represents 1- specificity in Fig. 1, refers to false positive
Rate;Ordinate represents sensibility, refers to true positive rate.Area under ROC curve is known as AUC, and the area of AUC is bigger, correctly
Rate is higher.The mean value and variance of AUC in 100 tests are labelled on Fig. 1.
By above-mentioned calculating, i.e., the selected number in the site CpG in 100 times experiments obtains following site: the 38974th,
37148,37013,37028,37076,32936 and the 24167,36996,37361,37178,39913,30789,37020,
The site CpG of 36832,34428,34805 positions, which has, preferably chooses number, followed by 21740, and 23407,34657,
28277,38982,21695,19811,32927,32906,32920,36988,32964,30940,19819,39004,
31075,38940,19843,18745,33830,31663,21709,23623,30639,32931,18727,37206,
38980,21309,30630,18737,38596,30647,37072,37162,30936,33838,36193,36218,
34719,40812,37088,15596,22970,18200,19952,14972,21194,23926 sites.
Site estimation lung cancer, breast cancer, the effect of nasopharyngeal carcinoma that embodiment 2 is obtained using model in embodiment 1 and screening
Data are using the reference data provided in embodiment 1, wherein breast cancer 5, and lung cancer 4, nasopharyngeal carcinoma 9.Here
Cancer patient number it is limited, these types of cancer does not distinguish cancer types.By the liver cancer obtained model of training in embodiment 1 and
Site is used to predict the distinction of these three cancers and reserved Healthy People.The purpose of the present embodiment is that verifying embodiment 1 is screened
Site and model can be used for the detection of other cancer types.Fig. 2 provides the ROC curve figure of prediction.Figure it is seen that
Average AUC reaches 0.82, achieves good prediction effect.The site for illustrating that embodiment 1 is screened can be used for breast cancer, lung
The prediction of the cancers such as cancer, nasopharyngeal carcinoma.
Embodiment 3 predicts lung cancer, colorectal cancer using peripheral blood RRBS data
It is entitled that we using 2017 were published in PNAS " Identification of methylation haplotype
blocks aids in deconvolution of heterogeneous tissue samples and tumor
The peripheral blood degeneracy representativeness bisulfite delivered in tissue-of-origin mapping from plasma DNA " article
(RRBS) data are sequenced in salt, and data deposit in Gene Expression Omnibus (GEO), searching number GSE79279.Here
Use the outer of the peripheral blood DNA methylation data and its colorectal cancer 30 for arriving Healthy People (75), lung cancer patient (29)
All blood DNA methylation data.
The reference sequences U13369.1 of manned rDNA repeated fragment unit under in GENBANK, overall length are total
42999 bases, 3288 sites CpG.Sequencing data is matched back on reference sequences using bs-seeker2 software
U13369.1, no longer removal sequencing repeat, and reason is that the sequencing coverage on rDNA is relatively high.It calculates CpG each
C number of methylation of point and the C number that do not methylate.Used herein is RRBS data, and the number of sites that RRBS technology can detect is small
In WGBS, therefore the technology cannot cover all sites CpG.
Next, screening out the few site CpG of those matching times, subsequent training process and embodiment 1 are consistent, Fig. 3 and Fig. 4
ROC curve figure in corresponding test set is given, good prediction effect is obtained.
By above-mentioned screening, the result of acquisition is as follows: the 21740th, 23407,34657,28277 and the 37020 site
It is better anticipated effect, the 36193rd, 36218,34719,40812,37088,15596,22970,18200,19952,
14972,21194,23926 also show good prediction effect.38974th, 37148,37013,37028,37076,
32936,38982,21695,19811,32927,32906,32920,36988,32964,30940,19819,39004,
31075,38940,19843,18745,33830,31663,21709,23623,30639,32931,18727,37206,
38980,21309,30630,18737,38596,30647,37072,37162,30936,33838 and the 24167,
The site CpG at 36996,37361,37178,39913,30789,36832,34428,34805 positions is because of RRBS Technical Board
The reasons such as sex-limited, effect are declined or without signals.
Embodiment 4 carries out the postoperative assessment of liver cancer using peripheral blood data
Used here as the full-length genome peripheral blood methylation information of two patients in embodiment 1 perioperatively, implementation is used
Logistic regression model in example 1 and the site filtered out predict the postoperative patient cancer probability of two patients, such as Fig. 5,
Wherein TBR34 it is postoperative still have it is very high suffer from cancer probability, in fact the patient 8 months after surgery is just dead, and patient TBR36 is postoperative
Suffer from cancer probability and just decline very low, still survives within 20 months after surgery.Show to be consistent with marker prediction result.
Herein, term " first ", " second " are used for description purposes only, and are not understood to indicate or imply relatively important
Property or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or
Person implicitly includes at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two,
Three etc., unless otherwise specifically defined.
In the present invention unless specifically defined or limited otherwise, the terms such as term " connected ", " connection " should do broad sense reason
Solution, for example, it may be being fixedly connected, may be a detachable connection, or integral;It can be mechanical connection, be also possible to electricity
Connection can communicate each other;It can be directly connected, can also can be inside two elements indirectly connected through an intermediary
Connection or two elements interaction relationship, unless otherwise restricted clearly.For those of ordinary skill in the art and
Speech, the specific meanings of the above terms in the present invention can be understood according to specific conditions.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not
It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any
It can be combined in any suitable manner in a or multiple embodiment or examples.In addition, without conflicting with each other, the technology of this field
The feature of different embodiments or examples described in this specification and different embodiments or examples can be combined by personnel
And combination.
Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example
Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned
Embodiment is changed, modifies, replacement and variant.
Claims (10)
1. a kind of marker for cancer, which is characterized in that including selected from least one of following site CpG:
On the basis of human rebosomal DNA repeated fragment elements reference sequence U13369.1, the 38974th, 37148,37013,
37028,37076,32936,21740,23407,34657,28277,38982,21695,19811,32927,32906,
32920,36988,32964,30940,19819,39004,31075,38940,19843,18745,33830,31663,
21709,23623,30639,32931,18727,37206,38980,21309,30630,18737,38596,30647,
37072,37162,30936,33838,36193,36218,34719,40812,37088,15596,22970,18200,
19952,14972,21194,23926 and the 24167,36996,37361,37178,39913,30789,37020,36832,
At least one of the site CpG or the modified site CpG at 34428 or 34805 positions.
2. marker according to claim 1, which is characterized in that with human rebosomal DNA repeated fragment elements reference sequence
On the basis of U13369.1, the marker includes at least one of following site CpG: the 38974th, 37148,37013,
The site CpG or the modified site CpG at 37028,37076,32936,21740,23407,34657,28277 positions
At least one;And including at least one of following site: the 38982nd, 21695,19811,32927,32906,32920,
36988,32964,30940,19819,39004,31075,38940,19843,18745,33830,31663,21709,
23623,30639,32931,18727,37206,38980,21309,30630,18737,38596,30647,37072,
37162,30936,33838,36193,36218,34719,40812,37088,15596,22970,18200,19952,
The site CpG or the modified site CpG at 14972,21194,23926 positions;
Optionally, on the basis of human rebosomal DNA repeated fragment elements reference sequence U13369.1, the marker includes the
In the site CpG or the modified site CpG at 38974,37148,37013,37028,37076,32936,21740 positions
At least one;And in the 21740th, 23407,34657,28277 site CpG or the modified site CpG at position
At least one.
3. marker according to claim 1 or 2, which is characterized in that with human rebosomal DNA repeated fragment elements reference sequence
On the basis of arranging U13369.1, the marker includes at the 38974th, 37148,37013,37028,37076,32936 position
At least two in the site CpG or the modified site CpG;
Optionally, on the basis of human rebosomal DNA repeated fragment elements reference sequence U13369.1, the marker includes the
In the site CpG or the modified site CpG at 38974,37148,37013,37028,37076,32936 positions at least
One;And in the 21740th, 23407,34657,28277 site CpG or the modified site CpG at position at least
Two;
Optionally, the modified site CpG includes 5- methylation modification or the modification of 5- methylolation.
4. a kind of primer sequence, which is characterized in that the primer sequence is where the marker any in claims 1 to 3
Nucleotides sequence is classified as target sequence, the specific amplification for target sequence.
5. a kind of probe, the probe is free in solution or is fixed on chip, which is characterized in that the probe can be special
Nucleotide sequence where the opposite sex capture any marker of claims 1 to 33.
6. a kind of kit, which is characterized in that for diagnosing cancer, the kit contains for detecting right the kit
It is required that the reagent of 1~3 any marker;
Optionally, the kit further comprises probe described in primer sequence as claimed in claim 4 or claim 5.
7. any marker of claims 1 to 3 or primer sequence as claimed in claim 4 or claim 5 institute
The probe stated is preparing the purposes in cancer diagnosing kit;
Optionally, the cancer includes liver cancer, lung cancer, colorectal cancer, breast cancer, nasopharyngeal carcinoma and/or head and neck cancer.
8. a kind of method that target site methylates in determining sample to be tested, the target site is any in claims 1 to 3
The site CpG in the marker, which comprises
(1) conversion processing is carried out to the dissociative DNA in the sample to be tested peripheral blood, so that the cytimidine not methylated turns
Thymidine is turned to, the sample of conversion processing is obtained;
(2) sample based on the conversion processing, constructs sequencing library, and sequencing obtains sequencing data;
(3) sequencing data is compared with reference sequences, target position in the sequencing data is determined based on comparison result
The methylation result of point;
Optionally, the reference sequences are people's rDNA repeated fragment elements reference sequence U13369.1;
Optionally, the sequencing is carried out by second generation sequencing approach or third generation sequencing approach;
Optionally, the sequencing be by selected from Hiseq2000, SOliD, 454 and single-molecule sequencing device it is at least one into
Capable.
9. a kind of system for diagnosing cancer or predicting cancer relapse risk characterized by comprising
Conversion treatment device, the conversion treatment device is for turning the dissociative DNA in sample to be tested peripheral blood
Change processing obtains the sample of conversion processing so that the Cytosines not methylated are thymidine;
Sequencing device, the sequencing device are connected with conversion treatment device, sample of the sequencing device based on the conversion processing
This, constructs sequencing library, and sequencing obtains sequencing data;
Comparison device, the comparison device are connected with the sequencing device, and the comparison device is for the sequencing data and ginseng
It examines sequence to be compared, the methylation result in the site CpG in marker in the sequencing data is determined based on comparison result;
Result judgement device, the result judgement device are connected with the comparison device, and the result judgement device is based on described
The methylation in the site CpG determines whether the sample to be tested suffers from as a result, by statistical model analysis in marker in sequencing data
Have whether cancer or the whether easy cancer stricken of the prediction sample to be tested or post-surgical cancer recur;
Wherein, the marker is any marker of claims 1 to 3.
10. system according to claim 9, which is characterized in that the reference sequences are people's rDNA repeated fragment list
First reference sequences U13369.1;
Optionally, the statistical model is multivariate statistical model;
Optionally, the statistical model is the methylation based on the site CpG in multiple cancer patients and the multiple cancer patient
As a result it establishes, the site CpG is the site CpG in claims 1 to 3 in any marker;
Optionally, the multivariate statistical model is logistic regression model, Random Forest model, at least one in SVM model
Kind, preferably logistic regression model;
Optionally, the comparison is carried out using software bs-seeker2, matching way selected by software is Local Alignment (local
alignment)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910445136.6A CN110195107B (en) | 2019-05-27 | 2019-05-27 | Ribosomal DNA methylation marker for detecting cancer in peripheral blood and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910445136.6A CN110195107B (en) | 2019-05-27 | 2019-05-27 | Ribosomal DNA methylation marker for detecting cancer in peripheral blood and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110195107A true CN110195107A (en) | 2019-09-03 |
CN110195107B CN110195107B (en) | 2023-04-14 |
Family
ID=67753136
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910445136.6A Active CN110195107B (en) | 2019-05-27 | 2019-05-27 | Ribosomal DNA methylation marker for detecting cancer in peripheral blood and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110195107B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110982907A (en) * | 2020-02-27 | 2020-04-10 | 上海鹍远生物技术有限公司 | Thyroid nodule-related rDNA methylation marker and application thereof |
CN112375822A (en) * | 2020-06-01 | 2021-02-19 | 广州市基准医疗有限责任公司 | Methylation biomarker for detecting breast cancer and application thereof |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104611410A (en) * | 2013-11-04 | 2015-05-13 | 北京贝瑞和康生物技术有限公司 | Noninvasive cancer detection method and its kit |
-
2019
- 2019-05-27 CN CN201910445136.6A patent/CN110195107B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104611410A (en) * | 2013-11-04 | 2015-05-13 | 北京贝瑞和康生物技术有限公司 | Noninvasive cancer detection method and its kit |
Non-Patent Citations (2)
Title |
---|
K. C. ALLEN CHAN等: "Noninvasive detection of cancer-associated genomewide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing", 《PNAS》 * |
XIANGLIN ZHANG等: "Ribosomal DNA methylation as stable biomarkers for detection of cancer in plasma", 《BIORXIV》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110982907A (en) * | 2020-02-27 | 2020-04-10 | 上海鹍远生物技术有限公司 | Thyroid nodule-related rDNA methylation marker and application thereof |
CN110982907B (en) * | 2020-02-27 | 2020-07-03 | 上海鹍远生物技术有限公司 | Thyroid nodule-related rDNA methylation marker and application thereof |
CN113308540A (en) * | 2020-02-27 | 2021-08-27 | 上海鹍远生物技术有限公司 | Thyroid nodule-related rDNA methylation marker and application thereof |
CN112375822A (en) * | 2020-06-01 | 2021-02-19 | 广州市基准医疗有限责任公司 | Methylation biomarker for detecting breast cancer and application thereof |
WO2021244423A1 (en) * | 2020-06-01 | 2021-12-09 | 广州市基准医疗有限责任公司 | Methylated biomarker for detecting breast cancer, and use thereof |
Also Published As
Publication number | Publication date |
---|---|
CN110195107B (en) | 2023-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108753967A (en) | A kind of gene set and its panel detection design methods for liver cancer detection | |
CN109825583B (en) | Marker for early diagnosis of liver cancer by DNA methylation of human repeat element and application of marker | |
CN111254194B (en) | Cancer-related biomarkers based on sequencing and data analysis of cfDNA and application thereof in classification of cfDNA samples | |
CN106460046A (en) | Detecting colorectal neoplasm | |
CN109825584B (en) | DNA methylation marker for diagnosing early liver cancer by using peripheral blood and application thereof | |
CN112501293B (en) | Reagent combination for detecting liver cancer, kit and application thereof | |
CN109072310A (en) | Cancer is detected in urine | |
Tanić et al. | Epigenome-wide association studies for cancer biomarker discovery in circulating cell-free DNA: technical advances and challenges | |
Poage et al. | Identification of an epigenetic profile classifier that is associated with survival in head and neck cancer | |
CN112322736A (en) | Reagent combination for detecting liver cancer, kit and application thereof | |
CN112280865B (en) | Reagent combination for detecting liver cancer, kit and application thereof | |
CN107142320B (en) | Gene marker for detecting liver cancer and application thereof | |
CN110257525A (en) | There is the marker and application thereof of conspicuousness to diagnosing tumor | |
CN111424093B (en) | Kit, device and method for lung cancer diagnosis | |
CN110195107A (en) | The rDNA methylation markers of cancer detection and its application in peripheral blood | |
WO2019149093A1 (en) | Gene marker for detecting esophageal cancer, use thereof and detection method therefor | |
KR101992796B1 (en) | Method for providing information of prediction and diagnosis of hypertension using methylation level of SGK1 gene and composition therefor | |
US20140206565A1 (en) | Esophageal Cancer Markers | |
CN107119144A (en) | Multi-functional transcription regulatory factor CTCF DNA binding sites CTCF_55 application | |
CN113817822B (en) | Tumor diagnosis kit based on methylation detection and application thereof | |
CN114395623B (en) | Gene methylation detection primer composition, kit and application thereof | |
CN107227366A (en) | Multi-functional transcription regulatory factor CTCF DNA binding sites CTCF_113 application | |
CN107151708A (en) | Multi-functional transcription regulatory factor CTCF DNA binding sites CTCF_13 application | |
TWI721414B (en) | Methods for early prediction, treatment response, recurrence and prognosis monitoring of breast cancer | |
WO2022190752A1 (en) | Cancer test reagent set, method for producing cancer test reagent set, and cancer test method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |