CN102061337A - Method and system for detecting tissue-specific differentially methylated region (tDMR) - Google Patents

Method and system for detecting tissue-specific differentially methylated region (tDMR) Download PDF

Info

Publication number
CN102061337A
CN102061337A CN2010105571311A CN201010557131A CN102061337A CN 102061337 A CN102061337 A CN 102061337A CN 2010105571311 A CN2010105571311 A CN 2010105571311A CN 201010557131 A CN201010557131 A CN 201010557131A CN 102061337 A CN102061337 A CN 102061337A
Authority
CN
China
Prior art keywords
tdmr
full genome
methylates
genome
seed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010105571311A
Other languages
Chinese (zh)
Other versions
CN102061337B (en
Inventor
余昶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Technology Solutions Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to CN2010105571311A priority Critical patent/CN102061337B/en
Publication of CN102061337A publication Critical patent/CN102061337A/en
Application granted granted Critical
Publication of CN102061337B publication Critical patent/CN102061337B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method and a system for detecting a tissue-specific differentially methylated region (tDMR). The method comprises the following steps: obtaining single-point methylation information on a complete genome by genome-wide sequencing; determining a seed tDMR on the complete genome under a pre-selection condition according to the single-point methylation information on the complete genome; extending the seed tDMR along two sides, and obtaining candidate tDMRs based on an extension terminal condition; and filtering the candidate tDMRs based on a filtering condition to obtain the tDMR result. By means of the method and the system provided by the invention, the methylation state of each site on the complete genome can be defined so as to determine the tDMR within the range of the complete genome, thus greatly improving the detection efficiency and lowering the cost. Subsequent validation shows that the found tDMR has high accuracy rate up to 85%; and validation in a plurality of species (not only limited to mammals) shows that the method has higher accuracy and higher sensitivity compared with other methods such as vardhman Rakyan and the like.

Description

A kind of tissue specificity difference methylate method for detecting area and system
Technical field
The present invention relates to genomics and field of bioinformatics, relate in particular to a kind of tissue specificity difference methylate zone (tissue-specific differentially methylated region, tDMR) detection method and system.
Background technology
In Mammals, dna methylation is that genome functions is realized necessary.Existing many full Study on Genome show that mammiferous dna methylation profiles (being translated into collection of illustrative plates) has tissue specificity.TDMR (reference [1]) is the important implementation method of difference function between the tissue, and the tDMR that seeks between the different tissues is significant to the research of genome functions.
Ordinary method is checked methylating on genomic indivedual genes or subregion usually.Existing based on chip technology detection tDMR experimental technique complicated operation, success ratio is low, the cost height.Biochip technology can only be studied the control region of known indivedual genes at present, and can not be to the methylation analysis of a large amount of genes especially unknown gene, and chip manufacturing and testing process complicated operations such as target gene and probe preparation, experiment success rate is low, the cost height.The research method of [1] such as Vardhman Rakyan shows, find interesting areas with chip, if the rate that on average methylates in tissue is more than 60%, and another tissue tissue just is being defined as tDMR below 40%, threshold value is chosen dogmatic, the positive predictive value of tDMR (positive predictive value) and susceptibility (sensitivity) are respectively 78% and 61%, and error is bigger.
Reference:
[1]Vardhman?Rakyan,Thomas?Down,Natalie?Thorne,et?al.An?integrated?resource?for?genome-wide?identification?and?analysis?of?human?tissue-specific?differentially?methylated?regions(tDMRs)Genome?Res.published?online?June?24,2008
[2]Petra?Haj?kova,Osman?El-Maarri,DNA-Methylation?Analysis?by?the?Bisulfite-Assisted?Genomic?Sequencing?Method?Methods?in?Molecular?Biology,2002,Volume?200,143-154,DOI:10.1385/1-59259-182-5:143
Summary of the invention
The technical problem that the present invention will solve provides a kind of tissue specificity difference method for detecting area that methylates, and has realized that carrying out tDMR based on full genome detects, and the accuracy height.
The invention provides a kind of tissue specificity difference regional tDMR detection method that methylates, comprising:
Obtain on the full genome single-point information that methylates by genome sequencing;
Determine seed tDMR on the full genome according to the information that methylates of single-point on the full genome based on preselected conditions;
TDMR extends to both sides to seed, obtains candidate tDMR based on extending end condition;
Based on filtration condition candidate tDMR is filtered, obtain tDMR result.
An embodiment according to detection method of the present invention, determine that based on preselected conditions seed tDMR comprises on the full genome according to the information that methylates of single-point on the full genome: scan on the full genome single-point information that methylates by moving window, determine seed tDMR on the full genome based on preselected conditions.
According to an embodiment of detection method of the present invention, be that moving window that length, 1 CpG are step-length scans on the full genome single-point information that methylates with 5 CpG; This preselected conditions comprises:
P<=0.05 of (1) chi square test (in conjunction with the fisher rigorous examination);
The significant difference of (2) two times of methylation level; With
(3) methylation level of at least one sample is more than 20%.
According to an embodiment of detection method of the present invention, extend end condition and comprise:
Distance between (1) two successive CpG surpasses 200bp;
The average methylation level of (2) two samples is less than two times of differences;
This regional methylation level of (3) two samples is all less than 20%;
(4) p of chi square test>0.01.
Filtration condition comprises:
(1)FDR<=0.05;
The average coverage in the tDMR zone that (2) obtains is greater than 20 reads;
The coverage of the CG site single-point that (3) obtains is greater than 10 reads;
(4) accuracy of sampling with replacement assay being carried out in the CpG site among the tDMR that obtains will be more than 95%.
An embodiment according to detection method of the present invention, obtain by genome sequencing that the single-point information of methylating comprises on the full genome: make by bisulfite and methylated cytosine(Cyt) deaminizating does not take place among the genomic DNA be transformed into uridylic, and methylated cytosine(Cyt) remains unchanged; Treated full genome is checked order, and compare, determine to take place on the full genome methylated CpG site with undressed whole genome sequence.
Detection method provided by the invention obtains single-point on the full genome of the sample information that methylates by the genome sequencing technology, on the basis of genome sequencing two order-checking samples are carried out analyzing and processing, can extract tDMR in full genome range; By steps such as seed tDMR selection, seed tDMR extension, candidate tDMR filtrations, improved the accuracy that detects.
The technical problem that the present invention will solve provides a kind of tissue specificity difference regional detection system that methylates, and has realized that carrying out tDMR based on full genome detects, and the accuracy height.
The invention provides a kind of tissue specificity difference regional tDMR detection system that methylates, comprising:
The information that methylates acquisition module is used for obtaining on the full genome single-point information that methylates by genome sequencing;
The seed region determination module is used for determining seed tDMR on the full genome according to the information that methylates of single-point on the described full genome based on preselected conditions;
The seed region extension of module is used for described seed tDMR is extended to both sides, obtains candidate tDMR based on extending end condition;
The candidate region filtration module is used for based on filtration condition described candidate tDMR being filtered, and obtains tDMR result.
Detection system embodiment according to the present invention, the seed region determination module scans on the described full genome single-point information that methylates by moving window, determines seed tDMR on the full genome based on preselected conditions; Wherein, be that moving window that length, 1 CpG are step-length scans on the described full genome single-point information that methylates with 5 CpG; Preselected conditions comprises: p<=0.05 of (1) chi square test (in conjunction with the fisher rigorous examination); The significant difference of (2) two times of methylation level; (3) methylation level of at least one sample is more than 20%.
Detection system embodiment according to the present invention, extend end condition and comprise: the distance between (1) two successive CpG surpasses 200bp; The average methylation level of (2) two samples is less than two times of differences; This regional methylation level of (3) two samples is all less than 20%; (4) p of chi square test>0.01.Filtration condition comprises: (1) FDR<=0.05; The average coverage in the tDMR zone that (2) obtains is greater than 20 reads; The coverage of the CG site single-point that (3) obtains is greater than 10 reads; (4) accuracy of sampling with replacement assay being carried out in the CpG site among the tDMR that obtains will be more than 95%.
Detection system embodiment according to the present invention, the information acquisition module of methylating comprises: the sodium bisulfite treatment facility, be used for making complete genomic DNA that methylated cytosine(Cyt) deaminizating not take place and be transformed into uridylic, and methylated cytosine(Cyt) remains unchanged by bisulfite; Full genome comparison equipment is used for treated full genome is checked order, and compares with undressed whole genome sequence, determines to take place on the full genome methylated CpG site.
Detection system provided by the invention, the information that methylates acquisition module obtains single-point on the full genome of the sample information that methylates by the genome sequencing technology, follow-up each module is carried out analyzing and processing to two order-checking samples on the basis of genome sequencing, can extract tDMR in full genome range; Carry out seed tDMR by the seed region determination module and select, carry out seed tDMR extension, candidate tDMR is filtered, improved the accuracy that detects by the candidate region filtration module by the seed region extension of module.
Description of drawings
Fig. 1 illustrates the methylate schema of an embodiment of method for detecting area of tissue specificity difference of the present invention;
Fig. 2 illustrates the methylate schema of another embodiment of method for detecting area of tissue specificity difference of the present invention;
Fig. 3 illustrates the methylate block diagram of an embodiment of regional detection system of tissue specificity difference of the present invention;
Fig. 4 illustrates the methylate block diagram of another embodiment of regional detection system of tissue specificity difference of the present invention.
Embodiment
With reference to the accompanying drawings the present invention is described more fully, exemplary embodiment of the present invention wherein is described.In the accompanying drawings, identical label is represented identical or similar assembly or element.
Fig. 1 illustrates the methylate schema of an embodiment of method for detecting area of tissue specificity difference of the present invention.
As shown in Figure 1, in step 102, obtain on the full genome of sample the single-point information that methylates by genome sequencing.For example, on s-generation high-throughput genome sequencing basis, obtain the single-point of sample on the full genome information that methylates by sodium bisulfite sequencing (Bisulfite-sequencing) (reference (2)).Through after the processing of step 102, below step 104 to 108 differences that are used to extract two samples zone that methylates.
In step 104, determine seed tDMR on the full genome of two samples based on preselected conditions according to the information that methylates of single-point on two full genomes of sample.
In step 106, tDMR extends to both sides to seed, obtains candidate tDMR based on extending end condition.
In step 108, based on filtration condition candidate tDMR is filtered, obtain final tDMR result.
Detect tDMR experimental technique complicated operation at existing chip technology, success ratio is low, problems such as cost height, the foregoing description obtains single-point on the full genome of the sample information that methylates by the genome sequencing technology, on the basis of genome sequencing, two order-checking samples are carried out analyzing and processing, can in full genome range, seek tDMR, easy, from full genome, extract tDMR apace, improve detection efficiency greatly, reduced cost.In addition, by steps such as seed tDMR selection, seed tDMR extension, candidate tDMR filtrations, the accuracy and the sensitivity that detect have been improved.
Fig. 2 illustrates the methylate schema of another embodiment of method for detecting area of tissue specificity difference of the present invention.
As shown in Figure 2, in step 202, make by bisulfite and methylated cytosine(Cyt) deaminizating not to take place among the genomic DNA be transformed into uridylic, and methylated cytosine(Cyt) remains unchanged.
In step 204, treated full genome is checked order, and compare with undressed whole genome sequence, determine to take place on the full genome methylated CpG site.
In step 206, scan on the described full genome single-point information that methylates by moving window, determine seed tDMR on the full genome based on preselected conditions.
With 5 CpG is that moving window that length, 1 CpG are step-length scans on the described full genome single-point information that methylates; Preselected conditions comprises:
P<=0.05 of (1) chi square test (in conjunction with the fisher rigorous examination);
In the time of p<=0.05, can think in this zone that there is the difference of significance in methylating between sample in twos.
The significant difference of (2) two times of methylation level; With
(3) methylation level of at least one sample is more than 20%; One of them of the tDMR zone of the finding rate that methylates needs to make the zone of being found have biological significance more than 20%.
In step 208, seed tDMR is extended acquisition candidate tDMR to both sides, the extension end condition is:
Distance between (1) two successive CpG surpasses 200bp;
If the distance between two successive CpG is long, the cognation between these two CpG is little, thus when this situation occurs, stop extending, thus guarantee the reliability of detected result as far as possible.
The average methylation level of (2) two samples is less than two times of differences;
This regional methylation level of (3) two samples is all less than 20%;
(4) p of chi square test>0.01.
In step 210, based on filtration condition candidate tDMR is filtered, filtration condition comprises:
(1) FDR (false discovery rate, mistake discovery rate)<=0.05
The average coverage in the tDMR zone that (2) obtains is greater than 20 reads (read, the dna sequence dna with the length necessarily read that utilizes the order-checking of new-generation sequencing technology to obtain).
The coverage of the CpG site single-point that (3) obtains is greater than 10 reads
(4) the sampling with replacement check is carried out in the CpG site among the tDMR that obtains and (extracted any one site in promptly from all CpG sites, test, after finishing, put back to the method that overall middle participation is selected next time again), result's accuracy will be more than 95%.
Step 212 obtains final tDMR result by filtering.
It may be noted that and detect the CG site in the foregoing description that it will be understood by those of skill in the art that method of the present invention goes for CHH, CHG site equally, wherein H represents among A, C, the T any one.
In the foregoing description,, determine specifically to have adopted preselected conditions, extension end condition and the filtration condition of tDMR, the accuracy height by a large amount of experimental studies and creative work.Through subsequent authentication, the tDMR accuracy rate that finds by aforesaid method is more than 85%.
Fig. 3 illustrates the methylate block diagram of an embodiment of regional detection system of tissue specificity difference of the present invention.As shown in Figure 3, detection system comprises the information acquisition module 31 that methylates, seed region determination module 32, seed region extension of module 33 and candidate region filtration module 34 among this embodiment.Wherein, the information acquisition module 31 that methylates obtains on the full genome single-point information that methylates by genome sequencing; Seed region determination module 32 is determined seed tDMR on the full genome according to the single-point information that methylates on the full genome based on preselected conditions; 33 couples of seed tDMR of seed region extension of module extend to both sides, obtain candidate tDMR based on extending end condition; Candidate region filtration module 34 filters candidate tDMR based on filtration condition, obtains tDMR result.
In the foregoing description, the information that methylates acquisition module obtains single-point on the full genome of the sample information that methylates by the genome sequencing technology, follow-up each module is carried out analyzing and processing to two order-checking samples on the basis of genome sequencing, can in full genome range, seek tDMR, easy, from full genome, extract tDMR apace, improve detection efficiency greatly, reduced cost.Select, carry out seed tDMR extension and carry out seed tDMR, candidate tDMR is filtered, improved the accuracy and the sensitivity that detect by the candidate region filtration module by the seed region extension of module by the seed region determination module.
In one embodiment, seed region determination module 32 scans on the full genome single-point information that methylates by moving window, determines seed tDMR on the full genome based on preselected conditions; Wherein, be that moving window that length, 1 CpG are step-length scans on the described full genome single-point information that methylates with 5 CpG; Preselected conditions comprises: p<=0.05 of (1) chi square test (in conjunction with the fisher rigorous examination); The significant difference of (2) two times of methylation level; (3) methylation level of at least one sample is more than 20%.According to an embodiment of detection system of the present invention, above-mentioned extension end condition comprises: the distance between two successive CpG surpasses 200bp; The average methylation level of two samples is less than two times of differences; This regional methylation level of two samples is all less than 20%; The p of chi square test>0.01.According to an embodiment of detection system of the present invention, filtration condition comprises: FDR<=0.05; The average coverage in the tDMR zone that obtains is greater than 20 reads; The coverage of the CG site single-point that obtains is greater than 10 reads; The accuracy of the CpG site among the tDMR that obtains being carried out the sampling with replacement assay will be more than 95%.
In the foregoing description,, determine specifically to have adopted preselected conditions, extension end condition and the filtration condition of tDMR, the accuracy height by a large amount of experimental studies and creative work.Through subsequent authentication, the tDMR accuracy rate that finds by aforesaid method is more than 85%.
Fig. 4 illustrates the methylate block diagram of another embodiment of regional detection system of tissue specificity difference of the present invention.Among Fig. 4 and the module of Fig. 3 with same numeral can describe referring to the correspondence among Fig. 3, for for purpose of brevity, be not described in detail at this.Compare with Fig. 3, the information that the methylates acquisition module 41 among Fig. 4 comprises sodium bisulfite treatment facility 411 and full genome comparison equipment 412.Wherein, sodium bisulfite treatment facility 411 makes by bisulfite and methylated cytosine(Cyt) deaminizating does not take place among the complete genomic DNA is transformed into uridylic, and methylated cytosine(Cyt) remains unchanged; 412 pairs of treated full genomes of full genome comparison equipment check order, and compare with undressed whole genome sequence, determine to take place on the full genome methylated CpG site.
In the foregoing description, the sodium bisulfite treatment facility makes with bisulfite and methylated cytosine(Cyt) deaminizating does not take place among the DNA is transformed into uridylic, and methylated cytosine(Cyt) remains unchanged; Full genome comparison equipment checks order to treated full genome, and compare with undressed sequence, judge whether that the CpG site methylates, the methylation state in each CpG site has very high reliability and tolerance range on the clear and definite genome of energy.
Through checking (being not limited only to Mammals) in many species, method and system of the present invention has higher tolerance range and susceptibility than the method for Vardhman Rakyan etc., and very high reliability and tolerance range are arranged.
Introduce an application examples of the present invention below.
Used sampled data is the fasta data of No. one, inoblast imr90 and YH in this application examples.The fasta data download address that No. one, inoblast imr90 and YH is respectively:
imr90:http://neomorph.salk.edu/human_methylome/data.html
YH:http://www.ncbi.nlm.nih.gov/Traces/wgs/?val=ADDF
In this application examples, a plurality of treatment steps in the technical scheme are realized by software the running environment of software can be Unix/Linux operating system, moves this software by the Unix/Linux order line.Description below provides the command line parameter of running software simultaneously.
At first carrying out data prepares to handle.After data are downloaded through comparison, go repetition, extract step process such as the information that methylates, obtain the cout file (file of the situation that methylates in record cytosine(Cyt) C site) of YH and imr90, the sample input file that extracts tDMR is this two cout files.It may be noted that method of the present invention can be used for any species that can access the cout file, the content that is not limited to enumerate among the embodiment is so range of application is extremely wide.
The concrete form of Cout file is as follows:
Figure BSA00000358124200101
Table 1
The first step: choose seed tDMR
With 5 CpG is that moving window that length, 1 CpG are step-length scans on the full genome that comprises in two sample file count files the single-point information that methylates; Preselected conditions is: p<=0.05 of (1) chi square test (in conjunction with the fisher rigorous examination); The significant difference of (2) two times of methylation level; (3) methylation level of at least one sample is more than 20%.
Computer command line parameter is:
″./tdmr?slide-c?CG?YH.cout/chr$i.cout?imr90.cout/chr$i.cout>outfile/CG/chr$i.CG;echo?slide?done;
The form of parameter-cytosine(Cyt) C that the c representative will be sought.Usually have three kinds of situations optional: CG, CHH, CHG, research is the CpG site herein, so select CG here.
Output result: the seed that obtains 916949 tDMR altogether
Output destination file form is as follows:
Figure BSA00000358124200111
Table 2
In above-mentioned steps, can on full genome, divide the seed of the parallel tDMR of searching of karyomit(e), thereby can reduce operation time, raise the efficiency.
Second step: the two-way extension of seed tDMR
Seed tDMR extend is obtained candidate tDMR to both sides, extend end condition and be: the distance between (1) two successive CpG surpasses 200bp; The average methylation level of (2) two samples is less than two times of differences; This regional methylation level of (3) two samples is all less than 20%; (4) p of chi square test>0.01.
Computer command line parameter is:
./tdmr?extend-t?8-c?CG?YH.cout/chr$i.cout?imr90.cout/chr$i.coutoutfile/CG/chr$i.CG>outfile/CG/chr$i.CG.ext;echo?extend?done;
Parameter-t represents in program run, the CPU number that use,
Parameter-c represents the form (have three kinds optional, CG, CHH, CHG choose CG here) of cytosine(Cyt) C.
Output result: through totally 279004 of the tDMR after extending
Preceding 10 row of output file form are as follows:
Figure BSA00000358124200121
Table 3
Back 8 row are as follows:
Figure BSA00000358124200122
Table 4
The 3rd step: candidate tDMR is filtered
Based on filtration condition candidate tDMR is filtered, filtration condition comprises: (1) FDR<=0.05; The average coverage in the tDMR zone that (2) obtains is greater than 20 reads; The coverage of the CG site single-point that (3) obtains is greater than 10 reads; (4) accuracy of sampling with replacement assay being carried out in the CpG site among the tDMR that obtains will be more than 95%.
Utilizing order under the linux to finish filters tDMR:
sort-k?2-n?outfile/CG/chr$i.CG.ext?|awk′\$16>0.95{a=\$1;for(i=2;i<16;i++){a=a\″\t\″\$i}{print?a}}′|uniq>outfile/CG/chr$i.CG.ext.filter;echo?filter?done″
Output result: obtain 36924 tDMR at last altogether through filtering.Dependency checking through tDMR and genetic expression relation is about to all genes that occur and finds in tDMR, whether the exploit information analytical procedure finds the relation of gene expression amount and the rate of methylating identical with known relationship, and statistics gets final product.The tDMR accuracy rate that present method finds can be more than 85%.
Totally 15 be listed as:
Preceding 8 row of output file form are as follows:
Figure BSA00000358124200131
Table 5
Back 7 row are as follows:
Table 6
In above-mentioned application examples, determined the methylation state in each CpG site on the full genome, Kai Fa tDMR software can be analyzed two order-checking samples on full genomic level on this basis, find out simply, fast and organize the differences zone that methylates, improve detection efficiency greatly, reduced cost.Above-mentioned processing can divide the karyomit(e) parallel running on full genome, thus raising speed and efficient, the present invention divides the karyomit(e) parallel running on full genome, can further improve speed and efficient, and speed is fast, the efficient height.
Description of the invention provides for example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selecting and describing embodiment is for better explanation principle of the present invention and practical application, thereby and makes those of ordinary skill in the art can understand the various embodiment that have various modifications that the present invention's design is suitable for specific end use.

Claims (10)

1. a tissue specificity difference regional tDMR detection method that methylates is characterized in that, comprising:
Obtain on the full genome single-point information that methylates by genome sequencing;
Determine seed tDMR on the full genome according to the information that methylates of single-point on the described full genome based on preselected conditions;
Described seed tDMR is extended to both sides, obtain candidate tDMR based on extending end condition;
Based on filtration condition described candidate tDMR is filtered, obtain tDMR result.
2. detection method according to claim 1 is characterized in that, determines that based on preselected conditions seed tDMR comprises on the full genome according to the information that methylates of single-point on the described full genome:
Scan on the described full genome single-point information that methylates by moving window, determine seed tDMR on the full genome based on preselected conditions.
3. detection method according to claim 2 is characterized in that, is that moving window that length, 1 CpG are step-length scans on the described full genome single-point information that methylates with 5 CpG; Described preselected conditions comprises:
(1) p of chi square test<=0.05;
The significant difference of (2) two times of methylation level; With
(3) methylation level of at least one sample is more than 20%.
4. detection method according to claim 1 is characterized in that, described extension end condition comprises:
Distance between (1) two successive CpG surpasses 200bp;
The average methylation level of (2) two samples is less than two times of differences;
This regional methylation level of (3) two samples is all less than 20%;
(4) p of chi square test>0.01.
5. detection method according to claim 1 is characterized in that, described filtration condition comprises:
(1)FDR<=0.05;
The average coverage in the tDMR zone that (2) obtains is greater than 20 reads;
The coverage of the CG site single-point that (3) obtains is greater than 10 reads;
(4) accuracy of sampling with replacement assay is carried out more than 95% in the CpG site among the tDMR that obtains.
6. detection method according to claim 1 is characterized in that, obtains by genome sequencing that the single-point information of methylating comprises on the full genome:
Make by bisulfite and methylated cytosine(Cyt) deaminizating not to take place among the genomic DNA be transformed into uridylic, and methylated cytosine(Cyt) remains unchanged;
Treated full genome is checked order, and compare, determine to take place on the full genome methylated CpG site with undressed whole genome sequence.
7. a tissue specificity difference regional tDMR detection system that methylates is characterized in that, comprising:
The information that methylates acquisition module is used for obtaining on the full genome single-point information that methylates by genome sequencing;
The seed region determination module is used for determining seed tDMR on the full genome according to the information that methylates of single-point on the described full genome based on preselected conditions;
The seed region extension of module is used for described seed tDMR is extended to both sides, obtains candidate tDMR based on extending end condition;
The candidate region filtration module is used for based on filtration condition described candidate tDMR being filtered, and obtains tDMR result.
8. detection system according to claim 7 is characterized in that, described seed region determination module scans on the described full genome single-point information that methylates by moving window, determines seed tDMR on the full genome based on preselected conditions; Wherein, be that moving window that length, 1 CpG are step-length scans on the described full genome single-point information that methylates with 5 CpG; Described preselected conditions comprises:
(1) p of chi square test<=0.05;
The significant difference of (2) two times of methylation level; With
(3) methylation level of at least one sample is more than 20%.
9. detection system according to claim 7 is characterized in that, described extension end condition comprises:
Distance between (1) two successive CpG surpasses 200bp;
The average methylation level of (2) two samples is less than two times of differences;
This regional methylation level of (3) two samples is all less than 20%;
(4) p of chi square test>0.01.
And/or
Described filtration condition comprises:
(1)FDR<=0.05;
The average coverage in the tDMR zone that (2) obtains is greater than 20 reads;
The coverage of the CG site single-point that (3) obtains is greater than 10 reads;
(4) accuracy of sampling with replacement assay is carried out more than 95% in the CpG site among the tDMR that obtains.
10. detection method according to claim 7 is characterized in that, the described information acquisition module that methylates comprises:
The sodium bisulfite treatment facility is used for making complete genomic DNA that methylated cytosine(Cyt) deaminizating not take place by bisulfite and is transformed into uridylic, and methylated cytosine(Cyt) remains unchanged;
Full genome comparison equipment is used for treated full genome is checked order, and compares with undressed whole genome sequence, determines to take place on the full genome methylated CpG site.
CN2010105571311A 2010-11-24 2010-11-24 Method and system for detecting tissue-specific differentially methylated region (tDMR) Active CN102061337B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105571311A CN102061337B (en) 2010-11-24 2010-11-24 Method and system for detecting tissue-specific differentially methylated region (tDMR)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105571311A CN102061337B (en) 2010-11-24 2010-11-24 Method and system for detecting tissue-specific differentially methylated region (tDMR)

Publications (2)

Publication Number Publication Date
CN102061337A true CN102061337A (en) 2011-05-18
CN102061337B CN102061337B (en) 2013-11-20

Family

ID=43996843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105571311A Active CN102061337B (en) 2010-11-24 2010-11-24 Method and system for detecting tissue-specific differentially methylated region (tDMR)

Country Status (1)

Country Link
CN (1) CN102061337B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982253A (en) * 2011-09-02 2013-03-20 深圳华大基因科技有限公司 Detection method and device of methylation difference of multiple samples
WO2013097061A1 (en) * 2011-12-31 2013-07-04 深圳华大基因研究院 Bs and rrbs sequencing-based bioinformatics analysis method and device
CN107451420A (en) * 2017-07-26 2017-12-08 同济大学 The differential methylation parser of purity effect is considered based on DNA methylation data
CN109637583A (en) * 2018-12-20 2019-04-16 中国科学院昆明植物研究所 A kind of detection method in Plant Genome differential methylation region
WO2019129200A1 (en) * 2017-12-28 2019-07-04 安诺优达基因科技(北京)有限公司 C-site extraction method and apparatus
WO2020155623A1 (en) * 2019-01-31 2020-08-06 郑州云海信息技术有限公司 Sequence alignment filtering processing method, system and device, and readable storage medium
CN113454219A (en) * 2020-08-10 2021-09-28 华大数极生物科技(深圳)有限公司 Methylation markers for detection and diagnosis of liver cancer
WO2021238441A1 (en) * 2020-05-27 2021-12-02 广州市基准医疗有限责任公司 Vectorized representation method and apparatus for methylation level, and method and apparatus for testing specific sequencing window
WO2023082140A1 (en) * 2021-11-11 2023-05-19 华大数极生物科技(深圳)有限公司 Nucleic acid detection kit for diagnosing liver cancer

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101205559A (en) * 2006-12-19 2008-06-25 上海生物芯片有限公司 Oligonucleotide chip for detecting complete genome CpG island and uses thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101205559A (en) * 2006-12-19 2008-06-25 上海生物芯片有限公司 Oligonucleotide chip for detecting complete genome CpG island and uses thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KIM HC ET AL: "CpG island methylation as an early event during adenoma progression in carcinogenesis of sporadic colorectal cancer", 《JOURNAL OF GASTROENTEROLOGY AND HEPAROLOGY》 *
赵贵森等: "基因组差异甲基化片段筛选技术", 《医学分子生物学杂志》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982253A (en) * 2011-09-02 2013-03-20 深圳华大基因科技有限公司 Detection method and device of methylation difference of multiple samples
CN102982253B (en) * 2011-09-02 2015-11-25 深圳华大基因科技服务有限公司 Methylation differential detection method and device between a kind of multisample
WO2013097061A1 (en) * 2011-12-31 2013-07-04 深圳华大基因研究院 Bs and rrbs sequencing-based bioinformatics analysis method and device
CN107451420A (en) * 2017-07-26 2017-12-08 同济大学 The differential methylation parser of purity effect is considered based on DNA methylation data
CN109979534A (en) * 2017-12-28 2019-07-05 安诺优达基因科技(北京)有限公司 A kind of site C extracting method and device
WO2019129200A1 (en) * 2017-12-28 2019-07-04 安诺优达基因科技(北京)有限公司 C-site extraction method and apparatus
CN109979534B (en) * 2017-12-28 2021-07-09 浙江安诺优达生物科技有限公司 C site extraction method and device
CN109637583A (en) * 2018-12-20 2019-04-16 中国科学院昆明植物研究所 A kind of detection method in Plant Genome differential methylation region
CN109637583B (en) * 2018-12-20 2020-06-16 中国科学院昆明植物研究所 Method for detecting differential methylation region of plant genome
WO2020155623A1 (en) * 2019-01-31 2020-08-06 郑州云海信息技术有限公司 Sequence alignment filtering processing method, system and device, and readable storage medium
WO2021238441A1 (en) * 2020-05-27 2021-12-02 广州市基准医疗有限责任公司 Vectorized representation method and apparatus for methylation level, and method and apparatus for testing specific sequencing window
CN113454219A (en) * 2020-08-10 2021-09-28 华大数极生物科技(深圳)有限公司 Methylation markers for detection and diagnosis of liver cancer
WO2022032429A1 (en) * 2020-08-10 2022-02-17 华大数极生物科技(深圳)有限公司 Methylation markers for liver cancer detection and diagnosis
CN113454219B (en) * 2020-08-10 2024-03-08 华大数极生物科技(深圳)有限公司 Methylation marker for liver cancer detection and diagnosis
WO2023082140A1 (en) * 2021-11-11 2023-05-19 华大数极生物科技(深圳)有限公司 Nucleic acid detection kit for diagnosing liver cancer

Also Published As

Publication number Publication date
CN102061337B (en) 2013-11-20

Similar Documents

Publication Publication Date Title
CN102061337B (en) Method and system for detecting tissue-specific differentially methylated region (tDMR)
AU2003241607B2 (en) Microarrays and method for running hybridization reaction for multiple samples on a single microarray
US9051616B2 (en) Diagnosing fetal chromosomal aneuploidy using massively parallel genomic sequencing
CN103014137B (en) Gene expression quantification analysis method
KR102667912B1 (en) Systems and methods for determining microsatellite instability
CN110211633B (en) Detection method for MGMT gene promoter methylation, processing method for sequencing data and processing device
CN116189763A (en) Single sample copy number variation detection method based on second generation sequencing
KR20210040714A (en) Method and appartus for detecting false positive variants in nucleic acid sequencing analysis
WO2006064631A1 (en) Method, program and system for the standardization of gene expression amount
CN102982253B (en) Methylation differential detection method and device between a kind of multisample
CN113355401A (en) NGS-based CNV analysis and detection method for glioma chromosomes
Kekeç et al. New generation genome sequencing methods
Van Paemel et al. Minimally invasive classification of pediatric solid tumors using reduced representation bisulfite sequencing of cell-free DNA: a proof-of-principle study
CN117344014B (en) Pancreatic cancer early diagnosis kit, method and device thereof
AU2013203079B2 (en) Diagnosing fetal chromosomal aneuploidy using genomic sequencing
Townsend December 2012 Biochem 218 A critical review of ChIP-seq enrichment analysis tools
CN116403641A (en) Method for eliminating base sequencing errors, method for identifying low-frequency mutation, and related device
CN116779036A (en) Rapid analysis method for sequencing targeting pathogen nanopores based on multiplex PCR
CA3099612A1 (en) Method of cancer prognosis by assessing tumor variant diversity by means of establishing diversity indices
AU2008278843B2 (en) Diagnosing fetal chromosomal aneuploidy using genomic sequencing
Rezaeian et al. A new multi-level thresholding algorithm for finding peaks in ChIP-Seq data
Prados et al. Feature extraction from mass spectra for classification
Rezaeian et al. Finding genomic features from enriched regions in ChlP-Seq data
Chhatbar et al. Evaluation of CA repeat arrays as targets for MeCP2 function in the brain
expression profiles from Lopez-Rios SUPPLEMENTARY METHODS Datasets and samples

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: BGI TECHNOLOGY SOLUTIONS CO., LTD.

Free format text: FORMER OWNER: BGI-SHENZHEN CO., LTD.

Effective date: 20130422

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20130422

Address after: 518083 science and Technology Pioneer Park, comprehensive building, Beishan Industrial Zone, Yantian District, Guangdong, Shenzhen 201

Applicant after: BGI Technology Solutions Co., Ltd.

Address before: Beishan Industrial Zone Building in Yantian District of Shenzhen city of Guangdong Province in 518083

Applicant before: BGI-Shenzhen Co., Ltd.

C14 Grant of patent or utility model
GR01 Patent grant