CN104131093A - DNase high-throughput sequencing detection signal processing method of DNA protein binding sites - Google Patents

DNase high-throughput sequencing detection signal processing method of DNA protein binding sites Download PDF

Info

Publication number
CN104131093A
CN104131093A CN201410352942.6A CN201410352942A CN104131093A CN 104131093 A CN104131093 A CN 104131093A CN 201410352942 A CN201410352942 A CN 201410352942A CN 104131093 A CN104131093 A CN 104131093A
Authority
CN
China
Prior art keywords
dnase
seq
dna
checking
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410352942.6A
Other languages
Chinese (zh)
Other versions
CN104131093B (en
Inventor
冯伟兴
廉德源
刘晓龙
宋锋飞
贺波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201410352942.6A priority Critical patent/CN104131093B/en
Publication of CN104131093A publication Critical patent/CN104131093A/en
Application granted granted Critical
Publication of CN104131093B publication Critical patent/CN104131093B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a DNase high-throughput sequencing detection signal processing method of DNA protein binding sites. The method comprises the following steps: basic information of gene and DNase-Seq high-throughput sequencing detection data and ChIP-Seq high-throughput sequencing monitoring data of DNA protein binding sites are obtained; quality evaluation of the DNase-Seq high-throughput sequencing detection data is carried out, and credible sequencing data are screened out; only credible sequencing datum for directly reflecting sequencing initial position of protein binding sites is retained; a DNase-Seq detection sample data set is obtained; normalization processing is carried out on the DNase-Seq detection sample data set; the DNase-Seq detection sample data set is subdivided; and vertical summation of data in two subsets is carried out respectively from two directions of the front and back so as to finish operations. According to the invention, recognition precision and recognition resolution of the DNA protein binding sites are greatly raised.

Description

The DNase high pass order-checking detection signal treatment process of DNA protein binding site
Technical field
The invention belongs to the treatment process of the detection signal of DNA protein binding site, relate in particular to a kind of DNase high pass order-checking detection signal treatment process of DNA protein binding site.
Background technology
At present, DNA protein binding site detects the main chromatin Immunoprecipitation (Chromatin Immunoprecipitation, ChIP) that adopts.And the ChIP-Seq technology that ChIP experimental result is combined with high throughput sequencing technologies can detect and the protein bound DNA section of specific function in full genome range efficiently.The principle of ChIP-Seq is: first by chromatin Immunoprecipitation (ChIP), utilized with the enzyme enrichment of target protein specific binding and be combined with the DNA fragmentation of target protein, and it is carried out to purifying and library construction.Then DNA fragmentation enrichment being obtained carries out high-flux sequence, millions of reading sequences that again order-checking obtained accurately navigate on genome, thereby obtain the DNA zone field that is combined with target protein in full genome range, and then obtain target protein DNA binding site accurately by various analytical algorithms.Yet, although detect the DNA protein binding site analytical procedure of data for ChIP-Seq very ripe, but this technology also has weak point, first the desmoenzyme that is enrichment target protein has specificity, thereby causes some albumen cannot detect because can not find suitable specific combination enzyme; Secondly, once experiment can only detect a kind of albumen, takes time and effort, and cost is high, cannot use on a large scale; The 3rd, what is more important, the DNA fragmentation of being combined with target protein obtaining due to experiment is longer, during order-checking, can only carry out part order-checking to its two ends, because order-checking region is not binding site itself, therefore,, although the resolution list base of sequencing data, the positioning resolution of target protein binding site is the highest also can only reach tens bases.
For the problems referred to above, produced in recent years a kind of new DNA protein binding site detection technique--the DNA protein binding site detection technique based on DNase high pass order-checking information, i.e. DNase-Seq technology.This technology also claims DNase footprinting (DNase footprinting), accurately identification of dna in conjunction with albumen the binding site on DNA molecular.Its principle is: first utilize DNase nucleic acid shearing enzyme to carry out enzyme to DNA and cut processing.Do not have the protein bound DNA of DNA region to be sheared enzyme by DNase nucleic acid and cut off equably at random, and have the protein bound DNA of DNA region not to be cut off owing to being subject to protein-bonded obstruction specificity.Subsequently, enzyme is cut to the DNA fragmentation of processing and carry out purifying and library construction, then check order, thereby obtain DNase nucleic acid in full genome range, shear the enzyme of enzyme and cut information.At enzyme, cut in information, the enzyme at the protein binding site place information of cutting weakens specificity, just as leave one by one footprint on DNA, thereby accurately identification of dna in conjunction with albumen the binding site on DNA molecular.Compare with ChIP-Seq technology, the advantage of the new DNase-Seq high pass sequencing technologies proposing is very outstanding.First, owing to not having specificity, DNase-Seq can disposablely detect the binding site of all DNA albumen in full genome range; Secondly, due to the binding site of all DNA albumen of disposable detection, DNase-Seq technology has significantly improved detection efficiency and has reduced testing cost, and making to carry out on a large scale the detection of DNA protein binding site becomes possibility; The 3rd, what is more important, because the order-checking zero position of DNase-Seq is exactly that enzyme is cut position, therefore, DNase-Seq can reach single base to the detection resolution of DNA protein binding site, and so high resolving power is very helpful to follow-up study.Therefore, DNase-Seq signal is processed, and to carry out deep research and analysis be very necessary.
Since technology that DNase accurately detects DNA protein binding site is suggested, DNase technology is utilized for DNA binding site correlative study achievement more accurate experimental verification is provided.Also mostly to the processing of its experimental data is to be analyzed with illustrated mode directly perceived and explained by the form of simple count.Until 2006, Crawford and Sabo have proposed DNase-Chip technology at Nature methods simultaneously, utilize microarray to carry out high-throughput measurement to DNase detection signal, thereby opened application DNase technology, in genome range, protein binding site is carried out the stage of extensive determination and analysis.2010, Crawford further proposed the DNase-Seq technology that can detect protein binding site in full genome range.
After DNase-Seq technology is suggested, many analytical procedures have in succession been produced.2010, Chen utilized DNase-Seq data to analyze DNA protein binding site based on dynamic bayesian network.2011, Fletez utilized DNase-Seq data to analyze DNA protein binding site based on SVMs.2012, Pique proposed CENTIPEDE method, by designing the statistical nature of the rigorous statistical model analyzing DNA protein binding site DNase-Seq of place data, and then DNA protein binding site is identified.2013, Jason proposed Wellington method, and the DNase-Seq data characteristic while utilizing DNA protein binding site chromosomal region open, identifies DNA protein binding site.2014, Sherwood utilized amplitude and the shape facility of DNA protein binding site region uniqueness to identify DNA protein binding site.
But the pretreatment mode that the various analytical procedures that propose at present all adopt ChIP-Seq to detect data detects data to DNase-Seq and carries out pre-treatment.But in fact, owing to detecting the remarkable difference of principle, DNase-Seq data to the detection resolution of DNA protein binding site far above ChIP-Seq data.
Summary of the invention
The object of this invention is to provide a kind of accuracy of identification that DNA protein binding site can be provided and recognition resolution, the DNase high pass order-checking detection signal treatment process of DNA protein binding site.
The present invention is achieved by the following technical solutions:
The DNase high pass order-checking detection signal pretreatment process of DNA protein binding site, comprises following step:
Step 1: obtain gene essential information, gene essential information comprises the genomic base sequence of DNA and the positional information of gene on DNA, the DNase-Seq high pass order-checking of obtaining DNA protein binding site detects data and ChIP-Seq high pass order-checking monitoring data;
Step 2: order-checking detects data quality accessment to DNase-Seq high pass, filters out the quality score in base site at more than 20 credible sequencing datas, finds the source of every credible predicted data in genome by mapping;
Step 3: every credible sequencing data is only retained to the order-checking zero position of direct reflection protein binding site, the DNase – Seq data after being upgraded;
Step 4: ask for the number of the DNase – Seq data point after renewal on each DNA base site, as the DNase-Seq detected value in DNA base site; Utilize ChIP-Seq data acquisition to have the region of the binding site of relevant DNA albumen, extract complete DNase-Seq detected value in region, obtain DNase-Seq and detect sampled data set;
Step 5: DNase-Seq is detected to sampled data set and be normalized, the DNase-Seq detected value that is about to each DNA base site detects the DNase-Seq detected value sum in all DNA bases site in sampled data set divided by DNase-Seq;
Step 6: DNase-Seq is detected to sampled data set and segment;
DNase-Seq is detected to sampled data set and be divided into normal chain just the checking order negative order-checking of subset, normal chain subset, minus strand just checking order subset and the negative order-checking of minus strand subset, the normal chain mode of the negative order-checking of subset and minus strand subset by relevant alignment that just checking order merged and become the front of DNA protein binding site and detect data subset, normal chain negative order-checking subset and the minus strand mode of subset by relevant alignment that just checking order merged and become the back side of DNA protein binding site and detect data subset;
Step 7: respectively from front and back both direction, data in data subset and back side detection data subset are detected in front and longitudinally sue for peace, complete operation.
Beneficial effect of the present invention:
Owing to detecting the remarkable difference of principle, DNase-Seq data to the detection resolution of DNA protein binding site far above ChIP-Seq data, in the present invention, study the treatment process targetedly that DNase-Seq is detected to data, significantly improved accuracy of identification and the recognition resolution of DNA protein binding site.
By DNase-Seq being detected to the processing of data, realize the detection information that highlights DNA protein binding site, for the high precision identification of follow-up protein binding site is layed foundation simultaneously.
Accompanying drawing explanation
The DNase high pass order-checking detection signal pretreatment process block diagram of Fig. 1 DNA protein binding site;
The processed conventionally operating process of Fig. 2;
The operating process of Fig. 3 special processing;
The DNase-Seq normal chain of Fig. 4 DNA protein binding site detection information that just checking order, K562 cell ATF1 albumen;
The negative order-checking of the DNase-Seq minus strand detection information of Fig. 5 DNA protein binding site, K562 cell ATF1 albumen;
The negative order-checking of the DNase-Seq normal chain detection information of Fig. 6 DNA protein binding site, K562 cell ATF1 albumen;
The DNase-Seq minus strand of Fig. 7 DNA protein binding site detection information that just checking order, K562 cell ATF1 albumen;
The positive detection information of DNase-Seq of Fig. 8 DNA protein binding site, K562 cell ATF1 albumen;
Information, K562 cell ATF1 albumen are detected in the DNase-Seq back side of Fig. 9 DNA protein binding site.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further details.
The DNase high pass order-checking of obtaining DNA protein binding site detects data, analyses in depth and processes targetedly, and finally reach the object that highlights protein binding site detection information.As shown in Figure 1, specifically comprise the following steps:
1. data acquisition
We obtain genome base sequence essential information on international bio information site UCSC; In the TRANSFAC of BIOBASE company database, obtain DNA albumen essential information.It is all to come from the ENCODE announcing UCSC website to plan the data that generate that the DNase-Seq of DNA protein binding site and ChIP-Seq detect data.Wherein DNase-Seq detection data are on ENCODE website, in the data that the DUKE laboratory under hg19 condition and UW laboratory provide, download; It is to download in the data that provide in SYDH laboratory that ChIP-Seq detects data.
2. conventional preprocessing part
As shown in Figure 2, the order-checking of the DNase-Seq high pass of DNA protein binding site detects data owing to adopting short sequence order-checking mode, and the high-flux sequence platform of general using Illumina company detects and generates.The data that this order-checking platform generates are observed FASTQ form, and each reading (read) minute four lines of sequencing data is carried out information storage.Wherein, the first row is the relevant information of order-checking platform, with " ", starts; The second row is sequence information; The third line starts with "+" number, after identical with the first row, sometimes can be omitted; Fourth line is that the base that sequencing sequence is corresponding detects quality score.Wherein, there is a quality score for correspondence with it in each base site.Here require the quality score in all bases site in credible sequencing data all should be more than 20, its implication be that the probability of arbitrary base site sniffing is all below 1%.
The mapping link of following by sequencing data, finds the source of each sequencing data in genome by mapping.The sequencing data of Illumina order-checking platform is because length is consistent, so be suitable for adopting BWA comparison software to realize sequencing data to genomic mapping.
After sequencing data mapping, also need the mapping quality of each data to analyze.Require the mapping mass M APQ score of believable sequencing data should be more than 20 (its implication is that sequencing data shines upon wrong probability below 1%), and the number of base mismatch should be below 2.
3. special pre-treatment part
As shown in Figure 3, first, because the order-checking zero position of DNase-Seq is exactly that enzyme is cut position, and can in single base resolving power, reflect DNA protein binding site, so detect the pretreatment stage of data at DNase-Seq, each DNase-Seq can be detected to data modified line is a little, only retains the directly order-checking zero position point of reflection protein binding site, thereby effectively highlights the binding site information of DNA albumen.
Secondly, after DNase-Seq data being carried out to the pre-treatment of modified line for point, on each DNA base site, ask for the number of DNase-Seq data point as the DNase-Seq detected value on this base site.There is the region of the binding site of relevant DNA albumen in recycling ChIP-Seq data acquisition.If there is (its resolving power only has tens bases), to extract the complete DNase-Seq detected value in this region, and accurately determine DNA protein binding site by the base proneness of associated protein, its resolving power should reach single base.Like this, can obtain one group exists the DNase-Seq of this DNA protein binding site to detect sampled data set certainly.
In addition,, for avoiding the percentage contribution of different samples in subsequent analysis process inconsistent, also tackle sampled data and be normalized pre-treatment.The DNase-Seq detected value that is each base site in sample should be divided by the DNase-Seq detected value sum in all bases site in this sample areas.
Subsequently, sample set is segmented.First by DNA protein binding site position, different samples are divided into DNA normal chain and minus strand two classes.Secondly, each class sample is divided into DNA normal chain and minus strand two portions according to the direction of its order-checking reading again.So all samples in sample set are divided into normal chain and just check order, the negative order-checking of normal chain, minus strand just checks order, and the negative order-checking of minus strand waits four parts.Wherein, the implication that normal chain is just checking order be DNA protein binding site when normal chain, the DNA protein binding site detected value it being detected from normal chain; The implication of the negative order-checking of normal chain be DNA protein binding site when normal chain, the DNA protein binding site detected value it being detected from minus strand; The implication that minus strand is just checking order be DNA protein binding site when minus strand, the DNA protein binding site detected value it being detected from normal chain.The implication of the negative order-checking of minus strand be DNA protein binding site when minus strand, the DNA protein binding site detected value it being detected from minus strand.DNase-Seq biological detection mechanism according to DNA protein binding site, normal chain is just checking order and the negative order-checking of minus strand is all to detect from just facing DNA protein binding site, and the negative order-checking of normal chain and minus strand just to check order be all from the back side, DNA protein binding site to be detected, therefore, normal chain is just checking order and the negative order-checking of minus strand can be merged and be become the front of DNA protein binding site detection data subset by relevant mode of aliging, and the negative order-checking of normal chain and minus strand are just checking order and can merged and be become the back side of DNA protein binding site detection data subset by relevant mode of aliging.
Finally, from front and back both direction, all samples are longitudinally sued for peace respectively, thereby remove and make an uproar on basis in statistics, highlight the DNase-Seq whole detection information on DNA protein binding site.This Information Availability is in extracting and form the relevant distinctive DNase-Seq signal mode of DNA albumen, and in the full genome range of subsequent experimental to the identification of the high-accuracy high-resolution of this DNA protein binding site and detection.
4. experimental verification
From the UCSC bioinformation website of International Publication, obtain DNase-Seq and the ChIP-Seq detection data of the DNA binding site of DBP ATF1 in K562 clone.The DNase-Seq for ATF1 albumen showing in Fig. 4~Fig. 9 detects the result of data after Preprocessing method of the present invention is processed.Wherein, transverse axis is the base position in DNA region, DNA protein binding site place, and left side is the 5 ' end in DNA region, and right side is 3 ' end, and the longitudinal axis is DNase-Seq detected value.In order to know, show, as shown in Figure 4 and Figure 5, the negative order-checking of minus strand is born to check order to compare with conventional display mode with normal chain and has been carried out the upset of level left and right.The binding site that region in figure between two vertical lines is ATF1 (DNA protein binding is directive, should hold toward 3 ' of downstream from 5 ' end of DNA upstream).
Can be clearly seen that in Fig. 4~Fig. 7, from the 5 ' end of DNA to 3 ' end, normal chain just checking order and the negative order-checking of minus strand between, and between the negative order-checking of normal chain and minus strand just checking order, the DNase-Seq detection signal at its DNA binding site place is basically identical; But normal chain just checks order, minus strand is born order-checking and the negative order-checking of normal chain, minus strand just check order) between, the DNase-Seq detection signal at its DNA binding site place is different.This shows, by the segmentation to DNase-Seq sample, effectively highlights the detection information of DNA protein binding site, has reached pretreated object.
Finally, by the positive sequencing data of normal chain and minus strand negative sequencing data is relevant align after addition, to reflect the detection in conjunction with situation for the front of DNA binding site, as shown in Figure 8; Correspondingly, by the negative sequencing data of normal chain and addition after relevant alignment of the positive sequencing data of minus strand, to reflect the detection in conjunction with situation for the back side of DNA binding site, as shown in Figure 9.Through this processing, further highlighted the detection information of DNA protein binding site.
Experimental result shows, the pretreatment process of the DNase-Seq signal that the present invention proposes has highlighted the detection information of DNA protein binding site effectively, for follow-up accurate extraction DNA protein binding site recognition mode, and good basis is laid in the identification that further realizes DNA protein binding site.

Claims (1)

  1. The DNase high pass order-checking detection signal pretreatment process of 1.DNA protein binding site, is characterized in that, comprises following step:
    Step 1: obtain gene essential information, gene essential information comprises the genomic base sequence of DNA and the positional information of gene on DNA, the DNase-Seq high pass order-checking of obtaining DNA protein binding site detects data and ChIP-Seq high pass order-checking monitoring data;
    Step 2: order-checking detects data quality accessment to DNase-Seq high pass, filters out the quality score in base site at more than 20 credible sequencing datas, finds the source of every credible predicted data in genome by mapping;
    Step 3: every credible sequencing data is only retained to the order-checking zero position of direct reflection protein binding site, the DNase – Seq data after being upgraded;
    Step 4: ask for the number of the DNase – Seq data point after renewal on each DNA base site, as the DNase-Seq detected value in DNA base site; Utilize ChIP-Seq data acquisition to have the region of the binding site of relevant DNA albumen, extract complete DNase-Seq detected value in region, obtain DNase-Seq and detect sampled data set;
    Step 5: DNase-Seq is detected to sampled data set and be normalized, the DNase-Seq detected value that is about to each DNA base site detects the DNase-Seq detected value sum in all DNA bases site in sampled data set divided by DNase-Seq;
    Step 6: DNase-Seq is detected to sampled data set and segment;
    DNase-Seq is detected to sampled data set and be divided into normal chain just the checking order negative order-checking of subset, normal chain subset, minus strand just checking order subset and the negative order-checking of minus strand subset, the normal chain mode of the negative order-checking of subset and minus strand subset by relevant alignment that just checking order merged and become the front of DNA protein binding site and detect data subset, normal chain negative order-checking subset and the minus strand mode of subset by relevant alignment that just checking order merged and become the back side of DNA protein binding site and detect data subset;
    Step 7: respectively from front and back both direction, data in data subset and back side detection data subset are detected in front and longitudinally sue for peace, complete operation.
CN201410352942.6A 2014-07-23 2014-07-23 The DNase high pass order-checking detection signal treatment process of DNA protein binding site Expired - Fee Related CN104131093B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410352942.6A CN104131093B (en) 2014-07-23 2014-07-23 The DNase high pass order-checking detection signal treatment process of DNA protein binding site

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410352942.6A CN104131093B (en) 2014-07-23 2014-07-23 The DNase high pass order-checking detection signal treatment process of DNA protein binding site

Publications (2)

Publication Number Publication Date
CN104131093A true CN104131093A (en) 2014-11-05
CN104131093B CN104131093B (en) 2015-12-09

Family

ID=51803942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410352942.6A Expired - Fee Related CN104131093B (en) 2014-07-23 2014-07-23 The DNase high pass order-checking detection signal treatment process of DNA protein binding site

Country Status (1)

Country Link
CN (1) CN104131093B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650313A (en) * 2016-09-29 2017-05-10 哈尔滨工程大学 Method for filtering DNA base tendentiousness deviation in DNase high-throughput sequencing data
CN110335640A (en) * 2019-07-09 2019-10-15 河南师范大学 A kind of prediction technique of drug-DBPs binding site

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1980618A2 (en) * 1995-02-24 2008-10-15 Genentech, Inc. Human DNASE I variants

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1980618A2 (en) * 1995-02-24 2008-10-15 Genentech, Inc. Human DNASE I variants

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TERRENCE S. FUREY: "ChIP–seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions", 《NATURE REVIEWS GENETICS》, vol. 13, 31 December 2012 (2012-12-31), pages 840 - 852, XP055235590, DOI: doi:10.1038/nrg3306 *
沈圣等: "下一代测序技术在表观遗传学研究中的重要应用及进展", 《遗传》, vol. 36, no. 3, 31 March 2014 (2014-03-31), pages 256 - 275 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650313A (en) * 2016-09-29 2017-05-10 哈尔滨工程大学 Method for filtering DNA base tendentiousness deviation in DNase high-throughput sequencing data
CN106650313B (en) * 2016-09-29 2019-10-18 哈尔滨工程大学 A method of it filtering out DNA base in DNase high-flux sequence data and is inclined to sexual deviation
CN110335640A (en) * 2019-07-09 2019-10-15 河南师范大学 A kind of prediction technique of drug-DBPs binding site

Also Published As

Publication number Publication date
CN104131093B (en) 2015-12-09

Similar Documents

Publication Publication Date Title
US10347365B2 (en) Systems and methods for visualizing a pattern in a dataset
Oh et al. Comparison of accuracy of whole-exome sequencing with formalin-fixed paraffin-embedded and fresh frozen tissue samples
CN111341383B (en) Method, device and storage medium for detecting copy number variation
Wang et al. The effect of methanol fixation on single-cell RNA sequencing data
CN112111565A (en) Mutation analysis method and device for cell free DNA sequencing data
US11773429B2 (en) Reduction of bias in genomic coverage measurements
Melsted et al. Fusion detection and quantification by pseudoalignment
Andrews et al. DeepSNVMiner: a sequence analysis tool to detect emergent, rare mutations in subsets of cell populations
CN101914619A (en) RNA (Ribonucleic Acid) sequencing quality control method and device relating to gene expression
CN110875082B (en) Microorganism detection method and device based on targeted amplification sequencing
CN117947163A (en) Method for evaluating background level of variant nucleic acid sample
CN107267613A (en) Sequencing data processing system and SMN gene detection systems
CN113066533A (en) mNGS pathogen data analysis method
CN104131093B (en) The DNase high pass order-checking detection signal treatment process of DNA protein binding site
US20150142328A1 (en) Calculation method for interchromosomal translocation position
US20150310166A1 (en) Method and system for processing data for evaluating a quality level of a dataset
US20170206315A1 (en) Analysis method and information processing device
KR20210040714A (en) Method and appartus for detecting false positive variants in nucleic acid sequencing analysis
CN107653299A (en) A kind of acquisition methods of the gene chip probes sequence based on high-flux sequence
CN114708908A (en) Method, computing device and storage medium for detecting micro residual focus of solid tumor
US20170046480A1 (en) Device and method for detecting the presence or absence of nucleic acid amplification
Li et al. Multi-platform and cross-methodological reproducibility of transcriptome profiling by RNA-seq in the ABRF next-generation sequencing study
Chong et al. SeqControl: process control for DNA sequencing
CN105205350A (en) Determination method for length of poly-basic group in Ion Torrent sequencing data
Wainer-Katsir et al. BIRD: identifying cell doublets via biallelic expression from single cells

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20151209

Termination date: 20210723