CN103853936A - Data processing method for chromatin immunoprecipitation high-throughput sequencing - Google Patents

Data processing method for chromatin immunoprecipitation high-throughput sequencing Download PDF

Info

Publication number
CN103853936A
CN103853936A CN201310610854.7A CN201310610854A CN103853936A CN 103853936 A CN103853936 A CN 103853936A CN 201310610854 A CN201310610854 A CN 201310610854A CN 103853936 A CN103853936 A CN 103853936A
Authority
CN
China
Prior art keywords
sequence
chromatin
file
immunoprecipitation
sequence data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310610854.7A
Other languages
Chinese (zh)
Inventor
王立山
曹鑫恺
臧卫东
王媛媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FENGHE (SHANGHAI) INFORMATION TECHNOLOGY Co Ltd
Original Assignee
FENGHE (SHANGHAI) INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FENGHE (SHANGHAI) INFORMATION TECHNOLOGY Co Ltd filed Critical FENGHE (SHANGHAI) INFORMATION TECHNOLOGY Co Ltd
Priority to CN201310610854.7A priority Critical patent/CN103853936A/en
Publication of CN103853936A publication Critical patent/CN103853936A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a data processing method for chromatin immunoprecipitation high-throughput sequencing, and belongs to the technical field of molecular biology. The method comprises the following steps: firstly eliminating low quality sequence data in an initial sequence file, then contrasting the filtered sequence data in a reference genome, counting signal peak amount and density distribution in different areas according to the classification of the reference genome, and determining neighboring genes of each signal peak for gene body function enrichment analysis, and finally generating a gene body function enrichment result text file and a corresponding graphical representation file. The method provides a high-efficiency and high-throughput data analysis process, so that each sequencing process is effectively integrated so as to help scientific research personnel to efficiently complete earlier-stage sequence quality control and sequence filtering of high-throughput data and reflect the advantages and disadvantages of a chromatin immunoprecipitation high-throughput sequencing experiment based on data statistics of contrasted sequence, and the distribution characteristics of the sequence on chromosome can be reflected, thus the work efficiency of sequencing is greatly promoted.

Description

Chromatin co-immunoprecipitation high-flux sequence data processing method
Technical field
The present invention relates to technical field of molecular biology, particularly chromatin sequencing data analysis technical field, specifically refers to a kind of chromatin co-immunoprecipitation high-flux sequence data processing method.
Background technology
The appearance of a new generation's high throughput sequencing technologies has greatly been enriched people and has been utilized molecular biology method to study the scheme of Changing Pattern in cell.Such as ChIP-seq, RNA-seq, ChIRP-seq, High-C, MeDIP-seq, DNA-seq etc. has all been used in molecular biology and basic medical research field at interior numerous high throughput sequencing technologies at present.Wherein, ChIP-seq technology is the high flux means of numerical analysis of a kind of interaction taking Study on Protein and chromosomal DNA as fundamental purpose, and its experimental section mainly comprises the preparation of chromatin co-immunoprecipitation (ChIP) sample and degree of depth order-checking (Deep Sequencing) two parts.Original series (reads) quality that produces for fear of experiment preparation and order-checking process uneven and to after data results exert an influence, many laboratories are all Quality Control and the screenings of carrying out reads with some self-defining scripts at present, comprising fastqQC, fastx toolkit, PICARD etc.But, how these small tool procedures apply in the analytic process of ChIP-seq data, at present still the effective solution of neither one for vast technology practitioner's reference.
Summary of the invention
The object of the invention is to have overcome above-mentioned shortcoming of the prior art, a kind of high flux data analysis flow process for scientific research personnel's design is provided, effectively integrate each order-checking flow process, choose sequence in early stage (reads) Quality Control that can help scientific research personnel to complete rapidly a set of high flux data, sequence (reads) screening, the data statistics of the sequence (reads) based on after comparison reflects the good and bad of ChIP-seq experiment and can embody the distribution characteristics of sequence (reads) on chromosome, thereby optimize scientific research personnel and the process of data analysis assistant director to data quality accessment, effectively promote the chromatin co-immunoprecipitation high-flux sequence data processing method of work efficiency.
In order to realize above-mentioned object, chromatin co-immunoprecipitation high-flux sequence data processing method of the present invention comprises the following steps:
(1) system acquisition chromatin co-immunoprecipitation high-flux sequence initiation sequence file;
(2) system is rejected inferior quality sequence data from described chromatin co-immunoprecipitation high-flux sequence initiation sequence file, obtains the sequence data through screening;
(3) during system contrasts with reference to genome by the described sequence data through screening, and be retained in the sequence data of the unique and base mispairing number no more than 2 in comparison position on genome according to comparing result;
(4) system is carried out the detection of sequence signal peak region to the sequence data of described reservation;
(5) system is classified to described reference genome, be divided into region, interval (Intergenic), coding (Coding) region, (Coding) region of further encoding is divided into exon sequence (Exon) region, intron sequences (Intron) region, 5 ' untranslated (5 ' UTR) region, 3 ' untranslated (3 ' UTR) region;
(6) quantity and the Density Distribution of chromatin co-immunoprecipitation sequencing sequence signal peak in the dissimilar chromosomal region of system statistics; And add up the order-checking degree of depth of chromatin co-immunoprecipitation high-flux sequence sequential file in dissimilar chromosomal region and the coverage of unit interval;
(7) system around detects described chromatin co-immunoprecipitation sequencing sequence signal peak, determine the contiguous gene of each signal peak, and carry out the enrichment of gene ontology function taking described contiguous gene as basis and analyze, generate gene ontology function enrichment resulting text file and graph of a correspondenceization and show file, described gene ontology function enrichment resulting text file and graph of a correspondenceization are shown file indicating target albumen locating information on chromosome in each sample, the Molecular biological function that the potential target gene information that target protein regulates and controls and target protein play in sample.
In this chromatin co-immunoprecipitation high-flux sequence data processing method, described chromatin co-immunoprecipitation high-flux sequence initiation sequence file comprises the high flux sequence data collection with many cover repetition experimental datas, and described method is further comprising the steps of between step (6) and (7):
(7-0) system is chosen the described undressed sets of data of high flux sequence data collection and is processed according to described step (2) to step (6), and the result of each sets of data is integrated.
In this chromatin co-immunoprecipitation high-flux sequence data processing method, described step (2) specifically comprises the following steps:
(21) system, to described chromatin co-immunoprecipitation high-flux sequence initiation sequence file according to base sequencing quality score value (phred-score) lower limit, inferior quality base the percentage upper limit and the indeterminacy base percentage upper limit in wall scroll sequence in wall scroll sequence set, is rejected inferior quality sequence data;
(22) system is removed 3 of every sequence ' end inferior quality base, obtains the sequence data through screening.
In this chromatin co-immunoprecipitation high-flux sequence data processing method, described step (4) is specially: system is calculated the chromatin co-immunoprecipitation high flux sequence data positive minus strand 5 ' end spacing retaining, and carries out the detection of sequence signal peak region according to result of calculation.
In this chromatin co-immunoprecipitation high-flux sequence data processing method, described gene ontology function enrichment resulting text file and graph of a correspondenceization show that file is any one gene ontology function enrichment resulting text file and graph of a correspondenceization displaying file in bioengineering (BP), molecular function (MF) or cellular component (CC).
In this chromatin co-immunoprecipitation high-flux sequence data processing method, described chromatin co-immunoprecipitation high-flux sequence initiation sequence file is fastq form.
Adopt the chromatin co-immunoprecipitation high-flux sequence data processing method of this invention, first it reject the inferior quality sequence data in initiation sequence file, then by the sequence data of screening contrasts with reference to genome, according to quantity and Density Distribution with reference to signal peak in the dissimilar chromosomal region of genomic statistic of classification, and the contiguous gene of definite each signal peak, carry out the enrichment of gene ontology function taking described contiguous gene as basis and analyze, finally generate gene ontology function enrichment resulting text file and graph of a correspondenceization and show file.The method provides a kind of efficient high flux data analysis flow process, effectively integrate each order-checking flow process, the data statistics of choosing the sequence Quality Control in early stage that can help scientific research personnel to complete rapidly a set of high flux data, sequence screening, sequence based on after comparison reflects the quality of chromatin co-immunoprecipitation high-flux sequence experiment, and can embody the distribution characteristics of sequence on chromosome, thereby optimize scientific research personnel and the process of data analysis assistant director to data quality accessment, significantly promoted the work efficiency of chromatin co-immunoprecipitation high-flux sequence.
Brief description of the drawings
Fig. 1 is the flow chart of steps of chromatin co-immunoprecipitation high-flux sequence data processing method of the present invention.
Embodiment
In order more clearly to understand technology contents of the present invention, describe in detail especially exemplified by following examples.
Referring to shown in Fig. 1, is the flow chart of steps of chromatin co-immunoprecipitation high-flux sequence data processing method of the present invention.
In one embodiment, this chromatin co-immunoprecipitation high-flux sequence data processing method, as shown in Figure 1, comprises the following steps:
(1) the chromatin co-immunoprecipitation high-flux sequence initiation sequence file of system acquisition fastq form;
(2) system is rejected inferior quality sequence data from described chromatin co-immunoprecipitation high-flux sequence initiation sequence file, obtains the sequence data through screening;
(3) during system contrasts with reference to genome by the described sequence data through screening, and be retained in the sequence data of the unique and base mispairing number no more than 2 in comparison position on genome according to comparing result;
(4) system is carried out the detection of sequence signal peak region to the sequence data of described reservation;
(5) system is classified to described reference genome, be divided into region, interval (Intergenic), coding (Coding) region, (Coding) region of further encoding is divided into exon sequence (Exon) region, intron sequences (Intron) region, 5 ' untranslated (5 ' UTR) region, 3 ' untranslated (3 ' UTR) region;
(6) quantity and the Density Distribution of chromatin co-immunoprecipitation sequencing sequence signal peak in the dissimilar chromosomal region of system statistics; And add up the order-checking degree of depth of chromatin co-immunoprecipitation high-flux sequence sequential file in dissimilar chromosomal region and the coverage of unit interval;
(7) system around detects described chromatin co-immunoprecipitation sequencing sequence signal peak, determine the contiguous gene of each signal peak, and carry out the enrichment of gene ontology function taking described contiguous gene as basis and analyze, generate gene ontology function enrichment resulting text file and graph of a correspondenceization and show file, described gene ontology function enrichment resulting text file and graph of a correspondenceization are shown file indicating target albumen locating information on chromosome in each sample, the Molecular biological function that the potential target gene information that target protein regulates and controls and target protein play in sample.
If described chromatin co-immunoprecipitation high-flux sequence initiation sequence file comprises the high flux sequence data collection with many cover repetition experimental datas, described method is further comprising the steps of between step (6) and (7):
(7-0) system is chosen the described undressed sets of data of high flux sequence data collection and is processed according to described step (2) to step (6), and the result of each sets of data is integrated.
In embodiment more preferably, described step (2) specifically comprises the following steps:
(21) system, to described chromatin co-immunoprecipitation high-flux sequence initiation sequence file according to base sequencing quality score value (phred-score) lower limit, inferior quality base the percentage upper limit and the indeterminacy base percentage upper limit in wall scroll sequence in wall scroll sequence set, is rejected inferior quality sequence data;
(22) system is removed 3 of every sequence ' end inferior quality base, obtains the sequence data through screening.
In further preferred embodiment, described step (4) is specially: system is calculated the chromatin co-immunoprecipitation high flux sequence data positive minus strand 5 ' end spacing retaining, and carries out the detection of sequence signal peak region according to result of calculation.
In preferred embodiment, described gene ontology function enrichment resulting text file and graph of a correspondenceization show that file is any one gene ontology function enrichment resulting text file and graph of a correspondenceization displaying file in bioengineering (BP), molecular function (MF) or cellular component (CC).
In actual applications, the kit that the system of method of the present invention adopts comprises 5 Python scripts and 6 R language scripts altogether, and the title of each script is as follows:
(1)PROGRAM_clean_reads_gen.py
(2)PROGRAM_fastq_trimmer.py
(3)PROGRAM_genomic_bin_gen.py
(4)PROGRAM_genomic_feature_gen.py
(5)PROGRAM_identical_reads_collapser.py
(6)PROGRAM_ChIPpeakAnno_GO_analysis_output_processing.r
(7)PROGRAM_ChIP-seq_peak_annotation.r
(8)PROGRAM_genomic_bin_seq_depth_breadth_stat.r
(9)PROGRAM_merging_peak_from_two_samples.r
(10)PROGRAM_peak_dens_in_diff_regions.r
(11)PROGRAM_reads_dens_in_diff_region.r
Above-mentioned each script can either independently be carried out, and also can be embedded in existing data analysis flow process, uses very flexible.
The code of above-mentioned script is write based on Python and R language, can under Linux and MacOS system platform, use.In code operational process, expend system resource few, can be in any individual PC, workstation and or the enterprising enforcement use of server.
Utilize above-mentioned script to realize flow chart of data processing described in method of the present invention as follows:
The data processing of this instrument and analysis process are using the high flux data file of the ChIP-seq of fastq form as initial input file.
The first step, is used PROGRAM_clean_reads_gen.py to screen initial ChIP-seq data fastq formatted file.By the setting of base sequencing quality score value (phred-score) lower limit, inferior quality base percentage and indeterminacy base percentage upper limit in wall scroll reads in wall scroll read, reject inferior quality reads data.Use PROGRAM_fastq_trimier.py to remove 3 of every reads ' end inferior quality base.Use PROGRAM_identical_reads_collapser.py to remove PCR duplicates and retain the reads that meets screening conditions for subsequent analysis.
Second step, integrates bowtie open source software the reads after screening is compared with reference in genome, and in conjunction with the comparison result of every reads, is only retained in the reads data of the unique and base mispairing number no more than 2 in comparison position on genome.Integrate SPP and MaSC open source software the ChIP-seq high flux reads data positive minus strand 5 ' end spacing retaining is estimated, and will calculate the detection of carrying out reads signal peak region in parameters obtained input MASC open source software.
The 3rd step, use PROGRAM_genomic_bin_gen.py and PROGRAM_genomic_feature_gen.py to carrying out interval division and classification of type with reference to genome file, being Intergenic region, Coding region with reference to gene element, be separately further subdivided into Exon region, Intron region, 5 ' UTR region, 3 ' UTR region for Coding region in conjunction with refGene file.The result of handling well will be used to follow-up analysis.
The 4th step, is used BEDTools to process the intermediate result of second and third step output.The ChIP-seq signal peak in dissimilar chromosomal region of quantity and Density Distribution use PROGRAM_reads_dens_in_diff_regions.r to add up to(for) the result of BEDTools; Use PROGRAM_genomic_bin_seq_depth_breadth_stat.r to add up the order-checking degree of depth of ChIP-seq high-flux sequence reads file in dissimilar chromosomal region and the coverage of unit interval; Use PROGRAM_peak_dens_in_diff_regions.r to add up quantity and the Density Distribution of ChIP-seq signal peak in dissimilar chromosomal region.
The 5th step, for the ChIP-seq high flux reads data set that has experiment to repeat, first separately every sets of data is carried out respectively to the first~four step treatment step, use afterwards PROGRAM_merging_peak_from_two_samples.r to integrate all data, and finally output to " Merged_peaks.bed " file.
The 6th step, use PROGRAM_ChIP-seq_peak_annotation.r to detect ChIP-seq signal peak around, by ChIP-seq Data Source species name, relative species gene annotation file and individual gene TSS upstream and downstream hunting zone are set, determine the contiguous gene of each ChIP-seq signal peak, and carry out GO (gene ontology, gene ontology) gene function enrichment taking these genes as basis and analyze.Use afterwards PROGRAM_ChIPpeakAnno_GO_analysis_output_processing.r that the result generating is screened and exported, the difference of input file is provided in conjunction with user, and GO function enrichment resulting text file and the graph of a correspondenceization that can generate respectively BP, MF, tri-kinds of CC are shown file.
Finally, by this ChIP-seq data processing and analysis process, we can obtain the Molecular biological function that the locating information on chromosome, target protein regulate and control in each sample at target protein potential target gene information and target protein play in sample.
Utilize method of the present invention can help scientific research institutions and clinical medicine to analyze the function controlling of disease association transcription factor, determine transcription factor downstream regulation and control target gene, be systematically familiar with the pathogenetic inherent molecular mechanism of disease; Simultaneously can be in conjunction with dissimilar chromatin co-immunoprecipitation experiment, the research of epigenetic regulation to a series of forward positions scientific research fields such as biosome growth, Cell Differentiation and agings in postgraduate's object.
Adopt the chromatin co-immunoprecipitation high-flux sequence data processing method of this invention, first it reject the inferior quality sequence data in initiation sequence file, then by the sequence data of screening contrasts with reference to genome, according to quantity and Density Distribution with reference to signal peak in the dissimilar chromosomal region of genomic statistic of classification, and the contiguous gene of definite each signal peak, carry out the enrichment of gene ontology function taking described contiguous gene as basis and analyze, finally generate gene ontology function enrichment resulting text file and graph of a correspondenceization and show file.The method provides a kind of efficient high flux data analysis flow process, effectively integrate each order-checking flow process, the data statistics of choosing the sequence Quality Control in early stage that can help scientific research personnel to complete rapidly a set of high flux data, sequence screening, sequence based on after comparison reflects the quality of chromatin co-immunoprecipitation high-flux sequence experiment, and can embody the distribution characteristics of sequence on chromosome, thereby optimize scientific research personnel and the process of data analysis assistant director to data quality accessment, significantly promoted the work efficiency of chromatin co-immunoprecipitation high-flux sequence.
In this instructions, the present invention is described with reference to its specific embodiment.But, still can make various amendments and conversion obviously and not deviate from the spirit and scope of the present invention.Therefore, instructions and accompanying drawing are regarded in an illustrative, rather than a restrictive.

Claims (6)

1. a chromatin co-immunoprecipitation high-flux sequence data processing method, is characterized in that, described method comprises the following steps:
(1) system acquisition chromatin co-immunoprecipitation high-flux sequence initiation sequence file;
(2) system is rejected inferior quality sequence data from described chromatin co-immunoprecipitation high-flux sequence initiation sequence file, obtains the sequence data through screening;
(3) during system contrasts with reference to genome by the described sequence data through screening, and be retained in the sequence data of the unique and base mispairing number no more than 2 in comparison position on genome according to comparing result;
(4) system is carried out the detection of sequence signal peak region to the sequence data of described reservation;
(5) system is classified to described reference genome, is divided into interval region, coding region, further coding region is divided into exon sequence region, intron sequences region, 5 ' untranslated region, 3 ' untranslated region;
(6) quantity and the Density Distribution of chromatin co-immunoprecipitation sequencing sequence signal peak in the dissimilar chromosomal region of system statistics; And add up the order-checking degree of depth of chromatin co-immunoprecipitation high-flux sequence sequential file in dissimilar chromosomal region and the coverage of unit interval;
(7) system around detects described chromatin co-immunoprecipitation sequencing sequence signal peak, determine the contiguous gene of each signal peak, and carry out the enrichment of gene ontology function taking described contiguous gene as basis and analyze, generate gene ontology function enrichment resulting text file and graph of a correspondenceization and show file, described gene ontology function enrichment resulting text file and graph of a correspondenceization are shown file indicating target albumen locating information on chromosome in each sample, the Molecular biological function that the potential target gene information that target protein regulates and controls and target protein play in sample.
2. chromatin co-immunoprecipitation high-flux sequence data processing method according to claim 1, it is characterized in that, described chromatin co-immunoprecipitation high-flux sequence initiation sequence file comprises the high flux sequence data collection with many cover repetition experimental datas, and described method is further comprising the steps of between step (6) and (7):
(7-0) system is chosen the described undressed sets of data of high flux sequence data collection and is processed according to described step (2) to step (6), and the result of each sets of data is integrated.
3. chromatin co-immunoprecipitation high-flux sequence data processing method according to claim 1 and 2, is characterized in that, described step (2) specifically comprises the following steps:
(21) system, to described chromatin co-immunoprecipitation high-flux sequence initiation sequence file according to base sequencing quality score value lower limit, inferior quality base the percentage upper limit and the indeterminacy base percentage upper limit in wall scroll sequence in wall scroll sequence set, is rejected inferior quality sequence data;
(22) system is removed 3 of every sequence ' end inferior quality base, obtains the sequence data through screening.
4. chromatin co-immunoprecipitation high-flux sequence data processing method according to claim 1 and 2, is characterized in that, described step (4) is specially:
System is calculated the chromatin co-immunoprecipitation high flux sequence data positive minus strand 5 ' end spacing retaining, and carries out the detection of sequence signal peak region according to result of calculation.
5. chromatin co-immunoprecipitation high-flux sequence data processing method according to claim 1 and 2, it is characterized in that, described gene ontology function enrichment resulting text file and graph of a correspondenceization show that file is any one gene ontology function enrichment resulting text file and graph of a correspondenceization displaying file in bioengineering, molecular function or cellular component.
6. chromatin co-immunoprecipitation high-flux sequence data processing method according to claim 1, is characterized in that, described chromatin co-immunoprecipitation high-flux sequence initiation sequence file is fastq form.
CN201310610854.7A 2013-11-27 2013-11-27 Data processing method for chromatin immunoprecipitation high-throughput sequencing Pending CN103853936A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310610854.7A CN103853936A (en) 2013-11-27 2013-11-27 Data processing method for chromatin immunoprecipitation high-throughput sequencing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310610854.7A CN103853936A (en) 2013-11-27 2013-11-27 Data processing method for chromatin immunoprecipitation high-throughput sequencing

Publications (1)

Publication Number Publication Date
CN103853936A true CN103853936A (en) 2014-06-11

Family

ID=50861584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310610854.7A Pending CN103853936A (en) 2013-11-27 2013-11-27 Data processing method for chromatin immunoprecipitation high-throughput sequencing

Country Status (1)

Country Link
CN (1) CN103853936A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650308A (en) * 2016-11-07 2017-05-10 为朔医学数据科技(北京)有限公司 Processing method and system for mitochondrial high-throughput sequencing data
CN107248039A (en) * 2016-12-30 2017-10-13 吉林金域医学检验所有限公司 Real-time quality control method and device based on medical specimen detection project result
WO2019153852A1 (en) * 2018-02-07 2019-08-15 北京大学 Micro cell chip method
CN115083517A (en) * 2022-07-07 2022-09-20 南华大学附属第一医院 Data processing method and system for identifying enhancer and super enhancer

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650308A (en) * 2016-11-07 2017-05-10 为朔医学数据科技(北京)有限公司 Processing method and system for mitochondrial high-throughput sequencing data
CN107248039A (en) * 2016-12-30 2017-10-13 吉林金域医学检验所有限公司 Real-time quality control method and device based on medical specimen detection project result
WO2019153852A1 (en) * 2018-02-07 2019-08-15 北京大学 Micro cell chip method
CN115083517A (en) * 2022-07-07 2022-09-20 南华大学附属第一医院 Data processing method and system for identifying enhancer and super enhancer

Similar Documents

Publication Publication Date Title
Dahlin et al. A single-cell hematopoietic landscape resolves 8 lineage trajectories and defects in Kit mutant mice
Tellez-Gabriel et al. Tumour heterogeneity: the key advantages of single-cell analysis
Mu et al. Deciphering brain complexity using single-cell sequencing
Eckardt et al. Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects
Harmston et al. GenomicInteractions: An R/Bioconductor package for manipulating and investigating chromatin interaction data
CN103853936A (en) Data processing method for chromatin immunoprecipitation high-throughput sequencing
Tauriello et al. Variability and constancy in cellular growth of Arabidopsis sepals
CN109559780A (en) A kind of RNA data processing method of high-flux sequence
Panero et al. iSmaRT: a toolkit for a comprehensive analysis of small RNA-Seq data
CN112289376B (en) Method and device for detecting somatic cell mutation
CN106021993A (en) Tumor exome sequencing analysis system and method
CN107506614B (en) Bacterial ncRNA prediction method
Pinto et al. StemMapper: a curated gene expression database for stem cell lineage analysis
Grandi et al. popsicleR: AR package for pre-processing and quality control analysis of single cell RNA-seq data
Sommarin et al. Single-cell multiomics reveals distinct cell states at the top of the human hematopoietic hierarchy
CN107832584A (en) Genetic analysis method, apparatus, equipment and the storage medium of grand genome
Richards et al. A comparison of data integration methods for single-cell RNA sequencing of cancer samples
Shirley et al. FISH Finder: a high-throughput tool for analyzing FISH images
CN111370065B (en) Method and device for detecting cross-sample contamination rate of RNA
Omar et al. Enhancer prediction in proboscis monkey genome: A comparative study
CN105528532A (en) A feature analysis method for RNA editing sites
CN103547681A (en) Method for capturing target region and method and system for processing bioinformatics thereof
Toninelli et al. Charting the tumor microenvironment with spatial profiling technologies
Kandhari et al. The detection and bioinformatic analysis of alternative 3′ UTR isoforms as potential cancer biomarkers
Coppe et al. Motif discovery in promoters of genes co-localized and co-expressed during myeloid cells differentiation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent for invention or patent application
CB02 Change of applicant information

Address after: 200241 Shanghai City, Minhang District science and Technology Park of Cangyuan Jianchuan Road No. 951 building A Room 102

Applicant after: FENGHE (SHANGHAI) INFORMATION TECHNOLOGY CO., LTD.

Address before: 201108, room 4, building 508, No. 208 East Spring Road, Shanghai, Minhang District

Applicant before: FENGHE (SHANGHAI) INFORMATION TECHNOLOGY CO., LTD.

RJ01 Rejection of invention patent application after publication

Application publication date: 20140611

RJ01 Rejection of invention patent application after publication