CN107368704A - The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform - Google Patents

The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform Download PDF

Info

Publication number
CN107368704A
CN107368704A CN201710598342.1A CN201710598342A CN107368704A CN 107368704 A CN107368704 A CN 107368704A CN 201710598342 A CN201710598342 A CN 201710598342A CN 107368704 A CN107368704 A CN 107368704A
Authority
CN
China
Prior art keywords
analysis
project
module
gene group
interactive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710598342.1A
Other languages
Chinese (zh)
Inventor
任一
余果
郭权
韩畅
史彩萍
仝颜丽
刘彬旭
石今
曾静
周玄
董亚晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sangge Information Technology Co Ltd
Original Assignee
Shanghai Sangge Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sangge Information Technology Co Ltd filed Critical Shanghai Sangge Information Technology Co Ltd
Priority to CN201710598342.1A priority Critical patent/CN107368704A/en
Publication of CN107368704A publication Critical patent/CN107368704A/en
Priority to CN201810802816.4A priority patent/CN109086567A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a kind of interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform, in project management module, after establishing analysis project, sequencing data is uploaded to local cluster server, and the public database configured in the reference gene group database or selection platform privately owned according to actual demand upload user, user lock the project in the module or share items selection to other people operational administratives;In fundamental analysis task submits module, user carries out parameter setting and operational analysis in visualization interface to sequencing data, before operational analysis, prejudge whether data Quality Control meets standardisation requirements, error information is directly returned if not meeting, carries out specifying parameter operational analysis if meeting, corresponding item file is generated after operational analysis, the item file, which will be sent in interactive interpretation of result module, interacts formula analysis, and generates intuitively standardization report.

Description

The interactive analysis of the transcriptome project for having reference gene group based on cloud computing platform System and method
Technical field
The present invention relates to analysis of biological information technical field, more particularly to a kind of reference gene is had based on cloud computing platform The interactive analysis system and method for the transcriptome project of group.
Background technology
Transcript profile refers to the set for all RNA that a certain species, tissue or cell are transcribed in a particular state, including MRNA and non-coding RNA.Transcript profile is the knob that connection undertakes protein group of the genome of inhereditary material with performing biological function Band, the regulation and control of transcriptional level are control methods that are mostly important and being widely studied, compared to the research for playing genomic level, are turned The research of record group can provide more efficient more accurate research information.
Although high throughput sequencing technologies decode animal-plant gene group compared to the existing huge advance of conventional sequencing technology Still great difficulty, and both expensive are faced.Transcript profile sequencing carries out high-flux sequence primarily directed to transcription product mRNA and obtained The information of transcript is obtained, the technology flux is high, covering is wide, precision is high, can be from integral level research gene function and gene knot Structure, the gene of differential expression in cell, tissue or individual under different physiology or pathological state is found, any species can be entered Row transcriptome analysis.It is multiple to be widely used to Basic of Biology research, clinical diagnosis, molecular breeding and medicament research and development etc. at present Field.
The analysis of biology big data is the most pass that the transcript profile that high throughput sequencing technologies are applied to have reference gene group is studied Key step.For there is the transcriptome project of reference gene group, recommend selection Illumina HiSeq microarray datasets, Illumina Once caused data volume is up to 1000G for HiSeq operations, and personal computer and work station obviously can not complete the place of these data Science and engineering is made.
Relevant data are adjusted during high-flux sequence data processing, screen, compare, annotating, it is necessary to research work Person possesses high-caliber shell script and writes ability.It is existing have reference gene group transcriptome project bioinformatic analysis it is main It is made up of three parts, respectively standard bioinformatic analysis, advanced bio bioinformatics analysis, personalized biological information credit Analysis.Standard bioinformatic analysis is the basis for the transcriptome project for entirely having reference gene group, and its result presentation mode includes Sample information statistics, sample Quality Control statistics, comparison result statistics, sequencing saturation analysis, redundant sequence analysis, coverage point Analysis, chromosome distribution statisticses, new transcript details chart, annotation general view.Advanced bio information analysis presentation mode includes: Gene expression profile information, gene expression matrix, gene expression Venn figures, gene expression correlation analysis, PCA points of gene expression Analysis, differential expression volcano figure, differential expression scatter diagram, differential expression cluster analysis, GO enrichments analysis, KEGG enrichment analyses.It is high Level analysis of biological information presentation mode includes:Gene co-expressing network analysis, interactions between protein network analysis, alternative splicing analysis, Snp analysis, rna editing analysis, gene fusion analysis etc..
The operation flow of prior art is adopted manually, and operating efficiency is relatively low, therefore can not meet efficient output The market demand.
The content of the invention
For weak point present in above-mentioned technology, present invention offer is a kind of to have reference gene based on cloud computing platform The interactive analysis system and method for the transcriptome project of group, to solve the big number that personal computer, work station can not be completed According to the operation flow efficiency of Treatment Analysis and existing manual mode it is low the problem of.
To achieve the above object, the technical solution adopted by the present invention is:It is a kind of that reference gene is had based on cloud computing platform The interactive analysis system of the transcriptome project of group, including
Project management module, for the details of project to be checked, edited and managed, by project, using, appoint Business, file etc. carry out integrated management to the analysis project of all progresses;
Fundamental analysis task submits module, and for carrying out parameter setting to task, operational analysis is carried out after submission, and will be defeated The analysis result and initial data gone out integrates distribution to corresponding item file according to preset format;Fundamental analysis task is submitted Module, including data Quality Control statistics, annotation of gene function, to compare reference gene group, transcript profile quality evaluation, new transcript pre- Survey, expression analysis, Differential expression analysis, genomic organization;
Interactive interpretation of result module, for carrying out the further optimization of analysis result according to users ' individualized requirement, and Result visualization is presented, including advanced bio bioinformatics analysis and personalized biological bioinformatics analysis;
The project management module submits module to be connected with interactive interpretation of result module by fundamental analysis task.
Specifically, the middle-and-high-ranking bioinformatic analysis of interactive interpretation of result module and personalized biological bioinformatics analysis, Including sequencing saturation analysis, redundant sequence analysis, the analysis of gene coverage, overlay area distributional analysis, gene expression difference Cluster analysis, GO enrichments analysis, KEGG enrichments analysis, gene co-expressing network analysis, interactions between protein network analysis, alternative splicing Analysis, snp analysis, rna editing analysis, gene fusion analysis.
Specifically, interactive interpretation of result module is additionally operable to change packet scheme and selects analysis sample, selects cluster to calculate Method etc..
Specifically, the interactive interpretation of result module includes graph tool, realize change scheme of colour, graphic scheme, Cylindricality direction;Sample may be selected, and selectively show legend, point title, clustering tree, envirment factor, figure caption can be changed;Analysis knot Fruit figure supports PNG, JPEG, PDF, SVG form to download, and is stored in report, is shown in report;Interactive interpretation of result module Reporting format is html and pdf.
The present invention also provides a kind of interactive analysis of the transcriptome project for having reference gene group based on cloud computing platform System, comprise the following steps:
Step 0, project is established;
Step 1, sequencing data is uploaded to local cluster server, at the same it is privately owned in local cluster server upload user Reference gene group database or selection use the public database in platform, project in user's lockable module or by item Mesh selectively shares to other people operational administrative;
Step 2, task is established;
Step 3, in fundamental analysis task submits module, user carries out parameter to sequencing data in visualization interface and set Put and operational analysis, before operational analysis, prejudge whether data Quality Control meets standardisation requirements, report is directly returned if not meeting Wrong information;Carry out specifying parameter operational analysis if meeting, corresponding item file is generated after operational analysis.
Step 4, caused item file, which is sent in interactive interpretation of result module, interacts formula analysis, according to user Individual demand carries out secondary analysis and statistics to item file, generates the interactive analysis report intuitively presented.
Specifically, in the step 3 when carrying out parameter setting and operational analysis to sequencing data, user can voluntarily be set Data Quality Control SS, the privately owned reference gene group of common reference genome database or user in selection analysis platform, Setting is compared, assembled, differential expression is analyzed, alternative splicing analysis software, after selecting content to be analyzed, is reruned and has been selected whole points Analysis.
Specifically, the project management module, it may also be used for check, edit and manage by user upload associated documents or The associated documents as caused by analysis, and associated documents can be uploaded, searched, replicated, moved, deleted and down operation; Mesh management module is used for the state and log information for checking task run;Project management module is entered available for tagging items state Degree, project status progress can be in not starting, carrying out, completed, terminated and problem;Project management module, it is also used for sharing Project, and member's authority can be managed.
Specifically, reference gene group database purchase uses the reference gene group in platform in local cluster server Database includes Animal genome database, Plant Genome database, fungal gene group database, can also upload self-defined number According to storehouse.
Specifically, the project management module, fundamental analysis task submit the behaviour of module and interactive interpretation of result module It is based on PHP+MySQL+MongoDB server background and html+Css+jquery front end page.Interactive analysis mould Block, front end page interaction is triggered by user and mutually receives task execution command, submits task parameters to server background, and transfer The server side scripts of Perl, C, python, R computer language are analyzed sequencing data, are returned again by server Shown to front end page.
Specifically, the fundamental analysis task submits module carrying out the not same order of operational analysis to the sequencing data Section, corresponding analysis software is chosen from the analysis software of its storage bioinformatic analysis is carried out to the sequencing data..
The beneficial effects of the present invention are:It is big easily to obtain biology by network by cloud computing technology by the present invention Basic calculation resource required for data analysis, researcher is met under big data background for the huge of basic calculation resource Demand.Meanwhile the interactive analysis method of the transcriptome project for having reference gene group based on cloud computing platform provides high collection Into the data analysis workflow of change, all kinds of analysis softwares are manually integrated without user, build analysis process, realize real meaning On one-stop analysis of biological information.Allowing no computer background but has the biological study worker of raw letter analysis demand in nothing On the premise of any computer language need to be learnt, easily realize that the depth of biology big data is excavated and obtains preferable result report Accuse.In addition, the transcript profile interaction analysis method for having reference gene group based on cloud computing platform supports many algorithms, it is self-defined to set Packet is put, Visual Chart and interactive report is presented in flexibly selection.Reference gene is had based on high-performance cloud calculating platform A key data sharing is realized in the transcript profile interactive analysis of group, and the integrated management for improving collaborative project is horizontal.Finally, based on cloud The interactive analysis content of the transcriptome project for having reference gene group of calculating platform is comprehensive, and not only covering has reference gene group The fundamental analysis of transcriptome analysis and advanced analysis, in addition to part personality analysis, meet that user believes the higher of analysis to raw Demand.
Brief description of the drawings
Fig. 1 is the interactive analysis system of the transcriptome project for having reference gene group based on cloud computing platform of the present invention Block diagram;
Fig. 2 is the interactive analysis method of the transcriptome project for having reference gene group based on cloud computing platform of the present invention Flow chart;
Fig. 3 is the schematic diagram of the transcriptome project for having reference gene group that fundamental analysis task submits module in the present invention;
Fig. 4 is the grassroot project schematic diagram for the transcriptome project for having reference gene group in the present invention;
Fig. 5 is that the transcriptome project background task parameter for having reference gene group in the present invention submits schematic diagram.
Fig. 6 is the transcriptome project interaction analysis schematic diagram for having in the present invention reference gene group;
Fig. 7 is the transcriptome project PCA graph tool schematic diagrames for having in the present invention reference gene group
Fig. 8 is the transcriptome project analysis report schematic diagram for having in the present invention reference gene group.
Main element symbol description is as follows:
10th, project management module 11, fundamental analysis task submit module
12nd, interactive interpretation of result module.
Embodiment
In order to more fully state the present invention, the present invention is further illustrated below in conjunction with the accompanying drawings.
Referring to Fig. 1, the interactive mode point of the transcriptome project for having reference gene group based on cloud computing platform of the present invention Analysis system is united, including
Project management module 10, for the details of project to be checked, edited and managed, by project, using, Task, file etc. carry out integrated management to the analysis project of all progresses;
Fundamental analysis task submits module 11, and for being configured to the basic parameter of task, computing point is carried out after submission Analysis, and the analysis result of output and initial data are integrated into distribution to corresponding item file according to preset format;
Interactive interpretation of result module 12, for carrying out the further optimization of analysis result according to users ' individualized requirement, And result visualization is presented;
Project management module submits module to be connected with interactive interpretation of result module by fundamental analysis task;
First, analysis project is established in project management module, to local cluster server upload sequencing data, while The privately owned reference gene group of local cluster server upload user or selection use the common reference genome in analysis platform Database, project in user's lockable module or shares items selection to other people operational administratives;Sequencing data is The file of fastq forms;The privately owned reference gene group of user should include reference gene group fasta formatted files and reference gene group Gff format result comment files;
Then, in fundamental analysis task submits module, user carries out parameter to sequencing data in visualization interface and set Put and operational analysis, corresponding item file is generated after operational analysis.Before operational analysis, prejudge whether data Quality Control accords with Standardizationization requirement, error information is directly returned to if not meeting, carry out specifying the operational analysis of parameter if meeting;
Finally, caused item file, which is sent in interactive interpretation of result module, interacts formula analysis, according to user Individual demand carries out secondary analysis and statistics to item file, generates the interactive analysis report intuitively presented;
Further referring to Fig. 2, the present invention also provides a kind of transcript profile for having reference gene group based on cloud computing platform The interactive analysis method of project, comprises the following steps:
Step S0, establishes project;
Step S1, sequencing data is uploaded to local cluster server, at the same it is privately owned in local cluster server upload user Reference gene group database or selection use public database in platform, project in user's lockable module or will Items selection is shared to other people operational administratives;
Step S2, establishes task;
Step S3, in fundamental analysis task submits module, user carries out parameter in visualization interface to sequencing data Setting and operational analysis.Before operational analysis, prejudge whether data Quality Control meets standardisation requirements, if not meeting direct return Error information;Carry out specifying parameter operational analysis if meeting, corresponding item file is generated after operational analysis.
Step S4, caused item file, which is sent in interactive interpretation of result module, interacts formula analysis, according to use Family individual demand carries out secondary analysis and statistics to item file, generates the interactive analysis report intuitively presented.
Compared with existing analytical technology, a kind of transcription for having reference gene group based on cloud computing platform provided by the invention The interactive analysis system and method for group project, the invention mainly includes project management module 10, fundamental analysis task is submitted 12 3 big module of module 11 and interactive interpretation of result module, the system and method are convenient by network by cloud computing technology The basic calculation resource obtained required for the analysis of biology big data, meet under big data background researcher for basis The great demand of computing resource.Meanwhile the interactive analysis of the transcriptome project for having reference gene group based on cloud computing platform Method provides highly integrated data analysis workflow, and all kinds of analysis softwares are manually integrated without user, build analysis process, Realize one-stop analysis of biological information truly.Allowing no computer background but has the biology of raw letter analysis demand Research worker easily realizes that the depth of biology big data is excavated and obtained on the premise of it need not learn any computer language Obtain preferable result report.In addition, the transcript profile interaction analysis method support for having reference gene group based on cloud computing platform is more Kind algorithm, self-defined that packet is set, flexibly selection presentation Visual Chart and interactive report.Calculated based on high-performance cloud flat A key data sharing is realized in the transcript profile interactive analysis for having reference gene group of platform, improves the integrated management water of collaborative project It is flat.Finally, the interactive analysis content of the transcriptome project for having reference gene group based on cloud computing platform is comprehensive, not only covers There are fundamental analysis and the advanced analysis of the transcriptome analysis of reference gene group, in addition to part personality analysis, meet user couple The higher demand of raw letter analysis.
In the present embodiment, in the step S3 when carrying out parameter setting analysis to sequencing data, user can voluntarily set Put data Quality Control SS, the privately owned reference gene of the common reference genome database or user in selection analysis platform Group, setting is compared, assembled, differential expression is analyzed, alternative splicing analysis software, is confirmed protein regulation database, is chosen to be analyzed After content, rerun and selected whole analyses.
In the present embodiment, the project management module 10, it may also be used for check, edit and manage the phase uploaded by user File or the associated documents as caused by analysis are closed, and associated documents can be uploaded, searched, replicated, move, delete and downloaded Operation;Project management module is used for the state and log information for checking task run;Project management module can be used for tagging items State progress, project status progress can be in not starting, carrying out, completed, terminated and problem;Project management module, also use In shared project, and member's authority can be managed.
In the present embodiment, the reference gene group database purchase is in local cluster server, using in platform Reference gene group database include Animal genome database, Plant Genome database, fungal gene group database, also can on Pass self-defining data storehouse.
Further referring to Fig. 3, fundamental analysis task submit module can be used for data Quality Control statistics, annotation of gene function, Compare reference gene group, transcript profile quality evaluation, the prediction of new transcript, expression analysis, Differential expression analysis, gene structure point Analysis etc..
Data Quality Control is counted for carrying out quality control and statistics to selected fastq files, and sets sequencing data quality The mass value of brief introduction and the minimum length for retaining reads;
Annotation of gene function is used to extract the sequence in reference gene group, compares NR, GO (Gene Ontology), COG (Cluster of Orthologous Groups), KEGG (Kyoto Encyclopedia of Genes and Genomes) With the database such as Swisspro, comprehensive assessment is carried out to annotation;
Selection reference gene group file is used to be compared with specified reference gene group;
Transcript profile quality evaluation reflection sequencing result depth, sequencing skewed popularity, redundant distributions frequency etc.;
New transcript, which is predicted to obtain some, does not have the brand-new transcript of annotation information;
Expression analysis carries out expression quantity statistics with softwares such as FeatureCount, Htseq, Kallisto, is turned Record this expression quantity;
Differential expression analysis reflects differential expression situation of all genes in all samples, can enter in the interaction analysis page One step selects different samples, sets different clustering methods, selection different distance algorithm to carry out variance analysis;
Genomic organization includes alternative splicing, SNP, Indel analysis, rna editing analysis and gene fusion analysis.Wherein Rna editing analysis is only supported to analyze human transcription group at present with gene fusion analysis.Fundamental analysis task submits module The result of generation can in interactive analysis module visual check, can also be found in item file corresponding to result text Part.
In the present embodiment, interactive interpretation of result module is used for advanced bio bioinformatics analysis and personalized biological information Credit is analysed, including sequencing saturation analysis, redundant sequence analysis, the analysis of gene coverage, overlay area distributional analysis, gene table Up to difference cluster analysis, GO enrichment analysis, KEGG enrichment analysis, gene co-expressing network analysis, interactions between protein network analysis, can Become montage analysis, snp analysis, rna editing analysis, gene fusion analysis.
Interactive interpretation of result module is additionally operable to change packet scheme and selects analysis sample, selection clustering algorithm etc..
Interactive interpretation of result module includes graph tool, and change scheme of colour, graphic scheme, cylindricality direction can be achieved; Sample may be selected, and selectively show legend, point title, clustering tree, envirment factor, figure caption can be changed;Interactive interpretation of result Module analysis result figure supports PNG, JPEG, PDF, SVG form to download;Interactive interpretation of result module analysis result can be stored in report Accuse, and shown in report;The reporting format of interactive interpretation of result module can be html and pdf.Project management module, basis The operation of analysis task submission module and interactive interpretation of result module is based on PHP+MySQL+MongoDB server background With html+Css+jquery front end page.
Interactive analysis module, front end page interaction is triggered by user and mutually receives task execution command, to after server Platform submits task parameters, and the server side scripts for transferring Perl, C, python, R computer language are divided sequencing data Analysis, front end page displaying is returned to again by server.
Fundamental analysis task submits module carrying out the different phase of operational analysis to the sequencing data, from its storage Corresponding analysis software is chosen in analysis software bioinformatic analysis is carried out to the sequencing data.
Further referring to Fig. 4, based on establishment project of the present invention and task step, to click on analysis platform and entering my item Mesh, grassroot project is clicked on, entry item title, item description, selects field label, species label.Click on the project established Title, newly-built task.
Referring to Fig. 5, fundamental analysis parameter setting mainly includes data Quality Control statistics, annotation of gene function, compares reference Genome, transcript profile quality evaluation, the prediction of new transcript, expression analysis, Differential expression analysis, the parameter of genomic organization Set.
Input fastq layout sequences file may be selected herein the fastq sequences text comprising each independent sample also may be selected Part presss from both sides.At the same time, the minimum mass value and minimal segment length after shearing can be set.Meanwhile select in local cluster service The privately owned reference gene group of device upload user or selection use the common reference genome database in analysis platform, have joined The database that genome database includes common eucaryon model organism is examined, privately owned reference gene group database also may be selected in user For sequence alignment.
Annotation of gene function can set NR, COG, KEGG, Swisspro database E-Value threshold values, for evaluation function The accuracy of annotation.
Compare reference gene group to may be selected to use sequence alignment program Tophat2 or Hisat2, from sequencing saturation degree, redundancy 4 angles such as sequence, coverage, area distribution are assessed transcript profile quality.
New transcript prediction provides 2 kinds of main flow splicing new transcripts of software prediction such as Cufflinks and Stringtie.
Expression analysis carries out expression quantity system with difference analysis softwares such as FeatureCount, Htseq, Kallisto Meter.
Genomic organization includes alternative splicing, SNP, Indel analysis, rna editing analysis and gene fusion analysis, can be according to Corresponding analysis software is provided according to different alternative splicing schemes.
The analysis of interactive interpretation of result module refers to Fig. 6, main to include sequencing data Quality Control, transcript profile quality evaluation, Splicing and new transcript prediction, functional annotation general view, expression analysis, differential expression research, network analyses, gene structure Analysis, Transcription factor analysis.
Wherein number sequencing includes sample information statistics according to Quality Control and sample Quality Control counts two contents.Sample information statistics fortune The statistics of base distribution and quality fluctuation, Ke Yicong are carried out to all sequencing reads each circle with statistical method Macroscopically intuitively reflect the sequencing quality and library construction quality of sample, the raw sequencing data of each sample is carried out Correlated quality is sequenced to assess, and the base quality distribution diagram of raw sequencing data, base error rate distribution map etc. can be drawn.Sample The Quality Control statistical guarantee accuracy of subsequent bio information analysis, and provide the statistics of the sample data volume after Quality Control and quality is commented Estimate.The amplification instrument in the figure lower right corner can be used to be amplified whole pictures on base quality distribution diagram.Click on deposit report Button is accused, the picture can be saved in report relevant position.
Transcript profile quality evaluation includes comparison result statistics, sequencing saturation analysis, redundant sequence analysis, coverage point Six analyses such as analysis, area distribution statistics, chromosome distribution statisticses.Wherein comparison result statistics is divided into comparison result statistical form. Comparison result statistical form be used to counting the Total Reads of each sample, Total basepairs, Total mapped, The information such as Multiple mapped, Uniquely mapped, Total unmapped.It is full that sequencing saturation analysis is divided into sequencing With line and the sequencing saturation degree box traction substation of writing music, sample, setpoint color scheme and shape scheme can be selected by clicking on graph tool, be confirmed The picture of generation can be downloaded to local preservation or click on deposit report and then picture is stored in static report.Redundant sequence analysis point For redundant sequence distribution map, for showing redundant sequence distribution situation, part samples show can be selected by clicking on graph tool.Covering Degree analysis is divided into gene coverage distribution map, and the synthesis for being sequential covering situation on 5 ' to 3 ' regions of all genes in sample is in Existing, sample, setpoint color scheme can be selected by clicking on graph tool.Area distribution statistics is divided into Reads area distributions statistics pie chart With Reads area distribution statistical forms.Reads area distributions statistics pie chart displaying reads is in each area distribution ratio, click chart Instrument may be selected sample, set Color scheme.Reads area distribution statistical forms, show the reads of each sample introne, Extron, code area, the distribution number of 3 ' UTR and 5 ' UTR regions.Chromosome distribution statisticses are divided into chromosome distribution statisticses column Figure, chromosome distribution statisticses string figure, chromosome distribution statisticses table.Chromosome distribution statisticses block diagram statistics is compared onto chromosome Sequence number, chromosome distribution statisticses string figure more intuitively embodies the distribution for being sequenced and being listed on each chromosome, clicks on chart Instrument may be selected sample, set Color scheme.Chromosome distribution statisticses table, compared with tables of data statistics to the sequence on chromosome Number, local preservation can be downloaded to by clicking on download.
Splicing and new transcript prediction include the general view of splicing situation, the prediction of new transcript.Splicing situation general view is divided into transcription This distribution of lengths block diagram, transcript distribution of lengths table.All transcripts in transcript distribution of lengths block diagram reflection sample Distribution of lengths section, click on graph tool, can set step-length section, select Color scheme, draw column diagram and check sample sequence The transcript distribution situation being listed in each length of interval scope.New transcript prediction is divided into the new transcript Map of Distributions of Types of prediction It is detailed with new transcript annotation information.New transcript Map of Distributions of Types visualizes all types of new transcript numbers, generation Picture can be downloaded to and local preserve or click on deposit report then picture is stored in static report.New transcript annotation information can Intellectual search clip types, fragment original position, segments end position, splicing expression quantity score value etc..
Functional annotation general view includes NR annotations, GO annotations, COG annotations, KEGG annotations, Swiss-Prot annotations statistics, note Release inquiry.NR annotations are divided into Information Statistics, E-Value is distributed pie chart, NR similarities distribution pie chart, after reference gene group or splicing New transcript compare the displaying of NCBI protein sequences storehouse (NR) object information, as a result annotate NCBI species taxonomy data simultaneously Storehouse.The changeable data of Information Statistics and sifting sort level.E-Value is distributed pie chart, for analyzing the reliable of matching result Property, the picture of generation can be downloaded to local preservation or click on deposit report and then picture is stored in static report.GO annotations are divided into GO annotates general view and GO level statistical forms.GO annotation general view alternatives check the GO annotation informations of gene or transcript, generation Picture can be downloaded to and local preserve or click on deposit report then picture is stored in static report.COG annotations are divided into COG classification Statistical form and COG statistic of classification block diagrams.COG annotations can carry out functional annotation and classification, switch data to gene or transcript Type can show different function classification block diagrams.KEGG annotations are divided into pathway information tables, pathway distribution histograms. Pathway information tables show pathway statistical informations, and can click on the path figure for checking any pathway.Pathway cylindricalitys N pathway, setpoint color scheme before the alternative displaying number gene of the graph tool of figure.Swiss-Prot points are Swiss-Prot annotation information tables, show that reference gene group or transcript compare the result letter of the database in the form of tables of data Breath.The integrated information of as above five databases is integrated in annotation inquiry, and can be to transcript length, sequence name, species name, ID number etc. Information carries out retrieval and inquisition, obtains corresponding data message table, and the form can be downloaded to local preservation or deposit item file.
Expression analysis includes two contents of gene expression analysis between expression quantity statistics, sample.Expression quantity statistics is divided into base Because of expression and distribution, gene expression information.Gene expression profile draws the expression quantity probability density of all genes based on FPKM results Distribution map, alternative FPKM software for calculation have FeatureCounts, RSEM, Kallisto, HTSeq, can given birth to after operation Into the FPKM analysis charts of gene, graph tool selection sample, selection figure, modification main title, regulation scheme of colour etc. are clicked on.Sample This gene expression information shows the specifying information of single sample expression analysis such as:Gene I/D, the chromosome mapping of gene, start stop bit Point, sequence number, FPKM values, whether it is new transcript etc., and can be according to expression quantity height screening accordingly result.Gene between sample Differential expression is divided into correlation analysis and PCA analyses.Correlation analysis is based on gene expression matrix and draws sample correlations Coefficient thermal map, part sample, selection clustering method, distance algorithm, hierarchical clustering mode can be selected, can be generated after operation corresponding Picture, click on graph tool and sample may be selected, Color scheme, display clustering tree, modification main title are set, it is determined that the picture of generation Local preservation can be downloaded to or click on deposit report then picture is stored in static report.PCA analyzes to be operated with correlation analysis It is similar, so as to find out the sample that peels off by PCA, differentiate the high sample cluster of similarity.
Expression difference research includes Variant statistical, variance analysis, differential gene GO analyses, differential gene KEGG analyses.
Expression difference analysis draws expression difference analysis chart, selectable variance analysis software based on gene expression table For DESeq, EdegR, DEGseq2, packet scheme and control are set, expression difference analysis table can be generated after operation, is cut Differential expression scatter diagram or differential expression volcano figure, differential gene table between different samples or different sample groups can be shown by changing group Expression patterns thermal map, differential gene venn figures.Wherein differential gene expression pattern thermal map can be set distance algorithm, clustering method, set Determine expression pattern selection scheme, corresponding result is generated after operation.Differential gene GO is divided into GO statistic of classifications, GO enrichments point Analysis.GO statistic of classifications utilize GO databases, the biological process that gene or transcript are participated according to them, the group for forming cell Molecular function divide, realized etc. is classified, and the difference expression gene for being grouped two-by-two or transcript carry out GO annotations Statistics, using one of sample as control, acquired results can draw down-regulated gene or transcript GO annotation column diagrams, click on figure The settable upper down-regulated gene of table instrument or the displaying color of transcript and function classification, it is determined that the picture of generation can be downloaded to local Deposit report is preserved or clicks on then picture is stored in static report.Go enrichment analyses are used in gene or transcript functional level The function enrichment condition between sample is illustrated, combination of two scheme is selected, regulation and control type, significance, multiple check school is set After correction method, operation generation GO enrichment analytic statistics tables are clicked on, and draw GO enrichment analyses block diagram, GO enrichment analysis bubbles Figure, GO directed acyclic graphs, wherein graph tool can show the gene or transcript, setting classification Color scheme of N before enrichment degree, Then picture is stored in static report it is determined that the picture of generation can be downloaded to local preservation or click on deposit report.Differential gene KEGG is divided into KEGG statistical analyses, KEGG enrichment analyses.KEGG statistical analyses can obtain KEGG regulation and control analytic statistics tables, and Show the expression pattern distribution of differential gene or transcript in KEGG paths.KEGG enrichment analyses carry out KEGG using KOBAS PATHWAY enrichment analyses, are accurately examined using Fisher and are calculated, combination of two scheme may be selected, set regulation and control type and Multiple testing adjustment method, the multiple testing adjustment method that can attack selection have:BH, BY, Q-value, click on operation, and generation is corresponding KEGG enrichments analytic statistics table, KEGG enrichment analyses block diagram, KEGG enrichment analysis bubble diagrams, wherein graph tool can show N gene or transcript, setting classification Color scheme before enrichment degree, it is determined that the picture of generation can be downloaded to local preservation or point Deposit report is hit then picture is stored in static report.
Network analyses include co-expressing network analysis, interactions between protein network analysis.Coexpression network analysis can reveal that The mechanism of transcriptional control, one group of gene/transcript is selected, by analyzing the expression quantity between gene/transcript in different samples Relative coefficient, the coexpression network between gene/transcript is built, so as to interaction relationship clearly therein.Table altogether Softpower threshold values (from 1 to 20), modification expression pattern similarity threshold can be set up to network analysis, after clicking on operation Network tables corresponding to generation and network figures, graph tool therein can be set Color scheme, switch to different modules The network detail charts of modules can be shown, it is determined that the picture of generation can be downloaded to local preservation or click on deposit report Then picture is stored in static report.Protein interaction network is analyzed, and carrys out forecasted variances gene using the principle of homologous mapping Protein interaction network, and the topological attribute of network is analyzed.Combination of two side may be selected in differential gene interaction network figure Case, possibility, logFC values, the significance that interaction between protein is set, differential protein interaction can be generated after clicking on operation Color scheme, the length on side can be set in network, network center's index profile, network node degree distribution map, wherein graph tool Degree, gravisphere, X-axis title, Y-axis title, main title, it is determined that the picture of generation can be downloaded to local preservation or click on deposit report Accuse and then picture is stored in static report.
Genomic organization includes alternative splicing analysis, snp analysis, rna editing analysis, gene fusion analysis.Some bases One mRNA precursor of cause produces different mRNA montage isomeries by different montage modes (the different splice site of selection) Body, this process are referred to as alternative splicing.RMATS or MATS softwares may be selected in alternative splicing analysis, generate difference alternative splicing thing Part categorised statistical form and difference alternative splicing temporal expressions amount statistical form.Snp analysis refers to the single nucleotide acid on genome Variation, including displacement, transversion, missing and insertion.SNP statistics, which can be set, compares software STAR, analysis software GATK, generates SNP As a result table, SNP area distributions pie chart, SNP type distribution histograms, and can to result carry out sample, chromosome, mutation type, Mutating alkali yl, functional area etc. screen.Rna editing refers to the modification and processing of the RNA analysis of post transcriptional maturation so that RNA is taken The process that the hereditary information of band changes, rna editing analysis use RDDpred Software Create editing sites details charts.Base Because convergence analysis uses SOAPfuse, the Gene Fusion event in transcript profile sequencing data is analyzed, settable support is melted The reads number minimum values of conjunction, Gene Fusion site display figure and Gene Fusion distribution table are generated after operation, clicks on graph tool The Gene Fusion site display figure of different samples may be selected, it is determined that the picture of generation, which can be downloaded to, locally preserves or click on deposit report Accuse and then picture is stored in static report.
Transcription factor analysis can be used for finding the protein being combined with specific dna sequence, and the protein specificity is carried out Description, so as to provide rational fundamental forecasting for the research of gene expression regulation mechanism.Reference number is selected in Transcription factor analysis According to storehouse, alternative transcription factor database has:PlantTFDB, iTAK, Animal TFDB, transcription can be generated after operation Factor comparison result table and comparison result statistical chart.Comparison result statistical chart visualizes the transcription factor percentage on comparing Pie chart, the picture of generation can be downloaded to local preservation or click on deposit report and then picture is stored in static report.
The analysis report of the present invention refers to Fig. 7-8, in the interactive analysis page, clicks on deposit report button, you can will divide The relevant position of result figure or table deposit static report is analysed, and same analysis can be stored in the analysis result figure of multiple different dimensions. In static report, software and algorithm and the biological significance of the analysis that analysis is selected can be checked, static report is supported online Preview, editor, print or be downloaded to local preservation.
A kind of interactive analysis of transcriptome project for having reference gene group based on cloud computing platform of the present invention The analysis result file of output is integrated into distribution to corresponding item file according to preset format.The analysis result of output can under Carry, used for follow-up analyse in depth.In addition, traditional data storage approach receives the limitation in quality of hardware and life-span, and high in the clouds Data storage is never lost, and security is higher.
A kind of interactive analysis of transcriptome project for having reference gene group based on cloud computing platform of the present invention Method, user can freely set required parameter, selection sequencing data, setting packet, Screening Samples, utilize the configuration file pair The sequencing data carries out the raw letter analysis in basis, and is presented in the form of chart and static report, thus compared to traditional hand Dynamic analytical model, of the invention automatically analyzes pattern, has not only saved learning cost, has more improved the transcription of reference gene group The analysis efficiency of group project.In the embodiment of the present invention, interactive analysis includes advanced raw letter analysis and the letter analysis of individual character metaplasia, high The raw letter analysis of level and the letter analysis of individual character metaplasia are more targeted, deeper, more multidimensional on the basis of the raw letter analysis in basis The mining data information of degree so that the transcriptome analysis mode for having reference gene group is no longer limited to the list of traditional business line flow One property, the efficiency and data user rate of the transcriptome analysis of reference gene group are improved, based on a basic raw letter analysis number According to advanced raw letter analysis and the letter analysis of individual character metaplasia infinitely can be done, significantly reduce the scientific research cycle and reduce scientific research cost.
In the present embodiment, the interactive analysis method tool of the transcriptome project for having reference gene group based on cloud computing platform There are the interactive analysis interface of succinct close friend, Quality Control software SeqPrep, Sickle of highly integrated classics in the industry, comparing The analysis softwares such as software TopHat2, HISAT2 are simultaneously built into complete workflow, and the chart or destination file for analyzing generation meet Professional journals deliver requirement.In work stream interface, according to the order of step 1 to step 8, the Fastq sequences for analysis are selected Row file, select quality control standard, reference gene group database and compare software, upload grouping information table, analytical parameters, point are set Hit and preserve and run, you can realize the raw letter analysis in basis for the transcriptome project for having reference gene group.In the interaction analysis page, use Family, which can freely perform, to be reset distance algorithm, changes clustering method, select sample, change packet, change color matching, rewrite figure caption etc. Operation, multi-angle, comprehensive, profound parsing data value, while having saved the quality time of communication link for user, keeps away The cognitive Bias to analyzing demand is exempted from, the project cycle significantly shortens.
Disclosed above is only several specific embodiments of the present invention, but the present invention is not limited to this, any ability What the technical staff in domain can think change should all fall into protection scope of the present invention.

Claims (10)

1. a kind of interactive analysis system of the transcriptome project for having reference gene group based on cloud computing platform, its feature exist In:Including
Project management module, for the details of project to be checked, edited and managed, by project, application, task, File etc. carries out integrated management to the analysis project of all progresses;
Fundamental analysis task submits module, for carrying out parameter setting to task, carries out operational analysis after submission, and by output Analysis result and initial data integrate distribution to corresponding item file according to preset format;Fundamental analysis task submits module The parameter that task is set is included data Quality Control statistics, annotation of gene function, compare reference gene group, transcript profile quality evaluation, New transcript prediction, expression analysis, Differential expression analysis, genomic organization;
Interactive interpretation of result module, for carrying out the further optimization of analysis result according to users ' individualized requirement, and will knot Fruit visualization is presented, including advanced bio bioinformatics analysis and personalized biological bioinformatics analysis;
The project management module submits module to be connected with interactive interpretation of result module by fundamental analysis task.
2. the interactive analysis of the transcriptome project according to claim 1 that have reference gene group based on cloud computing platform System, it is characterised in that the middle-and-high-ranking bioinformatic analysis of interactive interpretation of result module and personalized biological bioinformatics analysis, Including sequencing saturation analysis, redundant sequence analysis, the analysis of gene coverage, overlay area distributional analysis, gene expression difference Cluster analysis, GO enrichments analysis, KEGG enrichments analysis, gene co-expressing network analysis, interactions between protein network analysis, alternative splicing Analysis, snp analysis, rna editing analysis, gene fusion analysis.
3. the interactive analysis of the transcriptome project according to claim 1 that have reference gene group based on cloud computing platform System, it is characterised in that interactive interpretation of result module is additionally operable to change packet scheme and selects analysis sample, selects cluster to calculate Method etc..
4. the interactive analysis of the transcriptome project according to claim 1 that have reference gene group based on cloud computing platform System, it is characterised in that the interactive interpretation of result module includes graph tool, realize change scheme of colour, graphic scheme, Cylindricality direction;Sample may be selected, and selectively show legend, point title, clustering tree, envirment factor, figure caption can be changed;Analysis knot Fruit figure supports PNG, JPEG, PDF, SVG form to download, and clicks on deposit report, and shown in report;Interactive interpretation of result mould The reporting format of block is html and pdf.
5. a kind of analysis method using claim 1 system, it is characterised in that comprise the following steps:
Step 0, project is established;
Step 1, sequencing data is uploaded to local cluster server, while in the privately owned reference of local cluster server upload user Genome database or selection use the public database in platform, project in user's lockable module or select project Selecting property is shared to other people operational administratives;
Step 2, task is established;
Step 3, fundamental analysis task submit module in, user in visualization interface to sequencing data carry out parameter setting and Operational analysis, before operational analysis, prejudge whether data Quality Control meets standardisation requirements, the letter that reports an error directly is returned if not meeting Breath;Carry out specifying parameter operational analysis if meeting, corresponding item file is generated after operational analysis.
Step 4, caused item file, which is sent in interactive interpretation of result module, interacts formula analysis, according to user personality Change demand carries out secondary analysis and statistics to item file, generates the interactive analysis report intuitively presented.
6. the interaction analysis side of the transcriptome project according to claim 5 that have reference gene group based on cloud computing platform Method, it is characterised in that in the step 3 when carrying out parameter setting and operational analysis to sequencing data, user can voluntarily be set Data Quality Control SS, the privately owned reference gene group of common reference genome database or user in selection analysis platform, Setting is compared, assembled, differential expression is analyzed, alternative splicing analysis software, after selecting content to be analyzed, is reruned and has been selected whole points Analysis.
7. the interaction analysis side of the transcriptome project according to claim 5 that have reference gene group based on cloud computing platform Method, it is characterised in that the project management module, it may also be used for check, edit and manage by user upload associated documents or The associated documents as caused by analysis, and associated documents can be uploaded, searched, replicated, moved, deleted and down operation; Mesh management module is used for the state and log information for checking task run;Project management module is entered available for tagging items state Degree, project status progress can be in not starting, carrying out, completed, terminated and problem;Project management module, it is additionally operable to share Project, and member's authority can be managed.
8. the interaction analysis side of the transcriptome project according to claim 5 that have reference gene group based on cloud computing platform Method, it is characterised in that reference gene group database purchase uses the reference gene group number in platform in local cluster server Include Animal genome database, Plant Genome database, fungal gene group database according to storehouse, while also made by oneself including upload Adopted database.
9. the interaction analysis side of the transcriptome project according to claim 5 that have reference gene group based on cloud computing platform Method, it is characterised in that the project management module, fundamental analysis task submit the operation of module and interactive interpretation of result module It is based on PHP+MySQL+MongoDB server background and html+Css+jquery front end page.Interactive analysis mould Block, front end page interaction is triggered by user and mutually receives task execution command, submits task parameters to server background, and transfer The server side scripts of Perl, C, python, R computer language are analyzed sequencing data, are returned again by server Shown to front end page.
10. the interaction analysis of the transcriptome project according to claim 5 that have reference gene group based on cloud computing platform Method, it is characterised in that the fundamental analysis task submits module carrying out the not same order of operational analysis to the sequencing data Section, corresponding analysis software is chosen from the analysis software of its storage bioinformatic analysis is carried out to the sequencing data.
CN201710598342.1A 2017-07-21 2017-07-21 The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform Pending CN107368704A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710598342.1A CN107368704A (en) 2017-07-21 2017-07-21 The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform
CN201810802816.4A CN109086567A (en) 2017-07-21 2018-07-20 The interactive analysis system and method for having the transcriptome project with reference to genome based on cloud computing platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710598342.1A CN107368704A (en) 2017-07-21 2017-07-21 The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform

Publications (1)

Publication Number Publication Date
CN107368704A true CN107368704A (en) 2017-11-21

Family

ID=60307060

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710598342.1A Pending CN107368704A (en) 2017-07-21 2017-07-21 The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform
CN201810802816.4A Pending CN109086567A (en) 2017-07-21 2018-07-20 The interactive analysis system and method for having the transcriptome project with reference to genome based on cloud computing platform

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201810802816.4A Pending CN109086567A (en) 2017-07-21 2018-07-20 The interactive analysis system and method for having the transcriptome project with reference to genome based on cloud computing platform

Country Status (1)

Country Link
CN (2) CN107368704A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694305A (en) * 2018-03-30 2018-10-23 武汉光谷创赢生物技术开发有限公司 Analysis of biological information platform based on cloud computing
CN109086570A (en) * 2018-06-29 2018-12-25 迈凯基因科技有限公司 A kind of multiple database successively exchange method and device
CN109215742A (en) * 2018-08-30 2019-01-15 武汉古奥基因科技有限公司 biological information visualization device and method
CN109448788A (en) * 2018-10-24 2019-03-08 广州基迪奥生物科技有限公司 On-line analysis platform architecture of microbiology of genomics and bioinformatics
CN109582292A (en) * 2018-11-01 2019-04-05 广州基迪奥生物科技有限公司 Online interaction cloud platform based on genomics and bioinformatics
CN109584962A (en) * 2018-10-26 2019-04-05 广州基迪奥生物科技有限公司 A kind of RNA-seq on-line analysis reporting system and its generation method
CN110010203A (en) * 2019-03-29 2019-07-12 广州基迪奥生物科技有限公司 A kind of Interactive Dynamic qtl analysis system and method based on biological cloud platform
CN110008427A (en) * 2019-03-29 2019-07-12 广州基迪奥生物科技有限公司 A kind of multiple groups of integrating are gained knowledge the interactive biological information cloud analysis platform in library
CN110060741A (en) * 2019-04-29 2019-07-26 哈尔滨工业大学 Interaction network page biology big data method for visualizing based on JavaScript
CN110428867A (en) * 2019-07-30 2019-11-08 中国科学院心理研究所 A kind of human brain gene spatial and temporal expression profile on-line analysis system and its method
CN110490450A (en) * 2019-08-15 2019-11-22 安诺优达生命科学研究院 Biological information management system based on mixed cloud
CN110659252A (en) * 2019-08-12 2020-01-07 安诺优达生命科学研究院 Cloud-based biological information data delivery method and device and electronic equipment
CN110838338A (en) * 2018-08-15 2020-02-25 上海美吉生物医药科技有限公司 System, method, storage medium, and electronic device for creating biological analysis item
CN111009289A (en) * 2019-11-28 2020-04-14 广州基迪奥生物科技有限公司 RNA-seq online report flow analysis method and system based on cloud computing
CN111276190A (en) * 2020-01-07 2020-06-12 广州基迪奥生物科技有限公司 Dynamic interaction enrichment analysis method and system based on biological cloud platform
CN111402955A (en) * 2020-04-09 2020-07-10 德州学院 Biological information measuring method, system, storage medium and terminal
CN112037847A (en) * 2020-09-15 2020-12-04 中国科学院微生物研究所 Microbial strain genome analysis method and device and electronic equipment
CN113377765A (en) * 2021-07-09 2021-09-10 深圳华大基因科技服务有限公司 Multi-group chemical data analysis system and data conversion method thereof
CN113886674A (en) * 2020-07-01 2022-01-04 北京达佳互联信息技术有限公司 Resource recommendation method and device, electronic equipment and storage medium
CN115440305A (en) * 2022-08-29 2022-12-06 新疆碳智干细胞库有限公司 Human genetic resource gene data management system and method
CN115472298A (en) * 2022-10-28 2022-12-13 方寸慧医(江苏)生物科技有限公司 AI-based high-throughput sequencing data intelligent analysis system and method
CN115881225A (en) * 2022-12-28 2023-03-31 云舟生物科技(广州)股份有限公司 Method for analyzing biological information sequence, computer storage medium, and electronic device

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110993033A (en) * 2019-11-14 2020-04-10 北京诺禾致源科技股份有限公司 Method, system and device for processing genome data
CN111428159A (en) * 2020-03-17 2020-07-17 中国建设银行股份有限公司 Online classification method and device
CN111696629B (en) * 2020-06-29 2023-04-18 电子科技大学 Method for calculating gene expression quantity of RNA sequencing data
CN116386736B (en) * 2023-04-11 2024-04-05 南京派森诺基因科技有限公司 Full-automatic analysis method for eukaryotic ginseng transcriptome products based on second-generation sequencing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102277351A (en) * 2010-06-10 2011-12-14 中国科学院上海生命科学研究院 Method for acquiring gene information and function genes from species without genome referenced sequences
CN104331640B (en) * 2014-10-17 2018-04-17 北京百迈客生物科技有限公司 Project concluding report analysis system and method based on biological cloud platform
CN105653900B (en) * 2015-12-25 2019-03-26 北京百迈客生物科技有限公司 Without ginseng transcriptome analysis system and method
CN105447336B (en) * 2015-12-29 2018-06-19 北京百迈客生物科技有限公司 Analysis of Microbial Diversity system based on biological cloud platform
CN106021979A (en) * 2016-05-12 2016-10-12 北京百迈客云科技有限公司 Analysis system and method for human genome re-sequencing data

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694305A (en) * 2018-03-30 2018-10-23 武汉光谷创赢生物技术开发有限公司 Analysis of biological information platform based on cloud computing
CN108694305B (en) * 2018-03-30 2021-06-11 武汉生物样本库有限公司 Biological information analysis system based on cloud computing
CN109086570A (en) * 2018-06-29 2018-12-25 迈凯基因科技有限公司 A kind of multiple database successively exchange method and device
CN109086570B (en) * 2018-06-29 2020-09-04 迈凯基因科技有限公司 Multi-database sequential interaction method and device
CN110838338A (en) * 2018-08-15 2020-02-25 上海美吉生物医药科技有限公司 System, method, storage medium, and electronic device for creating biological analysis item
CN110838338B (en) * 2018-08-15 2023-09-29 上海美吉生物医药科技有限公司 Biological analysis item establishment system, biological analysis item establishment method, storage medium, and electronic device
CN109215742A (en) * 2018-08-30 2019-01-15 武汉古奥基因科技有限公司 biological information visualization device and method
CN109448788A (en) * 2018-10-24 2019-03-08 广州基迪奥生物科技有限公司 On-line analysis platform architecture of microbiology of genomics and bioinformatics
CN109584962A (en) * 2018-10-26 2019-04-05 广州基迪奥生物科技有限公司 A kind of RNA-seq on-line analysis reporting system and its generation method
CN109582292B (en) * 2018-11-01 2022-02-18 广州基迪奥生物科技有限公司 Online interaction cloud platform based on genomics and bioinformatics
CN109582292A (en) * 2018-11-01 2019-04-05 广州基迪奥生物科技有限公司 Online interaction cloud platform based on genomics and bioinformatics
CN110008427A (en) * 2019-03-29 2019-07-12 广州基迪奥生物科技有限公司 A kind of multiple groups of integrating are gained knowledge the interactive biological information cloud analysis platform in library
CN110010203A (en) * 2019-03-29 2019-07-12 广州基迪奥生物科技有限公司 A kind of Interactive Dynamic qtl analysis system and method based on biological cloud platform
CN110008427B (en) * 2019-03-29 2023-03-21 广州基迪奥生物科技有限公司 Interactive biological information cloud analysis platform integrating multi-group knowledge base
CN110010203B (en) * 2019-03-29 2022-05-27 广州基迪奥生物科技有限公司 Interactive dynamic QTL analysis system and method based on biological cloud platform
CN110060741A (en) * 2019-04-29 2019-07-26 哈尔滨工业大学 Interaction network page biology big data method for visualizing based on JavaScript
CN110428867B (en) * 2019-07-30 2021-09-17 中国科学院心理研究所 Human brain gene space-time expression mode online analysis system and method thereof
CN110428867A (en) * 2019-07-30 2019-11-08 中国科学院心理研究所 A kind of human brain gene spatial and temporal expression profile on-line analysis system and its method
CN110659252A (en) * 2019-08-12 2020-01-07 安诺优达生命科学研究院 Cloud-based biological information data delivery method and device and electronic equipment
CN110490450A (en) * 2019-08-15 2019-11-22 安诺优达生命科学研究院 Biological information management system based on mixed cloud
CN111009289A (en) * 2019-11-28 2020-04-14 广州基迪奥生物科技有限公司 RNA-seq online report flow analysis method and system based on cloud computing
CN111009289B (en) * 2019-11-28 2024-02-06 广州基迪奥生物科技有限公司 RNA-seq online report flow analysis method and system based on cloud computing
CN111276190B (en) * 2020-01-07 2023-09-12 广州基迪奥生物科技有限公司 Dynamic interactive enrichment analysis method and system based on biological cloud platform
CN111276190A (en) * 2020-01-07 2020-06-12 广州基迪奥生物科技有限公司 Dynamic interaction enrichment analysis method and system based on biological cloud platform
CN111402955A (en) * 2020-04-09 2020-07-10 德州学院 Biological information measuring method, system, storage medium and terminal
CN113886674A (en) * 2020-07-01 2022-01-04 北京达佳互联信息技术有限公司 Resource recommendation method and device, electronic equipment and storage medium
CN112037847A (en) * 2020-09-15 2020-12-04 中国科学院微生物研究所 Microbial strain genome analysis method and device and electronic equipment
CN113377765A (en) * 2021-07-09 2021-09-10 深圳华大基因科技服务有限公司 Multi-group chemical data analysis system and data conversion method thereof
CN115440305A (en) * 2022-08-29 2022-12-06 新疆碳智干细胞库有限公司 Human genetic resource gene data management system and method
CN115472298B (en) * 2022-10-28 2023-04-07 方寸慧医(江苏)生物科技有限公司 AI-based high-throughput sequencing data intelligent analysis system and method
CN115472298A (en) * 2022-10-28 2022-12-13 方寸慧医(江苏)生物科技有限公司 AI-based high-throughput sequencing data intelligent analysis system and method
CN115881225A (en) * 2022-12-28 2023-03-31 云舟生物科技(广州)股份有限公司 Method for analyzing biological information sequence, computer storage medium, and electronic device
CN115881225B (en) * 2022-12-28 2024-01-26 云舟生物科技(广州)股份有限公司 Analysis method of biological information sequence, computer storage medium and electronic device

Also Published As

Publication number Publication date
CN109086567A (en) 2018-12-25

Similar Documents

Publication Publication Date Title
CN107368704A (en) The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform
Ji et al. TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis
Preud’Homme et al. Head-to-head comparison of clustering methods for heterogeneous data: a simulation-driven benchmark
CN107391963A (en) Eucaryon based on calculating cloud platform is without ginseng transcript profile interaction analysis system and method
CN107368700A (en) Based on the microbial diversity interaction analysis system and method for calculating cloud platform
Shen et al. BarleyBase—an expression profiling database for plant genomics
Nobre et al. Lineage: Visualizing multivariate clinical data in genealogy graphs
Caudai et al. AI applications in functional genomics
CN108140025A (en) For the interpretation of result of graphic hotsopt
US20030218634A1 (en) System and methods for visualizing diverse biological relationships
Pehkonen et al. Theme discovery from gene lists for identification and viewing of multiple functional groups
Hale et al. FunSet: an open-source software and web server for performing and displaying Gene Ontology enrichment analysis
US20050188294A1 (en) Systems, tools and methods for constructing interactive biological diagrams
Kim et al. Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment
CN108388775A (en) Genetic analysis guidance system and its method
Štajdohar et al. Interactive network exploration with Orange
Bandi SynVisio: a multiscale tool to explore genomic conservation
Edlund et al. Design of the MCAW compute service for food safety bioinformatics
Ghulam et al. Comprehensive analysis of features and annotations of pathway databases
Huson et al. Visualizing incompatibilities in phylogenetic trees using consensus outlines
Reis et al. A Conceptual Architecture for AI-based Big Data Analysis and Visualization Supporting Metagenomics Research.
Lamba et al. Tools and techniques for text mining and visualization
Lane et al. Eyeing the patterns: Data visualization using doubly-seriated color heatmaps
Hashiguchi et al. Impact of supervisors' research style on young biomedical scientists' capacity development as measured by REDi, a novel index of crossdisciplinarity
Nicholls et al. TraitLab: A MatLab package for fitting and simulating binary tree-like data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171121