CN109979527A - A kind of transcript profile and metabolism group data relation analysis method and system - Google Patents

A kind of transcript profile and metabolism group data relation analysis method and system Download PDF

Info

Publication number
CN109979527A
CN109979527A CN201910176587.4A CN201910176587A CN109979527A CN 109979527 A CN109979527 A CN 109979527A CN 201910176587 A CN201910176587 A CN 201910176587A CN 109979527 A CN109979527 A CN 109979527A
Authority
CN
China
Prior art keywords
gene
metabolin
analysis
transcript profile
difference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910176587.4A
Other languages
Chinese (zh)
Inventor
夏昊强
周煌凯
高川
张羽
陶勇
罗玥
邢燕
张秋雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Gene Denovo Biotechnology Co ltd
Original Assignee
Guangzhou Gene Denovo Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Gene Denovo Biotechnology Co ltd filed Critical Guangzhou Gene Denovo Biotechnology Co ltd
Priority to CN201910176587.4A priority Critical patent/CN109979527A/en
Publication of CN109979527A publication Critical patent/CN109979527A/en
Pending legal-status Critical Current

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a kind of transcript profiles and metabolism group data relation analysis method and system, and described method includes following steps: step S1, carry out transcript profile sequencing to sample, and carry out analysis of biological information to transcript profile data, obtain difference expression gene;Step S2 carries out metabolism group sequencing to sample, and carries out analysis of biological information to metabolism group data, obtains difference metabolin;Step S3, difference expression gene and difference metabolin based on acquisition carry out analysis of biological information linked character, through the invention, solve the problems, such as that there are one-sidedness and partial data unreliability for the single data for organizing sequencing in the prior art, meanwhile, pass through information between biomolecule in the present invention and transmits, matching coordinative, specific function is finally showed, so that multiple groups confluence analysis more system of the invention, is more advantageous to and discloses complicated functional mechanism.

Description

A kind of transcript profile and metabolism group data relation analysis method and system
Technical field
The present invention relates to transcription groups and metabonomic technology field, are based on biological information algorithm groups more particularly to one kind The transcript profile and metabolism group data relation analysis method and system of conjunction.
Background technique
Metabolism group is the important component of systems biology, is the final product of vital movement, directly represents environment Variation or physiological and pathological change to be influenced to body bring, and available a large amount of difference expression genes and regulation generation are sequenced in transcript profile Thank to access, but transcription and metabolism are not independently to occur in biosystem, single group learn the data of sequencing there are one-sidedness and Partial data unreliability, possibly can not the entire biological process of complete picture.Also, due to being difficult it between gene and phenotype Between be associated with, cause crucial signal path to be difficult to determine, single group learns research and expected research purpose is often not achieved.It can be based on The gene or metabolin for participating in same bioprocess have the same or similar changing rule, integrate transcript profile and and metabolism group Data carry out the association analysis of the two, deeply excavate the gene and metabolin for participating in regulation process, disclose true gene expression Regulated and control network obtains more complete access and mechanism parsing.Meanwhile with the fast development of high-flux sequence, transcript profile sequencing Generated magnanimity biological data is sequenced with metabolism group, data are calculated and analysis has higher requirement, are needed based on height Performance computing cluster carries out Data Integration and excavation, and high-throughput integration analysis technology is that the final biological information of acquisition is essential Important means.
The Chinese invention patent of Publication No. CN107832585 discloses a kind of RNAseq data analysing method, but it is still The method for not disclosing two groups of sequencing data association analysis, there are still the data of single group sequencing, there are one-sidedness and part number The problem of according to unreliability.
Summary of the invention
In order to overcome the deficiencies of the above existing technologies, purpose of the present invention is to provide a kind of transcript profiles and metabolism group number According to association analysis method and system, learned the data of sequencing there are one-sidedness and partial data not with solving single group in the prior art The problem of reliability, excavates the gene and metabolin for participating in regulation process by the entire biological process of complete picture, discloses true Real gene expression regulation network determines crucial signal path, obtains more complete access and mechanism parsing.
In view of the above and other objects, the present invention proposes a kind of transcript profile and metabolism group data relation analysis method, including Following steps:
Step S1 carries out transcript profile sequencing to sample, and carries out analysis of biological information to transcript profile data, obtains difference table Up to gene;
Step S2 carries out metabolism group sequencing to sample, and carries out analysis of biological information to metabolism group data, obtains difference generation Thank to object;
Step S3, difference expression gene and difference metabolin based on acquisition carry out analysis of biological information linked character.
Preferably, in step S3, gene expression amount and metabolin abundance data based on acquisition are included but unlimited In the analysis of Pathway functional mode, O2PLS model analysis and relative coefficient model analysis, the Pathway functional mode For the KEGG metabolic pathway that query gene and metabolin share, that analyzes gene and metabolin in shared metabolic pathway is associated with spy Sign, the O2PLS model analysis are used to predict to obtain in turn using gene expression amount and metabolin abundance data building O2PLS model Relevant property gene and metabolin collection combined analysis, the relative coefficient model analysis for calculate gene expression amount with The pearson relative coefficient of metabolin abundance simultaneously exports displaying.
Preferably, the Pathway functional mode analysis includes but is not limited to that group difference gene and group difference are metabolized Object share the analysis of metabolic pathway, group difference gene and all metabolins share metabolic pathway analysis and all genes with All metabolins share the analysis of metabolic pathway.
Preferably, in every kind of analysis type, pathway annotation is carried out first, will need to participate in gene and the generation of analysis It thanks to object and matching is compared with the information of gene included in KEGG database pathway and metabolin, to obtain gene With the pathway where metabolin, the gene and metabolin for being present in same pathway, annotation are then shown with graphic statistics It is deemed likely to potential function association for the gene and metabolin of same pathway, finally in the form of shared pathway Obtain the related information of gene and metabolin.
Preferably, output gene metabolic pathway associated with metabolin is also drawn in the Pathway functional mode analysis Figure, gene functional character associated with metabolin is intuitively presented.
Preferably, the O2PLS model analysis includes the following steps:
Step 2.1, it is repeatedly modeled by intersection-verifying method, calculates the prediction error modeled every time, select pre- Most suitable model in modeling, generally, the prediction smaller expression model of error are more reasonable;By repeatedly pre- modeling, obtain most suitable O2PLS model;
Step 2.2, percentage contribution of the live part to model in transcript profile and metabolism group, the building of assessment models are calculated Situation;
Step 2.3, the contribution degree of each gene of associated section and each metabolin in entire model is calculated, contribution degree Size is embodied by load value, for all transcript profiles and metabolism group data, draws the load diagram of different groups respectively;
Step 2.4, the element load value obtained according to step 2.3 is as a result, screening-gene and metabolin carry out integration drafting Load diagram to show the maximum gene of correlation degree and metabolism group, and draws output group associated payload figure.
Preferably, it includes but is not limited to that all differences gene expression amount and all differences are metabolized that the relative coefficient, which calculates, The pearson coefficient and all gene expression amounts of object abundance and the pearson coefficient of all metabolin abundance.
Preferably, the relative coefficient model analysis is also used to draw output correlation thermal map using calculating gained coefficient And network.
Preferably, the relative coefficient model analysis includes the following steps:
Step 3.1, the pearson coefficient of gene expression amount and metabolin abundance is calculated;
Step 3.2, the pearson system based on step 3.1 calculated differential gene expression amount and difference metabolin abundance Number contents, sort from high to low by correlation absolute value, take several before ranking, and correlation absolute value is greater than the base of preset value Cause, then the correlation of the gene filtered out and all differences metabolin is shown with thermal map;
Step 3.3, the pearson system based on step 3.1 calculated differential gene expression amount and difference metabolin abundance Number content, garbled data draw output correlation network, to show the gene or metabolin that are in important relative position.
In order to achieve the above objectives, the present invention also provides a kind of transcript profiles and metabolism group data association analysis system, comprising:
Difference expression gene analytical unit for carrying out transcript profile sequencing to sample, and carries out biology to transcript profile data Information analysis obtains difference expression gene;
Difference metabolite analysis unit for carrying out metabolism group sequencing to sample, and carries out biological letter to metabolism group data Breath analysis, obtains difference metabolin;
Association analysis unit is associated with spy for carrying out analysis of biological information with difference metabolin based on difference expression gene Sign.
Compared with prior art, a kind of transcript profile of the present invention and metabolism group data relation analysis method and system pass through to sample This carries out transcript profile sequencing and the sequencing of metabolism group respectively, and two groups of data of gene expression amount and metabolin abundance based on acquisition carry out Association analysis, analyzing and associating feature, the data that can solve individually to organize sequencing are complete there are one-sidedness and partial data unreliability It is whole to describe entire biological process, the gene and metabolin for participating in regulation process are excavated, true gene expression regulation net is disclosed Network determines crucial signal path, obtains more complete access and mechanism parsing.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of transcript profile and metabolism group data relation analysis method of the present invention;
Fig. 2 is a kind of system architecture diagram of transcript profile and metabolism group data association analysis system of the present invention.
Specific embodiment
Below by way of specific specific example and embodiments of the present invention are described with reference to the drawings, those skilled in the art can Understand further advantage and effect of the invention easily by content disclosed in the present specification.The present invention can also pass through other differences Specific example implemented or applied, details in this specification can also be based on different perspectives and applications, without departing substantially from Various modifications and change are carried out under spirit of the invention.
It is found through analysis, better biology can be obtained based on two groups of (transcript profile and metabolism group) data relation analysis of analysis Data analysis result is more advantageous to the research for excavating mechanism principle, passes through the multiple groups data correlation of metabolism group and transcript profile point Analysis method, the metabolin information of numerous genes and differential accumulation to temporal expression carry out confluence analysis, and binding molecule biology Technology explains biological phenotype of interest from molecular level, explores biology growing development, physiological and pathological acknowledgement mechanism.Cause This, the present invention proposes a kind of transcript profile based on biological information algorithm combination and metabolism group data relation analysis method
Fig. 1 is a kind of step flow chart of transcript profile and metabolism group data relation analysis method of the present invention.As shown in Figure 1, A kind of transcript profile of the present invention and metabolism group data relation analysis method, include the following steps:
Step S1 carries out transcript profile sequencing to sample, and carries out analysis of biological information to transcript profile data, obtains difference table Up to gene, specially screening obtains associated gene set influential on sample packet
Step S2 carries out metabolism group sequencing to sample, and carries out analysis of biological information to metabolism group data, obtains difference generation Thank to object.Specially screening obtains metabolin set influential on sample packet.
Step S3 carries out analysis of biological information linked character based on difference expression gene and difference metabolin.In the present invention In specific embodiment, analysis of biological information linked character is carried out based on cloud computing, is based on gene expression amount and metabolin abundance two Group data carry out following three kinds of model analysis:
1, Pathway functional mode is analyzed, i.e. the shared KEGG metabolic pathway (pathway) of query gene and metabolin, Analyze the linked character of gene and metabolin in shared pathway.
Specifically, the Pathway functional mode analysis includes but is not limited to that shared metabolic pathway analysis and result are shown, It is analyzed by group difference, the gene of differential expression is obtained by transcript profile data, obtains difference table by metabolism group data The metabolin reached, and the KEGG enrichment analysis for having carried out respective group will carry out in association analysis for gene and metabolin The analysis of shared KEGG metabolic pathway (pathway)
In the specific embodiment of the invention, the analysis of Pathway functional mode includes but is not limited to following three types:
1) group difference gene and group difference metabolin share the analysis of metabolic pathway (pathway);
2) since the type of group difference metabolin may not have target metabolite seldom or in difference metabolin, therefore It carries out group difference gene and all metabolins shares the analysis of metabolic pathway (pathway);
3) basis as the screening of other personality analysis, carries out all genes and all metabolins share metabolic pathway (pathway) analysis.
Specifically, in every kind of analysis type, pathway annotation is carried out first, and above-mentioned needs are participated in the gene of analysis Matching is compared with the information of gene included in KEGG database pathway and metabolin with metabolin, to obtain Then pathway where gene and metabolin shows the gene and metabolin for being present in same pathway with graphic statistics, The gene and metabolin that annotation is same pathway have been deemed likely to potential function association.Finally to share pathway The related information of form acquisition gene and metabolin.
Specifically, the result display diagram is to show difference metabolin by output metabolic pathway figure (pathway map) It shows to reconcile in abundance and lower for the intuitive relationship that gene and metabolin and related biological problem is presented with associated gene figure Metabolin and expression quantity on reconcile the gene of downward.
2, O2PLS (Bidirectional orthogonal projections to latent structures) mould Type analysis constructs O2PLS model using gene expression amount and metabolin abundance data, remove model noise components, pass through mould Type prediction obtains the gene and metabolin collection combined analysis of relevant property.
Specifically, O2PLS model analysis is to be carried out based on all transcript profiles and all metabolism group data using OmicsPLS O2PLS analysis.O2PLS is a kind of extensive OPLS, can carry out two-way modeling and prediction in two data matrixes, utilize this point Analysis, can excavate the internal connection between transcript profile and metabolism group, determine the correlation degree of transcript profile Yu metabolism group data, while really Surely cause this associated oligogene or metabolin.
In the specific embodiment of the invention, O2PLS model analysis includes but is not limited to following steps:
Step 2.1, the building of model: since the fitting of model is insufficient or over-fitting can all analyze data and impact, It first before formal model analysis, is repeatedly modeled, is calculated by the method for intersection-verifying (cross-validation) The prediction error modeled every time selects the most suitable model in pre- modeling, and generally, the prediction smaller expression model of error more closes Reason;By repeatedly pre- modeling, most suitable O2PLS model is obtained, transcript profile and metabolism group data can be split as to 6 major parts, The respectively associated section (joint part, this part mainly include the gene very big with metabolin correlation degree) of transcript profile, (Orthogonal part, this part is not influence on metabolin abundance to the quadrature component of transcript profile, only to transcript profile data Influential gene), there are also noise components, (noise part, this portion gene neither influence transcript profile data and nor affect on generation Thank to a group data) and the corresponding association of metabolism group, orthogonal and noise components.
Step 2.2, model evaluation: after establishing O2PLS model, the live part calculated in transcript profile and metabolism group (is closed Connection and quadrature component) to the percentage contribution of model, carry out the building situation of assessment models.The contribution degree of live part is bigger, indicates The building of model is more reasonable.
Step 2.3, by constructing O2PLS model, it is each element Contribution Analysis: to calculate associated section (joint part) The size of the contribution degree of gene and each metabolin in entire model, contribution degree is embodied by load (loading) value.Gene Or load (loading) value absolute value of metabolin is bigger, indicates this gene or metabolin and metabolin or base that in addition group is learned The correlation degree of cause is bigger.Meanwhile the element importance intuitively to check two groups, for all transcript profiles and metabolism group number According to drafting exports the load diagram (loading plot) of different groups respectively.
Step 2.4, elements correlation is analyzed between group: element load (loading) value obtained according to previous step is as a result, screening Out both-end top 2.5% (positive and negative each 2.5%, totally 5%) gene and metabolin carry out integration draw load (loading) figure, with It shows the maximum gene of correlation degree and metabolism group, and draws output group associated payload figure.
3, relative coefficient model analysis calculates the pearson relative coefficient of gene expression amount and metabolin abundance, And it is shown with thermal map and network.Here it should be noted that, in the specific embodiment of the invention, when sample packet >=3, just needs Carry out relative coefficient model analysis.
Specifically, relative coefficient model analysis includes but is not limited to following steps:
Step 3.1, Pearson correlation coefficient (Pearson correlation coefficient) is calculated: Pearson came phase Relationship number can be used to measure the correlation between two variables, represent the power of two variable co-variations, and value range is [-1,+1].The pearson coefficient of gene expression amount and metabolin abundance is calculated, using R language cor.tesst function with assessment The correlation of gene and metabolin.In the specific embodiment of the invention, Pearson correlation coefficient (Pearson correlation It coefficient) include two types: 1) the pearson system of all differences gene expression amount and all differences metabolin abundance Number;2) the pearson coefficient of all differences gene expression amount and all metabolin abundance.
Step 3.2, correlation thermal map is drawn: based on the calculated differential gene expression amount of step 3.1 and difference metabolin The pearson coefficient content of abundance, sorts from high to low by correlation absolute value, takes several before ranking (such as preceding 10), and phase Closing property absolute value is greater than the gene of preset value (such as 0.5), then by the gene filtered out and all differences metabolin (group difference The union of metabolin) correlation shown with thermal map.
Step 3.3, it draws correlation networks figure: being metabolized based on the calculated differential gene expression amount of step 3.1 and difference The pearson coefficient content of object abundance, garbled data draws correlation networks figure, to show the gene in important relative position Or metabolin.
Fig. 2 is a kind of system architecture diagram of transcript profile and metabolism group data association analysis system of the present invention.As shown in Fig. 2, A kind of transcript profile of the present invention and metabolism group data association analysis system, comprising:
Difference expression gene analytical unit 201 for carrying out transcript profile sequencing to sample, and gives birth to transcript profile data Object information analysis, obtains difference expression gene, and difference expression gene analytical unit 201 is specifically used for screening and obtains to sample packet Influential associated gene set.
Difference metabolite analysis unit 202 for carrying out metabolism group sequencing to sample, and carries out biology to metabolism group data Information analysis obtains difference metabolin.Difference metabolite analysis unit 202, which is specifically used for screening acquisition, has an impact to sample packet Metabolin set.
Association analysis unit 203, for carrying out analysis of biological information association based on difference expression gene and difference metabolin Feature.In the specific embodiment of the invention, association analysis unit 203 is based on cloud computing and carries out analysis of biological information linked character, Following three kinds of model analysis are carried out based on two groups of data of gene expression amount and metabolin abundance:
1, Pathway functional mode is analyzed, i.e. the shared KEGG metabolic pathway (pathway) of query gene and metabolin, Analyze the linked character of gene and metabolin in shared pathway.
Specifically, the Pathway functional mode analysis includes but is not limited to that shared metabolic pathway analysis and result are shown, It is analyzed by group difference, the gene of differential expression is obtained by transcript profile data, obtains difference table by metabolism group data The metabolin reached, and the KEGG enrichment analysis for having carried out respective group will carry out in association analysis for gene and metabolin The analysis of shared KEGG metabolic pathway (pathway)
In the specific embodiment of the invention, the analysis of Pathway functional mode includes but is not limited to following three types:
1) group difference gene and group difference metabolin share the analysis of metabolic pathway (pathway);
2) since the type of group difference metabolin may not have target metabolite seldom or in difference metabolin, therefore It carries out group difference gene and all metabolins shares the analysis of metabolic pathway (pathway);
3) basis as the screening of other personality analysis, carries out all genes and all metabolins share metabolic pathway (pathway) analysis.
Preferably, the result display diagram is to show difference metabolin by output metabolic pathway figure (pathway map) With associated gene figure, gene functional character associated with metabolin is presented to be intuitive, reconciles the metabolism of downward in displaying abundance Reconcile the gene of downward on object and expression quantity.
2, O2PLS (Bidirectional orthogonal projections to latent structures) mould Type analysis constructs O2PLS model using gene expression amount and metabolin abundance data, remove model noise components, pass through mould Type prediction obtains the gene and metabolin collection combined analysis of relevant property.
Specifically, O2PLS model analysis is to be carried out based on all transcript profiles and all metabolism group data using OmicsPLS O2PLS analysis.O2PLS is a kind of extensive OPLS, can carry out two-way modeling and prediction in two data matrixes, utilize this point Analysis, can excavate the internal connection between transcript profile and metabolism group, determine the correlation degree of transcript profile Yu metabolism group data, while really Surely cause this associated oligogene or metabolin.
In the specific embodiment of the invention, O2PLS model analysis includes but is not limited to following procedure:
1) building of model: since the fitting of model is insufficient or over-fitting can all analyze data and impact, exist first It before formal model analysis, is repeatedly modeled by the method for intersection-verifying (cross-validation), calculating is built every time The prediction error of mould selects the most suitable model in pre- modeling, and generally, the prediction smaller expression model of error is more reasonable;Pass through Repeatedly pre- modeling, obtains most suitable O2PLS model, and transcript profile and metabolism group data can be split as to 6 major parts, respectively turned The associated section (joint part, this part mainly include the gene very big with metabolin correlation degree) of record group, transcript profile (Orthogonal part, this part is not influence on metabolin abundance to quadrature component, only influential on transcript profile data Gene), there are also noise components, (noise part, this portion gene neither influence transcript profile data and nor affect on metabolism group number According to) and the corresponding association of metabolism group, orthogonal and noise components.
2) it model evaluation: after establishing O2PLS model, calculates transcript profile and (is associated with and just with the live part in metabolism group Hand over part) to the percentage contribution of model, carry out the building situation of assessment models.The contribution degree of live part is bigger, indicates model It is more reasonable to construct.
3), element Contribution Analysis: by constructing O2PLS model, associated section (joint part) each gene is calculated It is embodied with the size of contribution degree of each metabolin in entire model, contribution degree by load (loading) value.Gene or generation Load (loading) the value absolute value for thanking to object is bigger, indicates this gene or metabolin and in addition organizes the metabolin or gene learned Correlation degree is bigger.Meanwhile intuitively to check the element importance of two groups, for all transcript profiles and metabolism group data, The load diagram (loading plot) of different groups of output is drawn respectively.
4) elements correlation is analyzed between group: element load (loading) value obtained according to previous step is as a result, filter out both-end (positive and negative each 2.5%, totally 5%) gene and metabolin carry out integration and draw load (loading) figure top 2.5%, are closed with showing The maximum gene of connection degree and metabolism group, and draw output group associated payload figure.
3, relative coefficient model analysis calculates the pearson relative coefficient of gene expression amount and metabolin abundance, And it is shown with thermal map and network.Here it should be noted that, in the specific embodiment of the invention, when sample packet >=3, just needs Carry out relative coefficient model analysis.
Specifically, relative coefficient model analysis includes but is not limited to following procedure:
3.1, it calculates Pearson correlation coefficient (Pearson correlation coefficient): Pearson came phase relation Number can be used to measure the correlation between two variables, represent the power of two variable co-variations, value range be [- 1 ,+ 1].The pearson coefficient of gene expression amount and metabolin abundance is calculated, to assess correlation of the gene with metabolin.In this hair In bright specific embodiment, Pearson correlation coefficient (Pearson correlation coefficient) includes two types: 1) The pearson coefficient of all differences gene expression amount and all differences metabolin abundance;2) all differences gene expression amount and institute There is the pearson coefficient of metabolin abundance.
3.2, draw correlation thermal map: based on the differential gene expression amount and difference metabolin abundance having calculated that in 3.1 Pearson coefficient content, sort from high to low by correlation absolute value, take several before ranking (such as preceding 10), and correlation Absolute value is greater than the gene of preset value (such as 0.5), then (group difference is metabolized with all differences metabolin by the gene filtered out The union of object) correlation shown with thermal map.
3.3, draw correlation networks figure: rich based on the differential gene expression amount and difference metabolin having calculated that in 3.1 The pearson coefficient content of degree, garbled data draws correlation networks figure, to show gene or the generation in important relative position Thank to object.
In conclusion a kind of transcript profile of the present invention and metabolism group data relation analysis method and system are by distinguishing sample Transcript profile sequencing and the sequencing of metabolism group are carried out, two groups of data of gene expression amount and metabolin abundance based on acquisition are associated point It analyses, analyzing and associating feature, can solve the problems, such as that there are one-sidedness and partial data unreliability for the single data for organizing sequencing, originally The entire biological process of invention complete picture excavates the gene and metabolin for participating in regulation process, discloses true gene table Up to regulated and control network, crucial signal path is determined, obtain more complete access and mechanism parsing, pass through information between biomolecule Transmitting, matching coordinative finally show specific function, and the confluence analysis of multiple groups is more systematic, and it is complicated to be more advantageous to announcement Functional mechanism.
The above-described embodiments merely illustrate the principles and effects of the present invention, and is not intended to limit the present invention.Any Without departing from the spirit and scope of the present invention, modifications and changes are made to the above embodiments by field technical staff.Therefore, The scope of the present invention, should be as listed in the claims.

Claims (10)

1. a kind of transcript profile and metabolism group data relation analysis method, include the following steps:
Step S1 carries out transcript profile sequencing to sample, and carries out analysis of biological information to transcript profile data, obtains differential expression base Cause;
Step S2 carries out metabolism group sequencing to sample, and carries out analysis of biological information to metabolism group data, obtains difference metabolism Object;
Step S3, difference expression gene and difference metabolin based on acquisition carry out analysis of biological information linked character.
2. a kind of transcript profile as described in claim 1 and metabolism group data relation analysis method, it is characterised in that: in step S3 In, gene expression amount and metabolin abundance data based on acquisition carry out including but not limited to Pathway functional mode analysis, O2PLS model analysis and relative coefficient model analysis, the Pathway functional mode are total for query gene and metabolin Some KEGG metabolic pathways, analyze the linked character of gene and metabolin in shared metabolic pathway, and the O2PLS model analysis is used In the gene and metabolism that use gene expression amount and metabolin abundance data building O2PLS model and then the relevant property of prediction acquisition Object collection combined analysis, the relative coefficient model analysis is for calculating pearson phase of the gene expression amount with metabolin abundance It closes property coefficient and exports displaying.
3. a kind of transcript profile as claimed in claim 2 and metabolism group data relation analysis method, it is characterised in that: described The analysis of Pathway functional mode includes but is not limited to point that group difference gene and group difference metabolin share metabolic pathway The analysis of the shared metabolic pathway of analysis, group difference gene and all metabolins and all genes and the shared metabolism of all metabolins The analysis of access.
4. a kind of transcript profile as claimed in claim 3 and metabolism group data relation analysis method, it is characterised in that: at every kind point It analyses in type, first progress pathway annotation, the gene and metabolin and KEGG database pathway that will need to participate in analyzing Included in the information of gene and metabolin matching is compared, to obtain the pathway where gene and metabolin, so The gene and metabolin for being present in same pathway are shown with graphic statistics afterwards, annotates the gene for same pathway and metabolism Object has been deemed likely to potential function association, and the association letter of gene and metabolin is finally obtained in the form of shared pathway Breath.
5. a kind of transcript profile as claimed in claim 3 and metabolism group data relation analysis method, it is characterised in that: described Pathway functional mode analysis also draw output gene metabolic pathway figure associated with metabolin, with intuitively present gene and The associated functional character of metabolin.
6. a kind of transcript profile as claimed in claim 2 and metabolism group data relation analysis method, which is characterized in that described O2PLS model analysis includes the following steps:
Step 2.1, it is repeatedly modeled by intersection-verifying method, calculates the prediction error modeled every time, select pre- modeling In most suitable model, generally, prediction the smaller expression model of error it is more reasonable;By repeatedly pre- modeling, most suitable O2PLS is obtained Model;
Step 2.2, percentage contribution of the live part to model in transcript profile and metabolism group, the building feelings of assessment models are calculated Condition;
Step 2.3, the contribution degree of each gene of associated section and each metabolin in entire model, the size of contribution degree are calculated It is embodied by load value, for all transcript profiles and metabolism group data, draws the load diagram of different groups respectively;
Step 2.4, the element load value obtained according to step 2.3 draws load as a result, screening-gene and metabolin carry out integration Figure, to show the maximum gene of correlation degree and metabolism group, and draws output group associated payload figure.
7. a kind of transcript profile as claimed in claim 2 and metabolism group data relation analysis method, it is characterised in that: the correlation Property coefficient calculates the pearson coefficient and institute of including but not limited to all differences gene expression amount and all differences metabolin abundance There is the pearson coefficient of gene expression amount Yu all metabolin abundance.
8. a kind of transcript profile as claimed in claim 7 and metabolism group data relation analysis method, it is characterised in that: the correlation Property coefficient model analysis is also used to draw output correlation thermal map and network using calculating gained coefficient.
9. a kind of transcript profile as claimed in claim 8 and metabolism group data relation analysis method, which is characterized in that the correlation Property coefficient model analysis includes the following steps:
Step 3.1, the pearson coefficient of gene expression amount and metabolin abundance is calculated;
Step 3.2, in the pearson coefficient based on the calculated differential gene expression amount of step 3.1 and difference metabolin abundance Hold, sort from high to low by correlation absolute value, takes several before ranking, and correlation absolute value is greater than the gene of preset value, then The correlation of the gene filtered out and all differences metabolin is shown with thermal map;
Step 3.3, in the pearson coefficient based on the calculated differential gene expression amount of step 3.1 and difference metabolin abundance Hold, garbled data draws output correlation network, to show the gene or metabolin that are in important relative position.
10. a kind of transcript profile and metabolism group data association analysis system, comprising:
Difference expression gene analytical unit for carrying out transcript profile sequencing to sample, and carries out biological information to transcript profile data Analysis obtains difference expression gene;
Difference metabolite analysis unit for carrying out metabolism group sequencing to sample, and carries out biological information point to metabolism group data Analysis obtains difference metabolin;
Association analysis unit, for carrying out analysis of biological information linked character based on difference expression gene and difference metabolin.
CN201910176587.4A 2019-03-08 2019-03-08 A kind of transcript profile and metabolism group data relation analysis method and system Pending CN109979527A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910176587.4A CN109979527A (en) 2019-03-08 2019-03-08 A kind of transcript profile and metabolism group data relation analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910176587.4A CN109979527A (en) 2019-03-08 2019-03-08 A kind of transcript profile and metabolism group data relation analysis method and system

Publications (1)

Publication Number Publication Date
CN109979527A true CN109979527A (en) 2019-07-05

Family

ID=67078287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910176587.4A Pending CN109979527A (en) 2019-03-08 2019-03-08 A kind of transcript profile and metabolism group data relation analysis method and system

Country Status (1)

Country Link
CN (1) CN109979527A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110970116A (en) * 2019-12-05 2020-04-07 吉林省蒲川生物医药有限公司 Transcriptomics-based traditional Chinese medicine pharmacological mechanism analysis method
CN111061818A (en) * 2019-12-27 2020-04-24 北京百迈客生物科技有限公司 Metabolic group and other omics combined analysis method and device
CN111292809A (en) * 2020-01-20 2020-06-16 至本医疗科技(上海)有限公司 Method, electronic device, and computer storage medium for detecting RNA level gene fusion
CN111709219A (en) * 2020-04-28 2020-09-25 上海欧易生物医学科技有限公司 Method for personalized display of single omics and multi-group science KEGG PATHWAY map expression heatmaps and application
CN112986411A (en) * 2019-12-17 2021-06-18 中国科学院地理科学与资源研究所 Biological metabolite screening method
CN113707221A (en) * 2021-08-31 2021-11-26 中国水产科学研究院南海水产研究所 Fish sauce flavor forming functional microbial exoenzyme mining method based on multi-dimensional data
CN114333994A (en) * 2020-09-30 2022-04-12 天津现代创新中药科技有限公司 Method and system for determining differential gene pathways based on reference-free transcriptome sequencing
CN116129991A (en) * 2023-04-17 2023-05-16 南京派森诺基因科技有限公司 Non-targeted metabolic component analysis method based on qualitative and quantitative data of metabolites

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103558354A (en) * 2013-11-15 2014-02-05 南京大学 Water toxicity analysis method based on biologic omics integrated technology
CN108103176A (en) * 2018-01-02 2018-06-01 中国药科大学 Method based on metabolism group and transcription group association analysis screening fritillaria alkaloid synthesis key gene

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103558354A (en) * 2013-11-15 2014-02-05 南京大学 Water toxicity analysis method based on biologic omics integrated technology
CN108103176A (en) * 2018-01-02 2018-06-01 中国药科大学 Method based on metabolism group and transcription group association analysis screening fritillaria alkaloid synthesis key gene

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
金玉等: "转录组-代谢组分析方法及其在药物作用机理研究中的应用", 《生物技术通报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110970116A (en) * 2019-12-05 2020-04-07 吉林省蒲川生物医药有限公司 Transcriptomics-based traditional Chinese medicine pharmacological mechanism analysis method
CN110970116B (en) * 2019-12-05 2023-09-01 吉林省蒲川生物医药有限公司 Traditional Chinese medicine pharmacological mechanism analysis method based on transcriptome
CN112986411A (en) * 2019-12-17 2021-06-18 中国科学院地理科学与资源研究所 Biological metabolite screening method
CN111061818A (en) * 2019-12-27 2020-04-24 北京百迈客生物科技有限公司 Metabolic group and other omics combined analysis method and device
CN111061818B (en) * 2019-12-27 2023-06-30 北京百迈客生物科技有限公司 Metabolic group and other group combined analysis method and device
CN111292809A (en) * 2020-01-20 2020-06-16 至本医疗科技(上海)有限公司 Method, electronic device, and computer storage medium for detecting RNA level gene fusion
CN111709219A (en) * 2020-04-28 2020-09-25 上海欧易生物医学科技有限公司 Method for personalized display of single omics and multi-group science KEGG PATHWAY map expression heatmaps and application
CN114333994A (en) * 2020-09-30 2022-04-12 天津现代创新中药科技有限公司 Method and system for determining differential gene pathways based on reference-free transcriptome sequencing
CN113707221A (en) * 2021-08-31 2021-11-26 中国水产科学研究院南海水产研究所 Fish sauce flavor forming functional microbial exoenzyme mining method based on multi-dimensional data
CN116129991A (en) * 2023-04-17 2023-05-16 南京派森诺基因科技有限公司 Non-targeted metabolic component analysis method based on qualitative and quantitative data of metabolites

Similar Documents

Publication Publication Date Title
CN109979527A (en) A kind of transcript profile and metabolism group data relation analysis method and system
Huisman et al. Software for social network analysis
US7856317B2 (en) Systems and methods for constructing genomic-based phenotypic models
Ni et al. M2IA: a web server for microbiome and metabolome integrative analysis
CN107368700A (en) Based on the microbial diversity interaction analysis system and method for calculating cloud platform
Stelzer et al. Combining the scenario technique with bibliometrics for technology foresight: The case of personalized medicine
CN108921221A (en) Generation method, device, equipment and the storage medium of user characteristics
Goutelle et al. Nonparametric methods in population pharmacokinetics
US20160019335A1 (en) Method, apparatus and computer program product for metabolomics analysis
US20190005187A1 (en) Simulating the metabolic pathway dynamics of an organism
CN108335756B (en) Nasopharyngeal carcinoma database and comprehensive diagnosis and treatment decision method based on database
Taylor et al. Survival estimation and testing via multiple imputation
Dagliati et al. Using topological data analysis and pseudo time series to infer temporal phenotypes from electronic health records
Ghadiri et al. BigFCM: Fast, precise and scalable FCM on hadoop
Leydesdorff et al. Construction of a pragmatic base line for journal classifications and maps based on aggregated journal-journal citation relations
Agarwal et al. Survival prediction based on histopathology imaging and clinical data: A novel, whole slide cnn approach
Zhou et al. A user-driven sampling model for large-scale geographical point data visualization via convolutional neural networks
Ashwood et al. Proceedings of the EuBIC-MS 2020 Developers’ Meeting
Luboschik et al. Feature‐Driven Visual Analytics of Chaotic Parameter‐Dependent Movement
Dos Santos A framework for the visualization of multidimensional and multivariate data
Mironov et al. Monitoring YouTube video views in the educational environment based on situation-oriented database and RESTful Web Services
CN104866929A (en) International investment index data processing and analysis method and international investment index data processing and analysis system
Gavai et al. Constraint-based probabilistic learning of metabolic pathways from tomato volatiles
Zhang et al. Interactive analysis of systems biology molecular expression data
Eicher Understanding glycolysis in Escherichia coli: a systems approach using nuclear magnetic resonance spectroscopy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190705