CN109801676A - A kind of method and device acted on for evaluating compound on gene signal pathway activated - Google Patents
A kind of method and device acted on for evaluating compound on gene signal pathway activated Download PDFInfo
- Publication number
- CN109801676A CN109801676A CN201910142574.5A CN201910142574A CN109801676A CN 109801676 A CN109801676 A CN 109801676A CN 201910142574 A CN201910142574 A CN 201910142574A CN 109801676 A CN109801676 A CN 109801676A
- Authority
- CN
- China
- Prior art keywords
- gene
- pathway
- compound
- effect
- profile data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
This application discloses a kind of methods for evaluating the effect of compound on gene signal pathway activated, comprising: obtains the transcript profile data of control group and the transcript profile data of compound study group;According to the transcript profile data of the transcript profile data of control group and compound study group, determine that transcriptional differences express multiple data;Clustering processing is done to related gene, by the gene clusters of coexpression to same group, obtains multiple gene co-expressing units;Obtain gene pathway, according to facilitation, inhibiting effect, phosphorylation and dephosphorylation of the gene played in gene pathway, its corresponding weight coefficient correspondingly is distributed for each gene in gene pathway, and then determines gene pathway topology coefficient matrix;Multiple, gene co-expressing unit and gene pathway topology coefficient matrix are expressed according to transcriptional differences, is determined for evaluating compound for the marking result of the activation of gene pathway.
Description
Technical field
This application involves technical field of biological information, more particularly to one kind is for evaluating compound on gene signal pathway activated work
Method and device.
Background technique
In the past few decades, with the appearance of genetic engineering, many researchs and fund be put to genomics and
In personalized medicine based on gene.With the extensive use of deep learning and machine learning algorithm, to Large Scale Transcriptional group number
According to effectively being used, to traditional classification of diseases, personalized medicine and in terms of produce very great Cheng
The optimization of degree.
However, these classical clinical applications are still limited by several generally acknowledged challenges and limitation at present, firstly, transcript profile number
According to challenge maximally related in analysis first is that the intrinsic complexity of idiotype network interaction, this is still from transcript profile data structure
Build the major obstacles of Comprehensive Model;In addition, the great diversity of experiment porch, indigestion value obtained and coming from
The inconsistency of the data of various types equipment, it is also possible to lead to the misinterpretation to potential source biomolecule process.
Despite the presence of these challenges, various transcript profile data analysis algorithms are still grown rapidly in academic and business, part
Algorithm has been attempted to be applied to clinic at present, the reaction in particular for prediction patient to various treatments of cancer, these methods are specific
By identifying the gene of differential expression between different sample groups, the reaction of various treatments of cancer is predicted, although the above method can be with
Genetic biomarker potential in research process and expression characteristic mode are identified, but are difficult capture because in signal network
Dynamic interaction in level between gene and the nuance between the sample that generates.
The IPANDA method of exploitation in 2016 combines gene pathway, largely reduces the data dimension of biology
Degree, but its to gene on gene pathway role assessment it is not accurate enough.
Summary of the invention
The embodiment of the present application provides a kind of method for evaluating the effect of compound on gene signal pathway activated, can drop
While low biological data dimension, compound is accurately evaluated for the activation of gene pathway.
In view of this, the application first aspect provides a kind of side acted on for evaluating compound on gene signal pathway activated
Method, which comprises
Obtain the transcript profile data of control group and the transcript profile data of compound study group;
According to the transcript profile data of the transcript profile data of the control group and the compound study group, transcriptional differences are obtained
Express multiple data;
Clustering processing is done to related gene, by the gene clusters of coexpression to same group, obtains multiple gene co-expressing lists
Member;
Gene pathway is obtained, is each in the gene pathway according to gene effect played in the gene pathway
A gene distributes weight coefficient, obtains gene pathway topology coefficient matrix;Effect packet of the gene played in gene pathway
It includes: facilitation, inhibiting effect, phosphorylation and dephosphorylation;
Multiple data, the gene co-expressing unit and gene pathway topology system are expressed according to the transcriptional differences
Matrix number determines marking result of the compound on every gene pathway;The marking result for evaluate the compound for
The activation of the gene pathway.
Optionally, the effect according to gene played in gene pathway is each gene distribution in gene pathway
Weight coefficient, comprising:
+ 1 will be set as to the corresponding weight coefficient of the favorable gene of gene pathway;Gene pathway will be risen and be inhibited
The corresponding weight coefficient of the gene of effect is set as -1;
+ 2 are set by the corresponding weight coefficient of gene for playing phosphorylation to gene pathway;Gene pathway will be gone it
The corresponding weight coefficient of the gene of phosphorylation is set as -2.
Optionally, the acquisition gene pathway topology coefficient matrix, comprising:
According to the corresponding weight coefficient of each gene, gene is calculated in every base using R packet KEGGgraph and RBGL
Because of the topological coefficient on access.
Optionally, described that clustering processing is done to gene, by the gene clusters of coexpression to same group, it is total to obtain multiple genes
Expression unit, comprising:
First time clustering processing is carried out to the gene of coexpression, and second is carried out to the first time cluster result and is gathered
Class processing, obtains gene co-expressing unit.
Optionally, described that clustering processing is done to gene, by the gene clusters of coexpression to same group, it is total to obtain multiple genes
Expression unit, comprising:
Using density clustering method and/or hierarchy clustering method.
Optionally, the density clustering method includes: DBSCAN, OPTICS;
The hierarchy clustering method includes: BIRCH.
The application second aspect provides a kind of for evaluating the device of compound on gene signal pathway activated effect, the dress
It sets and includes:
Transcript profile data acquisition module, for obtaining the transcript profile data of control group and the transcript profile number of compound study group
According to;
Transcriptional differences express multiple data acquisition module, for according to the control group transcript profile data and the chemical combination
The transcript profile data of object study group obtain transcriptional differences and express multiple data;
Gene co-expressing unit obtains module and arrives the gene clusters of coexpression for doing clustering processing to related gene
Same group, obtain multiple gene co-expressing units;
Gene pathway topology coefficient matrix obtains module, for obtaining gene pathway, according to gene in the gene pathway
Played in effect, be the gene pathway in each gene distribute weight coefficient, obtain gene pathway topology coefficient matrix;
Effect of the gene played in gene pathway includes: that facilitation, inhibiting effect, phosphorylation and dephosphorylation are made
With;
Scoring modules, for expressing multiple data, the gene co-expressing unit and described according to the transcriptional differences
Gene pathway topology coefficient matrix determines marking result of the compound on every gene pathway;The marking result is for commenting
Activation of the valence compound for the gene pathway.
Optionally, the gene pathway topology coefficient matrix obtains module, is specifically used for:
+ 1 will be set as to the corresponding weight coefficient of the favorable gene of gene pathway;Gene pathway will be risen and be inhibited
The corresponding weight coefficient of the gene of effect is set as -1;
+ 2 are set by the corresponding weight coefficient of gene for playing phosphorylation to gene pathway;Gene pathway will be gone it
The corresponding weight coefficient of the gene of phosphorylation is set as -2.
Optionally, the gene pathway topology coefficient matrix obtains module, is specifically used for:
According to the corresponding weight coefficient of each gene, gene is calculated in every base using R packet KEGGgraph and RBGL
Because of the topological coefficient on access.
Optionally, the gene co-expressing unit obtains module, is specifically used for:
First time clustering processing is carried out to the gene of coexpression, and second is carried out to the first time cluster result and is gathered
Class processing, obtains gene co-expressing unit.
Optionally, the gene co-expressing unit obtains module, is specifically used for:
Using density clustering method and/or hierarchy clustering method.
Optionally, the density clustering method includes: DBSCAN, OPTICS;
The hierarchy clustering method includes: BIRCH.
The application third aspect provides a kind of equipment for evaluating the effect of compound on gene signal pathway activated, described to set
Standby includes processor and memory:
Said program code is transferred to the processor for storing program code by the memory;
The processor is used to execute according to the instruction in said program code and be used to comment as described in above-mentioned first aspect
The step of method of valence compound on gene signal pathway activated effect.
The application fourth aspect provides a kind of computer readable storage medium, and the computer readable storage medium is used for
Program code is stored, said program code is living for evaluating compound on gene access described in above-mentioned first aspect for executing
The method of change effect.
As can be seen from the above technical solutions, the embodiment of the present application has the advantage that
The embodiment of the present application provides a kind of method for evaluating the effect of compound on gene signal pathway activated, in this method
In, first obtain the transcript profile data of control group and the transcript profile data of compound study group;Then, according to the transcript profile of control group
The transcript profile data of data and compound study group determine that transcriptional differences express multiple data;Clustering processing is done to related gene,
By the gene clusters of coexpression to same group, multiple gene co-expressing units are obtained;Gene pathway is obtained, according to gene in gene
Facilitation played in access, inhibiting effect, phosphorylation and dephosphorylation are correspondingly each in gene pathway
A gene distributes its corresponding weight coefficient, and then determines gene pathway topology coefficient matrix;Finally, it is expressed according to transcriptional differences
Multiple, gene co-expressing unit and gene pathway topology coefficient matrix determine compound in every base using IPANDA method
Because the marking on access is as a result, the marking result can evaluate compound for the activation of gene pathway.Determining gene
During access topology coefficient matrix, facilitation, inhibiting effect, phosphorylation and the dephosphorylation of gene are comprehensively considered
Effect guarantees the role on gene pathway of each gene of accurate evaluation, and then guarantees that the subsequent gene pathway that is based on is opened up
Marking that coefficient matrix determines is flutterred as a result, it is possible to the activation that more accurately characterization of compound plays gene pathway.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of application without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is that the process provided by the embodiments of the present application for evaluating the method for compound on gene signal pathway activated effect is shown
It is intended to;
Fig. 2 is that the structure provided by the embodiments of the present application for evaluating the device of compound on gene signal pathway activated effect is shown
It is intended to;
Fig. 3 is that the structure provided by the embodiments of the present application for evaluating the equipment of compound on gene signal pathway activated effect is shown
It is intended to.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application
Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only this
Apply for a part of the embodiment, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art exist
Every other embodiment obtained under the premise of creative work is not made, shall fall in the protection scope of this application.
The description and claims of this application and term " first ", " second ", " third ", " in above-mentioned attached drawing
The (if present)s such as four " are to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should manage
The data that solution uses in this way are interchangeable under appropriate circumstances, so as to embodiments herein described herein can in addition to
Here the sequence other than those of diagram or description is implemented.In addition, term " includes " and " having " and their any deformation,
Be intended to cover it is non-exclusive include, for example, containing the process, method of a series of steps or units, system, product or setting
It is standby those of to be not necessarily limited to be clearly listed step or unit, but may include be not clearly listed or for these mistakes
The intrinsic other step or units of journey, method, product or equipment.
In the prior art, when being assessed using activation of the IPANDA method to compound, often because can not be right
Gene role on gene pathway carries out accurate evaluation, and leads to the activation assessment result accuracy finally determined
It is lower.
In order to solve above-mentioned technical problem of the existing technology, the embodiment of the present application provides a kind of for assessing chemical combination
For object to the method for gene pathway activation, this method, can be to compound while guaranteeing reduces biological data dimension
Accurate evaluation is carried out to the activation that gene pathway rises.
Specifically, in the method provided by the embodiments of the present application for evaluating the effect of compound on gene signal pathway activated,
First obtain the transcript profile data of control group and the transcript profile data of compound study group;Then, according to acquired control group
The transcript profile data of transcript profile data and compound study group calculate transcriptional differences and express multiple data;Then, to related gene
Clustering processing is done, by the gene clusters of coexpression to same group, to obtain multiple gene co-expressing units;In turn, base is obtained
Because of access, made according to each gene facilitation played in the gene pathway, inhibiting effect, phosphorylation or dephosphorylation
With being correspondingly that gene each in gene pathway distributes weight coefficient, and corresponding based on each gene in gene pathway
Weight coefficient determines gene pathway topology coefficient matrix;Finally, multiple data are expressed according to above-mentioned transcriptional differences, gene is total to table
Up to unit and gene pathway topology coefficient matrix, correspondingly determine marking of the compound on every gene pathway as a result, should
Marking result can be used in evaluating the activation that compound plays gene pathway.
The above-mentioned method for evaluating the effect of compound on gene signal pathway activated, is determining gene pathway topology coefficient matrix
During, comprehensively consider facilitation, inhibiting effect, phosphorylation and dephosphorylation of the gene played in gene pathway
Effect, i.e., it is accurate that effect during determining gene pathway topology coefficient matrix to gene played in gene pathway carries out
Assessment guarantees subsequent based on transcriptional differences expression multiple data, gene co-expressing unit and gene pathway topology coefficient in turn
Marking of the compound that matrix is determined on gene pathway is as a result, it is possible to which more accurately characterization of compound plays gene pathway
Activation.
Below by embodiment to it is provided by the present application for evaluate compound on gene signal pathway activated effect method into
Row is discussed in detail:
Referring to Fig. 1, Fig. 1 is the method provided by the embodiments of the present application for evaluating the effect of compound on gene signal pathway activated
Flow diagram, as shown in Figure 1, method includes the following steps:
Step 101: obtaining the transcript profile data of control group and the transcript profile data of compound study group.
Wherein, the transcript profile data of control group refer to the transcript profile data for being not affected by compound effects;Compound study group
Transcript profile data refer to the transcript profile data by compound effects, the transcript profile data of different compound study groups may be right
Different compound dosage, and/or corresponding different classes of compounds and/or different administration times are answered, i.e., in experimentation
In, it can be tested using the compound of variety classes same dose, the chemical combination of identical type various dose can also be used
Object is tested, and can also be tested using the compound of variety classes various dose, and then correspondingly for experiment every time
Generate one group of corresponding transcript profile data.Further, it is also possible on the basis of above-mentioned experiment condition, increase variable dosing interval into
Row experiment generates transcript profile data, does not do any restriction to the formation condition of transcript profile data herein.
In practical applications, the transcript profile data of one or more groups of control groups can according to actual needs, be obtained, and corresponding
Ground obtains the transcript profile data of one or more groups of compound study groups.When specific acquisition, experiment can be passed through and obtain above-mentioned control
The transcript profile data of group and the transcript profile data of compound study group can also be obtained from transcription group data set online or offline
The transcript profile data of above-mentioned control group and the transcript profile data of compound study group, herein not to the transcript profile number for obtaining control group
It is specifically limited according to the implementation of the transcript profile data with compound study group.
In practical applications, the transcript profile data of the transcript profile data of acquired control group and compound study group, tool
Body is as shown in table 1:
Table 1
Gene | Compound study group 1 | Compound study group 2 | Normal group 1 | Normal group 2 |
TSPAN6 | 737.88411 | 789.4028003 | 734.65068 | 774.0787405 |
TNMD | 0 | 0 | 0 | 0 |
DPM1 | 685.1781021 | 659.2014157 | 607.2866174 | 648.4525964 |
SCYL3 | 177.2012333 | 181.1893394 | 179.9709581 | 173.6596697 |
C1orf112 | 364.3984336 | 385.1411586 | 379.3234039 | 345.4718961 |
FGR | 0 | 0 | 0 | 0 |
CFH | 79.96773606 | 62.82444432 | 90.44694302 | 79.44006167 |
FUCA2 | 1105.917441 | 1074.389048 | 1091.823812 | 978.2212245 |
GCLC | 3505.858247 | 3347.905533 | 3424.062843 | 3341.101198 |
NFYA | 603.3929175 | 656.4699182 | 674.6603607 | 706.6470602 |
STPG1 | 146.304608 | 150.2323668 | 168.8958222 | 165.3461749 |
NIPAL3 | 245.3555538 | 295.9122377 | 275.955469 | 247.5574015 |
Wherein, 2 liang of compound study group 1, compound study group column datas are the transcript profile data of compound study group,
Normal group 1,2 this two column data of normal group are the transcript profile data of corresponding control group.
Step 102: according to the transcript profile data of the transcript profile data of the control group and the compound study group, obtaining
Transcriptional differences express multiple data.
It, can be first to acquired after getting the transcript profile data of control group and the transcript profile data of compound study group
Each transcript profile data carry out preliminary treatment, in turn, calculate transcription of the transcript profile data of compound study group relative to control group
The transcriptional differences of group data express multiple data.
More mature calculatings transcriptional differences existing many at present express the mode of multiple data, can be with when concrete application
Suitable calculation is correspondingly chosen according to actual needs, according to turn of the transcript profile data of control group and compound study group
Record group data determine that transcriptional differences express multiple data, do not do have to the mode for calculating transcriptional differences expression multiple data herein
Body limits.
Step 103: clustering processing being done to related gene, by the gene clusters of coexpression to same group, obtains multiple genes
Co-express unit.
Next, the transcript profile data to control group and the related gene in the transcript profile data of compound study group carry out
Clustering processing, related gene herein specifically refer under compound effects, the gene-that expression quantity can change a lot;
In turn, the same expression factor will be controlled by related gene, and/or shows the series of genes cluster of significant coordinate expression
To same group, these genes are the gene co-expressed, are clustered to obtain gene co-expressing list by the gene to coexpression
Member.According to synergistic effect, multiple gene co-expressing units are obtained.
When specifically doing clustering processing to the gene of coexpression, a clustering processing directly can be done to the gene of coexpression,
Obtain corresponding gene co-expressing unit;It is more preferable of course for the effect for guaranteeing cluster, two can also be done to the gene of coexpression
Secondary clustering processing carries out first time clustering processing to the gene of coexpression, and carry out again to first time cluster result second
Clustering processing obtains gene co-expressing unit.
It should be noted that when the gene to coexpression does clustering processing, can using density clustering method and/
Or hierarchy clustering method;Specifically, density clustering method includes: DBSCAN (Density-Based Spatial
Clustering of Applications with Noise)、OPTICS(Ordering Points to identity the
clustering structure);Hierarchy clustering method includes: BIRCH (Balance Iterative Reducing and
Clustering using Hierarchies)。
It, can be using any one of the above clustering method to total table when the gene to coexpression only does a clustering processing
The gene reached is clustered, and gene co-expressing unit is obtained;When the gene to coexpression does multiple clustering processing, can only adopt
It is repeatedly clustered with any one of the above clustering method, obtains gene co-expressing unit, it can also will be above-mentioned any a variety of poly-
Class method combines, and is clustered to obtain gene co-expressing unit to the gene of coexpression, for example, using poly- based on density
Class method does first time clustering processing to the gene of coexpression, and then again using hierarchy clustering method to first time clustering processing knot
Fruit carries out second of clustering processing, obtains gene co-expressing unit.
Below by taking the gene to coexpression does clustering processing twice as an example, the process for generating gene co-expressing unit is carried out
It introduces:
Specifically, can first be done at first time cluster using gene of the density clustering method OPTICS to coexpression
Reason, OPTICS method is without being manually entered field radius and field minimal point the two parameters, and the class cluster knot that cluster obtains
Fruit is lower to field radius and field minimal point susceptibility.After obtaining first time cluster result, first time cluster result is determined
In similarity between each gene, and be screened out from it similarity and be used as at second of cluster higher than the gene of preset threshold
Reason further, only retains similarity and is higher than for example, only retaining the gene that similarity is higher than 0.3,0.4,0.5,0.6,0.7
0.5 gene.
Then, second of clustering processing is carried out to first time cluster result using hierarchy clustering method BIRCH, generates gene
Unit is co-expressed, BIRCH is suitable for large-scale data set, the cluster efficiency with higher when handling large-scale data,
And normal operation can be left in any give.
By the efficient combination of both clustering methods, can be obtained in a relatively short period of time with a small amount of computing resource
Accurate gene co-expressing unit.
Step 104: gene pathway is obtained, it is logical for the gene according to gene effect played in the gene pathway
Each gene in road distributes weight coefficient, obtains gene pathway topology coefficient matrix;The gene is played in gene pathway
Effect include: facilitation, inhibiting effect, phosphorylation and dephosphorylation.
It is obtained from gene pathway database such as KEGG (Kyoto Encyclopedia of Genes and Genomes)
Gene pathway is correspondingly distributed according to effect of each gene played in gene pathway for each gene in gene pathway
Weight coefficient;In turn, according to the corresponding weight coefficient of each gene, each gene is correspondingly calculated in every gene pathway
On topological coefficient, and determine gene pathway topology coefficient matrix.
It should be noted that the gene of Primary Reference is played in gene pathway when distributing weight coefficient for each gene
Effect include: facilitation, inhibiting effect, phosphorylation and dephosphorylation.
Specifically, if gene-for-gene access plays a driving role, it can be correspondingly by the corresponding weight coefficient of the gene
It is set as+1;If gene-for-gene access plays inhibiting effect, can correspondingly set the corresponding weight coefficient of the gene to-
1;In view of the addition or removal of phosphate group play biological " switch ", i.e. phosphorylation and dephosphorylation mistake to many reactions
Journey played in biology " switch " effect, correspondingly, if gene-for-gene access rise phosphorylation, can correspondingly by
The corresponding weight coefficient of the gene is set as+2;If gene-for-gene access plays dephosphorylation, will can correspondingly be somebody's turn to do
The corresponding weight coefficient of gene is set as -2.
It should be understood that in practical applications, can consider according to actual needs facilitation that gene-for-gene access plays,
Inhibiting effect, phosphorylation and dephosphorylation, and corresponding weight coefficient is set for it, it can correspondingly by weight
Coefficient is set as other numerical value commonly used in the art, does not do any restriction to the specific value of set weight coefficient herein.
Consider effect of the gene played in gene pathway, correspondingly distributes weight system for each gene in gene pathway
After number;It can be calculated using R packet KEGGgraph and RBGL each further according to the corresponding weight coefficient of each gene
Topological coefficient of the gene on every gene pathway, in turn, calculated topology coefficient form gene pathway topology system
Matrix number.
Gene pathway topology coefficient matrix specific manifestation obtained is as shown in table 2:
Table 2
It should be noted that in practical applications, the execution sequence of step 102, step 103 and step 104 is not limited to
When sequence as described above, specific implementation, step 102 can be first carried out, step 103 can also be first carried out, it can also be first
Execute step 104, may also be performed simultaneously step 102, step 103 and step 104, herein not to step 102, step 103 and
The execution sequence of step 104 is specifically limited.
Step 105: it is logical to express multiple data, the gene co-expressing unit and the gene according to the transcriptional differences
Road topology coefficient matrix determines marking result of the compound on every gene pathway;The marking result is for evaluating the change
Object is closed for the activation of the gene pathway.
It, can after obtaining transcriptional differences expression multiple data, gene co-expressing unit and gene pathway topology coefficient matrix
To express multiple data, gene co-expressing unit and gene pathway topology coefficient matrix according to transcriptional differences obtained, adopt
Marking of the compound on every gene pathway is calculated as a result, the marking result is used to evaluate compound to base with IPANDA method
The activation risen by access.
Marking of the final specific identified compound on every gene pathway is as a result, as shown in table 3:
Table 3
Wherein, positive value data representation compound has invigoration effect to corresponding gene pathway, and negative valued data represents chemical combination
Object has attenuation to corresponding gene pathway, and the more big then expression effect of the absolute value of numerical value is stronger.
It should be understood that in practical applications, needing the transcript profile data according to each compound study group, correspondingly determining should
The corresponding marking of compound study group is as a result, determine classes of compounds, compound dosage used in the compound study group
And/or the compound effects time, the activation that gene pathway is risen.
Method provided by the embodiments of the present application for evaluating the effect of compound on gene signal pathway activated is determining that gene is logical
During the topology coefficient matrix of road, facilitation, inhibiting effect, phosphorylation of the gene played in gene pathway are comprehensively considered
Effect and dephosphorylation, i.e., during determining gene pathway topology coefficient matrix to gene played in gene pathway
Effect carry out accurate evaluation, in turn, guarantee subsequent based on transcriptional differences expression multiple data, gene co-expressing unit and base
Marking of the compound determined by access topology coefficient matrix on gene pathway is as a result, it is possible to more accurately characterization of compound
The activation that gene pathway is risen.
For the method described above for evaluation compound on gene signal pathway activated effect, the embodiment of the present application also phase
Provide the device for evaluating the effect of compound on gene signal pathway activated with answering.
Referring to fig. 2, Fig. 2 is provided by the embodiments of the present application for evaluating the device of compound on gene signal pathway activated effect
Structural schematic diagram, as shown in Fig. 2, the device includes:
Transcript profile data acquisition module 201, for obtaining the transcript profile data of control group and the transcription of compound study group
Group data;
Transcriptional differences express multiple data acquisition module 202, for according to the transcript profile data of the control group and described
The transcript profile data of compound study group obtain transcriptional differences and express multiple data;
Gene co-expressing unit obtains module 203, for doing clustering processing to related gene, by the gene clusters of coexpression
To same group, multiple gene co-expressing units are obtained;
Gene pathway topology coefficient matrix obtains module 204, logical in the gene according to gene for obtaining gene pathway
Effect played in road is that each gene in the gene pathway distributes weight coefficient, obtains gene pathway topology coefficient square
Battle array;Effect of the gene played in gene pathway includes: facilitation, inhibiting effect, phosphorylation and dephosphorylation
Effect;
Scoring modules 205, for expressing multiple data, the gene co-expressing unit and institute according to the transcriptional differences
Gene pathway topology coefficient matrix is stated, determines marking result of the compound on every gene pathway;The marking result is used for
The compound is evaluated for the activation of the gene pathway.
Optionally, the gene pathway topology coefficient matrix obtains module 204, is specifically used for:
+ 1 will be set as to the corresponding weight coefficient of the favorable gene of gene pathway;Gene pathway will be risen and be inhibited
The corresponding weight coefficient of the gene of effect is set as -1;
+ 2 are set by the corresponding weight coefficient of gene for playing phosphorylation to gene pathway;Gene pathway will be gone it
The corresponding weight coefficient of the gene of phosphorylation is set as -2.
Optionally, the gene pathway topology coefficient matrix obtains module 204, is specifically used for:
According to the corresponding weight coefficient of each gene, gene is calculated in every base using R packet KEGGgraph and RBGL
Because of the topological coefficient on access.
Optionally, the gene co-expressing unit obtains module 203, is specifically used for:
First time clustering processing is carried out to the gene of coexpression, and second is carried out to the first time cluster result and is gathered
Class processing, obtains gene co-expressing unit.
Optionally, the gene co-expressing unit obtains module 203, is specifically used for:
Using density clustering method and/or hierarchy clustering method.
Optionally, the density clustering method includes: DBSCAN, OPTICS;
The hierarchy clustering method includes: BIRCH.
The device provided by the embodiments of the present application acted on for evaluating compound on gene signal pathway activated is determining that gene is logical
During the topology coefficient matrix of road, facilitation, inhibiting effect, phosphorylation of the gene played in gene pathway are comprehensively considered
Effect and dephosphorylation, i.e., during determining gene pathway topology coefficient matrix to gene played in gene pathway
Effect carry out accurate evaluation, in turn, guarantee subsequent based on transcriptional differences expression multiple data, gene co-expressing unit and base
Marking of the compound determined by access topology coefficient matrix on gene pathway is as a result, it is possible to more accurately characterization of compound
The activation that gene pathway is risen.
Present invention also provides a kind of equipment for evaluating the effect of compound on gene signal pathway activated, which specifically may be used
Think server, or terminal device;Below by taking server as an example, to this for evaluating compound on gene signal pathway activated
The equipment of effect is introduced.
Referring to Fig. 3, Fig. 3 is provided by the embodiments of the present application for evaluating the service of compound on gene signal pathway activated effect
Device structural schematic diagram, the server 300 can generate bigger difference because configuration or performance are different, may include one or one
A above central processing unit (central processing units, CPU) 322 (for example, one or more processors)
With memory 332, storage medium 330 (such as one or one of one or more storage application programs 342 or data 344
A above mass memory unit).Wherein, memory 332 and storage medium 330 can be of short duration storage or persistent storage.Storage
It may include one or more modules (diagram does not mark) in the program of storage medium 330, each module may include pair
Series of instructions operation in server.Further, central processing unit 322 can be set to communicate with storage medium 330,
The series of instructions operation in storage medium 330 is executed on server 300.
Server 300 can also include one or more power supplys 326, one or more wired or wireless networks
Interface 350, one or more input/output interfaces 358, and/or, one or more operating systems 341, such as
Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
The step as performed by server can be based on the server architecture shown in Fig. 3 in above-described embodiment.
Wherein, CPU 322 is for executing following steps:
Obtain the transcript profile data of control group and the transcript profile data of compound study group;
According to the transcript profile data of the transcript profile data of the control group and the compound study group, transcriptional differences are obtained
Express multiple data;
Clustering processing is done to related gene, by the gene clusters of coexpression to same group, obtains multiple gene co-expressing lists
Member;
Gene pathway is obtained, is each in the gene pathway according to gene effect played in the gene pathway
A gene distributes weight coefficient, obtains gene pathway topology coefficient matrix;Effect packet of the gene played in gene pathway
It includes: facilitation, inhibiting effect, phosphorylation and dephosphorylation;
Multiple data, the gene co-expressing unit and gene pathway topology system are expressed according to the transcriptional differences
Matrix number determines marking result of the compound on every gene pathway;The marking result for evaluate the compound for
The activation of the gene pathway.
Optionally, CPU322 can also be performed shown in Fig. 2 for evaluating the side of compound on gene signal pathway activated effect
The method and step of any specific implementation of method.
The embodiment of the present application also provides a kind of computer readable storage mediums, for storing program code, the program generation
Code is a kind of for evaluating in the method that compound on gene signal pathway activated acts on described in foregoing individual embodiments for executing
Any one embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components
It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or
The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit
It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the application
Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (full name in English: Read-Only
Memory, english abbreviation: ROM), random access memory (full name in English: Random Access Memory, english abbreviation:
RAM), the various media that can store program code such as magnetic or disk.
The above, above embodiments are only to illustrate the technical solution of the application, rather than its limitations;Although referring to before
Embodiment is stated the application is described in detail, those skilled in the art should understand that: it still can be to preceding
Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features;And these
It modifies or replaces, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution.
Claims (10)
1. a kind of method for evaluating the effect of compound on gene signal pathway activated, which is characterized in that the described method includes:
Obtain the transcript profile data of control group and the transcript profile data of compound study group;
According to the transcript profile data of the transcript profile data of the control group and the compound study group, transcriptional differences expression is obtained
Multiple data;
Clustering processing is done to related gene, by the gene clusters of coexpression to same group, obtains multiple gene co-expressing units;
Gene pathway is obtained, is each base in the gene pathway according to gene effect played in the gene pathway
Because distributing weight coefficient, gene pathway topology coefficient matrix is obtained;Effect of the gene played in gene pathway includes: to promote
Into effect, inhibiting effect, phosphorylation and dephosphorylation;
Multiple data, the gene co-expressing unit and the gene pathway topology coefficient square are expressed according to the transcriptional differences
Battle array, determines marking result of the compound on every gene pathway;The marking result is for evaluating the compound for described
The activation of gene pathway.
2. the method according to claim 1, wherein the effect according to gene played in gene pathway,
Weight coefficient is distributed for each gene in gene pathway, comprising:
+ 1 will be set as to the corresponding weight coefficient of the favorable gene of gene pathway;Inhibiting effect will be played to gene pathway
The corresponding weight coefficient of gene be set as -1;
+ 2 are set by the corresponding weight coefficient of gene for playing phosphorylation to gene pathway;It phosphoric acid will be removed to gene pathway
The corresponding weight coefficient of gene of change effect is set as -2.
3. method according to claim 1 or 2, which is characterized in that the acquisition gene pathway topology coefficient matrix, packet
It includes:
According to the corresponding weight coefficient of each gene, it is logical in every gene that gene is calculated using R packet KEGGgraph and RBGL
The topological coefficient of road.
4. the method according to claim 1, wherein described do clustering processing to gene, by the gene of coexpression
Same group is clustered, multiple gene co-expressing units are obtained, comprising:
First time clustering processing is carried out to the gene of coexpression, and the first time cluster result is carried out at second of cluster
Reason obtains gene co-expressing unit.
5. the method according to claim 1, wherein described do clustering processing to gene, by the gene of coexpression
Same group is clustered, multiple gene co-expressing units are obtained, comprising:
Using density clustering method and/or hierarchy clustering method.
6. according to the method described in claim 5, it is characterized in that, the density clustering method include: DBSCAN,
OPTICS;
The hierarchy clustering method includes: BIRCH.
7. a kind of for evaluating the device of compound on gene signal pathway activated effect, which is characterized in that described device includes:
Transcript profile data acquisition module, for obtaining the transcript profile data of control group and the transcript profile data of compound study group;
Transcriptional differences express multiple data acquisition module, for being ground according to the transcript profile data and the compound of the control group
Study carefully the transcript profile data of group, obtains transcriptional differences and express multiple data;
Gene co-expressing unit obtains module, for doing clustering processing to related gene, by the gene clusters of coexpression to same
Group obtains multiple gene co-expressing units;
Gene pathway topology coefficient matrix obtain module, for obtaining gene pathway, according to gene in the gene pathway institute
Role is that each gene in the gene pathway distributes weight coefficient, obtains gene pathway topology coefficient matrix;It is described
Effect of the gene played in gene pathway includes: facilitation, inhibiting effect, phosphorylation and dephosphorylation;
Scoring modules, for expressing multiple data, the gene co-expressing unit and the gene according to the transcriptional differences
Access topology coefficient matrix determines marking result of the compound on every gene pathway;The marking result is for evaluating this
Activation of the compound for the gene pathway.
8. device according to claim 7, which is characterized in that the gene pathway topology coefficient matrix obtains module, tool
Body is used for:
+ 1 will be set as to the corresponding weight coefficient of the favorable gene of gene pathway;Inhibiting effect will be played to gene pathway
The corresponding weight coefficient of gene be set as -1;
+ 2 are set by the corresponding weight coefficient of gene for playing phosphorylation to gene pathway;It phosphoric acid will be removed to gene pathway
The corresponding weight coefficient of gene of change effect is set as -2.
9. a kind of equipment for evaluating the effect of compound on gene signal pathway activated, which is characterized in that the equipment includes processing
Device and memory:
Said program code is transferred to the processor for storing program code by the memory;
The processor is used to execute the use as described in claim 1 to 6 any one according to the instruction in said program code
In the step of evaluating the method for compound on gene signal pathway activated effect.
10. a kind of computer readable storage medium, which is characterized in that the computer readable storage medium is for storing program generation
Code, said program code is for executing described in 1 to 6 any one of the claims for evaluating compound on gene access
The method of activation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910142574.5A CN109801676B (en) | 2019-02-26 | 2019-02-26 | Method and device for evaluating activation effect of compound on gene pathway |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910142574.5A CN109801676B (en) | 2019-02-26 | 2019-02-26 | Method and device for evaluating activation effect of compound on gene pathway |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109801676A true CN109801676A (en) | 2019-05-24 |
CN109801676B CN109801676B (en) | 2021-01-01 |
Family
ID=66561331
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910142574.5A Active CN109801676B (en) | 2019-02-26 | 2019-02-26 | Method and device for evaluating activation effect of compound on gene pathway |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109801676B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110444248A (en) * | 2019-07-22 | 2019-11-12 | 山东大学 | Cancer Biology molecular marker screening technique and system based on network topology parameters |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101553492A (en) * | 2006-08-31 | 2009-10-07 | 阵列生物制药公司 | RAF inhibitor compounds and methods of use thereof |
CN103093119A (en) * | 2013-01-24 | 2013-05-08 | 南京大学 | Method for recognizing significant biologic pathway through utilization of network structural information |
CN103608036A (en) * | 2011-06-19 | 2014-02-26 | 瓦克西尼私人有限公司 | Vaccine adjuvant composition comprising inulin particles |
US20150073719A1 (en) * | 2013-08-22 | 2015-03-12 | Genomoncology, Llc | Computer-based systems and methods for analyzing genomes based on discrete data structures corresponding to genetic variants therein |
CN104968646A (en) * | 2012-12-13 | 2015-10-07 | 葛兰素史密斯克莱有限责任公司 | Enhancer of Zeste homolog 2 inhibitors |
US20170277826A1 (en) * | 2016-03-27 | 2017-09-28 | Insilico Medicine, Inc. | System, method and software for robust transcriptomic data analysis |
CN108763864A (en) * | 2018-05-04 | 2018-11-06 | 温州大学 | A method of evaluation biological pathway sample state |
CN108753915A (en) * | 2018-05-12 | 2018-11-06 | 内蒙古农业大学 | The assay method of millet enzymatic activity |
US20190030078A1 (en) * | 2017-07-25 | 2019-01-31 | Insilico Medicine, Inc. | Multi-stage personalized longevity therapeutics |
WO2019034576A1 (en) * | 2017-08-18 | 2019-02-21 | Koninklijke Philips N.V. | Methods for sequencing biomolecules |
-
2019
- 2019-02-26 CN CN201910142574.5A patent/CN109801676B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101553492A (en) * | 2006-08-31 | 2009-10-07 | 阵列生物制药公司 | RAF inhibitor compounds and methods of use thereof |
CN103608036A (en) * | 2011-06-19 | 2014-02-26 | 瓦克西尼私人有限公司 | Vaccine adjuvant composition comprising inulin particles |
CN104968646A (en) * | 2012-12-13 | 2015-10-07 | 葛兰素史密斯克莱有限责任公司 | Enhancer of Zeste homolog 2 inhibitors |
CN103093119A (en) * | 2013-01-24 | 2013-05-08 | 南京大学 | Method for recognizing significant biologic pathway through utilization of network structural information |
US20150073719A1 (en) * | 2013-08-22 | 2015-03-12 | Genomoncology, Llc | Computer-based systems and methods for analyzing genomes based on discrete data structures corresponding to genetic variants therein |
US20170277826A1 (en) * | 2016-03-27 | 2017-09-28 | Insilico Medicine, Inc. | System, method and software for robust transcriptomic data analysis |
US20190030078A1 (en) * | 2017-07-25 | 2019-01-31 | Insilico Medicine, Inc. | Multi-stage personalized longevity therapeutics |
WO2019034576A1 (en) * | 2017-08-18 | 2019-02-21 | Koninklijke Philips N.V. | Methods for sequencing biomolecules |
CN108763864A (en) * | 2018-05-04 | 2018-11-06 | 温州大学 | A method of evaluation biological pathway sample state |
CN108753915A (en) * | 2018-05-12 | 2018-11-06 | 内蒙古农业大学 | The assay method of millet enzymatic activity |
Non-Patent Citations (2)
Title |
---|
AFSANEH MOHAMMADNEJAD ET AL: "Weighted gene co-expression network analysis of microarray mRNA expression profiling in response to electroacupuncture", 《2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)》 * |
沈慧丽等: "HER-2阴性乳腺癌新靶向基因的生物信息学分析", 《基因组学与应用生物学》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110444248A (en) * | 2019-07-22 | 2019-11-12 | 山东大学 | Cancer Biology molecular marker screening technique and system based on network topology parameters |
CN110444248B (en) * | 2019-07-22 | 2021-09-24 | 山东大学 | Cancer biomolecule marker screening method and system based on network topology parameters |
Also Published As
Publication number | Publication date |
---|---|
CN109801676B (en) | 2021-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tyler et al. | PyMINEr finds gene and autocrine-paracrine networks from human islet scRNA-Seq | |
CN110957002B (en) | Drug target interaction relation prediction method based on synergistic matrix decomposition | |
CN103778349B (en) | Biomolecular network analysis method based on function module | |
US20170277826A1 (en) | System, method and software for robust transcriptomic data analysis | |
Li et al. | Putative biomarkers for predicting tumor sample purity based on gene expression data | |
Yue et al. | Computational systems biology in disease modeling and control, review and perspectives | |
CN113012770A (en) | Medicine-medicine interaction event prediction method, system, terminal and readable storage medium based on multi-modal deep neural network | |
CN105224823A (en) | A kind of drug gene target spot Forecasting Methodology | |
Raza | Reconstruction, topological and gene ontology enrichment analysis of cancerous gene regulatory network modules | |
Skoufos et al. | AGAMEMNON: an Accurate metaGenomics And MEtatranscriptoMics quaNtificatiON analysis suite | |
Rahmani et al. | Recursive indirect-paths modularity (RIP-M) for detecting community structure in RNA-Seq co-expression networks | |
Nakajima et al. | Network completion using dynamic programming and least-squares fitting | |
CN109801676A (en) | A kind of method and device acted on for evaluating compound on gene signal pathway activated | |
KR20190054386A (en) | Genome analysis method based on modularization | |
EP4035163A1 (en) | Single cell rna-seq data processing | |
Jhalia et al. | A critical review on the application of artificial neural network in bioinformatics | |
KR101810527B1 (en) | Algorithm for the construction of a regulatory network for more than 10,000 genes and method for the identification of causal genes in drug responses using the same algorithm | |
Singha et al. | GraphGR: A graph neural network to predict the effect of pharmacotherapy on the cancer cell growth | |
Hernandez-Hernandez et al. | Nonlinear signaling on biological networks: The role of stochasticity and spectral clustering | |
So et al. | GraphComm: a graph-based deep learning method to predict cell-cell communication in single-cell RNAseq data | |
Barzel et al. | Graph theory properties of cellular networks | |
Nguyen et al. | Discovery of pathways in protein–protein interaction networks using a genetic algorithm | |
Madhamshettiwar et al. | RMaNI: regulatory module network inference framework | |
Roy et al. | Soft computing approaches to extract biologically significant gene network modules | |
Li et al. | iDEG: a single-subject method utilizing local estimates of dispersion to impute differential expression between two transcriptomes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |