CN107220526B - Method for identifying gene pathway based on PADOG - Google Patents

Method for identifying gene pathway based on PADOG Download PDF

Info

Publication number
CN107220526B
CN107220526B CN201710300900.1A CN201710300900A CN107220526B CN 107220526 B CN107220526 B CN 107220526B CN 201710300900 A CN201710300900 A CN 201710300900A CN 107220526 B CN107220526 B CN 107220526B
Authority
CN
China
Prior art keywords
gene
channel
genes
signal
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710300900.1A
Other languages
Chinese (zh)
Other versions
CN107220526A (en
Inventor
刘文斌
沈良忠
昝乡镇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN201710300900.1A priority Critical patent/CN107220526B/en
Publication of CN107220526A publication Critical patent/CN107220526A/en
Application granted granted Critical
Publication of CN107220526B publication Critical patent/CN107220526B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Abstract

The embodiment of the invention discloses a method for identifying a gene path based on PADOG, which comprises the steps of obtaining a sample, determining a signal path and genes of the sample, sequencing the genes contained in all the signal paths, and determining the gene frequency and the gene out-degree of each gene; determining the gene frequency weight and the correction score of each gene, and calculating the channel score of each signal channel; determining the gene out-degree weight of each sequenced gene; screening out the gene outbreak weight in the same signal channel, revising the channel score of the corresponding signal channel according to the gene outbreak weight of the same signal channel, sequencing the revised channel scores, and determining that the probability of the signal channel corresponding to the sequenced maximum channel score is the maximum. By implementing the invention, the importance of regulating and controlling genes of a large number of genes in a channel compared with regulating and controlling a small number of genes is considered, so that the identification precision of the channel is improved.

Description

Method for identifying gene pathway based on PADOG
Technical Field
The invention relates to the technical field of system biology research, in particular to a PADOG-based gene pathway recognition method.
Background
The high-throughput technology based on microarray generates a large amount of gene expression data, how to gain insight from the large amount of gene expression data, and further understanding the mechanism of life phenomena remains a serious challenge to scientists around the world. Biological pathways are the interaction between a group of genes that fulfill specific functions, mainly signaling pathways and metabolic pathways. In a signaling pathway, a node represents a gene (or gene product) and an edge represents a signal that is transduced from one gene to another. In a metabolic pathway, nodes represent biochemical compounds and edges represent biochemical reactions between compounds encoded by enzymes that are encoded by genes. Common pathway databases are the KEGG and Reactome databases, which provide a visualization format for interactions between genes. Over the past decade, researchers have developed many approaches to identify gene pathways based on PADOG to identify pathways associated with various cancers or diseases.
From the perspective of system biology, the interaction between genes and the change of their kinetics are the main causes of various diseases and cancers, so that identification of gene pathways based on PADOG has become a universal method for identifying various cancer-related pathways. Since the topological features of the pathway reflect the position, importance and interaction of genes in the pathway, the pathway should be identified by considering various information including genes in the pathway, such as the upstream and downstream positions of genes, the number of regulatory genes, the functional relationship between genes, and the like.
In 2005, PNAS published two important approaches to pathway analysis, one is a significant pathway analysis method based on function proposed by Tian et al, which comprehensively considers the significance of the difference between gene expression in a gene set and gene expression outside the set (row replacement) and the significance of the correlation between gene expression of the gene set and phenotype (column replacement). Another is the well-known GSEA method, a gene set enrichment analysis method, proposed by Subramanian et al, whose main idea is to rank all genes according to their correlation between gene expression in a pathway and a given phenotype, and then determine the score for the degree to which the Kolmogorov-Smirnov (Schmilnorov) statistic for a given pathway P is close to extreme in the ranked list. In this method, the significance of the Kolmogorov-Smirnov statistic was determined from the column permutation of the samples. In 2006, Zahn et al used the Van der Waerden statistic instead of the Kolmogorov-Smirnov statistic and replaced the permutation test method with bootstrap sampling that takes into account the correlation of the expression levels of the two genes in the pathway and the correlation with other factors. In the same year EFRON et al used the max-mean statistic instead of the Kolmogorov-Smirnov statistic to calculate the pathway score, then normalized the score by the row permutation method, and finally tested the significance of the pathway score by the column permutation, which is the well-known GSA method.
On the basis of the above-mentioned gene set enrichment analysis method GSEA and gene set analysis method GSA, the scholars also propose a signal pathway influence analysis method SPIA and an overlapping gene weight reduction method PADOG. In the signal pathway influence analysis method SPIA, only the influence of the upstream and downstream positions of genes on the propagation of a perturbation signal is considered, but genes which regulate a large number of genes in a pathway are ignored to be more important than genes which regulate a small number of genes, and the difference has greater influence on the function of the pathway, while in the overlapping gene weight reduction method PADOG, the influence of "common genes" which frequently appear in many pathways is reduced on the basis of the GSA method, but the genes which regulate a large number of genes in the pathway are not considered to be more important than genes which regulate a small number of genes, and the difference has greater influence on the function of the pathway.
Therefore, it is necessary to consider the importance of genes that regulate a large number of genes in a pathway rather than regulating only a small number of genes, and to improve the accuracy of pathway identification based on this.
Disclosure of Invention
The embodiment of the invention aims to provide a method for identifying a gene pathway based on PADOG, which can consider the importance of genes which regulate a large number of genes in the pathway compared with the importance of genes which regulate a small number of genes, thereby improving the identification precision of the pathway.
In order to solve the above technical problems, an embodiment of the present invention provides a method for identifying a gene pathway based on PADOG, the method including:
a. obtaining a sample, determining signal paths of the sample and genes contained in each signal path, sequencing the genes contained in all the signal paths according to the correlation between each gene and a phenotype, and further determining the gene frequency and the gene out-degree of each gene according to the sequenced genes; wherein the gene frequency is the total number of times a gene appears in the determined signal pathway, and the gene out degree is the number of genes which regulate and control downstream genes in the determined signal pathway;
b. counting the maximum gene frequency and the minimum gene frequency according to the determined gene frequency of each gene, and determining the gene frequency weight of each gene according to the counted maximum gene frequency and minimum gene frequency;
c. determining the total number of genes contained in each signal channel and the correction score of each sequenced gene, and calculating the channel score of each signal channel according to the total number of the genes contained in each signal channel, the correction score of each sequenced gene and the corresponding gene frequency weight;
d. counting the maximum gene output and the minimum gene output according to the determined gene output of each gene, and calculating the gene output weight of each gene according to the obtained gene output of each gene and the counted maximum gene output and minimum gene output;
e. screening out the gene outburst weight corresponding to the gene contained in the same signal channel, revising the channel score of the signal channel correspondingly calculated according to the gene outburst weight corresponding to the gene contained in the same signal channel, further sequencing the revised channel score of each signal channel, and determining that the probability of the signal channel corresponding to the maximum channel score after sequencing is the maximum.
Wherein, the step b specifically comprises:
acquiring the gene frequency of each gene, and counting the maximum gene frequency max (f) and the minimum gene frequency min (f);
according to the formula
Figure BDA0001284271870000041
Obtaining the gene frequency weight of each gene; wherein, f (g)j) Is sequenced gene gjThe frequency of the gene(s); w is af(gj) Is sequenced gene gjThe gene frequency weight of (2).
Wherein the "fraction of each signal path" in step c is determined by the formula
Figure BDA0001284271870000042
To realize the operation; wherein, ES0(S) is sequenced gene gjThe path fraction of the signal path S; m is sequenced gene gjThe total number of genes contained in the signal path S; t (g)j) Is sequenced gene gjThe correction score of (1).
Wherein, the step d specifically comprises:
acquiring the gene outbreak of each gene, and counting the maximum gene outbreak max (d) and the minimum gene outbreak min (d);
according to the formula
Figure BDA0001284271870000051
Obtaining the gene out-degree weight of each gene; wherein d (g)j) Is sequenced gene gjGene outbreak of (2); w is ad(gj) Is sequenced gene gjGene out-degree weight of (c).
Wherein the value range of the gene out-degree weight of each gene is [1, 2 ].
Wherein, the step e specifically comprises:
screening out the gene emergence weight corresponding to the genes contained in the same signal channel, and multiplying all the screened gene emergence weights corresponding to the genes contained in the same signal channel, wherein the obtained products are respectively used as the correction coefficients of each signal channel;
and multiplying the obtained correction coefficient of each signal path by the path fraction of the corresponding signal path to obtain a product as the revised path fraction of each signal path, sequencing the revised path fractions of each signal path, and determining that the probability of the change of the signal path corresponding to the sequenced maximum path fraction is maximum.
The embodiment of the invention has the following beneficial effects:
in the embodiment of the invention, the genes are sorted according to the correlation between the genes and the phenotypes, the channel score of each signal channel is counted according to the gene frequency, the importance of the regulation genes is further fully considered, the counted channel score of each signal channel is revised by combining the gene-out degree of each gene, the importance of the channel is identified by the revised channel score, and the aim of improving the identification precision of the channel is fulfilled.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is within the scope of the present invention for those skilled in the art to obtain other drawings based on the drawings without inventive exercise.
FIG. 1 is a flowchart of a PADOG-based gene pathway identification method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, a PADOG-based method for identifying a gene pathway is provided in the embodiment of the present invention, the method includes:
step S1, obtaining a sample, determining the signal path of the sample and the gene contained in each signal path, sequencing the genes contained in all the signal paths according to the correlation between each gene and the phenotype, and further determining the gene frequency and the gene out-degree of each gene according to the sequenced genes; wherein the gene frequency is the total number of times a gene appears in the determined signal pathway, and the gene out degree is the number of genes which regulate and control downstream genes in the determined signal pathway;
the specific process comprises the steps of obtaining a sample, determining signal paths of the sample and genes contained in each signal path, and further determining the gene frequency distribution and the gene outbreak distribution of the genes. The frequency of occurrence of genes in a pathway (i.e., gene frequency) actually reflects the specificity of a gene, genes frequently occurring in many pathways belong to "common genes" whose influence on the pathway is relatively small, whereas genes occurring only in one or several pathways have high specificity and their differential expression has a large influence on the pathway. Similarly, the expression of gene expression indicates the number of downstream genes regulated by a gene, and thus the larger the expression of gene expression, the greater the influence on the pathway.
Meanwhile, the genes contained in all signal paths are also ordered according to the correlation between each gene and the phenotype, so that the correction scores among the genes can be counted. Assuming a total number of all genes N, a signaling pathway S is given with a base factor M, N genes are ordered by r (or t statistic) as the correlation between each gene g and the phenotype1,...,gj,...gN]。
Step S2, counting the maximum gene frequency and the minimum gene frequency according to the determined gene frequency of each gene, and determining the gene frequency weight of each gene according to the counted maximum gene frequency and minimum gene frequency;
the specific process is that the gene frequency of each gene is obtained, and the maximum gene frequency max (f) and the minimum gene frequency min (f) are counted;
according to the formula
Figure BDA0001284271870000071
Obtaining the gene frequency weight of each gene; wherein, f (g)j) Is sequenced gene gjThe frequency of the gene(s); w is af(gj) Is sequenced gene gjThe weight of gene frequency of (a), the value reflecting the degree of specificity of the gene in the pathway, the greater the value, the higher the degree of specificity of the gene in the pathway, and vice versa, the lower the degree of specificity, wf(gj) Is in the range of [1, 2]]In between, i.e., the value range of the gene frequency weight of each gene is [1, 2]]。
Step S3, determining the total number of genes contained in each signal path and the correction score of each sequenced gene, and calculating the path score of each signal path according to the total number of the genes contained in each signal path, the correction score of each sequenced gene and the corresponding gene frequency weight;
the specific process is that the weighted absolute correction score sum of all genes in the signal path is used for calculating the path score of each signal path, namely the path score of each signal path can be calculated by a formula
Figure BDA0001284271870000081
To effect the calculation of the path fraction for each signal path; wherein, ES0(S) is sequenced gene gjThe path fraction of the signal path S; m is sequenced gene gjThe total number of genes contained in the signal path S; t (g)j) Is sequenced gene gjThe correction score of (1).
Step S4, obtaining the gene appearance of each gene, and counting the maximum gene appearance and the minimum gene appearance, and further calculating the gene appearance weight of each gene according to the obtained gene appearance of each gene and the counted maximum gene appearance and minimum gene appearance;
the specific process comprises the steps of obtaining the gene outbreak of each gene, and counting the maximum gene outbreak max (d) and the minimum gene outbreak min (d) according to the obtained gene outbreak of each gene;
according to the formula
Figure BDA0001284271870000082
Obtaining the gene out-degree weight of each gene; wherein d (g)j) Is sequenced gene gjGene outbreak of (2); w is ad(gj) Is sequenced gene gjThe gene out-degree weight of (a), the value reflecting the importance of the gene in the pathway, the greater the value, the higher the importance of the gene in the pathway; conversely, the less important the gene is in the pathway, wd(gj) Is in the range of [1, 2]]In between, i.e., the out-degree weight of each gene is in the range of [1, 2]]。
And S5, screening the gene out-degree weight corresponding to the gene contained in the same signal channel, revising the channel score of the signal channel correspondingly calculated according to the screened gene out-degree weight corresponding to the gene contained in the same signal channel, further sequencing the revised channel score of each signal channel, and determining that the probability of the signal channel corresponding to the maximum channel score after sequencing is the maximum.
Screening out the gene emergence weights corresponding to the genes contained in the same signal channel, multiplying all the screened gene emergence weights corresponding to the genes contained in the same signal channel, and respectively taking the obtained products as the correction coefficients of each signal channel;
and multiplying the obtained correction coefficient of each signal path by the path fraction of the corresponding signal path to obtain a product as the revised path fraction of each signal path, sequencing the revised path fractions of each signal path, and determining that the probability of the change of the signal path corresponding to the maximum sequenced path fraction is the maximum, namely the more the path fraction is ranked, the higher the signal path tendency is taken as the research value.
The embodiment of the invention has the following beneficial effects:
in the embodiment of the invention, the genes are sorted according to the correlation between the genes and the phenotypes, the channel score of each signal channel is counted according to the gene frequency, the importance of the regulation genes is further fully considered, the counted channel score of each signal channel is revised by combining the gene-out degree of each gene, the importance of the channel is identified by the revised channel score, and the aim of improving the identification precision of the channel is fulfilled.
It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by relevant hardware instructed by a program, and the program may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (5)

1. A method for identifying a gene pathway based on PADOG, the method comprising:
a. obtaining a sample, determining signal paths of the sample and genes contained in each signal path, sequencing the genes contained in all the signal paths according to the correlation between each gene and a phenotype, and further determining the gene frequency and the gene out-degree of each gene according to the sequenced genes; wherein the gene frequency is the total number of times a gene appears in the determined signal pathway, and the gene out degree is the number of genes which regulate and control downstream genes in the determined signal pathway;
b. counting the maximum gene frequency and the minimum gene frequency according to the determined gene frequency of each gene, and determining the gene frequency weight of each gene according to the counted maximum gene frequency and minimum gene frequency;
c. determining the total number of genes contained in each signal channel and the correction score of each sequenced gene, and calculating the channel score of each signal channel according to the total number of the genes contained in each signal channel, the correction score of each sequenced gene and the corresponding gene frequency weight;
d. counting the maximum gene output and the minimum gene output according to the determined gene output of each gene, and calculating the gene output weight of each gene according to the obtained gene output of each gene and the counted maximum gene output and minimum gene output;
e. screening out the gene outburst weight corresponding to the gene contained in the same signal channel, revising the channel score of the signal channel correspondingly calculated according to the gene outburst weight corresponding to the gene contained in the same signal channel, further sequencing the revised channel score of each signal channel, and determining that the probability of the signal channel corresponding to the maximum channel score after sequencing is the maximum;
the step b specifically comprises the following steps:
acquiring the gene frequency of each gene, and counting the maximum gene frequency max (f) and the minimum gene frequency min (f);
according to the formula
Figure FDA0002494039600000011
Obtaining the gene frequency weight of each gene; wherein, f (g)j) Is sequenced gene gjThe frequency of the gene(s); w is af(gj) Is sequenced gene gjThe gene frequency weight of (2).
2. The method of claim 1, wherein the "fraction of paths per signal path" in step c is determined by the formula
Figure FDA0002494039600000021
To realize the operation; wherein, ES0(S) is the channel score of the signal channel S where the sequenced gene gj is located; m is the total number of genes contained in a signal path S where the sequenced genes gj are located; t (g)j) Is sequenced gene gjThe correction score of (1).
3. The method according to claim 1, wherein said step d specifically comprises:
acquiring the gene outbreak of each gene, and counting the maximum gene outbreak max (d) and the minimum gene outbreak min (d);
according to the formula
Figure FDA0002494039600000022
Obtaining the gene out-degree weight of each gene; wherein d (g)j) Is sequenced gene gjGene outbreak of (2); w is ad(gj) Is sequenced gene gjGene out-degree weight of (c).
4. The method of claim 3, wherein the out-of-degree weight for each gene is in the range of [1, 2 ].
5. The method according to claim 1, wherein said step e specifically comprises:
screening out the gene emergence weight corresponding to the genes contained in the same signal channel, and multiplying all the screened gene emergence weights corresponding to the genes contained in the same signal channel, wherein the obtained products are respectively used as the correction coefficients of each signal channel;
and multiplying the obtained correction coefficient of each signal path by the path fraction of the corresponding signal path to obtain a product as the revised path fraction of each signal path, sequencing the revised path fractions of each signal path, and determining that the probability of the change of the signal path corresponding to the sequenced maximum path fraction is maximum.
CN201710300900.1A 2017-05-02 2017-05-02 Method for identifying gene pathway based on PADOG Active CN107220526B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710300900.1A CN107220526B (en) 2017-05-02 2017-05-02 Method for identifying gene pathway based on PADOG

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710300900.1A CN107220526B (en) 2017-05-02 2017-05-02 Method for identifying gene pathway based on PADOG

Publications (2)

Publication Number Publication Date
CN107220526A CN107220526A (en) 2017-09-29
CN107220526B true CN107220526B (en) 2020-08-25

Family

ID=59943758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710300900.1A Active CN107220526B (en) 2017-05-02 2017-05-02 Method for identifying gene pathway based on PADOG

Country Status (1)

Country Link
CN (1) CN107220526B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763862B (en) * 2018-05-04 2021-06-29 温州大学 Method for deducing gene pathway activity
CN109817337B (en) * 2019-01-30 2020-09-08 中南大学 Method for evaluating channel activation degree of single disease sample and method for distinguishing similar diseases

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289432A (en) * 2010-06-17 2011-12-21 上海其明信息技术有限公司 Whole-cell protein and gene interacting network analysis system
WO2012122106A2 (en) * 2011-03-04 2012-09-13 H. Lee Moffitt Cancer Center And Research Institute, Inc. Compositions and methods apc, creb, and bad pathways to assess and affect cancer
CN103093119A (en) * 2013-01-24 2013-05-08 南京大学 Method for recognizing significant biologic pathway through utilization of network structural information
CN105279393A (en) * 2015-10-12 2016-01-27 厦门大学 Method for evaluating adverse drug reactions based on weighting network
CN105393253A (en) * 2012-12-28 2016-03-09 赛尔文塔公司 Quantitative assessment of biological impact using mechanistic network models
KR20160059099A (en) * 2014-11-17 2016-05-26 대한민국(농촌진흥청장) A composition for prediction of carcass weight in cow and predicting method using the same

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289432A (en) * 2010-06-17 2011-12-21 上海其明信息技术有限公司 Whole-cell protein and gene interacting network analysis system
WO2012122106A2 (en) * 2011-03-04 2012-09-13 H. Lee Moffitt Cancer Center And Research Institute, Inc. Compositions and methods apc, creb, and bad pathways to assess and affect cancer
CN105393253A (en) * 2012-12-28 2016-03-09 赛尔文塔公司 Quantitative assessment of biological impact using mechanistic network models
CN103093119A (en) * 2013-01-24 2013-05-08 南京大学 Method for recognizing significant biologic pathway through utilization of network structural information
KR20160059099A (en) * 2014-11-17 2016-05-26 대한민국(농촌진흥청장) A composition for prediction of carcass weight in cow and predicting method using the same
CN105279393A (en) * 2015-10-12 2016-01-27 厦门大学 Method for evaluating adverse drug reactions based on weighting network

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
Comparative study on gene set and pathway topology-based enrichment methods;Michaela Bayerlová 等;《BMC Bioinformatics》;20151022;1-15 *
Diet-induced weight loss leads to a switch in gene regulatory network control in the rectal mucosa;Ashley J. Vargas 等;《Genomics》;20160811;126-133 *
DNA调控网络分析与数字化表征;徐德文;《中国优秀硕士学位论文全文数据库 医药卫生科技辑》;20160315(第03期);E059-102 *
Down-weighting overlapping genes improves gene set analysis;Adi Laurentiu Tarca 等;《BMC Bioinformatics 2012》;20120619;1-14 *
Environmental exposure to BDE47 is associated with increased diabetes prevalence: Evidence from community-based case-control studies and an animal experiment;Zhan Zhang 等;《Scientific Reports》;20160613;1-10 *
Subpathway Analysis based on Signaling-Pathway Impact Analysis of Signaling Pathway;Xianbin Li 等;《PLOS ONE》;20150724;第10卷(第7期);1-19 *
儿童原发性肾病综合征糖皮质激素耐药机制研究进展;林娜 等;《中国现代医学杂志》;20090930;第19卷(第17期);2626-2630 *
基于动态任务优先级的网格任务调度算法研究;孟宪福 等;《大连理工大学学报》;20120331;第52卷(第2期);277-284 *

Also Published As

Publication number Publication date
CN107220526A (en) 2017-09-29

Similar Documents

Publication Publication Date Title
Pelossof et al. Prediction of potent shRNAs with a sequential classification algorithm
Lamarre et al. Optimization of an RNA-Seq differential gene expression analysis depending on biological replicate number and library size
Zhang et al. LncmiRSRN: identification and analysis of long non-coding RNA related miRNA sponge regulatory network in human cancer
Caudai et al. AI applications in functional genomics
Boulesteix et al. Added predictive value of high-throughput molecular data to clinical data and its validation
CN107133492B (en) Method for identifying gene pathway based on PAGES
Zeng et al. Pleiotropic mapping and annotation selection in genome-wide association studies with penalized Gaussian mixture models
Redestig et al. Detection and interpretation of metabolite–transcript coresponses using combined profiling data
CN107220526B (en) Method for identifying gene pathway based on PADOG
Nguyen et al. Varmole: a biologically drop-connect deep neural network model for prioritizing disease risk variants and genes
Yousef et al. miRModuleNet: detecting miRNA-mRNA regulatory modules
CN107203704B (en) Method for identifying gene pathway based on GSA
Zhao et al. Identification of potential prognostic competing triplets in high-grade serous ovarian cancer
Cai et al. Predicting Nash equilibria for microbial metabolic interactions
Aanen et al. Mutation-rate plasticity and the germline of unicellular organisms
Zhu et al. Sc-gpe: A graph partitioning-based cluster ensemble method for single-cell
Ma et al. Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
Pradhan et al. miRbiom: Machine-learning on Bayesian causal nets of RBP-miRNA interactions successfully predicts miRNA profiles
Mu et al. Airpart: interpretable statistical models for analyzing allelic imbalance in single-cell datasets
Rahmani et al. Recursive indirect-paths modularity (RIP-M) for detecting community structure in RNA-Seq co-expression networks
Martin et al. NPA: an R package for computing network perturbation amplitudes using gene expression data and two-layer networks
Hukku et al. BAGSE: a Bayesian hierarchical model approach for gene set enrichment analysis
Yu et al. High-dimensional mediation analysis with confounders in survival models
CN107609349A (en) A kind of project implementation quality control system in bioanalysis platform
Avalos et al. Genetic variation in cis-regulatory domains suggests cell type-specific regulatory mechanisms in immunity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200729

Address after: 511400 No. 230 West Ring Road, Guangzhou University, Guangzhou, Guangdong, Panyu District

Applicant after: Guangzhou University

Address before: 325000 Zhejiang, Ouhai, South East Road, No. 38, Wenzhou National University Science Park Incubator

Applicant before: Wenzhou University

GR01 Patent grant
GR01 Patent grant