CN114093426B - Marker screening method based on gene regulation network construction - Google Patents

Marker screening method based on gene regulation network construction Download PDF

Info

Publication number
CN114093426B
CN114093426B CN202111330308.9A CN202111330308A CN114093426B CN 114093426 B CN114093426 B CN 114093426B CN 202111330308 A CN202111330308 A CN 202111330308A CN 114093426 B CN114093426 B CN 114093426B
Authority
CN
China
Prior art keywords
individual
task
gene
individuals
regulation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111330308.9A
Other languages
Chinese (zh)
Other versions
CN114093426A (en
Inventor
黄晓然
林晓惠
东坤杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202111330308.9A priority Critical patent/CN114093426B/en
Publication of CN114093426A publication Critical patent/CN114093426A/en
Application granted granted Critical
Publication of CN114093426B publication Critical patent/CN114093426B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • G16B35/20Screening of libraries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biotechnology (AREA)
  • Physiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Chemical & Material Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a marker screening method based on gene regulation and control network construction, which screens biomarkers by constructing a difference network and belongs to the technical field of biological data analysis. The method comprises the steps of firstly executing a genetic algorithm in a global scope, clustering according to optimal individuals corresponding to each task, dividing similar tasks into the same class, executing the genetic algorithm in each class, and carrying out information migration in each task to further optimize inference of a regulation and control relation so as to obtain a final regulation and control network; and finally, respectively constructing regulation and control networks on normal and disease samples to obtain a difference network, and screening genes as markers through the difference network. The core content of the invention is that the inherent relation between different genes hidden in gene expression data is mined by combining a floating point number coding genetic algorithm with a multi-objective optimization mode, an effective gene relation inference model is established, a gene regulation network is constructed, and markers are screened by the differential regulation network.

Description

Marker screening method based on gene regulation network construction
Technical Field
The invention belongs to the technical field of biological data analysis, and discloses a method for constructing a gene regulation network, constructing a difference network and screening biomarkers by analyzing the relation among expression data of genes and deducing potential regulation relation among genes.
Background
In biological cells, genes are expressed by transcription and translation processes, and the expression products can activate or inhibit the expression levels of other genes, which is the regulation among genes. The regulatory relationships between genes are different in different physiological states, such as health, disease, and post-operative intervention. Therefore, the regulation and control relation among genes under different physiological states is analyzed, so that the difference among the genes is found, the genes causing the physiological state difference, namely the biomarker, can be found, and the method has great significance for the work of disease treatment, drug research and development and the like.
Methods for constructing gene regulatory networks are diverse, for example: based on the pearson correlation coefficient, a co-expression network is established by calculating the degree of correlation between genes in the gene expression data; measuring the correlation of two genes under the condition of removing the effects of other genes by using the partial correlation coefficient; using mutual information, the relationship between two genes is measured in expression data containing nonlinear relationships, and so on. In addition to the above, genetic algorithms (Genetic Algorithm, GA) can also be used for gene regulatory network inference. Assuming that n genes are present, the GA treats each gene in turn as a regulated gene (i.e., a target gene), and then searches for a regulated gene combination of the gene from other n-1 genes through a series of crossover, mutation, screening, etc. The method achieves good effect in many problems of gene regulation network construction.
The invention starts from the angle of multitask optimization, and the inference problem is different tasks according to the corresponding regulation and control relation of different target genes, and the similarity between the multitasks is fully considered and utilized in the process of inferring the gene regulation and control network, so as to improve the accuracy and efficiency of the inference. Meanwhile, the idea of transfer learning is introduced by utilizing a genetic algorithm to solve the problem of the proposed multi-task optimization, so that information among different tasks is more effectively transferred, and the reasoning construction of a gene regulation network is more accurately carried out. And finally, respectively constructing regulation and control networks on the normal and disease samples by using the method, obtaining a difference regulation and control network based on the difference of the two established networks, and screening corresponding markers through the network.
Disclosure of Invention
The invention aims at realizing reasoning of a gene regulation network by utilizing a genetic algorithm based on the thought of multitasking optimization, so as to construct a difference network and screen biomarkers. In the implementation process, the inference of the regulation and control relation corresponding to each target gene is regarded as a task, and the parallel operation of each task genetic algorithm is optimized.
In order to achieve the above object, the present invention adopts the following technical scheme:
The marker screening method based on gene regulation network construction comprises the following steps:
Let f= { F 1,f2,…,fn } represent the gene set, n be the number of genes, x= { X 1,x2,…,xm } represent the sample set, and m be the number of samples. Let v= (V 1,V2,…,Vi,…,Vn) denote the expression level of n genes over m samples, where V i=(vi1,vi2,…,vim denotes the expression level of the i-th gene over m samples.
(1) Randomly initializing a population of size N, i.e. generating a number N of individuals, each individual comprising the following information:
① Factor cost (Factorial Cost): Representing the objective function value of individual p i at task T j for the factor cost of individual p i at task T j;
② Factor grade (FactorialRank): for task T j, the corresponding individuals are ranked according to the factor cost, and the factor grade of individual p i on task T j Defined as the index of the individual p i in the ordered list;
③ Skill Factor (Skill Factor): among all tasks, the task index that an individual p i performs best is defined as the skill factor of p i, also known as the "primary task", i.e
In addition, the individual dimension is the gene number n, and different from the common binary code, the numerical value corresponding to each dimension adopts a floating point number coding mode, the numerical value of one dimension represents the weight of the gene corresponding to the dimension, and the larger the weight is, the greater the possibility that the regulation and control relationship exists between the gene corresponding to the dimension and the target gene corresponding to the current individual is. Factor cost, factor level and skill factor are updated after each iteration;
(2) Randomly extracting parent individuals in the current population to perform crossover and mutation: a number of parental pairings were randomly drawn among all individuals. For a pair of parent p a、pb, randomly generating a value h between (0, 1), if τ a=τb or h is smaller than the balance factor between crossover and mutation, crossover the two individuals to generate child c a、cb, which respectively retain their own primary tasks, otherwise, mutation is performed on the two individuals to generate child c a、cb, which still respectively retain their own primary tasks.
Crossing:
ca=βpb+(1-β)pa (1)
cb=βpa+(1-β)pb (2)
Where β is a constant of intersection, which takes on the value (0, 1).
Variation:
Wherein k is a variation constant, the value (0, 1), p max is the upper limit of the value of each dimension of the individual, p min is the lower limit, and r is the random number generated by rand ();
(3) Screening individuals in the current population produces offspring: calculating the objective function value of each individual on the main task tau, wherein the objective function value is the sum of squares of residual errors between the expression value of the target gene of the task and the expression value of the remaining n-1 genes and the numerical value obtained by corresponding multiplication and addition of the weights of the remaining n-1 genes on m samples:
let the principal task τ i=Tma of individual p i be in equation (4) The objective function value for the individual p i at its main task T ma, V \ma=(V1,V2,…,Vma-1,Vma+1,…,Vn) represents a matrix of expression value vectors of n-1 genes other than the ma-th gene. The smaller the objective function value of an individual on a certain task, the more accurate the inference result of the individual on the regulation and control relation of the objective gene corresponding to the task is. And for the optimal individual corresponding to each task, generating another identical individual to be incorporated into the current population while keeping the individual, replacing the main task of the incorporated individual with a random one of the rest tasks, and calculating the objective function value of the main task on the new main task. N individuals with the highest fitness, namely the smallest objective function value, are selected from the current population to form offspring, and the rest individuals are eliminated.
Step (2) and step (3) are iterated, after each round, the accuracy degree of the current inferred regulation and control relation is measured by using the target function value mean value of the current corresponding optimal individual of each task, if the target function value mean value reaches the set threshold value or the iteration reaches the maximum times, the iteration is stopped, and step (4) is executed;
(4) Clustering the tasks: clustering optimal individuals corresponding to each task by using a K-Means algorithm, so that n tasks are divided into several major classes;
(5) Extracting parent individuals from the population corresponding to the similar tasks to execute a genetic evolution process: in the step (4), all tasks are clustered, so that the tasks in the same class have higher similarity, and the tasks among the classes have the similarity as low as possible. In this step, steps (2) and (3) are executed again, except that the algorithm will be executed only within a class, and algorithms between multiple major classes are executed in parallel;
(6) Information migration is carried out among different dimensions of the same task: randomly extracting individuals p r from individuals with the same main task, and a group of n-1 individuals, enabling the n-1 individuals to be an individual set S, selecting a certain dimension l, taking the first dimension data of the individuals in the S to form an n-1 dimension vector, and sequentially replacing the n-1 dimensions of the individuals p r except for the target genes of the main task corresponding to p r by using the values of the dimensions of the vector to generate a new individual q r. In the replacing process, if the fitness of p r is improved after the replacement of a certain dimension, namely the objective function value is reduced, the replacement is carried out, otherwise, the replacement is not carried out, and the next dimension is shifted; the specific process is illustrated in fig. 1:
the above process is described in pseudo code as follows:
Wherein P i is the current population corresponding to task T i, A value corresponding to the first dimension representing the individual p j;
(7) Acquiring a regulation and control relation matrix: and selecting an individual with the optimal objective function value on each task as a final screening result. The corresponding target genes exist in each task, so that the individuals are arranged according to the sequence of the genes represented by each dimension, a matrix D formed by individual vectors can be obtained, and the larger the numerical value of the ith row and the jth column in the matrix is, the stronger the regulation and control effect of the represented gene j on the gene i is, and the more likely the connected edges exist between the two in the constructed regulation and control network;
(8) Constructing a difference network, and screening markers: and (3) respectively executing the steps (1) - (7) on the normal sample and the disease sample to obtain two regulation relation matrixes D n、Di, taking absolute values by taking differences between the two matrixes, and obtaining a difference regulation matrix D s:
Ds=|Dn-Di| (5)
The larger the value of row i, column j in D s, the greater the difference in the regulation of gene j to gene i before and after the disease, the more likely a border between the two in the constructed differential regulation network, the greater the degree of the gene in the differential network, the more likely it is to be considered as an important gene related to the disease, and thus, the more likely it is to be screened as a marker.
The sample generally refers to tissues and organs under different physiological states, and the data refers to the expression levels of a plurality of genes in different samples.
The invention has the beneficial effects that: in the invention, continuous numerical values are utilized to encode individuals in a genetic algorithm in the process of deducing a gene regulation network, so that the deduction of the regulation relation strength can be more accurate; the similarity between regulation and control relations corresponding to different genes is fully considered, the migration of the optimal individuals in different tasks increases the searching range of the algorithm, reduces the probability of sinking into local optimum, and improves the accuracy of inference; in addition, before the higher-level genetic algorithm is executed, the tasks are clustered by utilizing the result obtained by the lower-level genetic algorithm, so that the similarity problem can be ensured to be divided into the same group, the similarity of the tasks in the group and the difference of the tasks among the groups are improved, the next step has higher similarity degree among the tasks for information sharing in the execution process, more effective knowledge sharing is realized, and the convergence rate is accelerated; finally, through a difference network, the change of gene regulation activities in two different physiological states before and after the disease can be intuitively seen, and the accuracy and the efficiency of searching key gene markers related to the disease are improved.
Drawings
FIG. 1 is a diagram of information sharing among individual dimensions within a task.
Detailed Description
The following describes embodiments of the present invention further in conjunction with a technical scheme and a set of simulation data, which are only for the purpose of illustrating the present invention and are not limiting.
In table 1 is the simulation data of the present invention, which contains two tags: gene f and sample x, the number of genes was 5 and sample number 3.
TABLE 1 Gene expression simulation data
(1) Randomly initializing a population, taking an individual p 1 = (0.2,0.4,0.1,0.3,0.2) in the population as an example, and taking the factor cost of the individual on a task T 1 as shown in a formula (4)Then there is/>, similarly, for individual p 2 = (0.3,0.3,0.2,0.5,0.1)Then the p 1 factor scale is less than p 2, i.e. >If there is/> (J=1, 2, …, 5), then the skill factor of p 1 is τ 1 =2, i.e. the primary task is to infer questions for regulatory relationships with f 2 as the target gene;
(2) Taking p 1、p2 in step (1) as an example, when β=0.3, k=0.2, r=0.6, rand ()% 2=0: performing a crossover operation, as can be derived from formulas (1), (2): c 1=(0.23,0.37,0.13,0.36,0.17),c2 = (0.27,0.33,0.17,0.44,0.13); performing mutation operation, wherein c 1 = (0.224,0.4,0.136,0.312,0.52) is obtained by the formula (3);
(3) Calculating an objective function value of each individual in the current population on a main task thereof according to a formula (4) (see step (1)), replacing by using the optimal individual through the main task to generate a new individual, selecting N individuals with the highest fitness from the population after each iteration, namely, the N individuals with the smallest objective function values as offspring, and eliminating the rest individuals;
(4) (5) clustering the optimal individuals corresponding to each task, namely the individuals with the smallest objective function value, by utilizing a K-Means algorithm, so as to divide the multi-task into a plurality of major classes; executing the steps (2) and (3) again in each major class, and executing the algorithms among a plurality of major classes in parallel;
(6) Assuming that the individual p r = (0.2,0.4,0.1,0.3,0.2) is randomly extracted from all the individuals who are tasked with T 1, and the individual constituent set S={(0.02,0.12,0.45,0.71,0.36),(0.66,0.11,0.56,0.03,0.3),(0.78,0.11,0.44,0.01,0.23),(0.02,0.12,0.05,0.81,0.16)}, is randomly extracted, when l=3, the third column constituent vector z= (0.45,0.56,0.44,0.05) of each individual in S is taken. From equation (4) If 0.45 of the vector z is substituted for 0.4 of the individual p r, then/> The objective function value becomes large and is therefore not replaced; similarly, 0.56 of vector z does not replace 0.1,0.44 of individual p r and 0.3,0.05 of individual p r replaces 0.2 of individual p r. Then q r = (0.2,0.4,0.1,0.3,0.05);
(7) And selecting an individual with the optimal objective function value on each task as a final screening result. If at this time, among the individuals with the primary task T 1, the objective function value is the smallest, p 3 = (0.15,0.69,0.45,0.88,0.06), the optimal individual with the primary task T 2 is p 9 = (0.02,0.12,0.45,0.71,0.36), the optimal individual with the primary task T 3 is p 2 = (0.66,0.11,0.56,0.03,0.3), the optimal individual with the primary task T 4 is p 26 = (0.78,0.11,0.44,0.01,0.23), and the optimal individual with the primary task T 5 is p 15 = (0.02,0.12,0.05,0.81,0.16), the following matrix can be obtained:
each column in the matrix sequentially corresponds to 5 genes { f 1,f2,f3,f4,f5 }, taking the element 0.88 of the fourth column of the first row as an example, the element has a larger value, which means that the gene f 4 has stronger regulation and control effect on the gene f 1, and the formed regulation and control network has a larger possibility of connecting edges;
(8) Assuming that x 1 is a normal sample and x 2、x3 is a disease sample, performing steps (1) - (7) on the two types of samples respectively to obtain two matrices D n、Di, and obtaining according to formula (5):
taking the element 0.82 in the third row and the fourth column as an example, the element has a larger value, which represents that the difference of the gene f 4 on the regulation action of the gene f 3 before and after the disease is larger, and the edge connection is more likely to exist in the constructed difference regulation network. If a node has a high degree of differential regulation in the network, the gene corresponding to the node tends to be screened as a marker associated with the disease.

Claims (1)

1. The marker screening method based on gene regulation network construction is characterized by comprising the following steps:
let f= { F 1,f2,...,fn } represent the gene set, n is the number of genes, x= { X 1,x2,...,xm } represent the sample set, m is the number of samples; let v= (V 1,V2,...,Vi,...,Vn) represent the expression level of n genes over m samples, where V i=(vi1,vi2,...,vim represents the expression level of the i-th gene over m samples;
(1) Randomly initializing a population of size N, i.e. generating a number N of individuals, each individual comprising the following information:
① Factor cost: Representing the objective function value of individual p i at task T j for the factor cost of individual p i at task T j;
② Factor grade: for task T j, the corresponding individuals are ranked according to the factor cost, and the factor grade of individual p i on task T j Defined as the index of the individual p i in the ordered list;
③ Skill factors: among all tasks, the task index that an individual p i performs best is defined as the skill factor of p i, also known as the "primary task", i.e
The individual dimension is the gene number n, the value corresponding to each dimension adopts a floating point number coding mode, the value of a certain dimension represents the weight of the gene corresponding to the dimension, and the larger the weight is, the greater the possibility that the regulation and control relationship exists between the gene corresponding to the dimension and the target gene corresponding to the current individual is; factor cost, factor level and skill factor are updated after each iteration;
(2) Randomly extracting parent individuals in the current population to perform crossover and mutation: randomly drawing a certain number of parent pairings from all individuals; for a pair of parent p a、pb, randomly generating a value h between (0, 1), if τ a=τb or h is smaller than a balance factor between crossover and mutation, crossover two individuals to generate child c a、cb and respectively reserve the original main tasks, otherwise, performing mutation on the two individuals to generate child c a、cb and respectively reserve the original main tasks, wherein the crossover and mutation modes are as follows:
Crossing:
ca=βpb+(1-β)pa (1)
cb=βpa+(1-β)pb (2)
wherein beta is a constant of the intersection, and the value is (0, 1);
Variation:
wherein k is a variation constant, the value (0, 1), p max is the upper limit of the value of each dimension of the individual, p min is the lower limit, and r is the random number generated by rand ();
(3) Screening individuals in the current population produces offspring: calculating the objective function value of each individual on the main task tau, wherein the objective function value is the sum of squares of residual errors between the expression value of the target gene of the task and the expression value of the remaining n-1 genes and the numerical value obtained by corresponding multiplication and addition of the weights of the remaining n-1 genes on m samples:
Let the principal task τ i=Tma of individual p i, then in equation (4), An objective function value on its main task T ma for individual p i, V \ma=(V1,V2,...,Vma-1,Vma+1,...,Vn) represents a matrix of expression value vectors of n-1 genes other than the ma-th gene; the smaller the objective function value of an individual on a certain task, the more accurate the inference result of the individual on the regulation and control relation of the objective gene corresponding to the task is; for the optimal individual corresponding to each task, generating another identical individual to be incorporated into the current population while keeping the individual, replacing the main task of the incorporated individual with a random one of the rest tasks, and calculating the objective function value of the main task on the new main task; selecting N individuals with highest fitness, namely the smallest objective function value from the current population to form offspring, and eliminating the rest individuals;
Step (2) and step (3) are iterated, after each round, the accuracy degree of the current inferred regulation and control relation is measured by using the target function value mean value of the current corresponding optimal individual of each task, if the target function value mean value reaches the set threshold value or the iteration reaches the maximum times, the iteration is stopped, and step (4) is executed;
(4) Clustering the tasks: clustering optimal individuals corresponding to each task by using a K-Means algorithm, so that n tasks are divided into several major classes;
(5) Extracting parent individuals from the population corresponding to the similar tasks to execute a genetic evolution process: in the step (4), all tasks are clustered, so that the tasks in the same class have higher similarity, and the tasks among the classes have the similarity as low as possible; in this step, steps (2) and (3) are executed again, except that the algorithm will be executed only within a class, and algorithms between multiple major classes are executed in parallel;
(6) Information migration is carried out among different dimensions of the same task: randomly extracting individuals p r from individuals with the same main task, and a group of n-1 individuals, enabling the n-1 individuals to be an individual set S, selecting a certain dimension l, taking the first dimension data of the individuals in the S to form an n-1 dimension vector, and sequentially replacing the n-1 dimensions of the individuals p r except for the target genes of the main task corresponding to p r by using the values of the dimensions of the vector to generate a new individual q r; in the replacing process, if the fitness of p r is improved after the replacement of a certain dimension, namely the objective function value is reduced, the replacement is carried out, otherwise, the replacement is not carried out, and the next dimension is shifted;
(7) Acquiring a regulation and control relation matrix: selecting an individual with the optimal objective function value on each task as a final screening result; the corresponding target genes exist in each task, so that the individuals are arranged according to the sequence of the genes represented by each dimension, a matrix D formed by individual vectors can be obtained, and the larger the numerical value of the ith row and the jth column in the matrix is, the stronger the regulation and control effect of the represented gene j on the gene i is, and the more likely the connected edges exist between the two in the constructed regulation and control network;
(8) Constructing a difference network, and screening markers: and (3) respectively executing the steps (1) - (7) on the normal sample and the disease sample to obtain two regulation relation matrixes D n、Di, taking absolute values by taking differences between the two matrixes, and obtaining a difference regulation matrix D s:
Ds=|Dn-Di| (5)
The larger the value of row i, column j in D s, the greater the difference in the regulation of gene j to gene i before and after the disease, the more likely a border between the two in the constructed differential regulation network, the greater the degree of the gene in the differential network, the more likely it is to be considered as an important gene related to the disease, and thus, the more likely it is to be screened as a marker.
CN202111330308.9A 2021-11-11 2021-11-11 Marker screening method based on gene regulation network construction Active CN114093426B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111330308.9A CN114093426B (en) 2021-11-11 2021-11-11 Marker screening method based on gene regulation network construction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111330308.9A CN114093426B (en) 2021-11-11 2021-11-11 Marker screening method based on gene regulation network construction

Publications (2)

Publication Number Publication Date
CN114093426A CN114093426A (en) 2022-02-25
CN114093426B true CN114093426B (en) 2024-05-07

Family

ID=80299754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111330308.9A Active CN114093426B (en) 2021-11-11 2021-11-11 Marker screening method based on gene regulation network construction

Country Status (1)

Country Link
CN (1) CN114093426B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117409962B (en) * 2023-12-14 2024-03-29 北京科技大学 Screening method of microbial markers based on gene regulation network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022473A (en) * 2016-05-23 2016-10-12 大连理工大学 Construction method for gene regulatory network by combining particle swarm optimization (PSO) with genetic algorithm
CN108197432A (en) * 2017-11-29 2018-06-22 东北电力大学 A kind of gene regulatory network reconstructing method based on gene expression data
CN109308934A (en) * 2018-08-20 2019-02-05 唐山照澜海洋科技有限公司 A kind of gene regulatory network construction method based on integration characteristic importance and chicken group's algorithm
WO2019136892A1 (en) * 2018-01-15 2019-07-18 大连民族大学 Complex network community detection method
CN113486952A (en) * 2021-07-06 2021-10-08 大连海事大学 Multi-factor model optimization method of gene regulation and control network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718999B (en) * 2016-01-25 2018-05-29 深圳大学 A kind of construction method and system of heuristic metabolism coexpression network
US20180166170A1 (en) * 2016-12-12 2018-06-14 Konstantinos Theofilatos Generalized computational framework and system for integrative prediction of biomarkers
US10325673B2 (en) * 2017-07-25 2019-06-18 Insilico Medicine, Inc. Deep transcriptomic markers of human biological aging and methods of determining a biological aging clock

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022473A (en) * 2016-05-23 2016-10-12 大连理工大学 Construction method for gene regulatory network by combining particle swarm optimization (PSO) with genetic algorithm
CN108197432A (en) * 2017-11-29 2018-06-22 东北电力大学 A kind of gene regulatory network reconstructing method based on gene expression data
WO2019136892A1 (en) * 2018-01-15 2019-07-18 大连民族大学 Complex network community detection method
CN109308934A (en) * 2018-08-20 2019-02-05 唐山照澜海洋科技有限公司 A kind of gene regulatory network construction method based on integration characteristic importance and chicken group's algorithm
CN113486952A (en) * 2021-07-06 2021-10-08 大连海事大学 Multi-factor model optimization method of gene regulation and control network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于部分互信息和贝叶斯打分函数的基因调控网络构建算法;刘飞;张绍武;高红艳;西北工业大学学报;20171231;35(005);全文 *
基因组学数据的网络构建与分析方法;王文杰;侯艳;李康;中国卫生统计;20171231(001);全文 *
机器学习在生物信息学领域的应用与研究进展;汤胜男;辛学刚;;人工智能;20200210(01);全文 *

Also Published As

Publication number Publication date
CN114093426A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
Marin et al. Macroevolutionary algorithms: a new optimization method on fitness landscapes
Marinaki et al. Honey bees mating optimization algorithm for financial classification problems
Wang et al. Deep hierarchical knowledge tracing
De Campos et al. Optimization of neural networks through grammatical evolution and a genetic algorithm
Bonzo et al. Clustering panel data via perturbed adaptive simulated annealing and genetic algorithms
Hong et al. Evolutionary self-organising modelling of a municipal wastewater treatment plant
Ren et al. Stability of salp swarm algorithm with random replacement and double adaptive weighting
Sarker et al. Evolutionary computation: A gentle introduction
CN112734051A (en) Evolutionary ensemble learning method for classification problem
CN114093426B (en) Marker screening method based on gene regulation network construction
Huang et al. Harnessing deep learning for population genetic inference
Mitra et al. Application of meta-heuristics on reconstructing gene regulatory network: a bayesian model approach
CN116611504A (en) Neural architecture searching method based on evolution
Kommadath et al. Single phase multi-group teaching learning algorithm for computationally expensive numerical optimization (CEC 2016)
CN115661546A (en) Multi-objective optimization classification method based on feature selection and classifier joint design
Galván et al. Evolutionary Multi-objective Optimisation in Neurotrajectory Prediction
Pradhan et al. Solving the 0–1 knapsack problem using genetic algorithm and rough set theory
Kaur et al. Optimizing the accuracy of CART algorithm by using genetic algorithm
Shekhar et al. Integrating decision trees with metaheuristic search optimization algorithm for a student’s performance prediction
Iranmanesh et al. Inferring gene regulatory network using path consistency algorithm based on conditional mutual information and genetic algorithm
Oliazadeh et al. Genetic Programming (GP): An Introduction and Practical Application
Ji et al. Tri-objective optimization-based cascade ensemble pruning for deep forest
CN112163068B (en) Information prediction method and system based on autonomous evolution learner
Mashayekhi et al. Can we predict speciation and species extinction using an individual-based ecosystem simulation?
Lagarteja et al. Improved genetic algorithm using new crossover operator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant