CN111243658B - Biomolecular network construction and optimization method based on deep learning - Google Patents

Biomolecular network construction and optimization method based on deep learning Download PDF

Info

Publication number
CN111243658B
CN111243658B CN202010013935.9A CN202010013935A CN111243658B CN 111243658 B CN111243658 B CN 111243658B CN 202010013935 A CN202010013935 A CN 202010013935A CN 111243658 B CN111243658 B CN 111243658B
Authority
CN
China
Prior art keywords
data
network
level
biomolecular
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010013935.9A
Other languages
Chinese (zh)
Other versions
CN111243658A (en
Inventor
余国先
严杨扬
王峻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University
Original Assignee
Southwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University filed Critical Southwest University
Priority to CN202010013935.9A priority Critical patent/CN111243658B/en
Publication of CN111243658A publication Critical patent/CN111243658A/en
Application granted granted Critical
Publication of CN111243658B publication Critical patent/CN111243658B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Abstract

The invention relates to a deep learning-based multi-level biomolecular network construction and optimization method, which belongs to the field of artificial intelligence and comprises the following steps: building a software and hardware environment; step two: collecting and preprocessing multi-level biomolecular network data, and primarily establishing a multi-level biomolecular network; step three: collecting biological molecular characteristic data of each biological level, carrying out corresponding characteristic coding, and dividing the data into a training set and a test set; step four: constructing a network optimization model according to the network optimization target and the existing characteristics; step five: training by using multi-level biomolecular network data and processed each level biomolecular characteristic data, solving parameters of each layer in the model, stopping training and storing parameters of each layer when an expected effect is achieved; step six: and deploying the trained neural network model for multi-level biomolecular network optimization. The invention solves the problems that the existing biomolecular network has poor expansibility and cannot deeply describe a complex biological system.

Description

Biomolecular network construction and optimization method based on deep learning
Technical Field
The invention belongs to the technical field of deep learning and system biology, and relates to a biomolecular network construction and optimization method based on deep learning.
Background
The biomolecular network is the core research field of system biology, is the basis of the high-efficiency integrated analysis of biological big data, and is also one of the innovative application fields of artificial intelligence technology in biological data mining. The biological system is formed by complex interaction among various biological molecules, genes are transcribed into RNA, and then are translated and modified to form various protein subtypes, the subtypes have different structures, so that different biological functions are completed, and the dynamic interaction of various biological molecules forms the complex biological system, thereby forming a complex functional mechanism.
A variety of high throughput techniques exist to identify molecular interactions and to represent them by different network models. The multilayer biomolecular network constructed based on multiple groups of chemical data can well describe the functional relationship between molecules and the space-time state of a biological system, a large number of researches show that molecules with the same function are easy to form local clusters and modules in a complex network, and the local block structure has great promotion effects on the research of molecular functions, the organization structure between biomolecules in cells and precise medical treatment. However, many of the known biomolecular networks can only provide partial information on complex physiological phenomena, such as typical complex diseases (breast cancer, lung cancer, etc.) are usually not caused by single gene variation or single pair-wise gene interaction deletion. They are actually caused by multiple genes, abnormalities in intracellular and intercellular molecular interactions.
The construction and modeling of biomolecular networks have shown more than 20 years of research history, however, the existing network biology research still generally only isolates the molecular networks at the level of concerned genome, transcriptome, metabolome or proteome, and it is difficult to analyze the pathology of complex diseases more comprehensively and stereoscopically from the perspective of multi-level molecular networks. The multilevel biomolecular network construction, optimization and visual analysis are not only helpful for disclosing the biomolecular mechanism of complex diseases and various life phenomena, but also helpful for the development of various fields such as precise treatment of complex diseases, drug research and development and the like.
Keras is an open source artificial neural network library written by Python, and can be used as a high-level application program interface of Tensorflow, Microsoft-CNTK and Theano for designing, debugging, evaluating, applying and visualizing a deep learning model. Keras is written by an object-oriented method on a code structure, is completely modularized and has expandability, and supports mainstream algorithms in the field of modern artificial intelligence. In the aspects of hardware and development environment, Keras supports multi-GPU parallel computing under a multi-operating system, and contributes to more efficiently optimizing a multi-level biomolecular network.
Disclosure of Invention
In view of the above, the present invention aims to provide a more comprehensive and reasonably designed method for constructing and optimizing a multi-level biomolecular network, which adopts a deep learning algorithm to construct a model for recognizing the interaction information of the existing biological nodes and comprehensively considering the characteristics of biological node sequences, expression quantities, etc., and then uses the model to optimize the multi-level biomolecular network, thereby having the advantage of comprehensively considering the biological multi-level interaction relationship.
In order to achieve the purpose, the invention provides the following technical scheme:
a biomolecular network construction and optimization method based on deep learning comprises the following steps:
the method comprises the following steps: constructing a software and hardware environment suitable for Keras deep learning operation;
step two: collecting and preprocessing biomolecular network data of multiple layers, aligning the multiple layers of networks, and initially establishing a multiple layers of biomolecular networks;
step three: collecting biological molecule characteristic data of each biological level, carrying out corresponding characteristic coding, and dividing a data set into a training set and a testing set;
step four: according to the network optimization target and the existing characteristics, a network optimization model is constructed by adopting a deep learning algorithm;
step five: on the set deep learning Keras operating environment, training according to the model set up in the fourth step by using the prepared aligned multi-level biomolecule network data and the processed characteristic data of each level of biomolecules, solving the parameters of each layer in the model, stopping training when the expected effect is achieved, and storing the parameters of each layer;
step six: and deploying the trained neural network model for multi-level biomolecular network optimization.
Further, the software and hardware environment suitable for Keras deep learning operation built in the step one comprises: the hardware is a server with 32GB memory and two NVIDIA Tesla K40C independent video cards with 12GB memory or higher configuration; the operating system of the software is Ubuntu16.04, a 64-bit operating system and other third-party libraries which Keras depends on.
Further, in the second step, multi-level biomolecular correlation interaction data is collected from a plurality of public databases, and the specific data is shown in table 1:
TABLE 1 biomolecular interaction correlation data
Figure BDA0002358156830000021
Figure BDA0002358156830000031
After the collection and sorting of the biomolecule interaction related data are completed, because the data dimensions of the data sources are not uniform, the data dimensions are uniformly aligned firstly. The method comprises the following steps: taking the gene as a reference, taking intersection of genes in each data source data, aligning respective data dimensions of the remaining biomolecules to preliminarily establish a multi-level biomolecule network, and then carrying out one-hot coding on node associated data of each level.
Further, in the third step, the collection of biological sequence data of each hierarchy is continued to perform feature coding, and meanwhile, the data of the expression quantity (FPKM) of each biomolecule is collected to be spliced with the biological sequence features as supplementary features.
Further, in the fourth step, a multilevel biological network optimization algorithm based on a Keras deep learning algorithm is constructed; and determining a model structure according to the task of optimizing the network, and repeatedly calling an add () function to insert a convolutional layer, a maximum pooling layer, a full link layer and an activation function in a model container model created by the Sequential () function to construct a deep convolutional neural network model.
Further, in the fifth step, the parameters of each layer of the deep network are iteratively updated by a gradient descent algorithm through a training data set, the performance of the model obtained through training is evaluated by using a test data set, and after the expected performance is achieved, the training is stopped and the parameters of each layer are stored.
And further, optimizing the established multi-level biomolecular network by using the established network optimization model in the sixth step and the model parameters obtained by training in the fifth step, and supplementing the associated edges with high predicted probability values.
The invention has the beneficial effects that: the method for constructing and optimizing the multi-level biomolecular network based on deep learning disclosed by the invention constructs a multi-level biomolecular network optimization model by adopting a deep learning algorithm, breaks through the defect that the existing biomolecular network construction generally only focuses on the integration of 1-2 physiological-level molecular data, comprehensively considers the interaction relationship among multi-level biomolecular nodes, introduces characteristics such as biomolecular sequence information, expression quantity information and the like, and has the advantage of higher accuracy of network optimization.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof.
Drawings
For a better understanding of the objects, aspects and advantages of the present invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of an embodiment of the present invention;
FIG. 2 is a technical route diagram for collecting and processing multi-level biomolecular network data according to the present invention;
FIG. 3 is a flow chart of training a multi-level network optimization model according to the present invention;
fig. 4 is a network structure of a multi-level network optimization model in the present invention.
Detailed Description
The following embodiments of the present invention are provided by way of specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure herein. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustration only and not for the purpose of limiting the invention, shown in the drawings are schematic representations and not in the form of actual drawings; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
As shown in fig. 1 to 4, the method for constructing and optimizing a deep learning-based multi-level biomolecular network according to the present embodiment adopts the following technical solutions:
the method comprises the following steps: constructing a software and hardware environment suitable for Keras deep learning operation;
step two: collecting and preprocessing multi-level biomolecular network data, aligning multi-level networks, and preliminarily establishing the multi-level biomolecular network;
step three: collecting biological molecular characteristic data of each biological level, carrying out corresponding characteristic coding, and dividing a data set into a training set and a test set;
step four: according to a network optimization target and the existing characteristics, a network optimization model is constructed by adopting a deep learning algorithm;
step five: on the set deep learning Keras operating environment, training according to the model set up in the fourth step by using the prepared aligned multi-level biomolecule network data and the processed characteristic data of each level of biomolecules, solving the parameters of each layer in the model, stopping training when the expected effect is achieved, and storing the parameters of each layer;
step six: and deploying the trained neural network model for multi-level biomolecular network optimization.
Optionally, a software and hardware environment suitable for the deep learning operation of Keras is built in the step one as follows: the hardware is a server with one internal memory of 32GB and two NVIDIA Tesla K40C independent video cards with 12GB internal memory or higher configuration; the operating system of the software is Ubuntu16.04, a 64-bit operating system and other third-party libraries which Keras depends on.
Optionally, multi-level biomolecular correlation interaction data is collected from a plurality of public databases in step two. Specific data are shown in table 1:
TABLE 1 biomolecular interaction correlation data
Serial number Type of biomolecule Post-alignment sample dimension Data source
1 LncRNA-Disease 240×412 Lncrnadisease
2 LncRNA-miRNA 240×495 Starbase
3 LncRNA-Gene 240×12663 Lncrna2target
4 LncRNA-GO 240×6428 GeneRIF
5 miRNA-Disease 495×412 Hmdd
6 miRNA-Gene 495×12663 Mirtarbase
7 Gene-Gene 12663×12663 Biogrid
8 Gene-GO 12663×6428 Geneontology
9 Gene-Disease 12663×412 Disgenet
10 Isoform-Gene 31408×12663 ENCODE
After the collection and arrangement of the biomolecule interaction related data are completed, because the data dimensions of the data sources are not uniform, the data dimensions are uniformly aligned firstly. For example, Gene-Disease correlation data collected from the Gene database and isofomm-Gene correlation data collected from the ENCODE database, using genes as reference, intersecting genes in Gene-Disease and isofomm-Gene, and aligning the respective data dimensions to the rest of biomolecules to initially establish a multi-level biomolecule network. And then carrying out one-hot coding on the node associated data of each layer.
Optionally, in the third step, the collection of biological sequence data of each hierarchy for feature coding is continued, and meanwhile, the data of the expression quantity (FPKM) of each biomolecule is collected to be spliced with the biological sequence features as supplementary features.
Optionally, a multilevel biological network optimization algorithm based on the Keras deep learning algorithm is constructed in the fourth step. And determining a model structure according to the task of the optimization network, and repeatedly calling an add () function to insert a convolutional layer, a maximum pooling layer, a full connection layer and an activation function in a model container model created by a Sequential () function to construct a deep convolutional neural network model.
Optionally, in the fifth step, a training data set is used to iteratively update parameters of each layer of the deep network by using a gradient descent algorithm, a test data set is used to perform performance evaluation on the model obtained by training, and after expected performance is achieved, training is stopped and parameters of each layer are stored.
Optionally, the established network optimization model and the model parameters obtained by training in the fifth step are used for optimizing the established multi-level biomolecular network in the sixth step, and the associated edges with high predicted probability values are supplemented.
The specific implementation mode is that a multilevel biomolecular network optimization model of a convolutional neural network is constructed based on a Keras deep learning framework. In one embodiment, the data preparation for the multi-level biomolecular network is to collect multi-level biomolecular related data from each public database as indicated in table 1. Because the dimensions of the biological associated data sorted in each database are not uniform and the biological associated data have respective biological naming modes, the data are named uniformly according to the database naming mapping file. On the basis, unified alignment of data dimensions is performed, taking leave-Gene-Isoform as an example, the specific method is that Gene-leave associated data collected by a leave database and Isoform-Gene associated data collected by an ENCODE database are subjected to Gene-based intersection of genes in the Gene-leave and the Isoform-Gene, then respective data dimensions are aligned to the remaining biomolecules to obtain an aligned leave-Gene-Isoform network, and the data dimensions of the remaining biomolecule networks are also aligned in this way to initially establish a multilayer biomolecule network. And then carrying out one-hot coding on the node associated data of each layer.
Collecting only biomolecule-related data is not sufficient to effectively optimize a multi-level biomolecule network, so that collection of sequence data of each level is continued for feature coding, and meanwhile, data of expression quantity (FPKM) of each level of biomolecules is collected to be spliced with characteristics of the biological sequences as supplementary features. Here, (X, Y) is defined as sample data, taking isofom-Disease association network as an example, where X represents an association relationship (one-hot code) between isofom molecules and other biomolecules and characteristic concatenation of sequence information and expression amount of the isofom molecules, and Y represents an association relationship (one-hot code) in the isofom-Disease association network.
The convolutional neural network model mainly comprises a convolutional layer, a pooling layer, a full-connection layer and a sigmoid layer for classification.
A convolutional layer: each convolution layer is composed of a plurality of convolution kernels and has the function of extracting features, and the most important parameters of the layer are the size of the convolution kernels and the number of the convolution kernels. The convolution kernel is denoted as Cm×nThe size is m × n, and the convolution kernel shift step is denoted by s.
The convolution operation can be described as:
Figure BDA0002358156830000061
here, the number of the first and second electrodes,
Figure BDA0002358156830000062
is the output of the i-th convolution kernel of the l-1-layer network, xl-1As an input to the layer i network,
Figure BDA0002358156830000063
is the output of the jth convolution kernel of the current layer,
Figure BDA0002358156830000064
is a parameter of the jth convolution kernel,
Figure BDA0002358156830000065
fnolineare is the nonlinear operation performed on the convolved data for the bias parameters of the convolution corresponding to the convolution kernel. Common nonlinear activation functions are ReLU, sigmoid and tanh.
A pooling layer: the pooling layer is used for performing down-sampling on the output of the previous layer of the convolutional layer by utilizing a pooling core, namely, the data dimension of the output of the convolutional layer is reduced, and finally, the model parameter scale is reduced. The main parameters of the pooling layer are the size of the pooling core, the pooling core movement step and the pooling pattern. The pooling mode adopted by the invention is maximum pooling, namely the maximum numerical value in the pooling kernel range is taken as output, and the mode can greatly reduce the deviation of the estimated mean value caused by parameter errors of the convolutional layer. The maximum pooling operation can be described as:
Figure BDA0002358156830000066
here, the number of the first and second electrodes,
Figure BDA0002358156830000067
to characterize the area on the map covered by the pooling kernel,
Figure BDA0002358156830000068
all characteristic values within the range are indicated.
Non-linear layer: the data is non-linearly manipulated to increase the complexity of the network. Common nonlinear operations are ReLU, sigmoid, and tanh. The nonlinear operation employed by the present invention is ReLU, which can be described as:
f(x)=max(0,x)
full connection layer: the full connection layer is that every neuron of the previous layer network is connected with the next layer network. The output number of the last full connection layer is the same as the number of the categories in the data, that is, the output of the last full connection layer corresponds to each category label. This fully connected layer is used to build a supervised identification.
Sigmoid activation function: the Sigmoid function serves as an activation function for the output full connection layer, which can smoothly map the real number domain to the [0,1] space. The function value can be interpreted as a probability value belonging to the prediction class (the probability value ranges from 0 to 1), besides, the sigmoid function is monotonously increased and continuously derivable, and the derivative form is very simple, which is mostly used in the classification task.
The convolutional neural network model is divided into a forward process and a backward process. The forward process is that a category label is output from input data through a plurality of convolution operations, pooling operations, nonlinear operations and full connection, and is compared with a real category label to obtain an error as a loss function. The backward process is a process of backward propagation of errors, and the gradient of the errors relative to all the parameters of the full-connection layer, the non-linear layer, the pooling layer and the convolution layer is calculated layer by layer reversely from the obtained errors.
The convolutional neural network model is trained by adopting a gradient descent algorithm according to error back propagation to calculate the gradient of the error of each layer and update the parameters of each layer along the direction which can make the gradient descend fastest, and finally convergence is achieved, and the network training is shown in figure 3.
And then writing a network structure description file according to the determined network structure of the deep multi-level biomolecular network optimization model and the parameters of each layer of network, wherein the network structure is shown in figure 4. And finally, inputting the data of the test set into a trained optimization model, outputting a probability value of Isofom corresponding to the distance by the model, and selecting the association with high predicted probability value to supplement to the original network.
The invention relates to a method for constructing and optimizing a multi-level biomolecular network based on deep learning, which constructs a multi-level biomolecular network optimization model by adopting a deep learning algorithm, breaks through the defect that the existing biomolecular network construction generally only focuses on the integration of molecular data of 1-2 physiological layers, comprehensively considers the interaction relation among multi-level biomolecular nodes, introduces the characteristics of biomolecular sequence information, expression quantity information and the like, and has the advantage of higher accuracy of network optimization.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (6)

1. A biomolecular network construction and optimization method based on deep learning is characterized in that: the method comprises the following steps:
the method comprises the following steps: constructing a software and hardware environment suitable for Keras deep learning operation;
step two: collecting and preprocessing biomolecular network data of multiple layers, aligning the multiple layers of networks, and initially establishing a multiple layers of biomolecular networks; in the second step, multi-level biomolecule correlation interaction data are collected from a plurality of public databases, and the specific data comprise:
class of biomoleculesModel (II) Post-alignment sample dimension Data source LncRNA-Disease 240×412 Lncrnadisease LncRNA-miRNA 240×495 Starbase LncRNA-Gene 240×12663 Lncrna2target LncRNA-GO 240×6428 GeneRIF miRNA-Disease 495×412 Hmdd miRNA-Gene 495×12663 Mirtarbase Gene-Gene 12663×12663 Biogrid Gene-GO 12663×6428 Geneontology Gene-Disease 12663×412 Disgenet Isoform-Gene 31408×12663 ENCODE
After the collection and arrangement of the biomolecule interaction related data are completed, because the data dimensions of the data sources are not uniform, firstly, the data dimensions are uniformly aligned, and the method comprises the following steps: taking gene as reference, taking intersection of genes in each data source data, aligning each data dimension to the rest biomolecules, preliminarily establishing a multi-level biomolecule network, and performing one-hot coding on node associated data of each level
Step three: collecting biological molecular characteristic data of each biological level, carrying out corresponding characteristic coding, and dividing a data set into a training set and a test set;
step four: according to a network optimization target and the existing characteristics, a network optimization model is constructed by adopting a deep learning algorithm;
step five: on the set deep learning Keras operating environment, training according to the model set up in the fourth step by using the prepared aligned multi-level biomolecule network data and the processed characteristic data of each level of biomolecules, solving the parameters of each layer in the model, stopping training when the expected effect is achieved, and storing the parameters of each layer;
step six: and deploying the trained neural network model for multi-level biomolecular network optimization.
2. The biomolecular network construction and optimization method based on deep learning of claim 1, wherein: the software and hardware environment which is built in the step one and is suitable for Keras deep learning operation comprises the following steps: the hardware is a server with one internal memory of 32GB and two NVIDIATesla K40C independent video cards with 12GB internal memory or higher configuration; the operating system of the software is Ubuntu16.04, a 64-bit operating system and other third-party libraries which Keras depends on.
3. The deep learning-based biomolecular network construction and optimization method according to claim 1, wherein: and continuously collecting biological sequence data of each level in the third step for feature coding, and simultaneously collecting data of each biological molecule expression quantity and biological sequence feature splicing as supplementary features.
4. The biomolecular network construction and optimization method based on deep learning of claim 1, wherein: in the fourth step, a multilevel biological network optimization algorithm based on a Keras deep learning algorithm is constructed; and determining a model structure according to the task of optimizing the network, and repeatedly calling an add () function to insert a convolutional layer, a maximum pooling layer, a full link layer and an activation function in a model container model created by the Sequential () function to construct a deep convolutional neural network model.
5. The deep learning-based biomolecular network construction and optimization method according to claim 1, wherein: and step five, iteratively updating parameters of each layer of the deep network by using a gradient descent algorithm through a training data set, evaluating the performance of the model obtained by training through a test data set, stopping training and storing the parameters of each layer after the expected performance is achieved.
6. The biomolecular network construction and optimization method based on deep learning of claim 1, wherein: and step six, optimizing the established multi-level biomolecular network by using the established network optimization model and the model parameters obtained by training in the step five, and supplementing the associated edges with high predicted probability values.
CN202010013935.9A 2020-01-07 2020-01-07 Biomolecular network construction and optimization method based on deep learning Active CN111243658B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010013935.9A CN111243658B (en) 2020-01-07 2020-01-07 Biomolecular network construction and optimization method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010013935.9A CN111243658B (en) 2020-01-07 2020-01-07 Biomolecular network construction and optimization method based on deep learning

Publications (2)

Publication Number Publication Date
CN111243658A CN111243658A (en) 2020-06-05
CN111243658B true CN111243658B (en) 2022-07-22

Family

ID=70864793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010013935.9A Active CN111243658B (en) 2020-01-07 2020-01-07 Biomolecular network construction and optimization method based on deep learning

Country Status (1)

Country Link
CN (1) CN111243658B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695702B (en) * 2020-06-16 2023-11-03 腾讯科技(深圳)有限公司 Training method, device, equipment and storage medium of molecular generation model
CN115101119A (en) * 2022-06-27 2022-09-23 山东大学 Isoform function prediction system based on network embedding

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1492019A4 (en) * 2002-03-11 2005-11-16 Inst Med Molecular Design Inc Method of forming molecule function network
US20190095778A1 (en) * 2013-12-27 2019-03-28 Charles L. Buchanan Biological organism development system
CN105718744B (en) * 2016-01-25 2018-05-29 深圳大学 A kind of metabolism mass spectrum screening method and system based on deep learning
CN106021990B (en) * 2016-06-07 2019-06-25 广州麦仑信息科技有限公司 A method of biological gene is subjected to classification and Urine scent with specific character
EP3479304A4 (en) * 2016-07-04 2020-04-08 Deep Genomics Incorporated Systems and methods for generating and training convolutional neural networks using biological sequences and relevance scores derived from structural, biochemical, population and evolutionary data
CN105930686B (en) * 2016-07-05 2019-05-07 四川大学 A kind of secondary protein structure prediction method based on deep neural network
CN108198625B (en) * 2016-12-08 2021-07-20 推想医疗科技股份有限公司 Deep learning method and device for analyzing high-dimensional medical data
CA3098321A1 (en) * 2018-06-01 2019-12-05 Grail, Inc. Convolutional neural network systems and methods for data classification
CN109147936B (en) * 2018-07-26 2021-07-30 刘滨 Prediction method for association between non-coding RNA and diseases based on deep learning
CN109448795B (en) * 2018-11-12 2021-04-16 山东农业大学 Method and device for recognizing circRNA
CN110033822B (en) * 2019-03-29 2020-12-08 华中科技大学 Protein coding method and protein posttranslational modification site prediction method and system
CN110136773A (en) * 2019-04-02 2019-08-16 上海交通大学 A kind of phytoprotein interaction network construction method based on deep learning
CN110400600A (en) * 2019-08-01 2019-11-01 枣庄学院 A kind of disease associated prediction technique of miRNA- based on rotation forest algorithm

Also Published As

Publication number Publication date
CN111243658A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
Wang et al. Multi-objective feature selection based on artificial bee colony: An acceleration approach with variable sample size
US8594941B2 (en) System, method and apparatus for causal implication analysis in biological networks
Pirim et al. Clustering of high throughput gene expression data
Li et al. Computational approaches for detecting protein complexes from protein interaction networks: a survey
CN112364880B (en) Omics data processing method, device, equipment and medium based on graph neural network
US20090313189A1 (en) Method, system and apparatus for assembling and using biological knowledge
CN109887540A (en) A kind of drug targets interaction prediction method based on heterogeneous network insertion
CN111243658B (en) Biomolecular network construction and optimization method based on deep learning
CN112599187B (en) Method for predicting drug and target protein binding fraction based on double-flow neural network
CN110021340A (en) A kind of RNA secondary structure generator and its prediction technique based on convolutional neural networks and planning dynamic algorithm
Erfanian et al. Deep learning applications in single-cell genomics and transcriptomics data analysis
CN114463596A (en) Small sample image identification method, device and equipment of hypergraph neural network
Liu et al. A Network Hierarchy-Based method for functional module detection in protein–protein interaction networks
CN116543832A (en) disease-miRNA relationship prediction method, model and application based on multi-scale hypergraph convolution
Shen et al. Identifying protein complexes based on brainstorming strategy
CN115410642A (en) Biological relation network information modeling method and system
CN113223620A (en) Protein solubility prediction method based on multi-dimensional sequence embedding
CN114093414A (en) ADMET property prediction method of ER alpha antagonist based on MMS _ ResNet _1d model
CN114496109A (en) Ligand-receptor complex conformation affinity prediction method based on deep learning
Gong et al. Hs-dti: Drug-target interaction prediction based on hierarchical networks and multi-order sequence effect
CN114512188B (en) DNA binding protein recognition method based on improved protein sequence position specificity matrix
CN113838519B (en) Gene selection method and system based on adaptive gene interaction regularization elastic network model
CN116564435A (en) Counterfactual subgraph mining method based on stream generation model
Trajkovski Functional interpretation of gene expression data
CN113870950A (en) Identification system and identification method for key sRNA of rice blast fungus infected rice

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Yu Guoxian

Inventor after: Yan Yangyang

Inventor after: Wang Jun

Inventor before: Yan Yangyang

Inventor before: Yu Guoxian

Inventor before: Wang Jun

Inventor before: Zhou Guangjie

Inventor before: Wang Yuehui

Inventor before: Huang Qiuyue

Inventor before: Zeng Jie

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant