CN114242237A - Graph neural network-based prediction of miRNA-disease association - Google Patents

Graph neural network-based prediction of miRNA-disease association Download PDF

Info

Publication number
CN114242237A
CN114242237A CN202111557995.8A CN202111557995A CN114242237A CN 114242237 A CN114242237 A CN 114242237A CN 202111557995 A CN202111557995 A CN 202111557995A CN 114242237 A CN114242237 A CN 114242237A
Authority
CN
China
Prior art keywords
mirna
disease
similarity
neural network
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111557995.8A
Other languages
Chinese (zh)
Inventor
庞善臣
庄雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN202111557995.8A priority Critical patent/CN114242237A/en
Publication of CN114242237A publication Critical patent/CN114242237A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Public Health (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Biotechnology (AREA)
  • Bioethics (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides miRNA-disease association prediction based on graph neural networks. The traditional neural network model cannot process irregular non-European spatial data in the miRNA-disease associated prediction field. Therefore, the graph SAGE model is selected to extract the characteristics of the graph nodes. First we map the integrated disease similarity, integrated miRNA similarity to the same feature space and use known association data to construct miRNA-disease bipartite graphs as input for GraphSAGE. Information of neighbor nodes is aggregated through a GraphSAGE model, node feature representation is enriched, and effective data are provided for a downstream prediction task. And finally, performing weighted splicing on the learned miRNA and the potential characteristics of the diseases to serve as the input of a deep neural network prediction model, and obtaining the association score. The training model parameters are propagated back using the cross entropy loss function.

Description

Graph neural network-based prediction of miRNA-disease association
Technical Field
The invention relates to a feature extraction method, in particular to a local feature extraction method based on a graph neural network.
Background
Research shows that miRNA is used as non-coding RNA to participate in regulation and control of life activities of all levels and most pathological processes. Identifying miRNA related to disease is of great significance for diagnosis and treatment of disease, but traditional biological experiments have great uncertainty and are time-consuming and labor-consuming, and therefore require advanced intelligent computational models to solve the problem. At present, miRNA-disease associated prediction is mainly realized through a scoring model, a machine learning algorithm and a deep learning algorithm.
The traditional neural network model has great success in extracting the European space data, but is more laboursome for irregular non-European space data. Therefore, the graph neural network comes along, and the main idea of the graph neural network is to firstly find the neighbor nodes of the central node, and then gather the information carried by the neighbor nodes to the central node by a certain method. The characteristics learned through the thought show that not only the carried information is richer, but also the topological structure of the graph can be protected to a certain extent. Therefore, the method selects a GraphSAGE model to extract miRNA and disease characteristics.
Disclosure of Invention
In view of this, the invention proposes miRNA-disease association prediction based on graph neural networks. The invention utilizes the local information of the graph to represent the characteristics rich in learning for each miRNA and disease pair.
The technical scheme adopted by the invention is as follows:
A. and calculating initial feature representation of the disease based on the semantic similarity of the disease and the similarity of the Gaussian contour nucleus, and calculating initial feature representation of the miRNA based on the functional similarity of the miRNA and the similarity of the Gaussian contour nucleus.
B. Input data for constructing a neural network encoder based on initial feature representations of miRNA and disease.
C. And 4, extracting miRNA and potential disease features based on GraphSAGE.
D. And constructing a score prediction model based on the deep neural network.
E. And reversely propagating the training model parameters based on the cross entropy loss function.
And calculating initial feature representation of the disease based on the semantic similarity of the disease and the similarity of the Gaussian contour nucleus according to the weight A, and calculating the initial feature representation of the miRNA based on the functional similarity of the miRNA and the similarity of the Gaussian contour nucleus. The invention downloads known miRNA-disease associated data from an HMDD database, downloads disease semantic description in an MESH database and constructs a directed acyclic graph. Respectively calculating the semantic similarity of diseases and the functional similarity of miRNA through the constructed directed acyclic graph, calculating the Gaussian contour nuclear similarity of diseases and miRNA by using a known incidence matrix, and finally aggregating the two similarities.
And B, constructing input data of a neural network encoder based on the miRNA and the initial feature representation of the disease. Because the invention uses the DGL framework to construct the graph neural network model, the input data of the model is the characteristic representation of the graph and the nodes, and the characteristic representation requires the same embedding dimension. The initial characterization of the disease and miRNA was therefore characterized and unified into the same dimension.
And C, extracting miRNA and potential disease features based on GraphSAGE. The invention constructs a three-layer GraphSAGE network, wherein the first step of GraphSAGE is to select neighbor nodes of a central node, and the second step is to aggregate neighbor node information to the central node. In the aggregation information phase, the present invention uses a MEAN aggregator.
The deep neural network-based construction of a score prediction model of claim D. The invention constructs three layers of full-connection layer networks, two layers of hidden layers and Relu activating functions between the layers. And finally, predicting an output layer for the score, and outputting the score after being activated by using a sigmoid activation function.
The cross-entropy-loss-function-based back propagation training model parameters of claim E. The method calculates the difference between the predicted value and the label through the cross entropy loss function and uses the difference for a back propagation training model to obtain the optimal parameter.
The technical scheme provided by the invention has the beneficial effects that:
the method applies the GraphSAGE model to the field of miRNA-disease association prediction, predicts unknown miRNA-disease association by using less known association data, reduces the cost of the traditional biological experiment, and greatly reduces the association prediction time. The application of the miRNA in the real life has great significance, the potential related miRNA can be predicted for diseases, and reference significance is provided for diagnosis and treatment of diseases and research and development of new drugs.
Drawings
FIG. 1 is a schematic flow chart of a miRNA-disease association prediction method based on a graph neural network according to the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the following describes the feature detection method of the present invention in further detail with reference to the accompanying drawings.
And in the data preprocessing stage, the data are derived from an HMDD database and an MESH database, initial characteristic representation of disease and miRNA nodes is obtained through the two databases, and an miRNA-disease association bipartite graph is constructed.
And inputting the constructed bipartite graph into a GraphSAGE coder, and enriching the embedded representation of the central node by aggregating the information of local nodes on the topology of the graph, thereby improving the accuracy of model prediction.
And (3) performing weighted splicing on the learned miRNA and potential characteristics of the disease to form input data of a deep neural network. And predicting the association score of miRNA-diseases, calculating loss by using a cross entropy loss function, performing back propagation, and training model parameters.

Claims (6)

1. The miRNA-disease association prediction based on the graph neural network comprises the following parts:
A. and calculating initial feature representation of the disease based on the semantic similarity of the disease and the similarity of the Gaussian contour nucleus, and calculating initial feature representation of the miRNA based on the functional similarity of the miRNA and the similarity of the Gaussian contour nucleus.
B. Input data for constructing a neural network encoder based on initial feature representations of miRNA and disease.
C. And 4, extracting miRNA and potential disease features based on GraphSAGE.
D. And constructing a score prediction model based on the deep neural network.
E. And reversely propagating the training model parameters based on the cross entropy loss function.
2. The method of claim 1, wherein the calculating the initial feature representation of the disease is based on semantic similarity of the disease and gaussian contour kernel similarity, and wherein the calculating the initial feature representation of the miRNA is based on functional similarity of the miRNA and gaussian contour kernel similarity. The invention downloads known miRNA-disease associated data from an HMDD database, downloads disease semantic description in an MESH database and constructs a directed acyclic graph. Respectively calculating the semantic similarity of diseases and the functional similarity of miRNA through the constructed directed acyclic graph, calculating the Gaussian contour nuclear similarity of diseases and miRNA by using a known incidence matrix, and finally aggregating the two similarities.
3. The initial characterization representation based on miRNA and disease of claim 1 constructing a graph neural network encoder input data. Because the invention uses the DGL framework to construct the graph neural network model, the input data of the model is the characteristic representation of the graph and the nodes, and the characteristic representation requires the same embedding dimension. The initial characterization of the disease and miRNA was therefore characterized and unified into the same dimension.
4. The GraphSAGE-based miRNA, disease potential feature extraction of claim 1. The invention constructs a three-layer GraphSAGE network, wherein the first step of GraphSAGE is to select neighbor nodes of a central node, and the second step is to aggregate neighbor node information to the central node. In the aggregation information phase, the present invention uses a MEAN aggregator.
5. The deep neural network-based construction of a score prediction model of claim 1. The invention constructs three layers of full-connection layer networks, two layers of hidden layers and Relu activating functions between the layers. And finally, predicting an output layer for the score, and outputting the score after being activated by using a sigmoid activation function.
6. The cross-entropy-loss-function-based back propagation training model parameters of claim 1. The method calculates the difference between the predicted value and the label through the cross entropy loss function and uses the difference for a back propagation training model to obtain the optimal parameter.
CN202111557995.8A 2021-12-20 2021-12-20 Graph neural network-based prediction of miRNA-disease association Pending CN114242237A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111557995.8A CN114242237A (en) 2021-12-20 2021-12-20 Graph neural network-based prediction of miRNA-disease association

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111557995.8A CN114242237A (en) 2021-12-20 2021-12-20 Graph neural network-based prediction of miRNA-disease association

Publications (1)

Publication Number Publication Date
CN114242237A true CN114242237A (en) 2022-03-25

Family

ID=80758750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111557995.8A Pending CN114242237A (en) 2021-12-20 2021-12-20 Graph neural network-based prediction of miRNA-disease association

Country Status (1)

Country Link
CN (1) CN114242237A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115798598A (en) * 2022-11-16 2023-03-14 大连海事大学 Hypergraph-based miRNA-disease association prediction model and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115798598A (en) * 2022-11-16 2023-03-14 大连海事大学 Hypergraph-based miRNA-disease association prediction model and method
CN115798598B (en) * 2022-11-16 2023-11-14 大连海事大学 Hypergraph-based miRNA-disease association prediction model and method

Similar Documents

Publication Publication Date Title
CN108510741B (en) Conv1D-LSTM neural network structure-based traffic flow prediction method
Wu et al. Evolving RBF neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm
CN104751842B (en) The optimization method and system of deep neural network
CN112070277B (en) Medicine-target interaction prediction method based on hypergraph neural network
CN110070715A (en) A kind of road traffic flow prediction method based on Conv1D-NLSTMs neural network structure
CN106021990B (en) A method of biological gene is subjected to classification and Urine scent with specific character
CN110164129B (en) Single-intersection multi-lane traffic flow prediction method based on GERNN
CN110458336A (en) A kind of net based on deep learning about vehicle supply and demand prediction method
CN109697512B (en) Personal data analysis method based on Bayesian network and computer storage medium
CN110570035B (en) People flow prediction system for simultaneously modeling space-time dependency and daily flow dependency
CN112949896B (en) Time sequence prediction method based on fusion sequence decomposition and space-time convolution
Zhu et al. A Novel Traffic Flow Forecasting Method Based on RNN‐GCN and BRB
CN109523021A (en) A kind of dynamic network Structure Prediction Methods based on long memory network in short-term
CN113780002A (en) Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning
CN111860787A (en) Short-term prediction method and device for coupling directed graph structure flow data containing missing data
CN115346372B (en) Multi-component fusion traffic flow prediction method based on graph neural network
CN109558484A (en) Electric power customer service work order emotion quantitative analysis method based on similarity word order matrix
CN112463987A (en) Chinese classical garden knowledge graph completion and cognitive reasoning method
CN115952424A (en) Graph convolution neural network clustering method based on multi-view structure
Qi et al. FedAGCN: A traffic flow prediction framework based on federated learning and Asynchronous Graph Convolutional Network
Ishak et al. Mining temporal reservoir data using sliding window technique
CN114242237A (en) Graph neural network-based prediction of miRNA-disease association
Liu Language database construction method based on big data and deep learning
CN116993043A (en) Power equipment fault tracing method and device
Feng et al. Link prediction based on orbit counting and graph auto-encoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication