CN112364983B - Protein interaction network node classification method based on multichannel graph convolutional neural network - Google Patents

Protein interaction network node classification method based on multichannel graph convolutional neural network Download PDF

Info

Publication number
CN112364983B
CN112364983B CN202011260336.3A CN202011260336A CN112364983B CN 112364983 B CN112364983 B CN 112364983B CN 202011260336 A CN202011260336 A CN 202011260336A CN 112364983 B CN112364983 B CN 112364983B
Authority
CN
China
Prior art keywords
channel
protein interaction
proteins
neural network
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011260336.3A
Other languages
Chinese (zh)
Other versions
CN112364983A (en
Inventor
杨旭华
马钢峰
徐新黎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202011260336.3A priority Critical patent/CN112364983B/en
Publication of CN112364983A publication Critical patent/CN112364983A/en
Application granted granted Critical
Publication of CN112364983B publication Critical patent/CN112364983B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/20Protein or domain folding

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computing Systems (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Biotechnology (AREA)
  • Chemical & Material Sciences (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A protein interaction network node classification method based on a multichannel graph convolutional neural network is characterized in that a classification effect is improved by combining high-order information, a protein interaction network is constructed according to protein interaction data, a multichannel graph convolutional neural network model is constructed, the model comprises a two-layer structure, semi-supervised classification is completed on the basis of data of a few labeled proteins by using different graph convolution kernel combinations, and the type of unlabeled proteins is obtained. The invention extracts the high-order information of the protein interaction network by combining the multichannel high-order neighborhood graph convolution neural networks, and improves the classification precision of the protein under lower operation cost.

Description

Protein interaction network node classification method based on multichannel graph convolutional neural network
Technical Field
The invention relates to the field of protein classification, in particular to a protein interaction network node classification method based on a multichannel graph convolutional neural network.
Background
Proteins are the material basis of life, and almost all the components of the human body are not separated from proteins, and have long been the focus of research. Proteins often participate in vital processes such as cellular metabolism, regulation of gene expression, etc. through interactions, and on the basis of this, protein interaction networks are formed. The protein interaction network visualizes the relation existing between proteins through the network, thereby facilitating research and analysis, and playing a very important role in understanding biological composition and some disease causes from a molecular level.
The graph rolling network aims at performing convolution analysis on irregular complex network data. In semi-supervised learning, the graph convolution can obtain better classification performance through a few labeled training sets, and the training speed is higher, so that the method is widely applied to various network structure data sets. However, the aggregation of the high-order neighborhood information can cause the feature to be too smooth, so that the common graph rolling network can only aggregate the 2-3-order neighborhood feature information, but the relation among proteins in the protein interaction network is relatively tight, and only the aggregation of the low-order information is insufficient. Meanwhile, protein interaction network data are often huge and complex, so that the higher-order neighborhood information is captured under the condition of controlling the network depth, namely, fewer parameters, and better protein classification performance is obtained.
Disclosure of Invention
In order to solve the problem of larger deviation of the existing protein interaction network classification result, the invention provides a protein interaction network node classification method based on a multichannel graph convolutional neural network, which is higher in accuracy.
The invention solves the technical problems by adopting the specific technical steps that:
a protein interaction network node classification method based on a multichannel graph convolutional neural network comprises the following steps:
step one: constructing a protein interaction network model G (V, E) according to protein interaction data, wherein V is a node, E is a continuous edge, an adjacent matrix is represented by A, one node represents a protein, and a node set V= { V 1 ,v 2 ,...,v N -represents a collection of proteins; if two proteins have interaction, a connecting edge is arranged between the two corresponding nodes; n represents the number of proteins, each protein initial feature vector is represented by a one-hot vector, the identity matrix X is the combination of all the protein initial feature vectors, C is the class number of the proteins, a small part of the proteins are known to have class labels, and a large part of the proteins have no class labels;
step two: constructing a multi-channel graph convolutional neural network model, wherein the model comprises a two-layer structure, the first layer comprises k channels, and an i-order convolutional kernel SGC is used on the ith channel i I e {1,2,., k }; the second layer contains k three-dimensional convolution kernels, where the (k+1-j) th order convolution kernel SGC is used on the jth channel k+1-j J e {1,2,.. K }, the i-th channel of the network model consists of the i-th channel of the first layer and the i-th channel of the second layer, wherein the output of the i-th channel of the first layer is the input of the i-th channel of the second layer;
step three: computing an i-order convolution kernel
Wherein GCN represents a graph roll-up neural network without an activation function, wherein i is more than or equal to 1 and less than or equal to k;
step four: computing an output of an ith channel of a multi-channel graph convolutional neural network model
y(i)=SGC (k+1-i) (f(SGC i (X,A)),A),
Wherein i is more than or equal to 1 and less than or equal to k, and f is a relu function;
step five: model output for computing a multi-channel graph convolutional neural network model
Wherein g is a softmax activation function;
step six: calculating a loss value for a semi-supervised classification
Where μ is the labeled node set, Y ij Is a node with a classification label;
step seven: repeating the steps three to six until the loss value converges, and taking the obtained Q as the classification result of the protein interaction network.
The technical conception of the invention is as follows: according to the invention, based on the shallow neural network, different convolution arrangements are combined by using multiple channels while high-order information is aggregated, so that the classification performance of proteins in the protein interaction network is effectively improved, and the classification accuracy is improved.
The beneficial effects of the invention are as follows: the protein interaction network is processed through the combination of the multi-channel high-order neighborhood graph convolution information, and the classification precision of the protein is improved under lower operation cost.
Drawings
Fig. 1 is a schematic diagram of a neural network model, for convenience of understanding, let k=3, input features into different channels for convolution, accumulate and activate the obtained results through two-layer graph convolution, and finally obtain an output result.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1, a protein interaction network node classification method based on a multi-channel graph convolutional neural network includes the following steps:
step one: constructing a protein interaction network model G (V, E) according to protein interaction data, wherein V is a node, E is a continuous edge, an adjacent matrix is represented by A, one node represents a protein, and a node set V= { V 1 ,v 2 ,...,v N -represents a collection of proteins; if two proteins have interaction, a connecting edge is arranged between the two corresponding nodes; n represents the number of proteins, each protein initial feature vector is represented by a one-hot vector, the identity matrix X is the combination of all the protein initial feature vectors, C is the class number of the proteins, a small part of the proteins are known to have class labels, and a large part of the proteins have no class labels;
step two: constructing a multi-channel graph convolutional neural network model, wherein the model comprises a two-layer structure, the first layer comprises k channels, and an i-order convolutional kernel SGC is used on the ith channel i I e {1,2,., k }; the second layer contains k three-dimensional convolution kernels, where the (k+1-j) th order convolution kernel SGC is used on the jth channel k+1-j J e {1,2,.. K }, the i-th channel of the network model consists of the i-th channel of the first layer and the i-th channel of the second layer, wherein the output of the i-th channel of the first layer is the input of the i-th channel of the second layer;
step three: computing an i-order convolution kernel
Wherein GCN represents a graph roll-up neural network without an activation function, wherein i is more than or equal to 1 and less than or equal to k;
step four: computing an output of an ith channel of a multi-channel graph convolutional neural network model
y(i)=SGC (k+1-i) (f(SGC i (X,A)),A),
Wherein i is more than or equal to 1 and less than or equal to k, and f is a relu function;
step five: model output for computing a multi-channel graph convolutional neural network model
Where g is a softmax activation function, and the model is shown in fig. 1;
step six: calculating a loss value for a semi-supervised classification
Where μ is the labeled node set, Y ij Is a node with a classification label;
step seven: repeating the steps three to six until the loss value converges, and taking the obtained Q as the classification result of the protein interaction network.
As described above, the specific implementation steps implemented by this patent make the present invention clearer. Any modifications and changes made to the present invention fall within the spirit of the invention and the scope of the appended claims.

Claims (1)

1. A protein interaction network node classification method based on a multichannel graph convolutional neural network is characterized by comprising the following steps of: the method comprises the following steps:
step one: constructing a protein interaction network model G (V, E) according to protein interaction data, wherein V is a node, E is a continuous edge, an adjacent matrix is represented by A, one node represents a protein, and a node set V= { V 1 ,v 2 ,...,v N -represents a collection of proteins; if two proteins have interaction, a connecting edge is arranged between the two corresponding nodes; n representsThe number of proteins, wherein each protein initial characteristic vector is represented by a one-hot vector, an identity matrix X is a combination of all protein initial characteristic vectors, C is the class number of proteins, a small part of proteins are known to have class labels, and a large part of proteins have no class labels;
step two: constructing a multi-channel graph convolutional neural network model, wherein the model comprises a two-layer structure, the first layer comprises k channels, and an i-order convolutional kernel SGC is used on the ith channel i I e {1,2,., k }; the second layer contains k three-dimensional convolution kernels, where the (k+1-j) th order convolution kernel SGC is used on the jth channel k+1-j J e {1,2,.. K }, the i-th channel of the network model consists of the i-th channel of the first layer and the i-th channel of the second layer, wherein the output of the i-th channel of the first layer is the input of the i-th channel of the second layer;
step three: computing an i-order convolution kernel
Wherein GCN represents a graph roll-up neural network without an activation function, wherein i is more than or equal to 1 and less than or equal to k;
step four: computing an output of an ith channel of a multi-channel graph convolutional neural network model
y(i)=SGC (k+1-i) (f(SGC i (X,A)),A),
Wherein i is more than or equal to 1 and less than or equal to k, and f is a relu function;
step five: model output for computing a multi-channel graph convolutional neural network model
Wherein g is a softmax activation function;
step six: calculating a loss value for a semi-supervised classification
Where μ is the labeled node set, Y ij Is a node with a classification label;
step seven: repeating the steps three to six until the loss value converges, and taking the obtained Q as the classification result of the protein interaction network.
CN202011260336.3A 2020-11-12 2020-11-12 Protein interaction network node classification method based on multichannel graph convolutional neural network Active CN112364983B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011260336.3A CN112364983B (en) 2020-11-12 2020-11-12 Protein interaction network node classification method based on multichannel graph convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011260336.3A CN112364983B (en) 2020-11-12 2020-11-12 Protein interaction network node classification method based on multichannel graph convolutional neural network

Publications (2)

Publication Number Publication Date
CN112364983A CN112364983A (en) 2021-02-12
CN112364983B true CN112364983B (en) 2024-03-22

Family

ID=74515357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011260336.3A Active CN112364983B (en) 2020-11-12 2020-11-12 Protein interaction network node classification method based on multichannel graph convolutional neural network

Country Status (1)

Country Link
CN (1) CN112364983B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113241114A (en) * 2021-03-24 2021-08-10 辽宁大学 LncRNA-protein interaction prediction method based on graph convolution neural network
CN113053457B (en) * 2021-03-25 2022-04-05 湖南大学 Drug target prediction method based on multi-pass graph convolution neural network
CN113539381B (en) * 2021-07-16 2023-09-05 中国海洋大学 Molecular dynamics result analysis method based on residue interaction and PEN
CN115312119B (en) * 2022-10-09 2023-04-07 之江实验室 Method and system for identifying protein structural domain based on protein three-dimensional structure image

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522953A (en) * 2018-11-13 2019-03-26 北京师范大学 The method classified based on internet startup disk algorithm and CNN to graph structure data
CN109977232A (en) * 2019-03-06 2019-07-05 中南大学 A kind of figure neural network visual analysis method for leading figure based on power
CN110889015A (en) * 2019-10-31 2020-03-17 天津工业大学 Independent decoupling convolutional neural network characterization algorithm for graph data
CN111563533A (en) * 2020-04-08 2020-08-21 华南理工大学 Test subject classification method based on graph convolution neural network fusion of multiple human brain maps
CN111916144A (en) * 2020-07-27 2020-11-10 西安电子科技大学 Protein classification method based on self-attention neural network and coarsening algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522953A (en) * 2018-11-13 2019-03-26 北京师范大学 The method classified based on internet startup disk algorithm and CNN to graph structure data
CN109977232A (en) * 2019-03-06 2019-07-05 中南大学 A kind of figure neural network visual analysis method for leading figure based on power
CN110889015A (en) * 2019-10-31 2020-03-17 天津工业大学 Independent decoupling convolutional neural network characterization algorithm for graph data
CN111563533A (en) * 2020-04-08 2020-08-21 华南理工大学 Test subject classification method based on graph convolution neural network fusion of multiple human brain maps
CN111916144A (en) * 2020-07-27 2020-11-10 西安电子科技大学 Protein classification method based on self-attention neural network and coarsening algorithm

Also Published As

Publication number Publication date
CN112364983A (en) 2021-02-12

Similar Documents

Publication Publication Date Title
CN112364983B (en) Protein interaction network node classification method based on multichannel graph convolutional neural network
Zheng et al. PAC-Bayesian framework based drop-path method for 2D discriminative convolutional network pruning
CN107729819B (en) Face labeling method based on sparse fully-convolutional neural network
LeCun et al. Deep learning
Su et al. Dynamic group convolution for accelerating convolutional neural networks
CN108898213B (en) Adaptive activation function parameter adjusting method for deep neural network
CN108304826A (en) Facial expression recognizing method based on convolutional neural networks
CN109063719B (en) Image classification method combining structure similarity and class information
CN107798385B (en) Sparse connection method of recurrent neural network based on block tensor decomposition
CN111414461A (en) Intelligent question-answering method and system fusing knowledge base and user modeling
CN112100514B (en) Friend recommendation method based on global attention mechanism representation learning
CN113221694B (en) Action recognition method
CN111143567B (en) Comment emotion analysis method based on improved neural network
CN113378938B (en) Edge transform graph neural network-based small sample image classification method and system
Zhang et al. Lenet-5 convolution neural network with mish activation function and fixed memory step gradient descent method
CN111931807A (en) Small sample class incremental learning method based on feature space combination
CN114896512B (en) Learner preference and group preference-based learning resource recommendation method and system
Wang et al. Semi-supervised Gaussian process latent variable model with pairwise constraints
Sun et al. Low-consumption neuromorphic memristor architecture based on convolutional neural networks
CN113177417A (en) Trigger word recognition method based on hybrid neural network and multi-stage attention mechanism
Zhang et al. Transfer learning from unlabeled data via neural networks
CN117854597A (en) Track prediction method based on contrast learning feature dimension reduction
CN110866403A (en) End-to-end conversation state tracking method and system based on convolution cycle entity network
Yang et al. Design of convolutional neural network based on tree fork module
CN108280511A (en) A method of network access data is carried out based on convolutional network and is handled

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant