CN112364983B - Protein interaction network node classification method based on multichannel graph convolutional neural network - Google Patents
Protein interaction network node classification method based on multichannel graph convolutional neural network Download PDFInfo
- Publication number
- CN112364983B CN112364983B CN202011260336.3A CN202011260336A CN112364983B CN 112364983 B CN112364983 B CN 112364983B CN 202011260336 A CN202011260336 A CN 202011260336A CN 112364983 B CN112364983 B CN 112364983B
- Authority
- CN
- China
- Prior art keywords
- channel
- protein interaction
- proteins
- neural network
- convolutional neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000006916 protein interaction Effects 0.000 title claims abstract description 27
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 18
- 238000000034 method Methods 0.000 title claims abstract description 11
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 39
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 39
- 238000013528 artificial neural network Methods 0.000 claims abstract description 5
- 239000013598 vector Substances 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000003993 interaction Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract 1
- 239000000284 extract Substances 0.000 abstract 1
- 230000006870 function Effects 0.000 description 6
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000005096 rolling process Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000019522 cellular metabolic process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000014493 regulation of gene expression Effects 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/20—Protein or domain folding
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computing Systems (AREA)
- Crystallography & Structural Chemistry (AREA)
- Biotechnology (AREA)
- Chemical & Material Sciences (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A protein interaction network node classification method based on a multichannel graph convolutional neural network is characterized in that a classification effect is improved by combining high-order information, a protein interaction network is constructed according to protein interaction data, a multichannel graph convolutional neural network model is constructed, the model comprises a two-layer structure, semi-supervised classification is completed on the basis of data of a few labeled proteins by using different graph convolution kernel combinations, and the type of unlabeled proteins is obtained. The invention extracts the high-order information of the protein interaction network by combining the multichannel high-order neighborhood graph convolution neural networks, and improves the classification precision of the protein under lower operation cost.
Description
Technical Field
The invention relates to the field of protein classification, in particular to a protein interaction network node classification method based on a multichannel graph convolutional neural network.
Background
Proteins are the material basis of life, and almost all the components of the human body are not separated from proteins, and have long been the focus of research. Proteins often participate in vital processes such as cellular metabolism, regulation of gene expression, etc. through interactions, and on the basis of this, protein interaction networks are formed. The protein interaction network visualizes the relation existing between proteins through the network, thereby facilitating research and analysis, and playing a very important role in understanding biological composition and some disease causes from a molecular level.
The graph rolling network aims at performing convolution analysis on irregular complex network data. In semi-supervised learning, the graph convolution can obtain better classification performance through a few labeled training sets, and the training speed is higher, so that the method is widely applied to various network structure data sets. However, the aggregation of the high-order neighborhood information can cause the feature to be too smooth, so that the common graph rolling network can only aggregate the 2-3-order neighborhood feature information, but the relation among proteins in the protein interaction network is relatively tight, and only the aggregation of the low-order information is insufficient. Meanwhile, protein interaction network data are often huge and complex, so that the higher-order neighborhood information is captured under the condition of controlling the network depth, namely, fewer parameters, and better protein classification performance is obtained.
Disclosure of Invention
In order to solve the problem of larger deviation of the existing protein interaction network classification result, the invention provides a protein interaction network node classification method based on a multichannel graph convolutional neural network, which is higher in accuracy.
The invention solves the technical problems by adopting the specific technical steps that:
a protein interaction network node classification method based on a multichannel graph convolutional neural network comprises the following steps:
step one: constructing a protein interaction network model G (V, E) according to protein interaction data, wherein V is a node, E is a continuous edge, an adjacent matrix is represented by A, one node represents a protein, and a node set V= { V 1 ,v 2 ,...,v N -represents a collection of proteins; if two proteins have interaction, a connecting edge is arranged between the two corresponding nodes; n represents the number of proteins, each protein initial feature vector is represented by a one-hot vector, the identity matrix X is the combination of all the protein initial feature vectors, C is the class number of the proteins, a small part of the proteins are known to have class labels, and a large part of the proteins have no class labels;
step two: constructing a multi-channel graph convolutional neural network model, wherein the model comprises a two-layer structure, the first layer comprises k channels, and an i-order convolutional kernel SGC is used on the ith channel i I e {1,2,., k }; the second layer contains k three-dimensional convolution kernels, where the (k+1-j) th order convolution kernel SGC is used on the jth channel k+1-j J e {1,2,.. K }, the i-th channel of the network model consists of the i-th channel of the first layer and the i-th channel of the second layer, wherein the output of the i-th channel of the first layer is the input of the i-th channel of the second layer;
step three: computing an i-order convolution kernel
Wherein GCN represents a graph roll-up neural network without an activation function, wherein i is more than or equal to 1 and less than or equal to k;
step four: computing an output of an ith channel of a multi-channel graph convolutional neural network model
y(i)=SGC (k+1-i) (f(SGC i (X,A)),A),
Wherein i is more than or equal to 1 and less than or equal to k, and f is a relu function;
step five: model output for computing a multi-channel graph convolutional neural network model
Wherein g is a softmax activation function;
step six: calculating a loss value for a semi-supervised classification
Where μ is the labeled node set, Y ij Is a node with a classification label;
step seven: repeating the steps three to six until the loss value converges, and taking the obtained Q as the classification result of the protein interaction network.
The technical conception of the invention is as follows: according to the invention, based on the shallow neural network, different convolution arrangements are combined by using multiple channels while high-order information is aggregated, so that the classification performance of proteins in the protein interaction network is effectively improved, and the classification accuracy is improved.
The beneficial effects of the invention are as follows: the protein interaction network is processed through the combination of the multi-channel high-order neighborhood graph convolution information, and the classification precision of the protein is improved under lower operation cost.
Drawings
Fig. 1 is a schematic diagram of a neural network model, for convenience of understanding, let k=3, input features into different channels for convolution, accumulate and activate the obtained results through two-layer graph convolution, and finally obtain an output result.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1, a protein interaction network node classification method based on a multi-channel graph convolutional neural network includes the following steps:
step one: constructing a protein interaction network model G (V, E) according to protein interaction data, wherein V is a node, E is a continuous edge, an adjacent matrix is represented by A, one node represents a protein, and a node set V= { V 1 ,v 2 ,...,v N -represents a collection of proteins; if two proteins have interaction, a connecting edge is arranged between the two corresponding nodes; n represents the number of proteins, each protein initial feature vector is represented by a one-hot vector, the identity matrix X is the combination of all the protein initial feature vectors, C is the class number of the proteins, a small part of the proteins are known to have class labels, and a large part of the proteins have no class labels;
step two: constructing a multi-channel graph convolutional neural network model, wherein the model comprises a two-layer structure, the first layer comprises k channels, and an i-order convolutional kernel SGC is used on the ith channel i I e {1,2,., k }; the second layer contains k three-dimensional convolution kernels, where the (k+1-j) th order convolution kernel SGC is used on the jth channel k+1-j J e {1,2,.. K }, the i-th channel of the network model consists of the i-th channel of the first layer and the i-th channel of the second layer, wherein the output of the i-th channel of the first layer is the input of the i-th channel of the second layer;
step three: computing an i-order convolution kernel
Wherein GCN represents a graph roll-up neural network without an activation function, wherein i is more than or equal to 1 and less than or equal to k;
step four: computing an output of an ith channel of a multi-channel graph convolutional neural network model
y(i)=SGC (k+1-i) (f(SGC i (X,A)),A),
Wherein i is more than or equal to 1 and less than or equal to k, and f is a relu function;
step five: model output for computing a multi-channel graph convolutional neural network model
Where g is a softmax activation function, and the model is shown in fig. 1;
step six: calculating a loss value for a semi-supervised classification
Where μ is the labeled node set, Y ij Is a node with a classification label;
step seven: repeating the steps three to six until the loss value converges, and taking the obtained Q as the classification result of the protein interaction network.
As described above, the specific implementation steps implemented by this patent make the present invention clearer. Any modifications and changes made to the present invention fall within the spirit of the invention and the scope of the appended claims.
Claims (1)
1. A protein interaction network node classification method based on a multichannel graph convolutional neural network is characterized by comprising the following steps of: the method comprises the following steps:
step one: constructing a protein interaction network model G (V, E) according to protein interaction data, wherein V is a node, E is a continuous edge, an adjacent matrix is represented by A, one node represents a protein, and a node set V= { V 1 ,v 2 ,...,v N -represents a collection of proteins; if two proteins have interaction, a connecting edge is arranged between the two corresponding nodes; n representsThe number of proteins, wherein each protein initial characteristic vector is represented by a one-hot vector, an identity matrix X is a combination of all protein initial characteristic vectors, C is the class number of proteins, a small part of proteins are known to have class labels, and a large part of proteins have no class labels;
step two: constructing a multi-channel graph convolutional neural network model, wherein the model comprises a two-layer structure, the first layer comprises k channels, and an i-order convolutional kernel SGC is used on the ith channel i I e {1,2,., k }; the second layer contains k three-dimensional convolution kernels, where the (k+1-j) th order convolution kernel SGC is used on the jth channel k+1-j J e {1,2,.. K }, the i-th channel of the network model consists of the i-th channel of the first layer and the i-th channel of the second layer, wherein the output of the i-th channel of the first layer is the input of the i-th channel of the second layer;
step three: computing an i-order convolution kernel
Wherein GCN represents a graph roll-up neural network without an activation function, wherein i is more than or equal to 1 and less than or equal to k;
step four: computing an output of an ith channel of a multi-channel graph convolutional neural network model
y(i)=SGC (k+1-i) (f(SGC i (X,A)),A),
Wherein i is more than or equal to 1 and less than or equal to k, and f is a relu function;
step five: model output for computing a multi-channel graph convolutional neural network model
Wherein g is a softmax activation function;
step six: calculating a loss value for a semi-supervised classification
Where μ is the labeled node set, Y ij Is a node with a classification label;
step seven: repeating the steps three to six until the loss value converges, and taking the obtained Q as the classification result of the protein interaction network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011260336.3A CN112364983B (en) | 2020-11-12 | 2020-11-12 | Protein interaction network node classification method based on multichannel graph convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011260336.3A CN112364983B (en) | 2020-11-12 | 2020-11-12 | Protein interaction network node classification method based on multichannel graph convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112364983A CN112364983A (en) | 2021-02-12 |
CN112364983B true CN112364983B (en) | 2024-03-22 |
Family
ID=74515357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011260336.3A Active CN112364983B (en) | 2020-11-12 | 2020-11-12 | Protein interaction network node classification method based on multichannel graph convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112364983B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113241114A (en) * | 2021-03-24 | 2021-08-10 | 辽宁大学 | LncRNA-protein interaction prediction method based on graph convolution neural network |
CN113053457B (en) * | 2021-03-25 | 2022-04-05 | 湖南大学 | Drug target prediction method based on multi-pass graph convolution neural network |
CN113539381B (en) * | 2021-07-16 | 2023-09-05 | 中国海洋大学 | Molecular dynamics result analysis method based on residue interaction and PEN |
CN115312119B (en) * | 2022-10-09 | 2023-04-07 | 之江实验室 | Method and system for identifying protein structural domain based on protein three-dimensional structure image |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522953A (en) * | 2018-11-13 | 2019-03-26 | 北京师范大学 | The method classified based on internet startup disk algorithm and CNN to graph structure data |
CN109977232A (en) * | 2019-03-06 | 2019-07-05 | 中南大学 | A kind of figure neural network visual analysis method for leading figure based on power |
CN110889015A (en) * | 2019-10-31 | 2020-03-17 | 天津工业大学 | Independent decoupling convolutional neural network characterization algorithm for graph data |
CN111563533A (en) * | 2020-04-08 | 2020-08-21 | 华南理工大学 | Test subject classification method based on graph convolution neural network fusion of multiple human brain maps |
CN111916144A (en) * | 2020-07-27 | 2020-11-10 | 西安电子科技大学 | Protein classification method based on self-attention neural network and coarsening algorithm |
-
2020
- 2020-11-12 CN CN202011260336.3A patent/CN112364983B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522953A (en) * | 2018-11-13 | 2019-03-26 | 北京师范大学 | The method classified based on internet startup disk algorithm and CNN to graph structure data |
CN109977232A (en) * | 2019-03-06 | 2019-07-05 | 中南大学 | A kind of figure neural network visual analysis method for leading figure based on power |
CN110889015A (en) * | 2019-10-31 | 2020-03-17 | 天津工业大学 | Independent decoupling convolutional neural network characterization algorithm for graph data |
CN111563533A (en) * | 2020-04-08 | 2020-08-21 | 华南理工大学 | Test subject classification method based on graph convolution neural network fusion of multiple human brain maps |
CN111916144A (en) * | 2020-07-27 | 2020-11-10 | 西安电子科技大学 | Protein classification method based on self-attention neural network and coarsening algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN112364983A (en) | 2021-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112364983B (en) | Protein interaction network node classification method based on multichannel graph convolutional neural network | |
Zheng et al. | PAC-Bayesian framework based drop-path method for 2D discriminative convolutional network pruning | |
CN107729819B (en) | Face labeling method based on sparse fully-convolutional neural network | |
LeCun et al. | Deep learning | |
Su et al. | Dynamic group convolution for accelerating convolutional neural networks | |
CN108898213B (en) | Adaptive activation function parameter adjusting method for deep neural network | |
CN108304826A (en) | Facial expression recognizing method based on convolutional neural networks | |
CN109063719B (en) | Image classification method combining structure similarity and class information | |
CN107798385B (en) | Sparse connection method of recurrent neural network based on block tensor decomposition | |
CN111414461A (en) | Intelligent question-answering method and system fusing knowledge base and user modeling | |
CN112100514B (en) | Friend recommendation method based on global attention mechanism representation learning | |
CN113221694B (en) | Action recognition method | |
CN111143567B (en) | Comment emotion analysis method based on improved neural network | |
CN113378938B (en) | Edge transform graph neural network-based small sample image classification method and system | |
Zhang et al. | Lenet-5 convolution neural network with mish activation function and fixed memory step gradient descent method | |
CN111931807A (en) | Small sample class incremental learning method based on feature space combination | |
CN114896512B (en) | Learner preference and group preference-based learning resource recommendation method and system | |
Wang et al. | Semi-supervised Gaussian process latent variable model with pairwise constraints | |
Sun et al. | Low-consumption neuromorphic memristor architecture based on convolutional neural networks | |
CN113177417A (en) | Trigger word recognition method based on hybrid neural network and multi-stage attention mechanism | |
Zhang et al. | Transfer learning from unlabeled data via neural networks | |
CN117854597A (en) | Track prediction method based on contrast learning feature dimension reduction | |
CN110866403A (en) | End-to-end conversation state tracking method and system based on convolution cycle entity network | |
Yang et al. | Design of convolutional neural network based on tree fork module | |
CN108280511A (en) | A method of network access data is carried out based on convolutional network and is handled |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |