CN113806546B

CN113806546B - Graph neural network countermeasure method and system based on collaborative training

Info

Publication number: CN113806546B
Application number: CN202111166143.6A
Authority: CN
Inventors: 卢凯; 邬会军; 吴旭刚; 王睿伯; 张文喆; 董勇; 张伟; 谢旻; 周恩强; 迟万庆; 吴振伟; 李佳鑫
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2021-09-30
Filing date: 2021-09-30
Publication date: 2024-04-05
Anticipated expiration: 2041-09-30
Also published as: CN113806546A

Abstract

The invention discloses a graph neural network countermeasure method and a system based on collaborative training, wherein the training of each round of the invention comprises dividing graph data into data of different views; selecting corresponding sub-models for the data of different views; training corresponding sub-models based on the data of different views, inputting unlabeled data into the trained sub-models, and sequencing the confidence of the unlabeled data according to the sub-models; and marking corresponding pseudo labels for the selected unlabeled data, adding the pseudo labels to the training data, and re-carrying out collaborative training on the sub-models under each view by using new training data. Compared with other graph neural network countermeasure defense mechanisms, the method and the device have the advantages that empirical ground node characteristic information utilization is jumped out when the node characteristic information is utilized, and the cooperative training-based fusion model is adopted for defense, so that the graph structure and the node characteristic information are fully fused under a training frame, and further a more robust fusion model is constructed.

Description

Graph neural network countermeasure method and system based on collaborative training

Technical Field

The invention relates to the field of artificial intelligence safety, in particular to a graph neural network countermeasure method and system based on collaborative training.

Background

Unlike conventional study objects, graph structures process data that is not in Euclidean space, with features of disorder, graph data structures can be used to describe a variety of real-world problems. For example, social network groups, citations between academic papers, protein molecular structure, control flow of computer programs, etc. may all be modeled as graphs. The graph neural network adopts a deep learning method to analyze graph data, and can be used for various analysis tasks such as graph classification, node classification, connection prediction and the like. Typical graph neural networks include graph convolutional neural networks (GCN), graph sample and aggregate networks (graphSAGE), graph attention networks (GAT), and the like. For graph data, graph neural networks tend to achieve better performance than other types of deep learning models. Similar to convolutional neural networks that process images, graph neural networks are also subject to attack against samples. A typical graph neural network challenge can be described as: misleading the model by as little disturbance as possible produces erroneous prediction results. Taking the graph node classification task as an example, the object node is misclassified against sample attack, namely by tampering with the attribute of a small number of nodes or the connection relationship between nodes. Assuming that the graph neural network model is f, the corresponding loss function isGraph data adjacency matrix is A, characteristic matrix is X, and sample attack is resistedLimiting to minimal adjacency matrix and feature matrix changes, the objective of combating sample attacks can be formalized as follows:

in the above-mentioned method, the step of,representing post-disturbance graph data adjacency matrix, +.>Representing the characteristic matrix of the graph data after disturbance, V _t Representing the set of attack target nodes->For loss function->Representing the prediction of the target node by the trained graph neural network model, y _u Original label, θ, representing target node ^* Representing the trained model parameters, +.>Representing the loss function of the training process,the method is characterized in that the method is used for predicting nodes by a graph neural network model in the training process, A is used for representing a graph data adjacency matrix, X is used for representing a graph data feature matrix, and delta is used for representing a constraint maximum value of total disturbance quantity.

The perturbation approach employed by an attacker can be generally summarized in four ways: a. adding/deleting edges: the purpose of attack is achieved by adding or deleting some edges; b. modifying node/edge attributes: the aim of attack is achieved by modifying the attribute of part of nodes/edges; c. edge reconnection: reconnecting edges between the nodes; d. dummy node injection: adding the node which does not exist in the graph data. Related studies show that although the feature information and the structure information of the nodes are considered by the graph neural network based on the message transmission, when the nodes are classified, the method is actually prone to classifying the nodes by using the structure information of the nodes. Correspondingly, the characteristic information of the node has little influence on the judgment of the model, the effective information is not fully utilized, and the situation causes the following characteristics when the existing graph neural network faces against the sample attack: a. the attack effect brought by the disturbance of the graph structure is stronger than the disturbance of the node characteristics, and an attacker tends to select the attack graph structure information; b. structurally connecting highly dissimilar nodes in a graph becomes the most effective method of combating sample attacks.

For the above mentioned means of combating attacks on graph neural networks, existing defense techniques can be summarized as follows:

and (5) denoising the graph. Before the graph neural network processes the graph data, the graph data is subjected to denoising processing based on the characteristics of normal graph data and attacked graph data, so that some interference edges are attempted to be eliminated, the graph neural network processes the cleaned graph, and errors are avoided. For example, the LowRank defense means observes that the attack component of the graph often appears as a small eigenvalue, and further appears as a high rank part on the eigenvalue spectrum. Based on this, the authors first approximate the graph components, removing the high rank part and making noise reduction, thereby achieving defense. Jaccard-GCN achieves the effect of defending against sample attacks without significantly affecting the model performance by deleting edges between all nodes with very low similarity based on the observation that an attacker tends to connect two dissimilar edges for attack.

Attention mechanisms. The attention mechanism can be regarded as a soft denoising method, and is different from the method for deleting disturbance edges in the graph denoising method, the defense based on the attention mechanism is combined with model training, and different attention weights (high credible edge weight and low interference edge weight) are given to each edge in a learning mode, so that the graph neural network pays attention to graph structures which are more meaningful for correct classification in the aggregation process to improve the robustness of the model.

Challenge training. Challenge training refers to a method of training a model by generating corresponding challenge samples during the training of the model and adding them together with the original samples as training data. In the image field, the countermeasure training method proves to be an effective method for improving the robustness of the model. The conventional challenge training process involves training by adding challenge samples to the training set, increasing the robustness of the model against challenge samples. The defense by adopting the countermeasure training method often has the problem that the cost for generating the countermeasure sample is too high, and the problem can be solved by generating the countermeasure sample on the manifold on the first hidden layer, thereby reducing the calculated amount and solving the data discretization problem.

However, the existing defense method against the sample attack has the following defects that the learning type and the prediction nature of the graph neural network based on the message passing are not changed basically. The related defense mode still has the defect of underutilization of node characteristics, and the potential promotion effect of the graph node characteristic information on the defense of the countermeasure sample is missed.

Disclosure of Invention

The invention aims to solve the technical problems: aiming at the problems in the prior art, the invention provides a graph neural network countermeasure method and a graph neural network countermeasure system based on collaborative training, which are used for fusing a model mainly comprising graph structure information and a model mainly comprising graph characteristic information, so that more effective graph neural network countermeasure can be realized, the robustness of the graph neural network model in the case of countering sample attack is improved, and the prediction accuracy of the graph neural network model in the case of countering sample attack is improved.

In order to solve the technical problems, the invention adopts the following technical scheme:

a co-training based graph neural network challenge defense method, comprising:

1) Dividing the graph data into data of different views;

2) Selecting corresponding sub-models for the data of different views;

3) Training corresponding sub-models based on the data of different views, inputting unlabeled data into the trained sub-models, and sequencing the confidence of the unlabeled data according to the sub-models;

4) Selecting K unlabeled data with highest confidence, marking corresponding pseudo labels for the selected unlabeled data according to the prediction result of the sub-model, and adding the pseudo labels to training data, wherein K is a preset super parameter;

5) Re-performing collaborative training on the sub-models under each view by using new training data;

6) Judging whether the round of cooperative training reaches the preset number of cooperative training rounds or not, if not, jumping to execute the step 1) to continue the next round of training; if yes, judging that fusion of the two sub-models is completed, obtaining a mixed model obtained by fusion of the sub-models, and taking the average value of the output of each sub-model as the final output result of the mixed model.

Optionally, the data of the different views in step 1) includes graph data X of a node feature view and graph data a of a graph structure view, where the graph data X of the node feature view reflects features of the node itself, and the graph data a of the graph structure view reflects structure information of the node in a graph topology.

Optionally, selecting the corresponding sub-model for the data of the different views in step 2) refers to selecting, for the graph data X of the node feature view, a node feature model f that is insensitive to the graph structure information and determines the node class based on the node feature _feat Selecting a graph feature model f which judges node types and is insensitive to graph node features by utilizing graph structure information for graph data A of a graph structure view _struct 。

Optionally, the node feature model f _feat Is a multi-layer perceptron model, and the multi-layer perceptron modelThe formalized representation function expression of the model is:

in the above-mentioned method, the step of,for characterization of the layer 1 node, +.>For characterization of the layer I node, σ is the nonlinear activation function used, θ ^l And b _l Is a parameter to be trained of the multi-layer perceptron.

Optionally, the node feature model f _feat For a k-nearest neighbor model, the k-nearest neighbor model uses a k-nearest neighbor algorithm to find k most similar nodes for each node to connect so as to construct a graph structure, and the graph structure and node characteristics are taken as input to a graph characteristic model f _struct Is classified.

Optionally, the graph feature model f _struct The method is characterized by comprising the following steps of:

in the above, X ^(l+1) The characterization of the node at layer 1, σ, is the nonlinear activation function employed,x is a regularized adjacency matrix ^(l) For characterization of layer I node, W ^(l) Is the parameter to be trained for the layer I node.

Optionally, the graph feature model f _struct Is a graph model, and the feature extraction object of the graph model comprises a graph topology A and a second-order adjacency matrix A of the graph topology ² 。

Optionally, step 4) labeling the selected unlabeled data with a corresponding pseudo tag and adding the pseudo tag to the training data specifically refers to: and adding unlabeled data with highest original proportion screening confidence to the training data according to the distribution of each category of data in the original data in the prediction result of the sub-model.

In addition, the invention also provides a cooperative training-based graph neural network countermeasure system, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the cooperative training-based graph neural network countermeasure method.

Furthermore, the present invention provides a computer readable storage medium having stored therein a computer program programmed or configured to perform the collaborative training-based graph neural network countermeasure method.

Compared with the prior art, the invention has the following advantages: the training of each round of the invention comprises dividing the graph data into data of different views; selecting corresponding sub-models for the data of different views; training corresponding sub-models based on the data of different views, inputting unlabeled data into the trained sub-models, and sequencing the confidence of the unlabeled data according to the sub-models; selecting K unlabeled data with highest confidence, marking corresponding pseudo labels for the selected unlabeled data according to the prediction result of the sub-model, adding the labeled data into training data, and re-carrying out collaborative training on the sub-model under each view by using new training data. Compared with other graph neural network countermeasure defense mechanisms, the method and the device have the advantages that empirical ground node characteristic information utilization is jumped out when the node characteristic information is utilized, and the cooperative training-based fusion model is adopted for defense, so that the graph structure and the node characteristic information are fully fused under a training frame, and further a more robust fusion model is constructed.

Drawings

FIG. 1 is a schematic of the basic flow of the process of the present invention.

Fig. 2 is a schematic diagram of a method according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a collaborative training process according to an embodiment of the present invention.

Detailed Description

The invention aims to fuse the model mainly based on the graph structure information and the model mainly based on the graph characteristic information so as to realize more effective graph neural network fight defense, improve the robustness of the graph neural network model when the graph neural network model faces to fight sample attack, and improve the prediction accuracy of the graph neural network model under the fight sample attack condition. Model fusion is an important challenge defense approach. For a model, its challenge space is defined as the sample space in which the challenge sample it is affected. The challenge samples of the models appear concentrated in a continuous interval, the size of which determines the ability of the challenge samples to migrate between models. The core thought of defending against attacks through model fusion is to reduce the subtend subspace shared between the submodels, so that a challenge sample is more difficult to migrate between the submodels, and as the classification result of the fusion model is determined by all submodels together, the robustness of the model to the challenge sample can be improved due to the reduction of the mobility of the challenge sample.

The gist of implementing the challenge defense by model fusion is how to ensure that the challenge subspace shared between models is small enough. There are currently two main ideas: one is to narrow the shared pair subspace of the sub-model by facilitating gradient diversity of the sub-model during training. The larger shared resist subspace of the models occurs during training, because the loss gradients between the submodels exhibit considerable consistency. Thus, by introducing a regularization term, the loss gradients of the sub-models become diverse and uncorrelated, and the shared countermeasure space between the sub-models is reduced, making the fusion model more robust. Secondly, the sharing pair resisting space of the sub-models is reduced by promoting the classification diversity of different sub-models on the maximum classification result. By enabling different sub-models to have different performances on non-maximum classification results (labels), the sub-models can present diversity in classification errors, and therefore robustness of the fusion model is improved. For graph data, since the graph data itself includes two mutually orthogonal information (graph structure information and node characteristic information), models respectively trained using the two mutually orthogonal information are independent of each other in principle. Further, the subspaces of the pairs between the two models are also mutually orthogonal, and the combination of the two models effectively improves the robustness of the fusion model.

The co-training (co-training) method is a classical model fusion and semi-supervised learning method, and the method assumes that data possess different views, and different classifiers can be trained by using the data through the different views. Meanwhile, the classifiers trained by the different views can be complementary, so that the classification effect of the model is improved. The co-training method requires that different views of the data satisfy two properties: 1. sufficiency: if a better classification result can be obtained on the classification problem through any view, we call the view sufficient. 2. Condition independence: if the sources of cues from which the model is to be classified are independent for different views, we call conditional independence between views. The sufficiency of the views ensures that an effective classifier can be obtained through each view, while the conditional independence ensures that the classifiers trained from different views are complementary. For graph data processed by a graph neural network, the data itself naturally has two full and independent views: a graph structure information view and a node characteristic information view. Thus, this approach can be used to fuse the two under-view sub-models.

Embodiment one:

as shown in fig. 1, the graph neural network countermeasure method based on collaborative training of the present embodiment includes:

1) Dividing the graph data into data of different views;

2) Selecting corresponding sub-models for the data of different views;

Referring to fig. 2, in step 1) of the present embodiment, the data of different views includes graph data X of a node feature view and graph data a of a graph structure view, where the graph data X of the node feature view reflects features of the node itself, and the graph data a of the graph structure view reflects structure information of the node in a graph topology.

In step 2) of this embodiment, selecting corresponding sub-models for data of different views refers to selecting, for graph data X of a node feature view, a node feature model f that is insensitive to graph structure information and determines a node class based on node features _feat Selecting a graph feature model f which judges node types and is insensitive to graph node features by utilizing graph structure information for graph data A of a graph structure view _struct 。

Since our goal is to train to obtain a fusion model, it is critical to select appropriate sub-models for two views (feature view and view structure view) of the graph data, and the node feature model and the graph structure model used in this embodiment are described below.

In this embodiment, the node feature model f _feat The method is characterized by comprising the steps of providing a multi-layer perceptron model, wherein the formalized representation function expression of the multi-layer perceptron model is as follows:

Since the node features appear as a vector, the present embodiment method first considers the use of a multi-layer perceptron (MLP) method, which processes a classical approach to vector as input. The multi-layer perceptron is an artificial neural network with a forward structure and consists of multiple layers of nodes, each layer of nodes is fully connected to the next layer, and a nonlinear activation function is arranged behind each node. Training data is used for training model parameters through a back propagation method during training.

In the present embodiment, the graph feature model f _struct The method is characterized by comprising the following steps of:

in the above, X ^(l+1) The characterization of the node at layer 1, σ, is the nonlinear activation function employed,x is a regularized adjacency matrix ^(l) For characterization of layer I node, W ^(l) Is the parameter to be trained for the layer I node. The graph convolutional neural network is a classical model for processing graph data, which is used here by the method of the present embodiment as a model for capturing graph structure information.

After selecting the corresponding sub-model, how to integrate the selected sub-models to obtain a fusion model is another key of the method of the embodiment, and the method of the embodiment adopts a collaborative training method to fuse the two sub-models. Referring to fig. 2 and 3, the method of the present embodiment first divides the graph data into two views, a graph structure view and a node feature view, and the data under the different views are sent to corresponding sub-models for training. After one training stage is finished, each sub-model can obtain predictions for unlabeled nodes, and the method adds corresponding pseudo labels to K nodes with the highest confidence degrees of the two sub-models into training data. Wherein K is a preset super parameter and is related to the data size. In the next training phase, repeating the action, and adding new pseudo tag data until the preset iteration round number is reached. The traditional collaborative training method directly selects the nodes with pseudo labels according to the confidence of the model, so that the problem of uneven training data types is easily caused. In order to prevent the problem of data unbalance in the training process, when pseudo tag data is added, unlabeled data with highest confidence is screened according to the original proportion according to the distribution of various types of data in the original data for adding. Therefore, in the embodiment, step 4) of marking the selected unlabeled data with a corresponding pseudo tag and adding the pseudo tag to the training data specifically means: and adding unlabeled data with highest original proportion screening confidence to the training data according to the distribution of each category of data in the original data in the prediction result of the sub-model.

In summary, the problem solved by this embodiment is that the current mainstream graph neural network model implements model training and prediction based on a message passing mechanism. Taking node classification problem as an example, the characteristic makes the model highly depend on the graph structure information during prediction, and the aggregation process makes the node characteristic information continuously fuzzy, so that the information quantity brought by classification prediction is insufficient. Thus, countering sample attacks often enables efficient attacks by tampering with a small amount of graph structure information. The existing anti-sample defense method utilizes the characteristics of tampering of the structural information, indirectly utilizes node characteristic information through node similarity and other information, and realizes defense in a side deleting mode and other modes. Such feature information utilization is often empirical and may not have wide applicability to diversity diagram data. Therefore, the embodiment aims to introduce a node characteristic model, and organically integrate the graph structure model and the node characteristic model through a co-training method. The sub-models obtained through the two mutually orthogonal view training have small shared attack resisting subspace, so that the corresponding fusion model has better defense effect when facing attack resisting.

The method comprises the following steps: first, the graph data is divided into a node characteristic view angle reflecting the characteristics of the node itself and a graph structure view angle reflecting the structure information of the node in the graph topology. After the two visual angles are obtained, corresponding models are respectively selected for the two visual angles. The model under the view angle of the node characteristics mainly judges the node category based on the node characteristics and is insensitive to the graph structure information. Correspondingly, the model under the view angle of the graph structure mainly utilizes the graph structure information to judge the node category, and is insensitive to the graph node characteristics. Therefore, when the model faces the attack under one view, the model under the other view is not affected, and the knowledge of the unaffected model can be transferred through the subsequent model fusion process, so that the attacked model is corrected. After selecting the corresponding model, the corresponding model is first trained independently under both views (as shown in fig. 2). After the trained sub-model is obtained, the unlabeled data is input into the sub-model, and the confidence of the unlabeled data is ordered according to the sub-model. And selecting K unlabeled data with highest confidence (K is a preset super parameter), marking corresponding pseudo labels for the selected unlabeled data according to the prediction result of the sub-model, and adding the labeled data into the training data. And training the sub-model under each view again by using new training data, and repeating the steps until the preset collaborative training round number is reached. Compared with other graph neural network countermeasure defense mechanisms, the embodiment jumps out of empirical ground node characteristic information utilization when node characteristic information is utilized, and adopts a fusion model based on collaborative training for defense, so that the graph structure and the node characteristic information are fully fused under a training frame, and a more robust fusion model is constructed.

In addition, the present embodiment also provides a co-training based graph neural network countermeasure system, including a microprocessor and a memory connected to each other, where the microprocessor is programmed or configured to perform the steps of the foregoing one co-training based graph neural network countermeasure method.

Furthermore, the present embodiment also provides a computer-readable storage medium having stored therein a computer program programmed or configured to perform the aforementioned one co-training based graph neural network countermeasure method.

Embodiment two:

the present embodiment is basically the same as the first embodiment, and the main differences are: node characteristic model f selected in this embodiment _feat Different. In this embodiment, the node feature model f _feat For a k-nearest neighbor model, the k-nearest neighbor model uses a k-nearest neighbor algorithm to find k most similar nodes for each node to connect so as to construct a graph structure, and the graph structure and node characteristics are taken as input to a graph characteristic model f _struct Is classified. By adopting the alternative technical means, the method of the embodiment can solve the technical problem to be solved in the first embodiment and achieve the technical effect basically the same as that of the first embodiment.

It should be noted that, based on the gist of the present invention, the key of steps 1) to 2) is to divide the graph data into data of different views and select corresponding sub-models for the data of different views, so that even if the data defining different views includes the graph data X of the node feature view reflecting the features of the node itself and the graph data a of the graph structure view reflecting the structural information of the node in the graph topology, the node feature view should not be understood as not depending on the specific node feature model f _feat And will not be described in detail herein.

Embodiment III:

the present embodiment is basically the same as the first embodiment, and the main differences are: the figure feature model f selected in this embodiment _struct Different. In the present embodiment, the graph feature model f _struct Is a graph model, and the feature extraction object of the graph model comprises a graph topology A (adjacent matrix of nodes) and a second-order adjacent matrix A of the graph topology ² . By adopting the alternative technical means, the method of the embodiment can solve the technical problem to be solved in the first embodiment and achieve the technical effect basically the same as that of the first embodiment. Moreover, since the conventional atlas method only performs feature extraction based on the graph topology A, in order to obtain stronger local graph structure information, the method of the embodiment is based on the second-order adjacency matrix A at the same time ² And extracting the characteristics. A graph is an important attribute reflecting graph structure, and the attribute is often used for capturing graph structure information in tasks such as clustering, community discovery and the like, and is specifically implemented as follows: giving a graph topology A, and firstly calculating the feature decomposition of the Laplace matrix corresponding to the graph topology A:

D ^-1 Ly＝λy，

in the above formula, D is a diagonal matrix, its element is the degree of the corresponding node, L is a graph laplace matrix, y is the corresponding eigenvector, and λ is the corresponding eigenvalue.

Wherein: d (D) ^-1 L＝I-D ^-1 A is a regularized Laplace matrix, I is an identity matrix, and A is a graph topology. Record lambda ₀ λ ₁ λ ₂ ...λ _k For k minimum eigenvalues, y ₀ y ₁ y ₂ ...y _k For the corresponding feature vector, each node u can then be characterized as a k-dimensional vector a _u ＝(y _1u ，y _2u ，...y _ku ) After this characterization, the node characterization is sent as input to a classification model, e.g., a multi-layer perceptron, for classification.

It should be noted that, based on the gist of the present invention, the key of steps 1) to 2) is to divide the graph data into data of different views and select corresponding sub-models for the data of different views, so that even if the data defining different views includes the graph data X of the node feature view reflecting the features of the node itself and the graph data a of the graph structure view reflecting the structural information of the node in the graph topology, the graph structure view should not be understood as not depending on the specific graph feature model f _struct And will not be described in detail herein.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims

1. A co-training based graph neural network countermeasure method, comprising:

1) Dividing the graph data into data of different views; the map data are constructed by protein molecular structures, and the map neural network is used for analyzing the map data for map classification; the data of the different views comprises graph data X of a node characteristic view and graph data A of a graph structure view, wherein the graph data X of the node characteristic view reflects the characteristics of the node, and the graph data A of the graph structure view reflects the structure information of the node in graph topology;

2) Selecting corresponding sub-models for the data of different views; the sub-model corresponding to the data selection of different views refers to a node characteristic model which is insensitive to the structure information of the graph and judges the node type based on the node characteristics for the graph data X of the node characteristic viewf _feat Selecting a graph feature model which judges node types and is insensitive to graph node features by using graph structure information for graph data A of graph structure viewf _struct ；

4) Selecting K unlabeled data with highest confidence, marking corresponding pseudo labels for the selected unlabeled data according to the prediction result of the sub-model, and adding the pseudo labels to training data, wherein K is a preset super parameter; the step of marking the selected unlabeled data with corresponding pseudo labels and adding the pseudo labels to the training data specifically comprises the following steps: according to the distribution of each category of data in the original data in the prediction result of the sub-model, adding the unlabeled data with the highest original proportion screening confidence to the training data;

2. The cooperative training-based graph neural network countermeasure method of claim 1, wherein the node feature modelf _feat The method is characterized by comprising the steps of providing a multi-layer perceptron model, wherein the formalized representation function expression of the multi-layer perceptron model is as follows:

，

in the above-mentioned method, the step of,is the firstlCharacterization of +1 level node, +_>Is the firstlCharacterization of layer node,/->For the nonlinear activation function employed, +.>And->Is a parameter to be trained of the multi-layer perceptron.

3. The cooperative training-based graph neural network countermeasure method of claim 1, wherein the node feature modelf _feat Is thatkA neighbor model, saidkNeighbor model usekThe neighbor algorithm finds for each nodekThe most similar nodes are connected to construct a graph structure, and the graph structure is input to the graph feature model along with the node featuresf _struct Is classified.

4. The cooperative training-based graph neural network countermeasure method of claim 1, wherein the graph feature modelf _struct The method is characterized by comprising the following steps of:

，

in the above-mentioned method, the step of,first, thelCharacterization of +1 level node, +_>For the nonlinear activation function employed, +.>For the regularized adjacency matrix, +.>Is the firstlCharacterization of layer node,/->Is the firstlParameters to be trained by the layer node.

5. The cooperative training-based graph neural network countermeasure method of claim 1, wherein the graph feature modelf _struct Is a graph model, and the feature extraction object of the graph model comprises a graph topology represented by graph data A of a graph structure view and a second-order adjacency matrix of the graph topology。

6. A co-training based graphic neural network challenge defense system comprising a microprocessor and a memory interconnected, wherein the microprocessor is programmed or configured to perform the steps of a co-training based graphic neural network challenge defense method of any of claims 1-5.

7. A computer readable storage medium having stored therein a computer program programmed or configured to perform a co-training based graph neural network countermeasure method of any of claims 1-5.