CN115018057A

CN115018057A - Robust neural architecture searching method and system for graph neural network

Info

Publication number: CN115018057A
Application number: CN202210744891.6A
Authority: CN
Inventors: 孙亚楠; 冯雨麒; 王聪; 陈红阳
Original assignee: Sichuan University; Zhejiang Lab
Current assignee: Sichuan University; Zhejiang Lab
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2022-09-06

Abstract

The invention discloses a robust neural architecture searching method and a system thereof for a graph neural network, wherein the searching method comprises the steps of defining a searching unit, a searching space and a GNN architecture; searching a plurality of GNN frameworks in a search space, and training all the GNN frameworks; adopting an anti-attack method to generate an anti-sample, inputting the anti-sample into each trained GNN framework, and taking the output accuracy as the robustness index; searching a search unit meeting preset conditions in the GNN framework with adjacent subscripts according to the robustness index of the GNN framework, and storing the search unit to a search unit set; according to the search unit set B ^* The searching unit searches the operation corresponding to the maximum weight of each edge in the searching space by adopting a probability enhancement searching strategy; constructing by adopting the operation corresponding to the maximum weight of each searched edgeForming a GNN neural network architecture.

Description

Robust neural architecture searching method and system for graph neural network

Technical Field

The invention relates to the technical field of neural networks, in particular to a robust neural architecture searching method and a robust neural architecture searching system for a graph neural network.

Background

In recent years, along with the development of NAS technology, the high efficiency and the design method for neural architecture automation are receiving more and more attention from GNN-related researchers. Currently, methods for designing GNN models based on NAS are mainly divided into three types: reinforcement learning based methods, evolutionary computation based methods, and gradient optimization based methods.

Although the above methods can all search for GNN models with high accuracy over a variety of map-related tasks, they are primarily directed to the accuracy of GNN models deployed in the absence of counter-attacks. However, in practical applications, due to the existence of a countermeasure sample (data generated after adding disturbance in input data by a specific algorithm, the disturbance is mainly the addition or deletion of edges and nodes in graph data), the robustness of the GNN model is also a problem to be considered.

The existing NAS method aiming at the GNN does not consider the robustness of the searched GNN model, so that the robustness of the searched GNN model is poor when confronted with challenge samples. For example, when the GNN is applied to identify a rogue user in a financial field-related system, the rogue user can modify the input data of the GNN by establishing contact with a user with high confidence, so that the GNN predicts it as a user with good confidence, which greatly reduces the security of the financial system.

Furthermore, when GNNs are applied to the identification and filtering of spam, etc., the manufacturer of spam, etc. can associate such mail, ads, etc. with normal mail, ads, causing the GNN filter to make incorrect predictions, which can lead to the failure of the spam, ad filtering system.

Therefore, a neural network architecture method considering the accuracy of the searched GNN architecture and the robustness against the attack needs to be proposed.

Disclosure of Invention

Aiming at the defects in the prior art, the robust neural architecture searching method and the system thereof for the graph neural network provided by the invention solve the problem of poor robustness of the existing method.

In order to achieve the purpose of the invention, the invention adopts the technical scheme that:

in a first aspect, a robust neural architecture search method for a graph neural network is provided, which includes:

s1, defining a search unit as one edge and a binary group formed by operations contained in the edge, wherein the search space S and the GNN framework are both a set formed by a plurality of search units;

s2, searching a plurality of GNN frameworks in the search space, and training all the GNN frameworks;

s3, generating a countermeasure sample by adopting an countermeasure attack method, inputting the countermeasure sample into each trained GNN framework, and taking the output accuracy as the robustness index;

s4, searching a search unit meeting preset conditions in the GNN framework with adjacent subscripts according to the robustness index of the GNN framework, and storing the search unit into a search unit set;

s5, collecting B according to search units ^* The searching unit searches the operation corresponding to the maximum weight of each edge in the searching space by adopting a probability enhancement searching strategy;

and S6, constructing and forming a GNN neural network architecture by adopting the operation corresponding to the searched maximum weight of each edge.

Further, the step S4 includes:

s41, searching a GNN framework with adjacent subscripts in the framework set; determining GNN architecture A _i Whether the robustness index of (2) is greater than the GNN framework A _i+1 The robustness index of (2);

s42, if yes, calculating A _i -A _i+1 The obtained partial search unit is stored in the set B ₁ Performing the following steps;

s43, otherwise, calculating A _i+1 -A _i The obtained partial search unit is stored in the set B ₂ Performing the following steps;

s44, set B is obtained ₁ And B ₂ And storing the search unit in the intersection of the search units into the search unit set B ^* In (1).

Further, the step S5 includes:

s51, reading the weight of each operation of each edge in the search space to form a weight matrix alpha, and constructing an initial matrix alpha which has the same dimension as the weight matrix alpha and has zero elements _R ；

S52, initializing the matrix alpha _R Corresponds to a search unit set B ^* The weight of the position of the middle search cell is set to 1, and an initial matrix alpha is calculated _R A distance metric ψ (α) from the weight matrix α;

s53, obtaining a pair search unit set B according to the distance measurement ^* The weight matrix alpha after the probability enhancement is carried out by the medium search unit:

s.t.ω ^* (α)＝argmin _ω L _train (ω，α)

wherein L is _train (omega, alpha) and L _val (ω, α) is the loss of GNN framework on the training/validation set, respectively; λ is a weight variable; ω is a trainable parameter in the GNN architecture; omega ^* (α) is the loss L of the GNN framework on the training set _train The value of the trainable parameter ω corresponding to the minimum;

s54, selecting the operation corresponding to the maximum weight of each edge in the search space according to the weight matrix alpha after probability enhancement:

wherein o is ^(i，j) An operation corresponding to the maximum weight of the edge; argmax is a function of the value of an independent variable when a function is solved and the maximum value is obtained;

a set of all operations for one edge;

is the weight of operation o on edge (i, j); i and j are the start and end points of the edge, respectively.

Further, the distance metric ψ (α) is calculated by the formula:

wherein | · | purple sweet ₂ Is the euclidean distance.

Further, the number of search cells in the search space S is greater than the number of search cells in the GNN architecture.

Further, when the constructed GNN neural network architecture is used for classifying scientific documents, the data set for training the GNN architecture is a Cora data set and a Citeseer data set;

when the constructed GNN neural network architecture is used for classifying junk mails in a mail system, a data set for training the GNN architecture is a Spambase data set;

when the constructed GNN neural network architecture is used for identifying a cheater user in a financial system, a data set for training the GNN architecture is a PaySim data set;

when the constructed GNN neural network architecture is used for adding a large amount of fake user scoring information into a recommendation system and defending malicious users trying to change recommendation results, a data set for training the GNN architecture is a Retailocket data set.

In a second aspect, a robust neural architecture search system for a graph-oriented neural network is provided, which includes:

the definition module is used for defining the search units as a side and a binary group formed by operations contained in the side, and the search space S and the GNN framework are both a set formed by a plurality of search units;

the GNN architecture search training module is used for searching a plurality of GNN architectures in a search space and training all the GNN architectures;

the robustness index generation module is used for generating countermeasure samples by adopting an countermeasure attack method, inputting the countermeasure samples into each trained GNN framework, and taking the output accuracy as the robustness index;

the search module is used for searching a search unit which meets preset conditions in the GNN framework with adjacent subscripts according to the robustness index of the GNN framework, and storing the search unit to a search unit set;

a probability enhancing module for collecting B according to the search units ^* The searching unit searches the operation corresponding to the maximum weight of each edge in the searching space by adopting a probability enhancement searching strategy;

and the network architecture generating module is used for constructing and forming the GNN neural network architecture by adopting the operation corresponding to the searched maximum weight of each edge.

Further, the search module includes:

the searching and judging module is used for searching the GNN framework with adjacent subscripts in the framework set; determining GNN architecture A _i Whether the robustness index of (2) is greater than the GNN framework A _i+1 The robustness index of (2);

a first search unit calculation module for calculating A when the output of the search judgment module is yes _i -A _i+1 The obtained partial search unit is stored in the set B ₁ Performing the following steps;

a second search unit calculating module for calculating A when the output of the judging module is not found _i+1 -A _i The obtained partial search unit is stored in the set B ₂ Performing the following steps;

a search unit determination module for finding the set B ₁ And B ₂ And storing to search unit set B ^* In (1).

Further, the probability enhancing module comprises:

a matrix construction module for reading the weight of each operation of each edge in the search space to form a weight matrix alpha, and constructing a matrix with the same dimension as the weight matrix alpha and zero elementsInitial matrix alpha of _R ；

A distance metric calculation module for calculating an initial matrix alpha _R Corresponds to a search unit set B ^* The weight of the position of the middle search cell is set to 1, and an initial matrix alpha is calculated _R A distance metric ψ (α) from the weight matrix α;

a probability enhancement submodule for obtaining a search unit set B according to the distance measurement ^* The weight matrix alpha after the probability enhancement is carried out by the medium search unit:

s.t.ω ^* (α)＝argmin _ω L _train (ω，α)

an operation selection module, configured to select, according to the weight matrix α after probability enhancement, an operation corresponding to the maximum weight of each edge in the search space:

a set of all operations for one edge;

The invention has the beneficial effects that: according to the scheme, from the perspective of a search space, the robust search unit can be marked in the search space. Based on the marked robust search units, in the search process, the probability enhancement is carried out on the robust search units in the search space while the framework precision is ensured, so that as many robust search units as possible are selected into the final framework, and the robustness of the final GNN neural network framework is improved.

Drawings

Fig. 1 is a flowchart of a robust neural architecture search method for a graph neural network.

FIG. 2 is a search unit set B ^* And searching the acquisition idea diagram of the unit.

Fig. 3 is a comparison of the performance of GNN architectures as the attack strength increases.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.

Referring to fig. 1, fig. 1 shows a flow diagram of a robust neural architecture search method for a graph-oriented neural network; as shown in fig. 1, the method S includes steps S1 to S6.

In step S1, a search cell is defined as a binary group formed by operations contained in one edge and one edge, the search space S and the GNN structure are both a set formed by a plurality of search cells, and the number of search cells in the search space S is greater than the number of search cells in the GNN structure. The specific implementation process of step S1 is as follows:

in the scheme, a search unit is defined as a side e and a binary group formed by operation o contained in the side, and is marked as (e, o); based on the above definition, the search space S can be defined as a set of search units, denoted as

Where ε represents the set of all edges in the search space,

it represents the set of all operations that may be contained on one edge. Similarly, similar to the definition of the search space S, a GNN structure is also defined as a set of search units, and a structure containing n search units can be denoted as { b ₁ ，b ₂ ，...，b _n |，b _i ∈S，i＝1，2，...，n}。

In step S2, a number of GNN architectures are searched in the search space, and all GNN architectures are trained; according to the scheme, a GNN architecture search is carried out by adopting a running search algorithm, and 50 GNN architectures can be obtained by assuming that 50 periods are set (one GNN architecture can be obtained in each period).

During implementation, when the scheme preferentially trains the GNN framework, the data set is selected in the following mode:

when the constructed GNN neural network architecture is used for classifying scientific documents, a data set for training the GNN architecture is a Cora data set and a Citeseer data set;

In step S3, a counterattack method is used to generate countersamples, the countersamples are input into each trained GNN architecture, and the accuracy of the output is used as its robustness index.

In step S4, search units satisfying a preset condition in the GNN architecture with adjacent subscripts are searched according to the robustness index of the GNN architecture, and stored in the search unit set.

In one embodiment of the present invention, the step S4 includes:

s41, searching a GNN framework with adjacent subscripts in the framework set; determining GNN architecture A _i Whether the robustness index of (A) is greater than that of the GNN architecture A _i+1 The robustness index of (2);

s44, set B is obtained ₁ And B ₂ And storing to search unit set B ^* In (1).

According to the scheme, the search unit with better robustness is obtained by adopting the mode and is used for constructing a robust search space, and the robust GNN framework is more favorably searched on the basis of the search space.

To facilitate the collection of search units B ^* The following describes a specific implementation process of step S4 with reference to fig. 2, in order to understand the acquisition process of the medium search unit:

for architecture A in FIG. 2 _m And A _m+1 Assuming that their robustness satisfies the relation R _m ＞R _m+1 Then A should be selected _m -A _m+1 The search unit in (A) is added to the set B ₂ In (1). Framework A _m And A _m+1 There is a different search unit, indicated by the dashed arrow in the figure, then (0, 3, operation 3) this search unit should be added to the set B ₂ Among them.

Similarly, for architecture A _n And A _n+1 Their robustness satisfies R _n ＜R _n+1 Therefore, A should be selected _n+1 -A _n The search unit in (A) is added to the set B ₁ Among them. In the framework A _n And A _n+1 There is also a different search unit, indicated by the dashed arrow in the figure. Then, A _n+1 -A _n The search unit included in (1) is ((0, 3), operation 3), and this search unit should be added to B ₁ In (1).

Finally, when set B ₁ And set B ₂ When the intersection is taken ((0, 3), operation 3) the search unit is stored in the search unit set B ^* In (1).

In step S5, set B of search units ^* The searching unit searches the operation corresponding to the maximum weight of each edge in the searching space by adopting a probability enhancement searching strategy;

in one embodiment of the present invention, the step S5 includes:

S52, initializing the matrix alpha _R Corresponds to a search unit set B ^* The weight of the position of the middle search cell is set to 1, and an initial matrix alpha is calculated _R Distance metric ψ (α) from weight matrix α:

wherein | · | purple sweet ₂ Is the euclidean distance.

s.t.ω ^* (α)＝argmin _ω L _train (ω，α)

wherein L is _train (omega, alpha) and L _val (ω，α) is the loss of GNN framework on the training/validation set, respectively; λ is a weight variable; ω is a trainable parameter in the GNN architecture; omega ^* (α) is the loss L of the GNN framework on the training set _train The value of the trainable parameter ω corresponding to the minimum;

a set of all operations for one edge;

In step S6, the GNN neural network architecture is constructed and formed by using the operation corresponding to the searched maximum weight of each edge.

The scheme also provides a robust neural architecture search system for the graph neural network, which comprises the following steps:

in implementation, the preferred search module in the scheme comprises:

the searching and judging module is used for searching the GNN framework with adjacent subscripts in the framework set; determining GNN framework A _i Whether the robustness index of (A) is greater than that of the GNN architecture A _i+1 The robustness index of (2);

a second searching unit calculating module for calculating A when the output of the judging module is not searched _i+1 -A _i The obtained partial search unit is stored in the set B ₂ Performing the following steps;

in one embodiment of the present invention, the probability enhancing module comprises:

a matrix construction module for reading the weight of each operation of each edge in the search space to form a weight matrix alpha and constructing an initial matrix alpha which has the same dimensionality as the weight matrix alpha and has zero elements _R ；

s.t.ω ^* (α)＝argmin _ω L _train (ω，α)

a set of all operations for one edge;

The performance evaluation is performed on the GNN neural network architecture searched by the scheme by combining the specific cases as follows:

A. data set construction and evaluation index

The performance of the GNN neural network architecture constructed by the present solution was evaluated using the GNN standard dataset Cora and CiteSeer, where the evaluation is divided into two parts: 1) inputting the original data in the data set into the GNN, obtaining the output accuracy, and mainly evaluating the precision of the GNN architecture; 2) and inputting the data after disturbance is added to the original data of the data set by the anti-attack algorithm into the GNN, obtaining the output accuracy and mainly evaluating the robustness of the GNN architecture. The Cora and CiteSeer data sets contain multiple categories of scientific publications and their citations. Each node in the data set represents a scientific publication, and each edge represents that a citation relationship exists between two scientific publications.

And (3) evaluating the performance of the GNN framework under two conditions of countermeasure training and standard training by adopting the classification accuracy of the node classification task as an evaluation index of the performance of the GNN framework. The method comprises the steps of performing countermeasure training, namely generating countermeasure samples by adopting an countermeasure attack algorithm, taking the countermeasure samples as training data of a model, and mainly evaluating the robustness of the GNN framework when defensive measures exist. The standard training is training using raw data in the dataset, and the main evaluation is the robustness of the GNN architecture itself. In addition, to demonstrate the migratability of the schema, the schema searched on the Cora dataset was trained and evaluated directly on the CiteSeer dataset.

When the robustness of the architecture is evaluated, the scheme adopts three attack methods: random attack (random attach), Delete Internal Connect External (DICE) and Node embedding attack (Node embedding attach), all of which have a disturbance rate of 5% of the number of edges in the graph.

B. Details of the experiment

When the performance of the method is evaluated, the method is applied to a commonly used NAS (network attached storage) method SANE (generalized adaptive network access network) based on gradient in GNN (GNN), the robust GNN architecture is searched, and the performance of the searched GNN architecture is evaluated.

C. Algorithm parameter setting

And (3) searching: following the general setup of a search algorithm based on Gradient optimization, Epoch is set to 200, and the learnable parameters of the optimized architecture are attenuated using a random Gradient Descent (SGD) with a learning rate of 0.025, momentum of 0.9 and weight of 0.0003. Adam was used to optimize the architecture parameters at a learning rate of 0.0003, weight decay of 0.001. The weight parameter λ in the objective function is set to 0.01.

And (3) evaluation process: the model was finally evaluated for performance after 600 epochs of model training using SGD with a learning rate of 0.025, a momentum of 0.9 and a weight decay of 0.0003. To achieve a fair comparison, all GNN models were trained using the same settings.

D. Comparison of results

The method comprises the following steps of (REP), the comparison method comprises a manually constructed GNN model (GCN, GAT and GraphSAGE), and a model (SANE) searched based on a gradient optimization method. All data are mean and standard deviation values determined from node classification accuracy (%) obtained from five training sessions of model records. The best results are marked in bold.

Table 1 evaluation results of the confrontational training under the Cora data set

As can be seen from the results in table 1, when all GNN models are subjected to countermeasure training, the classification accuracy of the GNN model searched by the method is higher than that of other GNN models in both the presence and absence of the countermeasure attack. The method has the advantages that the GNN model searched by the method has better accuracy and robustness in case of defensive measures than the manually constructed GNN model and the GNN model searched by the gradient optimization method.

Table 2 evaluation results of standard training under the Cora data set

From the results in table 2, when all models are subjected to standard training, the model searched by the method according to the present invention still has higher accuracy than other GNN models under both the original data and the attack resistance. When the influence of the countertraining on the robustness of the models is removed, the difference in robustness among the models mainly comes from the internal structure of the models. Therefore, the experimental data in table 2 prove that the GNN model searched by the method of the present invention has better robustness than other GNN models in terms of model structure.

TABLE 3 evaluation results of migratability under CiteSeer dataset

From the results in table 3, it can be seen that when the GNN model searched on the Cora data set by the method according to the present embodiment is directly trained on the CiteSeer data set, the GNN model according to the present embodiment achieves the highest classification accuracy in both the classification accuracy of the original data and the classification accuracy in the presence of counterattack. In particular, in the experiments, the present solution keeps each GNN model the same as it was when experimented on the Cora dataset to ensure fairness of the comparison. Experimental results show that the GNN model searched by the method has good mobility, and good classification accuracy and robustness can be kept on other data sets.

Meanwhile, in order to more fully show the superior robustness of the REP in the scheme, the disturbance rate of the DICE anti-attack algorithm is set to be 0%, 10%, 20%, 30%, 40% and 50% in sequence, so as to compare the performances of the GNN architectures when the attack strength is gradually increased, and the comparison result is shown in FIG. 3.

From the results of fig. 3, it can be easily found that the performance degradation of GCN, GAT, GraphSAGE and SANE is very significant and large in the case of increasing strength against attack. Compared with the GNN framework searched by the REP method, the descending rate of the GNN framework searched by the REP method is small, the highest accuracy rate in all models is always kept, and the GNN framework searched by the REP method has good robustness.

E. Advantages in practical scenarios

Similar to the node classification task on the Cora and CiteSeer datasets, when GNN is applied to an actual scene, such as: the problems to be solved in the detection of cheater users, the detection and filtration of junk mails and junk advertisements in the financial system are still the node classification problem. Specifically, in the detection of rogue users in financial systems, the task of GNN is to classify each node in the input data as a rogue user or a user with good credit, and in the detection and filtering of spam and spam, the task of GNN is to classify the nodes in the input data as normal mail, advertisement or spam, advertisement.

Compared with the traditional manually constructed GNN models such as GCN, GAT and GraphSAGE and the GNN models searched by the NAS method based on gradient optimization such as SANE, the GNN models with good classification accuracy and robustness in the node classification task can be searched by the method. Thus, after the map data input to the GNN is perturbed, as: the cheat user in the input data is added with the user with good credit, the junk mail, the advertisement is added with the normal mail and the advertisement, and the like.

In summary, since the GNN architecture searched by the method of the present invention has good robustness, when the method is applied to an actual scene, the method of the present invention is very advantageous for improving the security of the system in the actual application scene.

Claims

1. The robust neural architecture searching method for the graph neural network is characterized by comprising the following steps of:

2. The robust neural architecture search method for neural network-oriented according to claim 1, wherein the step S4 includes:

3. The robust neural architecture search method for neural network-oriented according to claim 1, wherein the step S5 includes:

S52, initializing the matrix alpha _R Corresponds to a search unit set B ^* The weight of the position of the middle search cell is set to 1, and an initial matrix alpha is calculated _R With a weight matrix alphaA distance metric ψ (α);

s.t.ω ^* (α)＝argmin _ω L _train (ω，α)

wherein, 0 ^(i，j) An operation corresponding to the maximum weight of the edge; argmax is a function of the value of its argument when a function is maximized,

a set of all operations for one edge;

4. The robust neural architecture search method for graph-oriented neural networks according to claim 3, wherein the distance metric ψ (α) is calculated by the formula:

wherein | · | purple sweet ₂ Is the euclidean distance.

5. The robust neural architecture search method for the graph-oriented neural network of claim 1, wherein the number of search units in the search space S is greater than the number of search units in the GNN architecture.

6. The robust neural architecture search method for graph-oriented neural networks as claimed in claim 1, wherein when the constructed GNN neural network architecture is used for classifying scientific literature, the data sets for training the GNN architecture are the Cora data set and the citeser data set;

7. The robust neural architecture search system for the graph neural network is characterized by comprising the following components:

8. The robust neural architecture search system for neural networks according to claim 7, wherein said search module comprises:

the searching and judging module is used for searching the GNN framework with adjacent subscripts in the framework set; determining GNN architecture A _i Whether the robustness index of (A) is greater than that of the GNN architecture A _i+1 The robustness index of (2);

9. The robust neural architecture search system for neural networks according to claim 7, wherein the probability enhancement module comprises:

matrix construction module for read searchThe weight of each operation of each edge in the space forms a weight matrix alpha, and an initial matrix alpha which has the same dimension as the weight matrix alpha and has zero elements is constructed _R ；

s.t.ω ^* (α)＝argmin _ω L _train (ω，α)

a set of all operations for one edge;