CN114897161A - Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium - Google Patents

Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium Download PDF

Info

Publication number
CN114897161A
CN114897161A CN202210540676.4A CN202210540676A CN114897161A CN 114897161 A CN114897161 A CN 114897161A CN 202210540676 A CN202210540676 A CN 202210540676A CN 114897161 A CN114897161 A CN 114897161A
Authority
CN
China
Prior art keywords
mask
graph
network
adjacency matrix
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210540676.4A
Other languages
Chinese (zh)
Other versions
CN114897161B (en
Inventor
魏薇
景慧昀
牛金行
周凡棣
辛鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Information and Communications Technology CAICT
Original Assignee
China Academy of Information and Communications Technology CAICT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Information and Communications Technology CAICT filed Critical China Academy of Information and Communications Technology CAICT
Priority to CN202210540676.4A priority Critical patent/CN114897161B/en
Publication of CN114897161A publication Critical patent/CN114897161A/en
Application granted granted Critical
Publication of CN114897161B publication Critical patent/CN114897161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention provides a mask-based graph classification backdoor attack defense method, a mask-based graph classification backdoor attack defense system, electronic equipment and a storage medium. The method comprises the following steps: the random mask is used for masking the adjacent matrix of the graph neural network, partial information of the network topological structure can be masked out in each masking operation, the local trigger structure in the network is damaged, meanwhile, the mask adjacent matrix which is superposed for many times is used, and after pooling operation, the original topological structure of the original network is reserved to the maximum extent, so that the trigger embedded in training data by an attacker is invalid, and the model can also keep normal performance.

Description

Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium
Technical Field
The invention belongs to the field of image processing, and particularly relates to a mask-based graph classification backdoor attack defense method, a mask-based graph classification backdoor attack defense system, electronic equipment and a storage medium.
Background
With the rapid development of digital economy and artificial intelligence technologies, graph networks have become an important branch of data analysis technologies. Most systems in real life can be represented by graph data, where graph classification is a fundamental graph analysis tool. Graph classification is a problem of mapping graph networks and their corresponding labels, which has many practical applications, such as molecular property determination, new drug discovery, fraud detection, and the like. Specifically, in the field of pharmaceutical molecular compounds, researchers model molecular structures as graph networks and study molecular chemistry as graph classification tasks.
The robustness of the model is also concerned when the graph neural network completes downstream tasks with high quality. The great majority of the tasks that the graph neural network model can perform excellently are derived from the large number of data supports. Then, some backdoor attack methods for the model training phase have been proposed. Backdoor attacks are methods of attack that occur during the training phase, where an attacker trains the model by setting the training data for the triggers, which responds to the data input with the trigger embedding in a highly predictable manner during the use phase, resulting in a preset result for the model, while the model operates normally for other normal samples of the input model. Once the trigger is set in the training phase, the model is equivalent to leaving a backdoor for the attacker who inputs the data with the embedded trigger in the using phase of the model, which leads to extremely serious results.
Disclosure of Invention
In order to solve the technical problems, the invention provides a mask-based graph classification backdoor attack defense method, a mask-based graph classification backdoor attack defense system, an electronic device and a storage medium, so as to solve the technical problems.
The invention discloses a mask-based graph classification backdoor attack defense method, which comprises the following steps:
step S1, acquiring a graph neural network model and a training data set of the graph neural network model, wherein the training data set is composed of a plurality of graph networks; the training data set is divided into a model training set, a verification set and a test set according to a proportion;
step S2, constructing a mask adjacency matrix of the graph network in the model training set;
step S3, pooling the mask adjacency matrix, and applying the pooled mask adjacency matrix to process the graph network to obtain a processed graph network;
step S4, processing the model training set according to the method of the step S2 to the step S3 to obtain a processed model training set;
and step S5, inputting the processed model training set into the graph neural network model for model training.
According to the method of the first aspect of the present invention, in the step S2, the method of constructing the mask adjacency matrix of the graph network in the model training set includes:
randomly generating T Mask matrices, i.e. { Mask 1 ,Mask 2 ,...,Mask T And (c) the step of (c) in which,
Mask i ∈R N×N i ∈ {1, 2.,. T }, where N is the number of nodes of the graph network;
and embedding the T mask matrixes into a graph network to obtain a mask adjacency matrix.
According to the method of the first aspect of the present invention, in the step S2, the values in the mask matrix are randomly set to 0 or 1.
According to the method of the first aspect of the present invention, in step S2, the method of embedding the T mask matrices into a graph network to obtain a mask adjacency matrix includes: adjacency matrix A and Mask matrix Mask i And performing dot multiplication to obtain a mask adjacency matrix, wherein the specific formula is as follows:
{A mask1 ,A mask2 ,...,A maskT }=Mix(A,{Mask 1 ,Mask 2 ,...,Mask T })
wherein A is maski ∈R N×N I ∈ {1, 2.,. T } is a mask adjacency matrix, Mix (·) is expressed as an operation of multiplying the adjacency matrix by all mask matrix points, and N is the number of nodes of the network.
According to the method of the first aspect of the present invention, in the step S3, the method of pooling the mask adjacency matrices includes:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The maximum value of each position in the list is replaced to A mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
According to the method of the first aspect of the present invention, in the step S3, the method for pooling the mask adjacency matrices further includes:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The value of each position in the solution is superposed and then is replaced to A by taking the average value mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
According to the method of the first aspect of the present invention, in step S3, the method for processing the graph network by applying the pooled mask adjacency matrices to obtain a processed graph network includes:
the pooled mask adjacency matrix A mask And replacing the adjacency matrix A in the graph network to obtain the processed graph network.
The second aspect of the invention discloses a mask-based graph classification backdoor attack defense system, which comprises:
the device comprises a first processing module, a second processing module and a third processing module, wherein the first processing module is configured to obtain a graph neural network model and a training data set of the graph neural network model, and the training data set is composed of a plurality of graph networks; the training data set is divided into a model training set, a verification set and a test set according to a proportion;
a second processing module configured to construct a mask adjacency matrix for a graph network in the model training set;
a third processing module configured to pool the mask adjacency matrix and apply the pooled mask adjacency matrix to process the graph network to obtain a processed graph network;
the fourth processing module is configured to process the model training set according to the second processing module and the third processing module to obtain a processed model training set;
and the fifth processing module is configured to input the processed model training set into the graph neural network model for model training.
According to the system of the second aspect of the present invention, the second processing module is configured to construct the mask adjacency matrix of the graph networks in the model training set, including:
randomly generating T Mask matrices, i.e. { Mask 1 ,Mask 2 ,...,Mask T And (c) the step of (c) in which,
Mask i ∈R N×N i ∈ {1, 2.,. T }, where N is the number of nodes of the graph network;
and embedding the T mask matrixes into a graph network to obtain a mask adjacency matrix.
According to the system of the second aspect of the present invention, the second processing module is configured to randomly set the values in the mask matrix to 0 or 1.
According to the system of the second aspect of the present invention, the second processing module is configured to embed the T mask matrixes into the graph network, and the method for obtaining the mask adjacency matrix includes: adjacency matrix A and Mask matrix Mask i And performing dot multiplication to obtain a mask adjacency matrix, wherein the specific formula is as follows:
{A mask1 ,A mask2 ,...,A maskT }=Mix(A,{Mask 1 ,Mask 2 ,...,Mask T })
wherein A is maski ∈R N×N I ∈ {1, 2.,. T } is a mask adjacency matrix, Mix (·) is expressed as an operation of multiplying the adjacency matrix by all mask matrix points, and N is the number of nodes of the network.
According to the system of the second aspect of the present invention, the third processing module is configured to pool the mask adjacency matrix comprising:
adjacency matrix of T masks{A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The maximum value of each position in the list is replaced to A mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
According to the system of the second aspect of the present invention, the third processing module is configured to pool the mask adjacency matrix further comprises:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The value of each position in the solution is superposed and then is replaced to A by taking the average value mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
According to the system of the second aspect of the present invention, the third processing module is configured to process the graph network by applying the pooled mask adjacency matrices, and obtain a processed graph network includes:
the pooled mask adjacency matrix A mask And replacing the adjacency matrix A in the graph network to obtain the processed graph network.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor implements the steps of the mask-based graph classification backdoor attack defense method according to any one of the first aspect of the disclosure when executing the computer program.
A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of a method for defending against a mask-based graph classification backdoor attack according to any one of the first aspect of the present disclosure.
The scheme provided by the invention can directly destroy the trigger structure inserted in the graph data, so that the trigger structure can not obtain the certain effect, but does not influence the normal sample, and the model can still show the due performance for inputting the normal sample.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flowchart of a mask-based graph classification backdoor attack defense method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a mask-based graph neural network backdoor attack defense method according to an embodiment of the present invention;
FIG. 3 is a block diagram of a mask-based graph classification backdoor attack defense system according to an embodiment of the present invention;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention discloses a mask-based graph classification backdoor attack defense method. Fig. 1 is a flowchart of a mask-based graph classification backdoor attack defense method according to an embodiment of the present invention, as shown in fig. 1 and fig. 2, the method includes:
step S1, acquiring a graph neural network model and a training data set of the graph neural network model, wherein the training data set is composed of a plurality of graph networks; the training data set is divided into a model training set, a verification set and a test set according to a proportion;
step S2, constructing a mask adjacency matrix of the graph network in the model training set;
step S3, pooling the mask adjacency matrix, and applying the pooled mask adjacency matrix to process the graph network to obtain a processed graph network;
step S4, processing the model training set according to the method of the step S2 to the step S3 to obtain a processed model training set;
and step S5, inputting the processed model training set into the graph neural network model for model training.
In step S1, a graph neural network model and a training data set of the graph neural network model are obtained, where the training data set is composed of a plurality of graph networks; the training data set is divided into a model training set, a verification set and a test set according to a proportion.
Specifically, step S11, the graph neural network model M oracle The downstream tasks performed by the graph neural network model are graph classification tasks, and graph networks are typically represented by G ═ { V, E }, where V ═ V 1 ,...,v N Denotes a set of N nodes, e i,j =<v i ,v j >E represents the node v i And node v j There is a continuous edge between them, in general, the information contained in the node set V and the continuous edge combination E is represented by A E R N×N When node v is present i And node v j When there is a direct edge connection, A i,j Not equal to 0, otherwise A i,j 0. X is represented as a feature matrix of the graph network. In the graph classification task, a graph set composed of M graphs is marked as G ═ G 1 ,...,G M },Y n Is shown as a diagram Y n And (4) corresponding class labels. M oracle Is an untrained graph classifier model f: G → {0, 1., y → {0, 1., y i G is the corresponding input sample, {0,1 i The prediction labels for the corresponding classifiers.
Step S12, obtaining model M oracle Training Data set Data oracle The data sets used to train the models are the MUTAG numbers from the biochemical field, respectivelyThe data set, the NCI1 data set, and the protein domain PROTEINS data set, which have 188 samples, 4110 samples, and 1113 samples, respectively, are downloaded from the network. The data sets are used for the chart classification task, and each data set comprises a plurality of charts G i And each graph has a corresponding label y of its classification i The graph data is composed of nodes and continuous edges, wherein the graph data is represented by G, and the structure information of the graph data is represented by an adjacency matrix A ij Indicating that if there is a connecting edge between nodes i, j, then at its corresponding adjacent matrix location e ij Has a value of 1, no connecting edge e is present ij The corresponding value is 0, and each node has a feature matrix X-U (0,1) from the same distribution. For different datasets, the nodes and edges of which have corresponding meanings, such as a MUTAG dataset, each graph network sample represents a nitro compound molecule, wherein atoms serve as nodes and chemical bonds between atoms serve as links in the graph network, and each sample has a corresponding label, and the label of the data mutates aromatic and heteroaromatic compounds, and the label is represented by 0, 1;
step S13, acquired Data set Data oracle And are divided according to proportion, wherein the model training set Data train Verification set Data val And test set Data test 70%, 10%, 20%, respectively.
At step S2, a mask adjacency matrix of the graph networks in the model training set is constructed.
In some embodiments, in the step S2, the method of constructing a mask adjacency matrix of a graph network in the model training set includes:
randomly generating T Mask matrices, i.e. { Mask 1 ,Mask 2 ,...,Mask T And (c) the step of (c) in which,
Mask i ∈R N×N i ∈ {1, 2.,. T }, where N is the number of nodes of the graph network, and the value in the mask matrix is randomly set to 0 or 1;
and embedding the T mask matrixes into a graph network to obtain a mask adjacency matrix.
The said one or moreThe T mask matrixes are embedded into the graph network, and the method for obtaining the mask adjacency matrix comprises the following steps: adjacency matrix A and Mask matrix Mask i And performing dot multiplication to obtain a mask adjacency matrix, wherein the specific formula is as follows:
{A mask1 ,A mask2 ,...,A maskT }=Mix(A,{Mask 1 ,Mask 2 ,...,Mask T })
wherein A is maski ∈R N×N I ∈ {1, 2.,. T } is a mask adjacency matrix, Mix (·) is expressed as an operation of multiplying the adjacency matrix by all mask matrix points, and N is the number of nodes of the network.
Specifically, step S21 shows that an arbitrary graph network G belongs to Data { a, X ═ b { (a, X) } in the Data train Randomly generating T Mask matrices, i.e. { Mask } 1 ,Mask 2 ,...,Mask T },Mask i ∈R N×N I ∈ {1, 2.,. T }, where N is the number of nodes of the network, Mask } i The value in (1) is randomly set to 0 or 1.
Step S22, T Mask matrixes { Mask are obtained 1 ,Mask 2 ,...,Mask T Embedding the graph network G into { A, X } to obtain a mask adjacency matrix { A mask1 ,A mask2 ,...,A maskT The process is as follows,
{A mask1 ,A mask2 ,...,A maskT }=Mix(A,{Mask 1 ,Mask 2 ,...,Mask T })
wherein A is maski ∈R N×N I ∈ {1, 2.,. T } is a mask adjacency matrix, Mix (·) is expressed as an operation of multiplying the adjacency matrix by all mask matrix points, and N is the number of nodes of the network.
At step S3, the mask adjacency matrices are pooled, and the pooled mask adjacency matrices are applied to process the graph network, resulting in a processed graph network.
In some embodiments, in the step S3, the method of pooling the mask adjacency matrices includes:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The maximum value of each position in the list is replaced to A mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
The method of pooling the mask adjacency matrix further comprises:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The value of each position in the solution is superposed and then is replaced to A by taking the average value mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
The method for processing the graph network by applying the pooled mask adjacency matrix to obtain the processed graph network comprises the following steps:
the pooled mask adjacency matrix A mask And replacing the adjacency matrix A in the graph network to obtain the processed graph network.
Specifically, step S31, T mask adjacency matrices are obtained from step S22
{A mask1 ,A mask2 ,...,A maskT And sending the mask adjacent matrix into a pooling layer, wherein two pooling modes, namely maximum pooling and average pooling, can be selected to obtain a pooled adjacent matrix A mask
Step S32, the operation flow of obtaining the maximum pooling mode from step S31 is as follows, the T mask adjacency matrixes { A } mask1 ,A mask2 ,...,A maskT }∈R T×N×N The maximum value of each position in the list is replaced to A mask Pooling to obtain A mask ∈R N×N Where T is the number of mask adjacency matrices and N is the number of nodes in the network, and is expressed in formula as follows,
A mask =Pool max ({A mask1 ,A mask2 ,...,A maskT })
wherein A is mask For the pooled adjacency matrix, { A mask1 ,A mask2 ,...,A maskT Is T mask adjacency matrices, Pool max Expressed as maximum pooling.
Step S33, the flow of the average pooling operation obtained from step S31 is as follows, the T mask adjacency matrices { A } mask1 ,A mask2 ,...,A maskT }∈R T×N×N The value of each position in the solution is superposed and then is replaced to A by taking the average value mask Pooling at the corresponding position to obtain A mask ∈R N×N Where T is the number of mask adjacency matrices, N is the number of nodes of the network, and the formulation is as follows,
A mask =Pool ave ({A mask1 ,A mask2 ,...,A maskT })
wherein A is mask For the pooled adjacency matrix, { A mask1 ,A mask2 ,...,A maskT Is T mask adjacency matrices, Pool ave Expressed as average pooling.
S34. the pooled adjacency matrix A obtained from the step S32 or S33 mask Replacing the adjacency matrix A in the graph network G ═ { A, X } to obtain a processed graph network G ═ { A, X } mask ,X}。
In step S4, the model training set is processed according to the method of step S2 to step S3, and a processed model training set is obtained.
Specifically, Data is recorded train All graph networks in (1) are operated according to the processing methods from S21 to S34, and the processed graph network G ═ a mask X, to form a processed model training set Data mask_train
In step S5, the processed model training set is input to the neural network model for model training.
Specifically, step S51, the processed model training set Data obtained from step S4 mask_train Inputs it normally into the graph classification model M oracle And (5) performing model training.
Step S52, obtaining the processed model training set Data from S51 mask_train And a graph classification model M oracle Training set Data of the processed model mask_train Input to the graph classification model M oracle In the middle-going and training, small batch is adoptedIn the training method of Gradient Descent (MBGD), a Batch of data is randomly selected from a training set each time for training a model, so that training shock caused by random Gradient Descent (SGD) can be avoided, excessive consumption of resources caused by Batch Gradient Descent (BGD) can be avoided, and the size of the Batch is selected to be 128. The training objective is to adjust the structural parameters of the network by forward and backward propagation of the gradient, and to continuously reduce the loss function value of the model.
To avoid the interference of the experiment by chance, the experiment adopts ten-fold cross validation, namely, the data set is divided into 10 parts, 9 parts of the data set are selected for training each time, and one part of the data set is selected for testing. And after the training is finished, finishing the training of the graph classification model subjected to the defense processing aiming at the backdoor attack.
In summary, the scheme provided by the invention can directly destroy the trigger structure inserted in the graph data, so that the trigger structure cannot achieve the effect, but does not influence the normal sample, and the model still can show the due performance for inputting the normal sample.
The invention discloses a mask-based graph classification backdoor attack defense system in a second aspect. FIG. 3 is a block diagram of a mask-based graph classification backdoor attack defense system according to an embodiment of the present invention; as shown in fig. 3, the system 100 includes:
a first processing module 101, configured to obtain a graph neural network model and a training data set of the graph neural network model, where the training data set is composed of a plurality of graph networks; the training data set is divided into a model training set, a verification set and a test set according to a proportion;
a second processing module 102 configured to construct a mask adjacency matrix of a graph network in the model training set;
a third processing module 103, configured to pool the mask adjacency matrix, and apply the pooled mask adjacency matrix to process the graph network, so as to obtain a processed graph network;
a fourth processing module 104, configured to process the model training set according to the second processing module and the third processing module, so as to obtain a processed model training set;
a fifth processing module 105, configured to input the processed model training set to the neural network model for model training.
According to the system of the second aspect of the present invention, the second processing module 102 is configured to construct the mask adjacency matrix of the graph networks in the model training set, including:
randomly generating T Mask matrices, i.e. { Mask } 1 ,Mask 2 ,...,Mask T And (c) the step of (c) in which,
Mask i ∈R N×N i ∈ {1, 2.,. T }, where N is the number of nodes of the graph network;
and embedding the T mask matrixes into a graph network to obtain a mask adjacency matrix.
According to the system of the second aspect of the present invention, the second processing module 102 is configured to randomly set the values in the mask matrix to 0 or 1.
According to the system of the second aspect of the present invention, the second processing module 102 is configured to embed the T mask matrixes into the graph network, and the method for obtaining the mask adjacency matrix includes: adjacency matrix A and Mask matrix Mask i And performing dot multiplication to obtain a mask adjacency matrix, wherein the specific formula is as follows:
{A mask1 ,A mask2 ,...,A maskT }=Mix(A,{Mask 1 ,Mask 2 ,...,Mask T })
wherein A is maski ∈R N×N I ∈ {1, 2.,. T } is a mask adjacency matrix, Mix (·) is expressed as an operation of multiplying the adjacency matrix by all mask matrix points, and N is the number of nodes of the network.
According to the system of the second aspect of the present invention, the third processing module 103 is configured to pool the mask adjacency matrix including:
mask T adjacency matrixes A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The maximum value of each position in the list is replaced to A mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
According to the system of the second aspect of the present invention, the third processing module 103 is configured to pool the mask adjacency matrix further comprises:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The value of each position in the solution is superposed and then is replaced to A by taking the average value mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
According to the system of the second aspect of the present invention, the third processing module 103 is configured to process the graph network by applying the pooled mask adjacency matrices, and obtain a processed graph network, including:
the pooled mask adjacency matrix A mask And replacing the adjacency matrix A in the graph network to obtain the processed graph network.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor executes the computer program to realize the steps of the mask-based graph classification backdoor attack defense method in any one of the first aspect of the disclosure.
Fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes a processor, a memory, a communication interface, a display screen, and an input device, which are connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the electronic device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, Near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the electronic equipment, an external keyboard, a touch pad or a mouse and the like.
It will be understood by those skilled in the art that the structure shown in fig. 4 is only a partial block diagram related to the technical solution of the present disclosure, and does not constitute a limitation of the electronic device to which the solution of the present application is applied, and a specific electronic device may include more or less components than those shown in the drawings, or combine some components, or have a different arrangement of components.
A fourth aspect of the invention discloses a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of a method for defending against a masked graph classification backdoor attack according to any one of the first aspect of the present disclosure.
It should be noted that the technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, however, as long as there is no contradiction between the combinations of the technical features, the scope of the present description should be considered. The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A mask-based graph classification backdoor attack defense method, the method comprising:
step S1, acquiring a graph neural network model and a training data set of the graph neural network model, wherein the training data set is composed of a plurality of graph networks; the training data set is divided into a model training set, a verification set and a test set according to a proportion;
step S2, constructing a mask adjacency matrix of the graph network in the model training set;
step S3, pooling the mask adjacency matrix, and applying the pooled mask adjacency matrix to process the graph network to obtain a processed graph network;
step S4, processing the model training set according to the method of the step S2 to the step S3 to obtain a processed model training set;
and step S5, inputting the processed model training set into the graph neural network model for model training.
2. The mask-based graph classification backdoor attack defense method according to claim 1, wherein in the step S2, the method for constructing the mask adjacency matrix of the graph network in the model training set comprises:
randomly generating T Mask matrices, i.e. { Mask 1 ,Mask 2 ,...,Mask T And (c) the step of (c) in which,
Mask i ∈R N×N i ∈ {1, 2.,. T }, where N is the number of nodes of the graph network;
and embedding the T mask matrixes into a graph network to obtain a mask adjacency matrix.
3. The method as claimed in claim 2, wherein in step S2, the values in the mask matrix are randomly set to 0 or 1.
4. The method as claimed in claim 2, wherein in step S2, the method of embedding the T mask matrixes into the graph network to obtain the mask adjacency matrix includes: adjacency matrix A and Mask matrix Mask i And performing dot multiplication to obtain a mask adjacency matrix, wherein the specific formula is as follows:
{A mask1 ,A mask2 ,...,A maskT }=Mix(A,{Mask 1 ,Mask 2 ,...,Mask T })
wherein A is maski ∈R N×N I ∈ {1, 2.,. T } is a mask adjacency matrix, Mix (·) is expressed as an operation of multiplying the adjacency matrix by all mask matrix points, and N is the number of nodes of the network.
5. The method for defending against attack on a back door based on mask graph classification as claimed in claim 1, wherein in said step S3, said method for pooling said mask adjacency matrix comprises:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The maximum value of each position in the list is replaced to A mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
6. The method for defending against a masked graph classification backdoor attack as claimed in claim 1, wherein in the step S3, the method for pooling the mask adjacency matrix further comprises:
adjacent T masks to matrix { A mask1 ,A mask2 ,...,A maskT }∈R T×N×N The value of each position in the solution is superposed and then is replaced to A by taking the average value mask Pooling to obtain A mask ∈R N×N Wherein T is the number of mask adjacency matrixes, and N is the number of nodes of the network.
7. The method as claimed in claim 6, wherein in step S3, the method for processing the graph network by applying the pooled mask adjacency matrix to obtain the processed graph network includes:
the pooled mask adjacency matrix A mask Replacing the adjacent matrix A in the graph network to obtain the processed graph network。
8. A system for classifying backdoor attack defense based on masked graphs, the system comprising:
the device comprises a first processing module, a second processing module and a third processing module, wherein the first processing module is configured to obtain a graph neural network model and a training data set of the graph neural network model, and the training data set is composed of a plurality of graph networks; the training data set is divided into a model training set, a verification set and a test set according to a proportion;
a second processing module configured to construct a mask adjacency matrix for a graph network in the model training set;
a third processing module configured to pool the mask adjacency matrix and apply the pooled mask adjacency matrix to process the graph network to obtain a processed graph network;
the fourth processing module is configured to process the model training set according to the second processing module and the third processing module to obtain a processed model training set;
and the fifth processing module is configured to input the processed model training set into the graph neural network model for model training.
9. An electronic device, comprising a memory storing a computer program and a processor implementing the steps of a method for classified backdoor attack defense based on a mask map of any of claims 1 to 7 when the computer program is executed by the processor.
10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps of a method for classifying backdoor attack defense based on a mask according to any one of claims 1 to 7.
CN202210540676.4A 2022-05-17 2022-05-17 Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium Active CN114897161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210540676.4A CN114897161B (en) 2022-05-17 2022-05-17 Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210540676.4A CN114897161B (en) 2022-05-17 2022-05-17 Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114897161A true CN114897161A (en) 2022-08-12
CN114897161B CN114897161B (en) 2023-02-07

Family

ID=82723208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210540676.4A Active CN114897161B (en) 2022-05-17 2022-05-17 Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114897161B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235584A (en) * 2023-11-15 2023-12-15 之江实验室 Picture data classification method, device, electronic device and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200019699A1 (en) * 2018-07-10 2020-01-16 International Business Machines Corporation Defending Against Model Inversion Attacks on Neural Networks
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network
CN111161535A (en) * 2019-12-23 2020-05-15 山东大学 Attention mechanism-based graph neural network traffic flow prediction method and system
CN111260059A (en) * 2020-01-23 2020-06-09 复旦大学 Back door attack method of video analysis neural network model
CN112765607A (en) * 2021-01-19 2021-05-07 电子科技大学 Neural network model backdoor attack detection method
CN112905379A (en) * 2021-03-10 2021-06-04 南京理工大学 Traffic big data restoration method based on graph self-encoder of self-attention mechanism
CN112925977A (en) * 2021-02-26 2021-06-08 中国科学技术大学 Recommendation method based on self-supervision graph representation learning
CN112989438A (en) * 2021-02-18 2021-06-18 上海海洋大学 Detection and identification method for backdoor attack of privacy protection neural network model
CN113283590A (en) * 2021-06-11 2021-08-20 浙江工业大学 Defense method for backdoor attack
CN113297571A (en) * 2021-05-31 2021-08-24 浙江工业大学 Detection method and device for backdoor attack of orientation graph neural network model

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200019699A1 (en) * 2018-07-10 2020-01-16 International Business Machines Corporation Defending Against Model Inversion Attacks on Neural Networks
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network
CN111161535A (en) * 2019-12-23 2020-05-15 山东大学 Attention mechanism-based graph neural network traffic flow prediction method and system
CN111260059A (en) * 2020-01-23 2020-06-09 复旦大学 Back door attack method of video analysis neural network model
US20220027462A1 (en) * 2020-01-23 2022-01-27 Fudan University System and Method for Video Backdoor Attack
CN112765607A (en) * 2021-01-19 2021-05-07 电子科技大学 Neural network model backdoor attack detection method
CN112989438A (en) * 2021-02-18 2021-06-18 上海海洋大学 Detection and identification method for backdoor attack of privacy protection neural network model
CN112925977A (en) * 2021-02-26 2021-06-08 中国科学技术大学 Recommendation method based on self-supervision graph representation learning
CN112905379A (en) * 2021-03-10 2021-06-04 南京理工大学 Traffic big data restoration method based on graph self-encoder of self-attention mechanism
CN113297571A (en) * 2021-05-31 2021-08-24 浙江工业大学 Detection method and device for backdoor attack of orientation graph neural network model
CN113283590A (en) * 2021-06-11 2021-08-20 浙江工业大学 Defense method for backdoor attack

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YONGDUO SUI ET AL.: "Deconfounded Training for Graph Neural Networks", 《HTTPS://ARXIV.ORG/ABS/2112.15089V1》 *
ZHAOHAN XI ET AL.: "Graph Backdoor", 《HTTPS://ARXIV.ORG/ABS/2006.11890V5,1-18》 *
陈晋音 等: "面向图神经网络的对抗攻击与防御综述", 《网络与信息安全学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235584A (en) * 2023-11-15 2023-12-15 之江实验室 Picture data classification method, device, electronic device and storage medium
CN117235584B (en) * 2023-11-15 2024-04-02 之江实验室 Picture data classification method, device, electronic device and storage medium

Also Published As

Publication number Publication date
CN114897161B (en) 2023-02-07

Similar Documents

Publication Publication Date Title
Li et al. Blockchain assisted decentralized federated learning (BLADE-FL): Performance analysis and resource allocation
CN110929047B (en) Knowledge graph reasoning method and device for focusing on neighbor entity
EP3620990A1 (en) Capturing network dynamics using dynamic graph representation learning
WO2022022274A1 (en) Model training method and apparatus
CN111126668B (en) Spark operation time prediction method and device based on graph convolution network
CN110366734A (en) Optimization neural network framework
WO2018068421A1 (en) Method and device for optimizing neural network
Zhou et al. A priori trust inference with context-aware stereotypical deep learning
EP4322056A1 (en) Model training method and apparatus
CN111930932B (en) Knowledge graph representation learning method and device in network space security field
Slimacek et al. Nonhomogeneous Poisson process with nonparametric frailty and covariates
CN114897161B (en) Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium
CN115618008A (en) Account state model construction method and device, computer equipment and storage medium
CN112613435A (en) Face image generation method, device, equipment and medium
CN113609345A (en) Target object association method and device, computing equipment and storage medium
CN115439192A (en) Medical commodity information pushing method and device, storage medium and computer equipment
CN114997036A (en) Network topology reconstruction method, device and equipment based on deep learning
WO2024074072A1 (en) Spiking neural network accelerator learning method and apparatus, terminal, and storage medium
CN111639523B (en) Target detection method, device, computer equipment and storage medium
WO2021068249A1 (en) Method and apparatus for hardware simulation and emulation during running, and device and storage medium
CN112507323A (en) Model training method and device based on unidirectional network and computing equipment
CN110601909B (en) Network maintenance method and device, computer equipment and storage medium
WO2024027068A1 (en) Attack method and device for evaluating robustness of object detection model
KR102126795B1 (en) Deep learning-based image on personal information image processing system, apparatus and method therefor
De Vita et al. µ-ff: On-device forward-forward training algorithm for microcontrollers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant