CN114579962A

CN114579962A - AI safety attack and defense test method

Info

Publication number: CN114579962A
Application number: CN202210137681.0A
Authority: CN
Inventors: 梁炜; 秦湛; 任奎; 姚宏伟; 林博涵
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2022-02-15
Filing date: 2022-02-15
Publication date: 2022-06-03

Abstract

The invention discloses an AI safety attack and defense test method, which comprises the following steps: software and hardware cooperation safety test flow and a full-period automatic safety test method. Aiming at the defects that a large-scale and complex AI system is difficult to carry out comprehensive security detection and the current security test method has single function and poor universality, the steps of the life cycle of the AI system are summarized, attack methods and corresponding defense strategies existing in all links of the AI system are analyzed by a full-cycle automatic adaptive security test method, a security test flow is constructed, and the comprehensive security defense of the AI system in the actual deployment environment is supported.

Description

AI safety attack and defense test method

Technical Field

The invention belongs to the technical field of artificial intelligence AI safety, and particularly relates to an AI safety attack and defense testing method.

Background

In recent years, the rapid development of the artificial intelligence technology brings a profound change to the life of the human society and promotes the jump of the fields of the economic society from digitalization and networking to intellectualization. Meanwhile, the potential safety hazard of the artificial intelligence technology is gradually exposed. However, the general development level of the current artificial intelligence is still in the starting stage, and the artificial intelligence technology represented by deep learning generally has the problems of robustness, interpretability limitation and the like, so people have insufficient understanding of the inherent vulnerability of the highly complex artificial intelligence system, lack of effective safety verification, testing and enhancement means and seriously hinder the artificial intelligence technology from being widely applied to the ground. In an actual use scene, the problems of complex data link environment of the AI system, various types of model operation software frameworks and hardware systems, and various AI attack modes also exist.

Disclosure of Invention

The invention aims to provide an AI safety attack and defense testing method aiming at the defects of the prior art.

The purpose of the invention is realized by the following technical scheme: an AI safety attack and defense test method comprises the following steps:

(1) inputting an AI model file to be tested, and inputting characteristic data of the AI model;

(2) executing a full-period automatic safety test method on the AI model characteristic data to be tested input in the step (1); establishing an attack and defense knowledge graph by a knowledge graph method, and carrying out knowledge reasoning to obtain an attack and defense algorithm of an input AI model to be tested in a data collection and pretreatment stage, a training and testing stage and a hardware adaptation environment of a deployment stage;

(3) the first stage of testing was performed: performing attack and defense tests in the AI model data collection and preprocessing stages according to the first-stage attack method and defense method in the knowledge reasoning result of the step (2);

(4) and (3) carrying out a second stage test: performing attack and defense tests in an AI model training and testing stage according to the attack method and the defense method in the second stage in the knowledge reasoning result of the step (2);

(5) and (3) carrying out online test on the deployment of the AI system: carrying out simulation adaptation on the system hardware environment which is most suitable for the AI model to be tested in the knowledge inference result of the step (2) and the AI model to obtain an AI model software and hardware adaptation result;

(6) and integrating the test results of the data collection and preprocessing stage, the training and testing stage and the deployment stage of the AI life cycle into a test report, and providing security defense strategies and security deployment methods of different stages.

Further, in step (1), the feature data of the AI model includes a data set type, a model category, a framework category, and a training parameter.

Further, the step (2) includes the sub-steps of:

(2.1) determining node types and relationship types of the AI model attack and defense knowledge graph to form a knowledge graph node relationship rule;

8 node types, 6 relation types and 9 triple types of the knowledge graph are established; the node types are respectively 'model ontology', 'data set type', 'model type', 'framework type', 'training parameters', 'system hardware environment', 'attack algorithm', 'defense algorithm'; the relationship types are respectively 'belong to', 'first stage attack', 'second stage attack', 'first stage defense', 'second stage defense', 'software and hardware adaptation'; the data set type, the model type, the frame type and the training parameter form a relationship of 'belonging' with the model body, and form 4 three tuple types; the attack algorithm and the model ontology form a relation of a first-stage attack and a second-stage attack to form 2 triad types; the defense algorithm and the model body form a first-stage defense relation and a second-stage defense relation to form 2 triad types; the system hardware environment and the model ontology form a software and hardware adaptation relationship to form 1 three tuple types;

(2.2) acquiring data for constructing a map in a crawler mode according to the node relation rule in the step (2.1);

crawling text information in a thesis database according to the AI thesis keywords; the text information comprises the node and relationship information of the node type and the relationship type in the step (2.1), and the node and relationship information is extracted into regular data through a regularization extraction and entity extraction method;

(2.3) forming a triple by the data through a node relation rule, and forming an AI model attack and defense knowledge graph;

forming a triple set by the extracted regularized data according to the node relation rule in the step (2.1), and storing the triple set in a database to form a complete knowledge graph;

meanwhile, a triple in the whole attack and defense knowledge graph is converted into a matrix form through knowledge embedding to obtain an embedded matrix, and the embedded matrix exists as another form of the knowledge graph;

and (2.4) matching the characteristic data of the AI model to be tested with the attack and defense knowledge graph constructed in the step (2.3), reasoning out an attack and defense method corresponding to the life cycle through the relationship between the nodes, and testing by using the corresponding attack and defense method.

Further, in the matching inference link of step (2.4):

(2.4.1) the feature data of the AI model to be tested is required to be converted into a vector form through knowledge embedding to obtain a feature vector;

(2.4.2) according to the feature vector obtained in the step (2.4.1) and the relation vector in the embedding matrix after the attack and defense knowledge graph is converted in the step (2.3), calculating and obtaining an attack method and a defense method in the first stage and the second stage and a system hardware environment matched with the AI model to be tested as a knowledge reasoning result through an L1 distance; the attack method of the first stage comprises data virus attack or sample attack resistance, and the defense method of the first stage comprises abnormal data analysis and abnormal data cleaning; the attack method of the second stage comprises backdoor attack or attack resistance, the defense method of the second stage comprises robustness enhancement and model reinforcement, and the attack and defense success rate of the attack and defense algorithm on the AI model and the influence on the AI model are analyzed by using a standardized unit test and robustness formal verification method;

and (2.4.3) respectively inputting the knowledge reasoning results of the step (2.4.2) into different stages of the software and hardware collaborative safety testing process to carry out corresponding tests.

Further, step (3) comprises the following substeps:

(3.1) according to the knowledge reasoning result in the step (2), constructing a data set used by the AI model polluted by the attack sample by using a first-stage attack method in the knowledge reasoning result, and realizing attack in the data collection and pretreatment stages;

(3.2) selecting an abnormal data automatic analysis method, inputting data of the data set of the attacked AI model, analyzing the sample characteristics and the distribution condition of the data set, and automatically cleaning the abnormal data of the attacked data set by using an automatic abnormal data cleaning method;

and (3.3) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test results of the data collection and pretreatment stages.

Further, step (3.2) comprises the following sub-steps:

(3.2.1) in the abnormal data detection step: first, the type of anomaly data is analyzed: detecting the type of the abnormal data according to the distribution characteristic comparison of the abnormal data and the normal data; then, exception data is processed: selecting repair data or discard data according to the type of the abnormal data; inputting data to be repaired into an automatic abnormal data cleaning step;

(3.2.2) in the automatic abnormal data cleaning step:

obtaining a defense method in the first stage according to the knowledge reasoning result of the step (2) of the full-period automatic safety test method;

the abnormal data to be repaired detected in the abnormal data detection step (3.2.1) is cleaned by using the first-stage defense method.

Further, the step (4) includes the following sub-steps:

(4.1) according to the knowledge reasoning result in the step (2), using the obtained backdoor attack or counter attack to trigger the backdoor or disturb in the training and testing stage, and realizing the attack in the training and testing stage;

(4.2) in a training link, according to the knowledge reasoning result in the step (2), carrying out robustness enhancement and model reinforcement on the AI model by using the obtained second-stage defense method, and defending the second-stage attack method;

(4.3) in a model testing link, carrying out standardized unit testing and robustness formalized verification on the AI model, and analyzing the influence on the AI model caused by attack and defense in the model training and testing processes;

and (4.4) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test result of the model training and testing stage.

Further, in step (4.3), the standardized unit test comprises:

a1. and (3) testing the module unit: the functions of different modules of the model are tested respectively, input data of a predictable result are constructed, and a predicted standard result and an actual module output result are compared to achieve the test purpose;

a2. neuronal coverage test: the output range of the neuron is divided into a plurality of intervals with the same length, each interval is used for representing a characteristic behavior of the neuron, and whether the logic behavior is covered by the test data is judged by judging whether the output value of the neuron is contained in one interval.

Further, in step (4.3), the robust formal verification includes:

b1. selecting a proper formal verification method according to the type of the activation function and the application scene of the AI model by using the knowledge reasoning result of the step (2);

b2. and c, performing formal verification on the AI model to be tested by using the method selected in the step b1 to obtain a robustness evaluation result.

Further, in the step (6), the test report includes a vulnerable method, a defense strategy, an attack and defense success rate in the AI model data collection and preprocessing stage, a vulnerable method, a defense strategy, an attack and defense success rate in the training and testing stage, and an AI model software and hardware adaptation result.

The invention has the beneficial effects that: according to the invention, a safety attack and defense test method is innovated by using a knowledge map in combination with an attack and defense method, defense means is implemented by combining characteristics of different stages of the life cycle of the AI model, the safety attack and defense test of the AI model in the whole cycle is realized, various AI attack methods can be effectively defended, a safety test flow is constructed, the comprehensive safety defense of an AI system in the actual deployment environment is supported, and the robustness and the safety of the AI model are ensured.

Drawings

FIG. 1 is a flow chart of an AI security attack and defense testing method according to the present invention;

FIG. 2 is a flow diagram of a portion of a full-cycle automated security testing method.

Detailed Description

The present invention is described in detail below with reference to the accompanying drawings.

As shown in fig. 1, the invention provides an AI security attack and defense test method, which combines software and hardware collaborative security test and full-period automated security test, analyzes attack modes and defense strategies at different stages of an AI life cycle, realizes AI security defense and security deployment, and is realized by the following steps:

(1) inputting an AI model to be tested and model information.

Uploading an AI model file to be tested, and inputting characteristic data (data set type, model type, frame type and training parameters) of the AI model. The AI model file is used for attack and defense testing, and the input characteristic data is used for attack and defense knowledge graph reasoning.

(2) And (4) executing a full-period automatic safety test method on the AI model characteristic data to be tested input in the step (1.1). As shown in fig. 2, an attack and defense knowledge graph is constructed by a knowledge graph method, and knowledge reasoning is performed to obtain an attack and defense algorithm of an input AI model to be tested in a data collection and preprocessing stage, a training and testing stage, and a hardware adaptation environment of a deployment stage, so as to be used in subsequent testing steps. The method comprises the following specific steps:

and (2.1) determining the node type and the relationship type of the AI model attack and defense knowledge graph to form a knowledge graph node relationship rule.

8 node types, 6 relation types and 9 triple types of the knowledge graph are established. The node types are respectively 'model ontology', 'data set type', 'model type', 'framework type', 'training parameters', 'system hardware environment', 'attack algorithm' and 'defense algorithm'. The relationship types are "belonging", "first stage attack", "second stage attack", "first stage defense", "second stage defense", and "software and hardware adaptation", respectively. The data set type, the model type, the frame type and the training parameter form a relationship of 'belonging' with the model body, and form 4 three tuple types; the attack algorithm and the model ontology form a relation of a first-stage attack and a second-stage attack to form 2 triad types; the defense algorithm and the model body form a first-stage defense relation and a second-stage defense relation to form 2 triad types; the 'system hardware environment' and the 'model ontology' form a 'software and hardware adaptation' relationship to form 1 three tuple types.

The data collection and preprocessing stage is the first stage, the training and testing stage is the second stage, and the deployment stage is the third stage.

And (2.2) acquiring data for constructing the map in a crawler mode according to the node relation rule in the step (2.1).

And crawling the text information in the thesis database according to the AI thesis keywords. The text information comprises the node and relationship information of the node type and relationship type in the step (2.1), and the node and relationship information is extracted into regular data through a regularization extraction and entity extraction method.

And (2.3) forming a triple by the data through a node relation rule, and forming an AI model attack and defense knowledge graph.

And (3) forming a triple set by the extracted regularized data according to the node relation rule in the step (2.1), and storing the triple set in a database to form a complete knowledge graph.

Meanwhile, the triple in the whole attack and defense knowledge graph is converted into a matrix form through knowledge embedding to obtain an embedded matrix, and the embedded matrix exists as another form of the knowledge graph.

In the matching reasoning link:

(2.4.1) feature data of the AI model to be tested (namely the data set type, the model type, the framework type and the training parameters of the AI model) needs to be converted into a vector form through knowledge embedding to obtain a feature vector.

And (2.4.2) according to the characteristic vector obtained in the step (2.4.1) and the relation vector in the embedded matrix after the attack and defense knowledge map is converted in the step (2.3), calculating and obtaining the attack method and the defense method in the first stage and the second stage and the system hardware environment matched with the AI model to be tested as a knowledge reasoning result through an L1 distance (the sum of the projected distances of a line segment formed by two points on a fixed rectangular coordinate system of an Euclidean space to a coordinate axis). The attack method of the first stage comprises data virus attack or sample attack resistance, and the defense method comprises abnormal data analysis and abnormal data cleaning; the attack method of the second stage comprises backdoor attack or attack resistance, the defense method comprises robustness enhancement and model reinforcement, and the attack and defense success rate of the attack and defense algorithm on the AI model and the influence of the attack and defense algorithm on the AI model are analyzed by using a standardized unit test and robustness formal verification method.

(3) And (4) performing a first stage test, namely an attack and defense test of the AI model data collection and preprocessing stage.

And (3.1) according to the knowledge reasoning result in the step (2.4.2), constructing a data set used by the AI model polluted by the attack sample by using a data virus throwing or sample attack resisting method in the knowledge reasoning result, and realizing the attack in the data collection and pretreatment stages.

And (3.2) selecting an abnormal data automatic analysis method, inputting data of the data set of the attacked AI model, analyzing the sample characteristics and the distribution condition of the data, and automatically cleaning the abnormal data of the attacked data set by using an automatic abnormal data cleaning method.

(3.2.1) in the abnormal data detection step: first, the type of anomaly data is analyzed: detecting the type of the abnormal data according to the distribution characteristic comparison of the abnormal data and the normal data; then, exception data is processed: selecting repair data or discard data according to the type of the abnormal data; inputting the data to be repaired into an automatic abnormal data cleaning step.

(3.2.2) in the automatic abnormal data cleaning step:

and (3) obtaining the defense method of the first stage according to the knowledge reasoning result of the step (2.4.2) of the full-period automatic safety test method.

(4) And (5) carrying out a second stage test, namely an attack and defense test in the AI model training and testing stage.

And (4.1) according to the knowledge reasoning result of the step (2.4.2), using the obtained backdoor attack or counterattack, triggering the backdoor or disturbance in the training and testing stage, and realizing the attack in the training and testing stage.

And (4.2) in a training link, carrying out robustness enhancement and model reinforcement on the AI model, and training the model with robustness.

And (3) in the robustness enhancement and model reinforcement stages of the AI model training link, using the obtained defense method, namely the inferred robustness enhancement and model reinforcement algorithm, to defend against backdoor attacks or counterattack according to the knowledge inference result in the step (2.4.2).

And (4.3) in a model testing link, carrying out standardized unit testing and robustness formalized verification on the AI model, and analyzing the influence on the AI model caused by attack and defense in the model training and testing processes.

(4.3.1) standardized Unit test includes the following test criteria:

a1. module unit test criteria: the functions of different modules of the model are tested respectively, input data of a predictable result are constructed, and a predicted standard result and an actual module output result are compared to achieve the test purpose;

a2. neuron coverage test criteria: the output range of the neuron is divided into a plurality of intervals with the same length, each interval is used for representing a characteristic behavior of the neuron, and whether the logic behavior is covered by the test data is judged by judging whether the output value of the neuron is contained in one interval.

(4.3.2) robust formal verification comprising the steps of:

b1. and (3) selecting a proper formal verification method according to the type of the activation function of the AI model and the application scene by using the knowledge reasoning result of the step (2.4.2).

(5) And (3) carrying out online test on the deployment of the AI system: and analyzing and adapting the hardware system environment deployed by the AI model. And (3) carrying out simulation adaptation on the knowledge inference result in the step (2.4.2), namely the system hardware environment most suitable for the AI model to be tested, and the AI model to obtain the AI model software and hardware adaptation result.

The test report specifically comprises a vulnerable method, a defense strategy and an attack and defense success rate in an AI model data collection and preprocessing stage, a vulnerable method, a defense strategy and an attack and defense success rate in a training and testing stage, and an AI model software and hardware adaptation result.

Claims

1. An AI safety attack and defense test method is characterized by comprising the following steps:

(3) a first phase test was performed: performing attack and defense tests in the AI model data collection and preprocessing stages according to the first-stage attack method and defense method in the knowledge reasoning result of the step (2);

(5) and (3) carrying out online test on deployment of the AI system: carrying out simulation adaptation on the system hardware environment which is most suitable for the AI model to be tested in the knowledge inference result of the step (2) and the AI model to obtain an AI model software and hardware adaptation result;

2. The AI safety attack and defense test method according to claim 1, wherein in step (1), the characteristic data of the AI model includes a data set type, a model type, a framework type, and a training parameter.

3. The AI safety attack and defense testing method according to claim 1, wherein the step (2) includes the substeps of:

(2.2) acquiring data for constructing the map according to the node relation rule in the step (2.1);

crawling text information in a thesis database according to the AI thesis keywords; the text information comprises the node and relation information of the node type and the relation type in the step (2.1), and the node and relation information is extracted into regular data through a regularized extraction and entity extraction method;

4. The AI safety attack and defense test method according to claim 3, wherein in the matching inference link of step (2.4):

5. The AI safety attack and defense testing method according to claim 1, wherein step (3) includes the substeps of:

6. The AI security attack and defense test method according to claim 5, wherein the step (3.2) comprises the sub-steps of:

(3.2.2) in the automatic abnormal data cleaning step:

obtaining a defense method of a first stage according to the knowledge reasoning result of the step (2) of the full-period automatic safety test method;

7. The AI safety attack and defense testing method according to claim 1, wherein step (4) includes the substeps of:

(4.3) in a model testing link, carrying out standardized unit testing and robustness formalization verification on the AI model, and analyzing the influence on the AI model caused by attack and defense in the model training and testing processes;

8. The AI safety attack and defense test method according to claim 7, wherein in step (4.3), the standardized unit test comprises:

a1. testing a module unit: the functions of different modules of the model are tested respectively, input data of a predictable result are constructed, and a predicted standard result and an actual module output result are compared to achieve the test purpose;

9. The AI safety attack and defense testing method according to claim 7, wherein in step (4.3), the robust formal verification comprises:

10. The AI security attack and defense test method according to claim 1, wherein in the step (6), the test report includes a vulnerable method and a defense strategy and an attack and defense success rate in the AI model data collection and preprocessing stage, a vulnerable method and a defense strategy and an attack and defense success rate in the training and testing stage, and an AI model software and hardware adaptation result.