CN114579962A - AI safety attack and defense test method - Google Patents

AI safety attack and defense test method Download PDF

Info

Publication number
CN114579962A
CN114579962A CN202210137681.0A CN202210137681A CN114579962A CN 114579962 A CN114579962 A CN 114579962A CN 202210137681 A CN202210137681 A CN 202210137681A CN 114579962 A CN114579962 A CN 114579962A
Authority
CN
China
Prior art keywords
attack
defense
model
stage
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210137681.0A
Other languages
Chinese (zh)
Inventor
梁炜
秦湛
任奎
姚宏伟
林博涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202210137681.0A priority Critical patent/CN114579962A/en
Publication of CN114579962A publication Critical patent/CN114579962A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an AI safety attack and defense test method, which comprises the following steps: software and hardware cooperation safety test flow and a full-period automatic safety test method. Aiming at the defects that a large-scale and complex AI system is difficult to carry out comprehensive security detection and the current security test method has single function and poor universality, the steps of the life cycle of the AI system are summarized, attack methods and corresponding defense strategies existing in all links of the AI system are analyzed by a full-cycle automatic adaptive security test method, a security test flow is constructed, and the comprehensive security defense of the AI system in the actual deployment environment is supported.

Description

AI safety attack and defense test method
Technical Field
The invention belongs to the technical field of artificial intelligence AI safety, and particularly relates to an AI safety attack and defense testing method.
Background
In recent years, the rapid development of the artificial intelligence technology brings a profound change to the life of the human society and promotes the jump of the fields of the economic society from digitalization and networking to intellectualization. Meanwhile, the potential safety hazard of the artificial intelligence technology is gradually exposed. However, the general development level of the current artificial intelligence is still in the starting stage, and the artificial intelligence technology represented by deep learning generally has the problems of robustness, interpretability limitation and the like, so people have insufficient understanding of the inherent vulnerability of the highly complex artificial intelligence system, lack of effective safety verification, testing and enhancement means and seriously hinder the artificial intelligence technology from being widely applied to the ground. In an actual use scene, the problems of complex data link environment of the AI system, various types of model operation software frameworks and hardware systems, and various AI attack modes also exist.
Disclosure of Invention
The invention aims to provide an AI safety attack and defense testing method aiming at the defects of the prior art.
The purpose of the invention is realized by the following technical scheme: an AI safety attack and defense test method comprises the following steps:
(1) inputting an AI model file to be tested, and inputting characteristic data of the AI model;
(2) executing a full-period automatic safety test method on the AI model characteristic data to be tested input in the step (1); establishing an attack and defense knowledge graph by a knowledge graph method, and carrying out knowledge reasoning to obtain an attack and defense algorithm of an input AI model to be tested in a data collection and pretreatment stage, a training and testing stage and a hardware adaptation environment of a deployment stage;
(3) the first stage of testing was performed: performing attack and defense tests in the AI model data collection and preprocessing stages according to the first-stage attack method and defense method in the knowledge reasoning result of the step (2);
(4) and (3) carrying out a second stage test: performing attack and defense tests in an AI model training and testing stage according to the attack method and the defense method in the second stage in the knowledge reasoning result of the step (2);
(5) and (3) carrying out online test on the deployment of the AI system: carrying out simulation adaptation on the system hardware environment which is most suitable for the AI model to be tested in the knowledge inference result of the step (2) and the AI model to obtain an AI model software and hardware adaptation result;
(6) and integrating the test results of the data collection and preprocessing stage, the training and testing stage and the deployment stage of the AI life cycle into a test report, and providing security defense strategies and security deployment methods of different stages.
Further, in step (1), the feature data of the AI model includes a data set type, a model category, a framework category, and a training parameter.
Further, the step (2) includes the sub-steps of:
(2.1) determining node types and relationship types of the AI model attack and defense knowledge graph to form a knowledge graph node relationship rule;
8 node types, 6 relation types and 9 triple types of the knowledge graph are established; the node types are respectively 'model ontology', 'data set type', 'model type', 'framework type', 'training parameters', 'system hardware environment', 'attack algorithm', 'defense algorithm'; the relationship types are respectively 'belong to', 'first stage attack', 'second stage attack', 'first stage defense', 'second stage defense', 'software and hardware adaptation'; the data set type, the model type, the frame type and the training parameter form a relationship of 'belonging' with the model body, and form 4 three tuple types; the attack algorithm and the model ontology form a relation of a first-stage attack and a second-stage attack to form 2 triad types; the defense algorithm and the model body form a first-stage defense relation and a second-stage defense relation to form 2 triad types; the system hardware environment and the model ontology form a software and hardware adaptation relationship to form 1 three tuple types;
(2.2) acquiring data for constructing a map in a crawler mode according to the node relation rule in the step (2.1);
crawling text information in a thesis database according to the AI thesis keywords; the text information comprises the node and relationship information of the node type and the relationship type in the step (2.1), and the node and relationship information is extracted into regular data through a regularization extraction and entity extraction method;
(2.3) forming a triple by the data through a node relation rule, and forming an AI model attack and defense knowledge graph;
forming a triple set by the extracted regularized data according to the node relation rule in the step (2.1), and storing the triple set in a database to form a complete knowledge graph;
meanwhile, a triple in the whole attack and defense knowledge graph is converted into a matrix form through knowledge embedding to obtain an embedded matrix, and the embedded matrix exists as another form of the knowledge graph;
and (2.4) matching the characteristic data of the AI model to be tested with the attack and defense knowledge graph constructed in the step (2.3), reasoning out an attack and defense method corresponding to the life cycle through the relationship between the nodes, and testing by using the corresponding attack and defense method.
Further, in the matching inference link of step (2.4):
(2.4.1) the feature data of the AI model to be tested is required to be converted into a vector form through knowledge embedding to obtain a feature vector;
(2.4.2) according to the feature vector obtained in the step (2.4.1) and the relation vector in the embedding matrix after the attack and defense knowledge graph is converted in the step (2.3), calculating and obtaining an attack method and a defense method in the first stage and the second stage and a system hardware environment matched with the AI model to be tested as a knowledge reasoning result through an L1 distance; the attack method of the first stage comprises data virus attack or sample attack resistance, and the defense method of the first stage comprises abnormal data analysis and abnormal data cleaning; the attack method of the second stage comprises backdoor attack or attack resistance, the defense method of the second stage comprises robustness enhancement and model reinforcement, and the attack and defense success rate of the attack and defense algorithm on the AI model and the influence on the AI model are analyzed by using a standardized unit test and robustness formal verification method;
and (2.4.3) respectively inputting the knowledge reasoning results of the step (2.4.2) into different stages of the software and hardware collaborative safety testing process to carry out corresponding tests.
Further, step (3) comprises the following substeps:
(3.1) according to the knowledge reasoning result in the step (2), constructing a data set used by the AI model polluted by the attack sample by using a first-stage attack method in the knowledge reasoning result, and realizing attack in the data collection and pretreatment stages;
(3.2) selecting an abnormal data automatic analysis method, inputting data of the data set of the attacked AI model, analyzing the sample characteristics and the distribution condition of the data set, and automatically cleaning the abnormal data of the attacked data set by using an automatic abnormal data cleaning method;
and (3.3) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test results of the data collection and pretreatment stages.
Further, step (3.2) comprises the following sub-steps:
(3.2.1) in the abnormal data detection step: first, the type of anomaly data is analyzed: detecting the type of the abnormal data according to the distribution characteristic comparison of the abnormal data and the normal data; then, exception data is processed: selecting repair data or discard data according to the type of the abnormal data; inputting data to be repaired into an automatic abnormal data cleaning step;
(3.2.2) in the automatic abnormal data cleaning step:
obtaining a defense method in the first stage according to the knowledge reasoning result of the step (2) of the full-period automatic safety test method;
the abnormal data to be repaired detected in the abnormal data detection step (3.2.1) is cleaned by using the first-stage defense method.
Further, the step (4) includes the following sub-steps:
(4.1) according to the knowledge reasoning result in the step (2), using the obtained backdoor attack or counter attack to trigger the backdoor or disturb in the training and testing stage, and realizing the attack in the training and testing stage;
(4.2) in a training link, according to the knowledge reasoning result in the step (2), carrying out robustness enhancement and model reinforcement on the AI model by using the obtained second-stage defense method, and defending the second-stage attack method;
(4.3) in a model testing link, carrying out standardized unit testing and robustness formalized verification on the AI model, and analyzing the influence on the AI model caused by attack and defense in the model training and testing processes;
and (4.4) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test result of the model training and testing stage.
Further, in step (4.3), the standardized unit test comprises:
a1. and (3) testing the module unit: the functions of different modules of the model are tested respectively, input data of a predictable result are constructed, and a predicted standard result and an actual module output result are compared to achieve the test purpose;
a2. neuronal coverage test: the output range of the neuron is divided into a plurality of intervals with the same length, each interval is used for representing a characteristic behavior of the neuron, and whether the logic behavior is covered by the test data is judged by judging whether the output value of the neuron is contained in one interval.
Further, in step (4.3), the robust formal verification includes:
b1. selecting a proper formal verification method according to the type of the activation function and the application scene of the AI model by using the knowledge reasoning result of the step (2);
b2. and c, performing formal verification on the AI model to be tested by using the method selected in the step b1 to obtain a robustness evaluation result.
Further, in the step (6), the test report includes a vulnerable method, a defense strategy, an attack and defense success rate in the AI model data collection and preprocessing stage, a vulnerable method, a defense strategy, an attack and defense success rate in the training and testing stage, and an AI model software and hardware adaptation result.
The invention has the beneficial effects that: according to the invention, a safety attack and defense test method is innovated by using a knowledge map in combination with an attack and defense method, defense means is implemented by combining characteristics of different stages of the life cycle of the AI model, the safety attack and defense test of the AI model in the whole cycle is realized, various AI attack methods can be effectively defended, a safety test flow is constructed, the comprehensive safety defense of an AI system in the actual deployment environment is supported, and the robustness and the safety of the AI model are ensured.
Drawings
FIG. 1 is a flow chart of an AI security attack and defense testing method according to the present invention;
FIG. 2 is a flow diagram of a portion of a full-cycle automated security testing method.
Detailed Description
The present invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the invention provides an AI security attack and defense test method, which combines software and hardware collaborative security test and full-period automated security test, analyzes attack modes and defense strategies at different stages of an AI life cycle, realizes AI security defense and security deployment, and is realized by the following steps:
(1) inputting an AI model to be tested and model information.
Uploading an AI model file to be tested, and inputting characteristic data (data set type, model type, frame type and training parameters) of the AI model. The AI model file is used for attack and defense testing, and the input characteristic data is used for attack and defense knowledge graph reasoning.
(2) And (4) executing a full-period automatic safety test method on the AI model characteristic data to be tested input in the step (1.1). As shown in fig. 2, an attack and defense knowledge graph is constructed by a knowledge graph method, and knowledge reasoning is performed to obtain an attack and defense algorithm of an input AI model to be tested in a data collection and preprocessing stage, a training and testing stage, and a hardware adaptation environment of a deployment stage, so as to be used in subsequent testing steps. The method comprises the following specific steps:
and (2.1) determining the node type and the relationship type of the AI model attack and defense knowledge graph to form a knowledge graph node relationship rule.
8 node types, 6 relation types and 9 triple types of the knowledge graph are established. The node types are respectively 'model ontology', 'data set type', 'model type', 'framework type', 'training parameters', 'system hardware environment', 'attack algorithm' and 'defense algorithm'. The relationship types are "belonging", "first stage attack", "second stage attack", "first stage defense", "second stage defense", and "software and hardware adaptation", respectively. The data set type, the model type, the frame type and the training parameter form a relationship of 'belonging' with the model body, and form 4 three tuple types; the attack algorithm and the model ontology form a relation of a first-stage attack and a second-stage attack to form 2 triad types; the defense algorithm and the model body form a first-stage defense relation and a second-stage defense relation to form 2 triad types; the 'system hardware environment' and the 'model ontology' form a 'software and hardware adaptation' relationship to form 1 three tuple types.
The data collection and preprocessing stage is the first stage, the training and testing stage is the second stage, and the deployment stage is the third stage.
And (2.2) acquiring data for constructing the map in a crawler mode according to the node relation rule in the step (2.1).
And crawling the text information in the thesis database according to the AI thesis keywords. The text information comprises the node and relationship information of the node type and relationship type in the step (2.1), and the node and relationship information is extracted into regular data through a regularization extraction and entity extraction method.
And (2.3) forming a triple by the data through a node relation rule, and forming an AI model attack and defense knowledge graph.
And (3) forming a triple set by the extracted regularized data according to the node relation rule in the step (2.1), and storing the triple set in a database to form a complete knowledge graph.
Meanwhile, the triple in the whole attack and defense knowledge graph is converted into a matrix form through knowledge embedding to obtain an embedded matrix, and the embedded matrix exists as another form of the knowledge graph.
And (2.4) matching the characteristic data of the AI model to be tested with the attack and defense knowledge graph constructed in the step (2.3), reasoning out an attack and defense method corresponding to the life cycle through the relationship between the nodes, and testing by using the corresponding attack and defense method.
In the matching reasoning link:
(2.4.1) feature data of the AI model to be tested (namely the data set type, the model type, the framework type and the training parameters of the AI model) needs to be converted into a vector form through knowledge embedding to obtain a feature vector.
And (2.4.2) according to the characteristic vector obtained in the step (2.4.1) and the relation vector in the embedded matrix after the attack and defense knowledge map is converted in the step (2.3), calculating and obtaining the attack method and the defense method in the first stage and the second stage and the system hardware environment matched with the AI model to be tested as a knowledge reasoning result through an L1 distance (the sum of the projected distances of a line segment formed by two points on a fixed rectangular coordinate system of an Euclidean space to a coordinate axis). The attack method of the first stage comprises data virus attack or sample attack resistance, and the defense method comprises abnormal data analysis and abnormal data cleaning; the attack method of the second stage comprises backdoor attack or attack resistance, the defense method comprises robustness enhancement and model reinforcement, and the attack and defense success rate of the attack and defense algorithm on the AI model and the influence of the attack and defense algorithm on the AI model are analyzed by using a standardized unit test and robustness formal verification method.
And (2.4.3) respectively inputting the knowledge reasoning results of the step (2.4.2) into different stages of the software and hardware collaborative safety testing process to carry out corresponding tests.
(3) And (4) performing a first stage test, namely an attack and defense test of the AI model data collection and preprocessing stage.
And (3.1) according to the knowledge reasoning result in the step (2.4.2), constructing a data set used by the AI model polluted by the attack sample by using a data virus throwing or sample attack resisting method in the knowledge reasoning result, and realizing the attack in the data collection and pretreatment stages.
And (3.2) selecting an abnormal data automatic analysis method, inputting data of the data set of the attacked AI model, analyzing the sample characteristics and the distribution condition of the data, and automatically cleaning the abnormal data of the attacked data set by using an automatic abnormal data cleaning method.
(3.2.1) in the abnormal data detection step: first, the type of anomaly data is analyzed: detecting the type of the abnormal data according to the distribution characteristic comparison of the abnormal data and the normal data; then, exception data is processed: selecting repair data or discard data according to the type of the abnormal data; inputting the data to be repaired into an automatic abnormal data cleaning step.
(3.2.2) in the automatic abnormal data cleaning step:
and (3) obtaining the defense method of the first stage according to the knowledge reasoning result of the step (2.4.2) of the full-period automatic safety test method.
The abnormal data to be repaired detected in the abnormal data detection step (3.2.1) is cleaned by using the first-stage defense method.
And (3.3) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test results of the data collection and pretreatment stages.
(4) And (5) carrying out a second stage test, namely an attack and defense test in the AI model training and testing stage.
And (4.1) according to the knowledge reasoning result of the step (2.4.2), using the obtained backdoor attack or counterattack, triggering the backdoor or disturbance in the training and testing stage, and realizing the attack in the training and testing stage.
And (4.2) in a training link, carrying out robustness enhancement and model reinforcement on the AI model, and training the model with robustness.
And (3) in the robustness enhancement and model reinforcement stages of the AI model training link, using the obtained defense method, namely the inferred robustness enhancement and model reinforcement algorithm, to defend against backdoor attacks or counterattack according to the knowledge inference result in the step (2.4.2).
And (4.3) in a model testing link, carrying out standardized unit testing and robustness formalized verification on the AI model, and analyzing the influence on the AI model caused by attack and defense in the model training and testing processes.
(4.3.1) standardized Unit test includes the following test criteria:
a1. module unit test criteria: the functions of different modules of the model are tested respectively, input data of a predictable result are constructed, and a predicted standard result and an actual module output result are compared to achieve the test purpose;
a2. neuron coverage test criteria: the output range of the neuron is divided into a plurality of intervals with the same length, each interval is used for representing a characteristic behavior of the neuron, and whether the logic behavior is covered by the test data is judged by judging whether the output value of the neuron is contained in one interval.
(4.3.2) robust formal verification comprising the steps of:
b1. and (3) selecting a proper formal verification method according to the type of the activation function of the AI model and the application scene by using the knowledge reasoning result of the step (2.4.2).
b2. And c, performing formal verification on the AI model to be tested by using the method selected in the step b1 to obtain a robustness evaluation result.
And (4.4) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test result of the model training and testing stage.
(5) And (3) carrying out online test on the deployment of the AI system: and analyzing and adapting the hardware system environment deployed by the AI model. And (3) carrying out simulation adaptation on the knowledge inference result in the step (2.4.2), namely the system hardware environment most suitable for the AI model to be tested, and the AI model to obtain the AI model software and hardware adaptation result.
(6) And integrating the test results of the data collection and preprocessing stage, the training and testing stage and the deployment stage of the AI life cycle into a test report, and providing security defense strategies and security deployment methods of different stages.
The test report specifically comprises a vulnerable method, a defense strategy and an attack and defense success rate in an AI model data collection and preprocessing stage, a vulnerable method, a defense strategy and an attack and defense success rate in a training and testing stage, and an AI model software and hardware adaptation result.

Claims (10)

1. An AI safety attack and defense test method is characterized by comprising the following steps:
(1) inputting an AI model file to be tested, and inputting characteristic data of the AI model;
(2) executing a full-period automatic safety test method on the AI model characteristic data to be tested input in the step (1); establishing an attack and defense knowledge graph by a knowledge graph method, and carrying out knowledge reasoning to obtain an attack and defense algorithm of an input AI model to be tested in a data collection and pretreatment stage, a training and testing stage and a hardware adaptation environment of a deployment stage;
(3) a first phase test was performed: performing attack and defense tests in the AI model data collection and preprocessing stages according to the first-stage attack method and defense method in the knowledge reasoning result of the step (2);
(4) and (3) carrying out a second stage test: performing attack and defense tests in an AI model training and testing stage according to the attack method and the defense method in the second stage in the knowledge reasoning result of the step (2);
(5) and (3) carrying out online test on deployment of the AI system: carrying out simulation adaptation on the system hardware environment which is most suitable for the AI model to be tested in the knowledge inference result of the step (2) and the AI model to obtain an AI model software and hardware adaptation result;
(6) and integrating the test results of the data collection and preprocessing stage, the training and testing stage and the deployment stage of the AI life cycle into a test report, and providing security defense strategies and security deployment methods of different stages.
2. The AI safety attack and defense test method according to claim 1, wherein in step (1), the characteristic data of the AI model includes a data set type, a model type, a framework type, and a training parameter.
3. The AI safety attack and defense testing method according to claim 1, wherein the step (2) includes the substeps of:
(2.1) determining node types and relationship types of the AI model attack and defense knowledge graph to form a knowledge graph node relationship rule;
8 node types, 6 relation types and 9 triple types of the knowledge graph are established; the node types are respectively 'model ontology', 'data set type', 'model type', 'framework type', 'training parameters', 'system hardware environment', 'attack algorithm', 'defense algorithm'; the relationship types are respectively 'belong to', 'first stage attack', 'second stage attack', 'first stage defense', 'second stage defense', 'software and hardware adaptation'; the data set type, the model type, the frame type and the training parameter form a relationship of 'belonging' with the model body, and form 4 three tuple types; the attack algorithm and the model ontology form a relation of a first-stage attack and a second-stage attack to form 2 triad types; the defense algorithm and the model body form a first-stage defense relation and a second-stage defense relation to form 2 triad types; the system hardware environment and the model ontology form a software and hardware adaptation relationship to form 1 three tuple types;
(2.2) acquiring data for constructing the map according to the node relation rule in the step (2.1);
crawling text information in a thesis database according to the AI thesis keywords; the text information comprises the node and relation information of the node type and the relation type in the step (2.1), and the node and relation information is extracted into regular data through a regularized extraction and entity extraction method;
(2.3) forming a triple by the data through a node relation rule, and forming an AI model attack and defense knowledge graph;
forming a triple set by the extracted regularized data according to the node relation rule in the step (2.1), and storing the triple set in a database to form a complete knowledge graph;
meanwhile, a triple in the whole attack and defense knowledge graph is converted into a matrix form through knowledge embedding to obtain an embedded matrix, and the embedded matrix exists as another form of the knowledge graph;
and (2.4) matching the characteristic data of the AI model to be tested with the attack and defense knowledge graph constructed in the step (2.3), reasoning out an attack and defense method corresponding to the life cycle through the relationship between the nodes, and testing by using the corresponding attack and defense method.
4. The AI safety attack and defense test method according to claim 3, wherein in the matching inference link of step (2.4):
(2.4.1) the feature data of the AI model to be tested is required to be converted into a vector form through knowledge embedding to obtain a feature vector;
(2.4.2) according to the feature vector obtained in the step (2.4.1) and the relation vector in the embedding matrix after the attack and defense knowledge graph is converted in the step (2.3), calculating and obtaining an attack method and a defense method in the first stage and the second stage and a system hardware environment matched with the AI model to be tested as a knowledge reasoning result through an L1 distance; the attack method of the first stage comprises data virus attack or sample attack resistance, and the defense method of the first stage comprises abnormal data analysis and abnormal data cleaning; the attack method of the second stage comprises backdoor attack or attack resistance, the defense method of the second stage comprises robustness enhancement and model reinforcement, and the attack and defense success rate of the attack and defense algorithm on the AI model and the influence on the AI model are analyzed by using a standardized unit test and robustness formal verification method;
and (2.4.3) respectively inputting the knowledge reasoning results of the step (2.4.2) into different stages of the software and hardware collaborative safety testing process to carry out corresponding tests.
5. The AI safety attack and defense testing method according to claim 1, wherein step (3) includes the substeps of:
(3.1) according to the knowledge reasoning result in the step (2), constructing a data set used by the AI model polluted by the attack sample by using a first-stage attack method in the knowledge reasoning result, and realizing attack in the data collection and pretreatment stages;
(3.2) selecting an abnormal data automatic analysis method, inputting data of the data set of the attacked AI model, analyzing the sample characteristics and the distribution condition of the data set, and automatically cleaning the abnormal data of the attacked data set by using an automatic abnormal data cleaning method;
and (3.3) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test results of the data collection and pretreatment stages.
6. The AI security attack and defense test method according to claim 5, wherein the step (3.2) comprises the sub-steps of:
(3.2.1) in the abnormal data detection step: first, the type of anomaly data is analyzed: detecting the type of the abnormal data according to the distribution characteristic comparison of the abnormal data and the normal data; then, exception data is processed: selecting repair data or discard data according to the type of the abnormal data; inputting data to be repaired into an automatic abnormal data cleaning step;
(3.2.2) in the automatic abnormal data cleaning step:
obtaining a defense method of a first stage according to the knowledge reasoning result of the step (2) of the full-period automatic safety test method;
the abnormal data to be repaired detected in the abnormal data detection step (3.2.1) is cleaned by using the first-stage defense method.
7. The AI safety attack and defense testing method according to claim 1, wherein step (4) includes the substeps of:
(4.1) according to the knowledge reasoning result in the step (2), using the obtained backdoor attack or counter attack to trigger the backdoor or disturb in the training and testing stage, and realizing the attack in the training and testing stage;
(4.2) in a training link, according to the knowledge reasoning result in the step (2), carrying out robustness enhancement and model reinforcement on the AI model by using the obtained second-stage defense method, and defending the second-stage attack method;
(4.3) in a model testing link, carrying out standardized unit testing and robustness formalization verification on the AI model, and analyzing the influence on the AI model caused by attack and defense in the model training and testing processes;
and (4.4) collecting the success rate of the attack method before and after defense to obtain the success rate of attack and defense, analyzing the effects of different defense methods, and obtaining the attack and defense test result of the model training and testing stage.
8. The AI safety attack and defense test method according to claim 7, wherein in step (4.3), the standardized unit test comprises:
a1. testing a module unit: the functions of different modules of the model are tested respectively, input data of a predictable result are constructed, and a predicted standard result and an actual module output result are compared to achieve the test purpose;
a2. neuronal coverage test: the output range of the neuron is divided into a plurality of intervals with the same length, each interval is used for representing a characteristic behavior of the neuron, and whether the logic behavior is covered by the test data is judged by judging whether the output value of the neuron is contained in one interval.
9. The AI safety attack and defense testing method according to claim 7, wherein in step (4.3), the robust formal verification comprises:
b1. selecting a proper formal verification method according to the type of the activation function and the application scene of the AI model by using the knowledge reasoning result of the step (2);
b2. and c, performing formal verification on the AI model to be tested by using the method selected in the step b1 to obtain a robustness evaluation result.
10. The AI security attack and defense test method according to claim 1, wherein in the step (6), the test report includes a vulnerable method and a defense strategy and an attack and defense success rate in the AI model data collection and preprocessing stage, a vulnerable method and a defense strategy and an attack and defense success rate in the training and testing stage, and an AI model software and hardware adaptation result.
CN202210137681.0A 2022-02-15 2022-02-15 AI safety attack and defense test method Pending CN114579962A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210137681.0A CN114579962A (en) 2022-02-15 2022-02-15 AI safety attack and defense test method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210137681.0A CN114579962A (en) 2022-02-15 2022-02-15 AI safety attack and defense test method

Publications (1)

Publication Number Publication Date
CN114579962A true CN114579962A (en) 2022-06-03

Family

ID=81770316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210137681.0A Pending CN114579962A (en) 2022-02-15 2022-02-15 AI safety attack and defense test method

Country Status (1)

Country Link
CN (1) CN114579962A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023246237A1 (en) * 2022-06-22 2023-12-28 支付宝(杭州)信息技术有限公司 Attack-defense confrontation simulation test method and system for network model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023246237A1 (en) * 2022-06-22 2023-12-28 支付宝(杭州)信息技术有限公司 Attack-defense confrontation simulation test method and system for network model

Similar Documents

Publication Publication Date Title
CN111177417B (en) Security event correlation method, system and medium based on network security knowledge graph
Cho Incorporating soft computing techniques into a probabilistic intrusion detection system
CN109800875A (en) Chemical industry fault detection method based on particle group optimizing and noise reduction sparse coding machine
CN106790292A (en) The web application layer attacks detection and defence method of Behavior-based control characteristic matching and analysis
CN109522716A (en) A kind of network inbreak detection method and device based on timing neural network
CN104125112B (en) Physical-information fuzzy inference based smart power grid attack detection method
CN111695823B (en) Industrial control network flow-based anomaly evaluation method and system
CN107220540A (en) Intrusion detection method based on intensified learning
CN109308411B (en) Method and system for hierarchically detecting software behavior defects based on artificial intelligence decision tree
CN105844501A (en) Consumption behavior risk control system and method
CN111126820A (en) Electricity stealing prevention method and system
CN113556319B (en) Intrusion detection method based on long-short term memory self-coding classifier under internet of things
CN108156142A (en) Network inbreak detection method based on data mining
CN111898129B (en) Malicious code sample screener and method based on Two-Head anomaly detection model
CN111079348B (en) Method and device for detecting slowly-varying signal
CN114023399A (en) Air particulate matter analysis early warning method and device based on artificial intelligence
CN112714130A (en) Big data-based adaptive network security situation sensing method
CN114579962A (en) AI safety attack and defense test method
CN115102705A (en) Automatic network security detection method based on deep reinforcement learning
CN108761250B (en) Industrial control equipment voltage and current-based intrusion detection method
Mokhtari et al. Measurement data intrusion detection in industrial control systems based on unsupervised learning
CN110796237B (en) Method and device for detecting attack resistance of deep neural network
Amro et al. Evolutionary computation in computer security and forensics: An overview
Jose et al. Prediction of Network Attacks Using Supervised Machine Learning Algorithm
CN114565051B (en) Method for testing product classification model based on influence degree of neurons

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination