CN113609482A - Back door detection and restoration method and system for image classification model - Google Patents
Back door detection and restoration method and system for image classification model Download PDFInfo
- Publication number
- CN113609482A CN113609482A CN202110796626.8A CN202110796626A CN113609482A CN 113609482 A CN113609482 A CN 113609482A CN 202110796626 A CN202110796626 A CN 202110796626A CN 113609482 A CN113609482 A CN 113609482A
- Authority
- CN
- China
- Prior art keywords
- model
- trigger
- back door
- potential
- backdoor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Computer Security & Cryptography (AREA)
- Mathematical Physics (AREA)
- Neurology (AREA)
- Computer Hardware Design (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Virology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a backdoor detection and restoration method and a backdoor detection and restoration system aiming at an image classification model, which belong to the technical field of software technology and information security, and adopt a method of model pruning, transfer learning and shallow model training to obtain a series of comparison models which have the same task as a backdoor model but do not have the backdoor; reversing each category of the backdoor model by optimizing the objective function by means of the comparison model to obtain a series of potential triggers; refining the potential trigger by utilizing a contribution thermodynamic diagram, and only keeping key features influencing a model classification result; distinguishing a back door trigger and a countermeasure patch of the potential triggers after refining based on the difference of the migratability of the back door trigger and the countermeasure patch on the comparison model; adding the distinguished rear door trigger into a clean data set, and removing a rear door in the rear door model through countermeasure training. According to the method, only a small amount of clean data is used, the back door of the image classification model can be detected and repaired, and a normal model is generated.
Description
Technical Field
The invention belongs to the technical field of software technology and information security, relates to artificial intelligence oriented security technology, and particularly relates to a backdoor detection and restoration method and system for a deep neural network image classification model.
Background
In recent years, Deep Neural Networks (DNNs) have been widely used in the fields of computer vision, speech recognition, natural language processing, and the like because of their accurate prediction results. Deep neural networks are even used in important security areas such as access control systems, automotive driving and medical diagnostics, because of their accuracy, sometimes even more reliable, than human experts.
However, while being widely used, deep neural networks also face serious security issues, such as data poisoning attacks, counterattacks, backdoor attacks, and so on. In particular, an attacker may inject back-gates into the deep neural network during model training to control the behavior of the model. The DNN model injected into the back door behaves substantially in accordance with the model without the back door on normal input data, but when a special "trigger (i.e., a special pattern overlaid on the original image)" is input, an abnormal behavior of the model is triggered, resulting in an attacker's desired result. The existence of backdoor attacks brings potential safety hazards to the deep neural network. For example, a rear door can be injected into the DNN model to misrecognize the parking signboard stuck with a special sticker (trigger) as a speed limit signboard. If an autonomous automobile is equipped with such a rear door model, a fatal traffic accident may occur.
Disclosure of Invention
The invention aims to provide a back door detection and restoration method for a deep neural network image classification model. The method can detect the backdoor possibly existing in the model by using a small amount of clean data on the premise of not knowing a backdoor trigger and a backdoor attack target, and repair the detected backdoor to generate a normal model.
In order to achieve the purpose, the invention adopts the following technical scheme:
a backdoor detection and restoration method for an image classification model comprises the following steps:
based on a clean data set, a series of comparison models which have the same task as a back door model but do not have a back door are obtained by adopting a model pruning method, a transfer learning method and a shallow model training method;
reversing each category of the back door model by optimizing an objective function by means of the comparison model and the clean data set to obtain a series of potential triggers, wherein the potential triggers comprise a back door trigger and a countermeasure patch;
calculating a contribution thermodynamic diagram according to the clean data set and the potential triggers, refining the potential triggers by using the contribution thermodynamic diagram, and only keeping key features influencing the classification result of the model;
distinguishing a back door trigger and a countermeasure patch of the potential triggers after refining based on the difference of the migratability of the back door trigger and the countermeasure patch on the comparison model;
adding the distinguished rear door trigger into a clean data set, and removing a rear door in the rear door model through countermeasure training.
Further, the clean data set is a pollution training set from backdoor attack or a data set with similar data distribution with the pollution training set, wherein the similar data distribution means that the similarity of the data distribution is higher than a preset index; the amount of data in the clean data set is 10% -20% of the contaminated training set.
Further, the model pruning method comprises the following steps: removing the backdoor by cutting off neurons with low activation rate in the backdoor model, and recovering the classification accuracy of the model by fine tuning training;
the transfer learning method comprises the following steps: on the basis of a neural network model similar to the classification task of the back door model, a comparison model is obtained through transfer learning training;
the shallow model training method comprises the following steps: and simplifying the structure of the back door model, and training on the simplified model structure to obtain a comparison model.
Further, the objective function is optimized by adjusting the weight of the loss function of the objective function, and the formula is as follows:
wherein the loss function LbackdoorAnd LcleanRespectively representing the influence of the rear door trigger on the classification results of the rear door model and the comparison model, and a loss function LnoiseIs a noise reduction function applied to m; α, β and γ are weight coefficients of the loss function; Δ and m are two variables of the objective function optimization, which are three-dimensional matrices of the same size as the clean dataset, where Δ is the pattern that holds the potential triggers; m is a transparency matrix, controlling the location of potential triggers; x is the number ofiIs an image randomly selected from a clean dataset; j is a full 1 matrix with dimensions the same as Δ; Δ m + xi(J-m) indicates overlaying the trigger on image xiThe above step (1); f. ofbAnd fcA prediction function for the back door model and the comparison model, respectively; CE is the cross entropy loss function; n is the total number of images in the clean dataset; i is the number of the current image; on the back door model, the image with the trigger is classified into the target class ytClassified into the correct category y on the comparison modeli(ii) a j and k represent the rows and columns, respectively, of the matrix m, and a and b are indices of the summation symbols.
Further, the step of calculating a contribution thermodynamic diagram from the clean data set and the potential triggers includes:
randomly selecting a set of images from the clean dataset and overlaying with the potential triggers;
and calculating a thermodynamic diagram representing the contribution degree of the classification result for all the images, namely the contribution degree thermodynamic diagram.
Further, the step of refining the potential triggers using the contribution thermodynamic diagram includes:
averaging all the contribution thermodynamic diagrams to obtain an average thermodynamic diagram;
removing the area with the lowest current contribution degree in the potential triggers according to the average thermodynamic diagram;
and calculating the current attack success rate of the potential trigger, if the current attack success rate is lower than a threshold value, ending, and if the current attack success rate is not lower than the threshold value, continuously removing the area with the lowest current contribution degree in the potential trigger.
Further, the step of distinguishing the back-gate trigger of the refined potential trigger from the countermeasure patch includes:
randomly selecting a set of images from the clean dataset and overlaying with the potential triggers;
calculating the attack success rate of the potential trigger on the back door model, if the attack success rate is lower than a threshold value, judging as a counterpatch, and ending;
and if the attack success rate is not lower than the threshold, calculating the attack success rate of the potential trigger on all the comparison models, if the attack success rate on one comparison model is higher than the other threshold, judging the potential trigger as a counterpatch, otherwise, judging the potential trigger as a backdoor trigger.
Further, randomly selecting a certain proportion of images from a clean data set, and covering the images by using a rear door trigger; the distinguished back door trigger is then added to the clean data set.
Further, the step of removing the back door in the back door model through the countermeasure training includes: adding the distinguished backdoor trigger into a clean data set, and keeping the class information of the image unchanged to obtain a confrontation training data set; and (5) fine-tuning the training back door model by using the confrontation training data set, and removing the back door in the back door model.
A backdoor detection and restoration system for an image classification model comprises a memory and a processor, wherein a computer program is stored on the memory, and the processor realizes the steps of the method when executing the program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
Compared with the prior art, the invention has the following positive effects:
the invention has stronger detection capability on the back door, has wider detection range on the back doors of different types of triggers, is less influenced by factors such as area occupation ratio, position, shape, pattern and the like of the triggers, and has lower false alarm rate and missing report rate. Compared with the existing backdoor detection method (such as neural clean, ABS and TABOR), the method has the advantage that the assumption is put forward and limited on the area proportion of the trigger, so that an attacker can avoid detection by adopting the trigger with larger area proportion (more than 10%) at the cost of sacrificing the concealment of the trigger, and the method can still maintain the detection capability when the area proportion of the trigger reaches 25% and is more difficult to attack adaptively.
Drawings
Fig. 1 is an overall flowchart of a backdoor detection and restoration method for an image classification model according to the present invention.
FIG. 2 is a flow diagram of potential trigger refining.
Fig. 3 is a flowchart of back door trigger recognition.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
The embodiment discloses a backdoor detection and restoration method for an image classification model, as shown in fig. 1, the steps are as follows:
1. the invention comprises the following key points:
1.1. generation of a comparison model: meanwhile, a series of comparison models which have the same task as the back door model but do not have the back door are obtained by adopting the methods of model pruning, transfer learning and shallow model training.
1.2. The potential triggers are reversed: an objective function is designed by means of a comparison model and a clean data set, and each category of the back door model is reversed to obtain a series of potential triggers (consisting of a back door trigger and a countermeasure patch).
1.3. Potential trigger refinement: and refining the potential trigger by means of a contribution thermodynamic diagram technology, and removing redundant features of the potential trigger to obtain a refined potential trigger.
1.4. Back door trigger recognition: the refined potential triggers are classified into two categories, back door triggers and countermeasure patches, based on the difference in migratability of the back door triggers and countermeasure patches on the comparison model.
1.5. Repairing a rear door model: and adding a rear door trigger into the clean data set, and removing a rear door in the rear door model through countermeasure training to obtain a normal model without the rear door.
2. The generation of the control model comprises the following three modes and is adopted simultaneously:
2.1. model pruning: and removing the backdoor by cutting off the neurons with low activation rate in the model, and simultaneously restoring the classification accuracy of the model by adopting fine tuning training.
2.2. Transfer learning: the control model is trained by transfer learning based on a model similar to the back door model classification task.
2.3. Training a superficial layer model: the structure of the back door model is simplified, and a comparison model is trained on the simplified model structure.
3. The potential trigger reversal is completed by optimizing an objective function:
Δ and m are two variables of the objective function optimization, both three-dimensional matrices of the same size as the clean dataset. Where Δ is a pattern that holds potential triggers; m is a transparency matrix that controls the location of potential triggers. The objective function consists of three loss functions, adjusted by three weights α, β and γ.
xiIs an image randomly selected from a clean data set. J is an all 1 matrix with dimensions the same as Δ. Δ m + xi(J-m) indicates overlaying the trigger on image xiThe above. f. ofbAnd fcRespectively, the prediction functions of the back door model and the control model. CE is the cross entropy loss function. n is the total number of images in the clean dataset and i is the number of the current image. L isbackdoorAnd LcleanRespectively representing the effect of the backdoor trigger on the classification results of the backdoor model and the comparison model. On the back door model, the image with the trigger should be classified into the target class ytShould be classified into the correct category y on the control modeli. Only one control model need be used here. L isnoiseIs the noise reduction function applied to m. j and k represent the rows and columns, respectively, of the matrix m, and a and b are indices of the summation symbols. L isnoiseAnd the purpose of noise reduction is achieved by adding adjacent pixel points of m, taking absolute values and then summing.
4. The flow of potential trigger refinement is shown in FIG. 2 and includes the following steps:
4.1. a set of raw images is randomly selected from the clean dataset and overlaid with potential triggers.
4.2. Calculating a thermodynamic diagram representing the contribution of the classification result (a two-dimensional matrix with the same size as the original image, wherein the larger the numerical value of the midpoint of the matrix, the larger the contribution of the pixel point at the same position of the original image to the classification result) for all the images, and averaging all the thermodynamic diagrams to obtain an average thermodynamic diagram.
4.3. And removing the area with the lowest contribution degree in the potential trigger according to the average thermodynamic diagram.
4.4. Calculating the current attack success rate of the potential trigger, if the current attack success rate is lower than a threshold value (95 percent of the attack success rate of the unrefined original potential trigger), ending, otherwise, skipping to the step 4.3
5. The process of identifying the back door trigger is shown in fig. 3, and includes the following steps:
5.1. a set of images is randomly selected from the clean dataset and overlaid with potential triggers.
5.2. And calculating the attack success rate of the potential trigger on the backdoor model.
5.3. If the attack success rate is lower than a threshold value (the value is 60 percent of the preset hyperparameter), judging as a counterpatch, ending, otherwise, skipping 5.4.
5.4. The attack success rate of the potential trigger on all the comparison models is calculated.
5.5. If the attack success rate on a certain comparison model is higher than another threshold (the preset hyperparameter is related to the number of classification categories, 40% on a data set with a small number of MNIST and GTSRB categories and 20% on a data set with a large number of Youtube-Face and VGG-Face categories), judging as a counterattack patch, otherwise, judging as a back door trigger.
6. The rear door model repairing method comprises the following steps:
6.1. a proportion of the image is randomly selected from the clean data set and covered with a back door trigger.
6.2. And adding the image into a clean data set and keeping the class information of the image unchanged to obtain a confrontation training data set.
6.3. And (5) fine-tuning the training back door model by using the confrontation training data set, and removing the back door in the back door model.
According to the embodiment, firstly, according to the angle of a backdoor attacker, 60 backdoor models are generated by two mainstream backdoor attack modes, namely a pollution training set (Badnets) and a modified pre-training model (TrojanNN), on four data sets in three application fields of handwritten number classification (MNIST data set), traffic sign classification (GTSRB data set) and Face classification (Youtube-Face and VGG-Face data set); meanwhile, 30 normal (no backdoor) models are generated by adopting a normal model training method. The "trigger" of the back door model is a special pattern covering the original image, with an area of 2% -25%, with different positions, shapes and patterns. According to the invention, detection results that the false alarm rate (the number of normal models of the false detected back door/the total number of normal models) and the false alarm rate (the number of back door models of the undetected back door/the total number of back door models) are less than 10% are achieved on the 90 models.
The above embodiments are only intended to illustrate the technical solution of the present invention, but not to limit it, and a person skilled in the art can modify the technical solution of the present invention or substitute it with an equivalent, and the protection scope of the present invention is subject to the claims.
Claims (10)
1. A backdoor detection and restoration method for an image classification model is characterized by comprising the following steps:
based on a clean data set, a series of comparison models which have the same task as a back door model but do not have a back door are obtained by adopting a model pruning method, a transfer learning method and a shallow model training method;
reversing each category of the back door model by optimizing an objective function by means of the comparison model and the clean data set to obtain a series of potential triggers, wherein the potential triggers comprise a back door trigger and a countermeasure patch;
calculating a contribution thermodynamic diagram according to the clean data set and the potential triggers, refining the potential triggers by using the contribution thermodynamic diagram, and only keeping key features influencing the classification result of the model;
distinguishing a back door trigger and a countermeasure patch of the potential triggers after refining based on the difference of the migratability of the back door trigger and the countermeasure patch on the comparison model;
adding the distinguished rear door trigger into a clean data set, and removing a rear door in the rear door model through countermeasure training.
2. The method of claim 1, wherein the model pruning method is: removing the backdoor by cutting off neurons with low activation rate in the backdoor model, and recovering the classification accuracy of the model by fine tuning training;
the transfer learning method comprises the following steps: on the basis of a neural network model similar to the classification task of the back door model, a comparison model is obtained through transfer learning training;
the shallow model training method comprises the following steps: and simplifying the structure of the back door model, and training on the simplified model structure to obtain a comparison model.
3. The method of claim 1, wherein the objective function is optimized by adjusting the penalty function weights of the objective function, as follows:
wherein the loss function LbackdoorAnd LcleanRespectively representing the influence of the rear door trigger on the classification results of the rear door model and the comparison model, and a loss function LnoiseIs a noise reduction function applied to m; α, β and γ are weight coefficients of the loss function; Δ and m are two variables of the objective function optimization, which are three-dimensional matrices of the same size as the clean dataset, where Δ is the pattern that holds the potential triggers; m is a matrix of the degree of transparency,controlling the position of the potential trigger; x is the number ofiIs an image randomly selected from a clean dataset; j is a full 1 matrix with dimensions the same as Δ; Δ m + xi(J-m) indicates overlaying the trigger on image xiThe above step (1); f. ofbAnd fcA prediction function for the back door model and the comparison model, respectively; CE is the cross entropy loss function; n is the total number of images in the clean dataset; i is the number of the current image; on the back door model, the image with the trigger is classified into the target class ytClassified into the correct category y on the comparison modeli(ii) a j and k represent the rows and columns, respectively, of the matrix m, and a and b are indices of the summation symbols.
4. The method of claim 1, wherein the step of calculating a contribution thermodynamic diagram from the clean data set and the potential triggers comprises:
randomly selecting a set of images from the clean dataset and overlaying with the potential triggers;
and calculating a thermodynamic diagram representing the contribution degree of the classification result for all the images, namely the contribution degree thermodynamic diagram.
5. The method of claim 1 or 4, wherein the step of refining the potential trigger using a contributing thermodynamic diagram comprises:
averaging all the contribution thermodynamic diagrams to obtain an average thermodynamic diagram;
removing the area with the lowest current contribution degree in the potential triggers according to the average thermodynamic diagram;
and calculating the current attack success rate of the potential trigger, if the current attack success rate is lower than a threshold value, ending, and if the current attack success rate is not lower than the threshold value, continuously removing the area with the lowest current contribution degree in the potential trigger.
6. The method of claim 1, wherein the step of distinguishing between a back-gate trigger of the refined potential triggers and the countermeasure patch comprises:
randomly selecting a set of images from the clean dataset and overlaying with the potential triggers;
calculating the attack success rate of the potential trigger on the back door model, if the attack success rate is lower than a threshold value, judging as a counterpatch, and ending;
and if the attack success rate is not lower than the threshold, calculating the attack success rate of the potential trigger on all the comparison models, if the attack success rate on one comparison model is higher than the other threshold, judging the potential trigger as a counterpatch, otherwise, judging the potential trigger as a backdoor trigger.
7. The method of claim 1, wherein a proportion of the images are first randomly selected from the clean data set and overlaid with a back door trigger; the distinguished back door trigger is then added to the clean data set.
8. The method of claim 1, wherein the step of removing the back door in the back door model by the counter training comprises: adding the distinguished backdoor trigger into a clean data set, and keeping the class information of the image unchanged to obtain a confrontation training data set; and (5) fine-tuning the training back door model by using the confrontation training data set, and removing the back door in the back door model.
9. A backdoor detection and repair system for an image classification model, comprising a memory on which is stored a computer program and a processor which, when executed, carries out the steps of the method of any one of claims 1 to 8.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110796626.8A CN113609482B (en) | 2021-07-14 | 2021-07-14 | Back door detection and restoration method and system for image classification model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110796626.8A CN113609482B (en) | 2021-07-14 | 2021-07-14 | Back door detection and restoration method and system for image classification model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113609482A true CN113609482A (en) | 2021-11-05 |
CN113609482B CN113609482B (en) | 2023-10-17 |
Family
ID=78304643
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110796626.8A Active CN113609482B (en) | 2021-07-14 | 2021-07-14 | Back door detection and restoration method and system for image classification model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113609482B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114003511A (en) * | 2021-12-24 | 2022-02-01 | 支付宝(杭州)信息技术有限公司 | Evaluation method and device for model interpretation tool |
CN114154589A (en) * | 2021-12-13 | 2022-03-08 | 成都索贝数码科技股份有限公司 | Similarity-based module branch reduction method |
CN116091871A (en) * | 2023-03-07 | 2023-05-09 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Physical countermeasure sample generation method and device for target detection model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190318099A1 (en) * | 2018-04-16 | 2019-10-17 | International Business Machines Corporation | Using Gradients to Detect Backdoors in Neural Networks |
US20200410098A1 (en) * | 2019-06-26 | 2020-12-31 | Hrl Laboratories, Llc | System and method for detecting backdoor attacks in convolutional neural networks |
CN112989438A (en) * | 2021-02-18 | 2021-06-18 | 上海海洋大学 | Detection and identification method for backdoor attack of privacy protection neural network model |
CN113111349A (en) * | 2021-04-25 | 2021-07-13 | 浙江大学 | Backdoor attack defense method based on thermodynamic diagram, reverse engineering and model pruning |
-
2021
- 2021-07-14 CN CN202110796626.8A patent/CN113609482B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190318099A1 (en) * | 2018-04-16 | 2019-10-17 | International Business Machines Corporation | Using Gradients to Detect Backdoors in Neural Networks |
US20200410098A1 (en) * | 2019-06-26 | 2020-12-31 | Hrl Laboratories, Llc | System and method for detecting backdoor attacks in convolutional neural networks |
CN112989438A (en) * | 2021-02-18 | 2021-06-18 | 上海海洋大学 | Detection and identification method for backdoor attack of privacy protection neural network model |
CN113111349A (en) * | 2021-04-25 | 2021-07-13 | 浙江大学 | Backdoor attack defense method based on thermodynamic diagram, reverse engineering and model pruning |
Non-Patent Citations (1)
Title |
---|
HUANG X 等: "Neuroninspect: Detecting backdoors in neural networks via output explanations", ARXIV, pages 1 - 7 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114154589A (en) * | 2021-12-13 | 2022-03-08 | 成都索贝数码科技股份有限公司 | Similarity-based module branch reduction method |
CN114154589B (en) * | 2021-12-13 | 2023-09-29 | 成都索贝数码科技股份有限公司 | Module branch reduction method based on similarity |
CN114003511A (en) * | 2021-12-24 | 2022-02-01 | 支付宝(杭州)信息技术有限公司 | Evaluation method and device for model interpretation tool |
CN114003511B (en) * | 2021-12-24 | 2022-04-15 | 支付宝(杭州)信息技术有限公司 | Evaluation method and device for model interpretation tool |
CN116091871A (en) * | 2023-03-07 | 2023-05-09 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Physical countermeasure sample generation method and device for target detection model |
CN116091871B (en) * | 2023-03-07 | 2023-08-25 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Physical countermeasure sample generation method and device for target detection model |
Also Published As
Publication number | Publication date |
---|---|
CN113609482B (en) | 2023-10-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113609482A (en) | Back door detection and restoration method and system for image classification model | |
CN112597993B (en) | Patch detection-based countermeasure model training method | |
Liu et al. | Visualization of driving behavior using deep sparse autoencoder | |
CN113111349B (en) | Backdoor attack defense method based on thermodynamic diagram, reverse engineering and model pruning | |
CN109086797A (en) | A kind of accident detection method and system based on attention mechanism | |
CN112434599B (en) | Pedestrian re-identification method based on random occlusion recovery of noise channel | |
CN110991568A (en) | Target identification method, device, equipment and storage medium | |
CN112016499A (en) | Traffic scene risk assessment method and system based on multi-branch convolutional neural network | |
WO2024051183A1 (en) | Backdoor detection method based on decision shortcut search | |
CN113609784A (en) | Traffic limit scene generation method, system, equipment and storage medium | |
CN111814644B (en) | Video abnormal event detection method based on disturbance visual interpretation | |
CN114332829A (en) | Driver fatigue detection method based on multiple strategies | |
CN113537284A (en) | Deep learning implementation method and system based on mimicry mechanism | |
Parasnis et al. | RoadScan: A Novel and Robust Transfer Learning Framework for Autonomous Pothole Detection in Roads | |
CN116071797B (en) | Sparse face comparison countermeasure sample generation method based on self-encoder | |
CN115098855A (en) | Trigger sample detection method based on custom back door behavior | |
CN113283520B (en) | Feature enhancement-based depth model privacy protection method and device for membership inference attack | |
CN113807541B (en) | Fairness repair method, system, equipment and storage medium for decision system | |
CN110796237B (en) | Method and device for detecting attack resistance of deep neural network | |
CN108647592A (en) | Group abnormality event detecting method and system based on full convolutional neural networks | |
CN118587561B (en) | Action recognition migration attack method based on self-adaptive gradient time sequence characteristic pruning | |
Chen et al. | A Defense Method against Backdoor Attacks in Neural Networks Using an Image Repair Technique | |
CN114639007B (en) | Fire detection model training method and detection method based on improved DETR | |
Chen et al. | Investigating the Backdoor on DNNs Based on Recolorization and Reconstruction: From A Multi-Channel Perspective | |
Chen et al. | Functional safety of deep learning techniques in autonomous driving systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |