CN112270367A - Semantic information-based method for enhancing robustness of deep learning model - Google Patents

Semantic information-based method for enhancing robustness of deep learning model Download PDF

Info

Publication number
CN112270367A
CN112270367A CN202011222045.5A CN202011222045A CN112270367A CN 112270367 A CN112270367 A CN 112270367A CN 202011222045 A CN202011222045 A CN 202011222045A CN 112270367 A CN112270367 A CN 112270367A
Authority
CN
China
Prior art keywords
semantic information
deep learning
learning model
samples
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011222045.5A
Other languages
Chinese (zh)
Inventor
陈兴蜀
王丽娜
王伟
岳亚伟
唐瑞
朱毅
曾雪梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202011222045.5A priority Critical patent/CN112270367A/en
Publication of CN112270367A publication Critical patent/CN112270367A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a semantic information-based method for enhancing the robustness of a deep learning model, and belongs to the field of deep learning safety. In order to improve the defense capacity of the deep learning model against attack in the countermeasure environment and improve robustness, the invention designs a semantic information-based method for enhancing the robustness of the deep learning model against attack, which can fully mine the omitted semantic information near the decision boundary of the deep learning model and greatly improve the classification accuracy of the deep learning model against the countermeasure samples. The method comprises the steps of extracting general semantic information on a subset of a training data set in an iteration mode; the extracted general semantic information is used for increasing the diversity of training data in a random selection and simple superposition mode; respectively calculating loss functions on the clean samples and the samples added with the semantic information on the expanded new training set and summing the loss functions; and optimizing the summed loss function to train the deep learning model until the model converges.

Description

Semantic information-based method for enhancing robustness of deep learning model
Technical Field
The invention relates to the technical field of machine learning, in particular to a method for enhancing the robustness of a deep learning model based on semantic information.
Background
In recent years, with the accumulation of mass data and the dramatic increase in computing power, artificial intelligence represented by deep learning has been rapidly developed and has been gaining attention in many application scenarios. Deep learning models have achieved performance beyond that of humans on many tasks. However, different scenes and practical applications in the real world often face various situations such as high environmental complexity, strong uncertainty, incomplete information, information antagonism and interference, and the existing deep learning model excessively depends on massive data or knowledge, so that the existing deep learning model has the limitations of poor environmental change adaptability, easy attack in an antagonistic environment, single task and the like, and cannot meet the requirements of various scenes. Particularly, the deep learning model has the problem of poor robustness, the deep learning model with good performance on a test data set can be deceived by some countersamples which cannot be identified by human eyes, a serious error identification result occurs, and the deep learning model with poor robustness brings huge hidden dangers to the application in various fields.
At present, researches on improving robustness of a deep learning model are mainly divided into two types, one type discovers the upper bound of the robustness of the model by researching a new anti-attack form, and heuristically enhances the robustness of the model based on different attack methods, and the method can not provide powerful guarantee and often depends on a large number of samples. And another type of method uses formalization to ensure the lower bound of model robustness, and the method is reliable but has more assumed conditions, complex calculation and difficult application.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide a method for enhancing robustness of a deep learning model based on semantic information, which significantly improves robustness of the model by using missing semantic information, and can capture semantic information suitable for most samples with only a small number of samples. The technical scheme is as follows:
a method for enhancing the robustness of a deep learning model based on semantic information comprises the following steps:
the method comprises the following steps: iteratively extracting semantic information: for each category of the deep learning model classification task, randomly extracting a subset X not containing the category sample from training set data, iteratively extracting omitted semantic information near the category region on X, calculating to obtain a semantic information vector applicable to most samples on X, and constraining L of the vector by using a parameter etaAn upper bound of norm;
step two: sampling extended training set data: respectively extracting samples with specific ratios at random from a difference set of the training set and each class sample, adding the semantic information vector obtained by the first step, and keeping the rest unchanged to form a new training set;
step three: calculating an objective function: on the new training set, respectively calculating loss functions of the samples added with the semantic information and the original samples not added, and summing the loss functions;
step four: retraining the model: and (4) retraining the deep learning model by adopting the summed loss function obtained in the step three until the model converges.
Further, in the first step: respectively calculating missing semantic information vectors near decision boundaries of the corresponding regions of the class on a subset X of a difference set of the training set and each class; iteratively computing point by point on the subset X, at LUnder the upper bound constraint of the norm, sequentially calculating the components of the semantic information vector at each point and aggregating to obtain the final universal semantic information vector:
Figure BDA0002762370810000021
where r represents a semantic information vector, Δ riRepresented on the set XCalculating the component of the semantic information vector at the ith point, wherein the component is obtained by solving the optimization problem at the ith point; p∞,ηRepresenting the projection operation on an infinite norm sphere with 0 as the center of a circle and eta as the radius; by calculating Δ riAnd solving an optimization problem and performing projection operation at every k steps to limit the size of the semantic information vector.
Further, the objective function in step 3 is:
Figure BDA0002762370810000022
wherein, theta*Is a parameter of the model;
Figure BDA0002762370810000023
a loss function representing an original sample x to which no semantic information is added;
Figure BDA0002762370810000024
representing samples x with added semantic informationiA loss function of + r;
c represents an original training set; t represents a certain class in the model classification task, and T belongs to (1, 2.. eta., T), wherein T represents the total class number of the original model classification task; f (-) represents a deep learning model; θ, δ represent model parameters and perturbations added to the original sample, respectively.
The invention has the beneficial effects that:
1) the method extracts the semantic information missed in the area near the decision boundary of the model from the training data, and reserves the universal applicability of the semantic information vector to the sample point by one-by-one iterative polymerization on a series of samples.
2) The invention adopts a method of sample data diversity expansion to sample the training data set in proportion and uses semantic information to reconstruct to obtain the training set containing richer semantic information.
3) The method adopts the recalculated target function, retrains the deep learning model on the reconstructed training set until the model converges, and obtains stronger robustness.
Drawings
FIG. 1 is a conceptual diagram of the present invention.
Fig. 2 is a schematic diagram of semantic information vector according to the present invention (taking a ten-classified image data set as an example).
Detailed Description
The invention is described in further detail below with reference to the figures and specific embodiments. The deep learning model is easy to attack against the sample, which means that the model does not really learn the real concepts related to the decision, therefore, if the omitted information related to the real concepts can be extracted, the model can be helped to learn a clearer decision boundary closer to the reality, and the robustness of the model is enhanced. Such C to be extracted should not be derived from only a single instance of a sample, but should be applied to most samples, reflecting the decision-making concept-related information that is missed by the model, rather than causing the model to overfit individual samples.
The invention relates to a semantic information-based method for enhancing the robustness of a deep learning model, which mainly comprises the following steps of 1. The method mainly comprises the steps of iteratively extracting semantic information, sampling and expanding training set data, calculating a target function, retraining a model and the like. The conventional deep learning model a in fig. 1 is trained on the original training set samples, and the model can correctly classify clean samples, but cannot correctly classify antagonistic samples that are very close to the original samples. After the semantic information extracted by the method is used for reconstructing the original training set, the model is retrained, and the obtained model B with enhanced confrontation robustness not only can correctly classify clean samples, but also can correctly classify confrontation samples.
The calculation method is as follows:
1. iteratively extracting semantic information: for each category of the deep learning model classification task, randomly extracting a subset X not containing the category sample from training set data, iteratively extracting omitted semantic information near the category region on X, calculating to obtain a semantic information vector applicable to most samples on X, and constraining L of the vector by using a parameter etaThe upper bound of the norm.
Semantic information is closely related to human understanding of features and concepts. The semantic information is introduced into the model training process, so that the model can be helped to learn a real concept, and the robustness of confrontation is improved. This semantic information is missed in the model training process and should be related to the features of the sample as a whole, rather than the individual features of a certain sample, which means that the extracted semantic information must have generality to the sample. At the same time, such semantic information vectors should be represented in the form of very small perturbations to prevent destructive interference with the otherwise useful learning process.
Based on this, we propose a method for iteratively extracting semantic information, and the calculation formula is:
Figure BDA0002762370810000031
where r represents a semantic information vector, Δ riAnd representing the component of the semantic information vector calculated at the ith point on the set X, wherein the component is obtained by solving the optimization problem at the ith point. P∞,ηRepresents the projection operation on an infinity norm sphere with 0 as the center and η as the radius. Obtaining a semantic information vector with universality through continuous iterative polymerization, and calculating delta riAnd solving an optimization problem and performing projection operation at every k steps to limit the size of the semantic information vector.
The significance of this step is that the countermeasure samples which are considered to be very close to the original samples by the existing research but can not be correctly identified shows that the model obtained by the existing deep learning method system does not learn the real concept, the phenomenon of the countermeasure samples represents the blind spots of the model, but the countermeasure samples generated by the existing research are disordered in appearance and do not contain any information, and paradox exists between the two. Therefore, the omitted semantic information is extracted, so that the robustness of the model can be improved, the model is promoted to learn a real concept, and a more accurate model is obtained. In addition, the iterative extraction mode keeps the universality of most samples, so that the efficiency of subsequent steps can be improved, and the extracted semantic information is ensured to be general information rather than local unimportant information which only acts on a single sample.
In fig. 2, semantic information extracted from the data set is amplified and visualized, two pictures are randomly extracted from each class and compared with the semantic information picture, and a part outlined by a box clearly shows the corresponding relationship between the semantic information and the semantic information in the original image.
2. Sampling and expanding a training set: and (3) respectively and randomly extracting samples with specific ratios from the difference set of the training set and each class sample, adding the semantic information vector obtained by the first step, and keeping the rest unchanged to form a new training set.
After extracting the semantic information iteratively, the training data set needs to be extended using the semantic information. The augmentation requires the addition of semantic information while not reducing the original distribution characteristics. Therefore, semantic information is added to the difference set between the training data and the data of different classes in a sampling mode.
Denote the original training set by CtSamples belonging to a category T in C are represented, wherein T e (1, 2.. eta., T) represents the total number of categories of the original model classification task. At C \ CtExtracting a sample according to the probability P, adding semantic information to the extracted sample x to change the sample x into xi+ r. The remaining samples remain unchanged and are mixed well to form a new training set.
3. Calculating an objective function: on the new training set, loss functions are calculated and summed respectively for the samples with added semantic information and the original samples without added semantic information.
The traditional deep learning model training process obtains the parameters of the model by solving the optimization problem about the objective function J (θ, x, y), which is shown in the following formula (2):
Figure BDA0002762370810000041
wherein Θ is*Are parameters of the model.
Solving a min-max problem by a general countertraining mode, as shown in formula (3)
Figure BDA0002762370810000042
Based on this, the method decomposes the objective function into two parts, as shown in formula (4)
Figure BDA0002762370810000051
Wherein, theta*Is a parameter of the model;
Figure BDA0002762370810000052
a loss function representing an original sample x to which no semantic information is added;
Figure BDA0002762370810000053
representing samples x with added semantic informationiA loss function of + r;
c represents an original training set; t is a certain class in the model classification task, and f (-) represents a deep learning model.
4. Retraining the model: and (4) retraining the deep learning model by adopting the summed loss function obtained in the step three until the model converges.
Through the steps, a new training set and a new objective function added with semantic information are obtained. On the new dataset, the deep learning model is retrained using the newly computed objective function until convergence. The obtained new model not only can correctly classify the clean samples, but also can resist the counterattack, and the model with stronger counterattack robustness is obtained.

Claims (3)

1. A method for enhancing the robustness of a deep learning model based on semantic information is characterized by comprising the following steps:
the method comprises the following steps: iteratively extracting semantic information: for each category of the deep learning model classification task, randomly extracting a sub-category not containing the category sample from the training set dataCollecting X, iteratively extracting omitted semantic information near the region on X, calculating to obtain a semantic information vector applicable to most samples on X, and constraining L of the vector by using parameter etaAn upper bound of norm;
step two: sampling extended training set data: respectively extracting samples with specific ratios at random from a difference set of the training set and each class sample, adding the semantic information vector obtained by the first step, and keeping the rest unchanged to form a new training set;
step three: calculating an objective function: on the new training set, respectively calculating loss functions of the samples added with the semantic information and the original samples not added, and summing the loss functions;
step four: retraining the model: and (4) retraining the deep learning model by adopting the summed loss function obtained in the step three until the model converges.
2. The method for enhancing robustness of deep learning model based on semantic information as claimed in claim 1, wherein in the step one: respectively calculating missing semantic information vectors near decision boundaries of the corresponding regions of the class on a subset X of a difference set of the training set and each class; iteratively computing point by point on the subset X, at LUnder the upper bound constraint of the norm, sequentially calculating the components of the semantic information vector at each point and aggregating to obtain the final universal semantic information vector:
Figure FDA0002762370800000011
where r represents a semantic information vector, Δ riRepresenting the component of the semantic information vector calculated at the ith point on the set X, wherein the component is obtained by solving the optimization problem at the ith point; p∞,ηRepresenting the projection operation on an infinite norm sphere with 0 as the center of a circle and eta as the radius; by calculating Δ riAnd solving an optimization problem and performing projection operation at every k steps to limit the size of the semantic information vector.
3. The method for enhancing robustness of deep learning model based on semantic information as claimed in claim 1, wherein the objective function in step 3 is:
Figure FDA0002762370800000012
wherein, theta*Is a parameter of the model;
Figure FDA0002762370800000013
a loss function representing an original sample x to which no semantic information is added;
Figure FDA0002762370800000014
representing samples x with added semantic informationiA loss function of + r;
c represents an original training set; t represents a certain class in the model classification task, and T belongs to (1, 2.. eta., T), wherein T represents the total class number of the original model classification task; f (-) represents a deep learning model; θ, δ represent model parameters and perturbations added to the original sample, respectively.
CN202011222045.5A 2020-11-05 2020-11-05 Semantic information-based method for enhancing robustness of deep learning model Pending CN112270367A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011222045.5A CN112270367A (en) 2020-11-05 2020-11-05 Semantic information-based method for enhancing robustness of deep learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011222045.5A CN112270367A (en) 2020-11-05 2020-11-05 Semantic information-based method for enhancing robustness of deep learning model

Publications (1)

Publication Number Publication Date
CN112270367A true CN112270367A (en) 2021-01-26

Family

ID=74346129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011222045.5A Pending CN112270367A (en) 2020-11-05 2020-11-05 Semantic information-based method for enhancing robustness of deep learning model

Country Status (1)

Country Link
CN (1) CN112270367A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114491039A (en) * 2022-01-27 2022-05-13 四川大学 Meta-learning few-sample text classification method based on gradient improvement
CN115473734A (en) * 2022-09-13 2022-12-13 四川大学 Remote code execution attack detection method based on single classification and federal learning

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135579A (en) * 2019-04-08 2019-08-16 上海交通大学 Unsupervised field adaptive method, system and medium based on confrontation study

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135579A (en) * 2019-04-08 2019-08-16 上海交通大学 Unsupervised field adaptive method, system and medium based on confrontation study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LINA WANG ET AL.: ""Improving adversarial robustness of deep neural networks by using semantic information"", 《ARXIV》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114491039A (en) * 2022-01-27 2022-05-13 四川大学 Meta-learning few-sample text classification method based on gradient improvement
CN114491039B (en) * 2022-01-27 2023-10-03 四川大学 Primitive learning few-sample text classification method based on gradient improvement
CN115473734A (en) * 2022-09-13 2022-12-13 四川大学 Remote code execution attack detection method based on single classification and federal learning
CN115473734B (en) * 2022-09-13 2023-08-11 四川大学 Remote code execution attack detection method based on single classification and federal learning

Similar Documents

Publication Publication Date Title
CN110163258B (en) Zero sample learning method and system based on semantic attribute attention redistribution mechanism
CN110941794A (en) Anti-attack defense method based on universal inverse disturbance defense matrix
Kozerawski et al. Clear: Cumulative learning for one-shot one-class image recognition
CN113128591B (en) Rotary robust point cloud classification method based on self-supervision learning
CN112750129B (en) Image semantic segmentation model based on feature enhancement position attention mechanism
CN112270367A (en) Semantic information-based method for enhancing robustness of deep learning model
US20230134531A1 (en) Method and system for rapid retrieval of target images based on artificial intelligence
CN108052959A (en) A kind of method for improving deep learning picture recognition algorithm robustness
CN112270300A (en) Method for converting human face sketch image into RGB image based on generating type confrontation network
Cai et al. Aris: a noise insensitive data pre-processing scheme for data reduction using influence space
Bhanu et al. Evolutionary synthesis of pattern recognition systems
Mozos et al. Semantic labeling of places using information extracted from laser and vision sensor data
CN116827685B (en) Dynamic defense strategy method of micro-service system based on deep reinforcement learning
CN113593043A (en) Point cloud three-dimensional reconstruction method and system based on generation countermeasure network
CN117218351A (en) Three-dimensional point cloud semantic segmentation method based on local and global context awareness
CN111950635A (en) Robust feature learning method based on hierarchical feature alignment
CN111767949A (en) Multi-task learning method and system based on feature and sample confrontation symbiosis
Zhang et al. CAGFuzz: Coverage-guided adversarial generative fuzzing testing of deep learning systems
CN113033410B (en) Domain generalization pedestrian re-recognition method, system and medium based on automatic data enhancement
Katsumata et al. Open-set domain generalization via metric learning
CN114511745A (en) Three-dimensional point cloud classification and rotation attitude prediction method and system
Visser et al. Stampnet: unsupervised multi-class object discovery
Nguyen et al. 3D orientation and object classification from partial model point cloud based on pointnet
Nehvi et al. Visual Recognition of Local Kashmiri Objects with Limited Image Data using Transfer Learning
Su et al. The precise vehicle retrieval in traffic surveillance with deep convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210126

RJ01 Rejection of invention patent application after publication