CN115331079A - Attack resisting method for multi-mode remote sensing image classification network - Google Patents
Attack resisting method for multi-mode remote sensing image classification network Download PDFInfo
- Publication number
- CN115331079A CN115331079A CN202211005572.XA CN202211005572A CN115331079A CN 115331079 A CN115331079 A CN 115331079A CN 202211005572 A CN202211005572 A CN 202211005572A CN 115331079 A CN115331079 A CN 115331079A
- Authority
- CN
- China
- Prior art keywords
- remote sensing
- network
- sensing image
- disturbance
- modal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Remote Sensing (AREA)
- Optical Communication System (AREA)
Abstract
The invention discloses an anti-attack method for a multi-mode remote sensing image classification network, which provides an anti-attack technology for the multi-mode remote sensing image classification network for two remote sensing image data sources, namely an optical remote sensing image TOP with three wave bands and a digital elevation model image DSM with one wave band, has more obvious attack effect and higher attack time efficiency and is used for evaluating and improving the robustness of the multi-mode remote sensing image classification network.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an anti-attack method for a multi-mode remote sensing image classification network.
Background
Remote sensing image classification remains a challenging task in the field of remote sensing. In recent years, with the rapid development of aerospace technology, a large number of remote sensing images of different sensors are generated, and effective classification of ground objects by combining a plurality of sensor images becomes a hot point of research. The multimodal remote sensing image covers the ground features with different observations, for example a TOP image, which expresses high spatial resolution information of the remote sensing scene, and a Digital Surface Model (DSM) image, which provides height information of the ground features. The multi-mode remote sensing image classification method based on the advantages of the multi-mode remote sensing data can effectively reduce the uncertainty in the classification task. In recent years, deep neural networks are widely introduced into the task of multi-modal remote sensing image classification. Compared with a single-mode remote sensing image classification method, the multi-mode remote sensing image classification method can effectively utilize the correlation between two modes and realize higher classification precision.
The vulnerability of deep neural networks has also attracted a great deal of attention in recent years. Adding a tiny perturbation that cannot be distinguished by human eyes to the original clean sample enables the deep neural network to generate wrong predictions with high confidence, the perturbed sample is called a countermeasure sample, and the process of generating the countermeasure sample is called countermeasure attack. Therefore, it is necessary to make the network more robust, and to know all risks faced by the network in advance, and to find a counterexample with more aggressive performance. Various attack methods have been proposed to generate countermeasure samples, such as Fast Gradient Sign Method (FGSM), iterative Fast Gradient Sign Method (I-FGSM), C & W algorithm based on optimized objective function, deepFool algorithm, etc., but the existing countermeasure attack technology only considers the attack effect under a single mode and lacks the attack implementation on the multi-mode classification network. When the multi-mode remote sensing image classification network is attacked, not only the success rate of the attack, the concealment of disturbance and the timeliness of the attack but also the cooperative attack capability among different modes need to be considered. Therefore, for the multi-modal remote sensing image classification network, high-quality multi-modal countermeasures need to be generated, and the robustness of the multi-modal remote sensing image classification network is further evaluated and improved.
Disclosure of Invention
The invention aims to provide an attack resisting method for a multi-mode remote sensing image classification network, which has more remarkable attack effect and higher attack time efficiency.
The invention adopts the following technical scheme: an attack resisting method for a multi-mode remote sensing image classification network comprises the following steps:
step one, constructing a multi-modal training sample set T and a test sample set S;
step two, inputting the training sample set T and the testing sample set S in the step one into a target model f to be attacked;
step three, constructing a target attack class t according to the target model f to be attacked in the step two;
step four, building a multi-mode anti-attack network; the multi-mode anti-attack network consists of a disturbance generation network and an identifier network of each mode;
step five, generating a multi-modal confrontation sample;
step six, inputting the multi-modal confrontation samples in the step five into a corresponding disturbance generation network, a discriminator network and a target model f respectively, and constructing a multi-modal generation loss function and a multi-modal discrimination loss function;
step seven, respectively training the multi-modal generation loss function and the multi-modal identification loss function in the step six alternately, updating each disturbance generation network and each identification network, and finishing the training of the multi-modal anti-attack network;
step eight, inputting the test samples into the multi-modal counter attack network in the step seven, and generating corresponding test counter samples.
Further, after the step eight, a step nine is further included, as follows: inputting the training samples into the multi-modal counterattack network to obtain the counterattack samples of the training samples, adding the counterattack samples of the training samples into the training sample set, and re-training the target model f in the second step; and repeating the third step to the eighth step.
Further, the multiple modes are optical remote sensing images and digital elevation model images.
Further, the specific process of constructing the multi-modal generation loss function and the multi-modal discrimination loss function is as follows:
step 6.1, constructing a multi-modal generating loss function, as shown in formula (C):
wherein:respectively taking the optical remote sensing image as a multimode mode and taking the optical remote sensing image as an image of the resistance loss and the digital elevation model as a resistance loss;
the hyper-parameters alpha, beta and gamma are weight coefficients of perception loss, deception loss and cooperative attack loss;
L f to defraud the loss; classified as t-th by error, spoofing loss L f The definition is shown as a formula (H);
L c the definition of the synergistic loss is shown as formula (I);
wherein: d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to the optical remoteSensory image authentication network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The obtained output probability;respectively representing the perception loss of the optical remote sensing image and the perception loss of the digital elevation model image, and the definitions are shown as formulas (F) and (G):
wherein: epsilon is a hyperparameter controlling the minimum allowable disturbance intensity, | | | | computational complexity P L representing a disturbance P A norm;
formula (H) is:
f=l f (f(x′ T ,x′ D ),t) (H);
wherein: f is the target model to be attacked, l f Obtaining a target label t of the target attack for the cross entropy loss by the third step;
formula (I) is:
L C =l C (T(G T (x T )),G D (x D )) (I);
wherein: t is a band variation function; l C Is a cosine similarity measure function;
step 6.2, constructing a multi-modal discrimination loss function, as shown in formula (J):
wherein:the multi-mode discriminator loss function is an optical remote sensing image and a digital elevation model image respectively, and is defined as shown in a formula (K) and a formula (L):
wherein: d T (x T ) Representing training sample x for original multi-modality to optical remote sensing image T Input to a multimode optical remote sensing image identification network D T The obtained output probability; d D (x D ) Representing training sample x of original digital elevation model image D Input to the digital elevation model image discrimination network D D The obtained output probability; d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The resulting output probability.
Further, in the second step, the target model f is a trained multi-source remote sensing image classification network.
Further, in step three, the prediction probability score P = [ P ] of the sample is obtained by forward calculation of the target model f 1 ,p 2 ,...,p n ] T ;
Performing One-Hot encoding according to the original label and the category number of the sample to obtain an encoded vector, where if the original sample label is 1, the encoded vector is h = [1,0., 0 =] T The vector length is the number of classes of the sample;
obtaining an inverse mask from the encoded vectorMultiplying the reverse mask by the prediction probability score P to obtain the prediction probability value P' = [0,p ] of other categories except the sample original label category probability 2 ,...,p n ] T Performing difference processing on p ', subtracting a larger value from the position of the original label, namely the difference s = p' -h × 1e10, and taking the number of positions where the maximum value of the difference s is located as the target category;
wherein: p is a radical of n The probability score of the output nth class is represented.
Further, in step four, the multi-modal attack-resisting network generates a network G by the disturbance of the optical remote sensing image T Optical remote sensing image discriminator network D T Digital elevation model image disturbance generation network G D And a digital elevation model image discriminator network D D And (4) forming.
Further, in step five, the multi-modal confrontation sample generation process is as follows:
training sample x of optical remote sensing image T Input to optical remote sensing image disturbance generation network G T In the method, optical remote sensing image disturbance is generated, and the generated optical remote sensing image disturbance is added to an input optical remote sensing image training sample to obtain a TOP confrontation sample x' T As shown in formula (A):
x′ T =x T +G T (x T ) (A);
wherein: g T (x T ) Representing TOP disturbance generated by an optical remote sensing image disturbance generation network.
Training sample x of digital elevation model image D Input to the image disturbance generation network D of the digital elevation model T Generating image disturbance of a digital elevation model, adding the generated disturbance to an input digital elevation model image training sample to obtain a countermeasure sample x 'of the digital elevation model image' D As shown in formula (B):
x′ D =x D +G D (x D ) (B);
wherein: g D (x D ) And representing the image disturbance of the digital elevation model generated by the image disturbance generation network of the digital elevation model.
Further, in step eight, the specific process of attacking the multimodal remote sensing image classification network by using the test sample is as follows:
testing TOP samplesInputting the input to the trained TOP disturbance generation network, and outputting the TOP disturbanceAnd adding the TOP disturbance and the input TOP test sample to obtain a confrontation sample of the TOP test sampleSimultaneous DSM testingInputting the data to a trained DSM disturbance generation network, and outputting the DSM disturbanceAnd adding the DSM disturbance to the input DSM test sample to obtain a countermeasure sample of the DSM test sampleFinishing the attack resistance of the multi-mode remote sensing image classification network; the concrete expression is as follows:
further, the specific process of step nine is as follows:
training sample x of optical remote sensing image T Is inputted intoThe trained good optical remote sensing image disturbance generates network output optical remote sensing image disturbance, the optical remote sensing image disturbance is added with the input optical remote sensing image training sample to obtain a countermeasure sample x 'of the optical remote sensing image training sample' T ;
Training sample x of digital elevation model image D Inputting the image disturbance of the trained digital elevation model into a disturbance generation network, outputting the image disturbance of the digital elevation model, and adding the image disturbance of the digital elevation model and the input image training sample of the digital elevation model to obtain a countermeasure sample x 'of the image training sample of the digital elevation model' D (ii) a X' T And x' D Adding the training samples into the training sample set to obtain a new training sample set T', namely { x T ,x D ,x′ T ,x′ D E, T', and retraining the target model f in the step two.
The invention has the beneficial effects that: 1. the method considers that multi-source disturbance is generated by combining a plurality of modes, and the generated countermeasure sample is more real. 2. A multi-modal generation loss function and a multi-modal identification loss function are designed to establish a relation between different modal disturbances, so that the interference intensity added to each data source can be effectively reduced, and meanwhile, a higher attack success rate is kept.
Drawings
FIG. 1 is a research flow block diagram of an attack resisting method for a multi-source remote sensing image classification network according to the invention;
FIG. 2 is a data set used in the experiments of the present invention;
2a is a Potsdam dataset; 2b is a Vaihingen dataset;
FIG. 3 is a schematic diagram of the structure of the object model, disturbance generation network, discriminator network of the present invention;
FIG. 4 is a diagram of the classification before and after various approaches to confrontation training on a Potsdam dataset;
FIG. 5 is a classification chart of different classification methods on the Vaihingen dataset before and after the countertraining.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention discloses an attack resisting method for a multi-mode remote sensing image classification network, which comprises the following steps as shown in figure 1:
step one, constructing a multi-modal training sample set T and a test sample set S;
the multi-modes are three-band optical remote sensing images (TOP) and one-band digital elevation model images (DSM).
Inputting two multi-modal data TOP and DSM, and a real class diagram corresponding to the two modalities, as shown in FIG. 2; according to each pixel on the two multi-modal data TOP and DSM, taking the pixel as the center, defining a spatial window with the size of 27 multiplied by 27 pixels to extract samples, and respectively extracting the TOP samples and the DSM samples in pairs on the TOP and the DSM to form a sample pair; forming a sample set according to all the sample pairs; selecting part of sample pairs to form a training sample set, and selecting the rest of sample pairs in the sample set to form a test sample set, wherein { x T ,x D }∈T,x T Andrespectively TOP training and TOP test samples, x D AndDSM training samples and DSM test samples, respectively.
Step two, inputting the training sample set T and the testing sample set S in the step one into a target model f to be attacked;
in the second step, the target model f is a trained multi-source remote sensing image classification network. As shown in fig. 3 (a), the configuration is specifically: the method comprises the steps of inputting two data sources, wherein the data input size of a TOP data source is 28 multiplied by 3, the data input size of a DSM data source is 28 multiplied by 1, each data source is provided with 5-layer networks to extract basic features of each data source respectively, then the outputs of the two data sources are connected, and then common features of the two data sources are extracted and classified through the 4-layer networks. The network parameters of the target model f are specifically shown in table 1, the target model f is trained well in advance, and the parameters of the target model are kept unchanged in the attack process.
TABLE 1 target model f network parameters
Step three, constructing a target attack class t according to the target model f to be attacked in the step two;
in the third step, each pair of samples x in the training sample set T is separately paired T ,x D Carrying out forward calculation through a pre-trained target model f to obtain an output probability score of each sample on each class, selecting a target attack class according to the output probability score, and assuming that a training sample pair is x T ,x D The output probability score is P = [ P ] 1 ,p 2 ,...,p n ] T ,p n Representing the probability score of the output nth class, and constructing a target class according to the following principle:
obtaining a prediction probability score P = [ P ] of the sample through forward calculation of the target model f 1 ,p 2 ,...,p n ] T ;
Performing One-Hot encoding according to the original label and the category number of the sample to obtain an encoded vector, where if the original sample label is 1, the encoded vector is h = [1,0., 0 =] T The vector length is the number of classes of the sample;
obtaining an inverse mask from the encoded vectorMultiplying the reverse mask by the prediction probability score P to obtain the prediction probability value P' = [0,p ] of other categories except the sample original label category probability 2 ,...,p n ] T In order to avoid the repetition of the target category label and the original category label, difference processing is performed on p ', a larger value is subtracted from the position of the original label, namely the difference s = p' -h × 1e10, and the number of positions where the maximum value of the difference s is located is taken as the number of the positions where the maximum value of the difference s is locatedA target category;
wherein: p is a radical of n The probability score of the output nth class is represented.
Step four, building a multi-mode anti-attack network; the multi-modal attack-resisting network consists of a disturbance generation network and a discriminator network of each mode;
in the fourth step, the multi-modal attack-resisting network generates a network G by the disturbance of the optical remote sensing image T Optical remote sensing image discriminator network D T Digital elevation model image disturbance generation network G D And a digital elevation model image discriminator network D D And (4) forming.
As shown in fig. 3 (b) (c). The disturbance generating network is a coding-decoding structure in which G T The image input size of the layer 1 convolutional layer of (2) is 28X 3,G D Has an input size of 28 × 28 × 1, and the number of filters in the third layer in the decoding structure depends on the number of channels in the picture, G T Is 3,G D To 1, in addition to this parameter setting, the disturbance generation network G T And G D The other parameter settings are the same, as shown in table 2; TOP discriminator network D T Layer 1 convolutional layer input picture size 28 x3, dsm discriminator network D D The size of the convolution layer 1 input picture is 28 × 28 × 1, and the discriminator D T And D D The same parameter settings are shown in table 3.
Table 2 network parameters of a disturbance generation network
TABLE 3 discriminator network parameters
Step five, generating a multi-modal confrontation sample;
in the fifth step, the multi-modal confrontation sample generation process is as follows:
training sample x of optical remote sensing image T Input to optical remote sensing image disturbance generation network G T In the method, optical remote sensing image disturbance is generated, and the generated optical remote sensing image disturbance is added to an input optical remote sensing image training sample to obtain a TOP confrontation sample x' T As shown in formula (A):
x′ T =x T +G T (x T ) (A);
wherein: g T (x T ) Representing TOP disturbance generated by an optical remote sensing image disturbance generation network.
Training sample x of digital elevation model image D Input to the image disturbance generation network D of the digital elevation model T Generating image disturbance of a digital elevation model, adding the generated disturbance to an input digital elevation model image training sample to obtain a countermeasure sample x 'of the digital elevation model image' D As shown in formula (B):
x′ D =x D +G D (x D ) (B);
wherein: g D (x D ) And representing the image disturbance of the digital elevation model generated by the image disturbance generation network of the digital elevation model.
Step six, inputting the multi-modal confrontation samples in the step five into a corresponding disturbance generation network, a discriminator network and a target model f respectively, and constructing a multi-modal generation loss function and a multi-modal discrimination loss function;
in the sixth step, the specific process of constructing the multi-modal generating loss function and the multi-modal discriminating loss function is as follows:
inputting TOP countercheck samples into a TOP discriminator and a target model f respectively, and simultaneously inputting DSM countercheck samples into a DSM discriminator and a target function f respectively, and constructing a multi-modal generation loss function and a multi-modal discrimination loss function according to TOP disturbance, the TOP countercheck samples, TOP at the output of the discriminator and TOP at the output of the target model, and DSM disturbance, DSM countercheck samples, DSM at the output of the discriminator and DSM at the output of the target model; the concrete structure is as follows:
step 6.1, constructing a multi-modal generating loss function, as shown in formula (C):
wherein:respectively taking the optical remote sensing image as a multimode mode and taking the optical remote sensing image as an image of the resistance loss and the digital elevation model as a resistance loss; the TOP challenge samples generated with the constraints are closer to the input TOP training samples and the DSM challenge samples generated with the constraints are closer to the input DSM training samples.
The hyper-parameters alpha, beta and gamma are weight coefficients of perception loss, deception loss and cooperative attack loss;
L f for spoofing losses, classified by mistake as category t, spoofing losses L f The definition is shown as a formula (H);
L c is a cooperative loss to realize multi-modal cooperative attack, and the definition of the cooperative loss is shown as a formula (I);
wherein: d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The obtained output probability;respectively representing the optical remote sensing image perception loss and the digital elevation model image perception loss so as to respectively restrict the disturbance intensity of TOP disturbance and DSM disturbance, and the definitions are shown as formulas (F) and (G):
wherein: epsilon is a hyperparameter controlling the minimum allowable disturbance intensity, | | | | computational complexity P L representing a disturbance P A norm; used in the examples is ∞ The norm constrains the perturbation, forcing the generated countermeasure sample closer to the true sample.
Formula (H) is:
L f =l f (f(x′ T ,x′ D ),t) (H):
wherein: f is the target model to be attacked, l f Obtaining a target label t of the target attack for the cross entropy loss by the third step;
formula (I) is:
L C =l C (T(G T (x T )),G D (x D )) (I);
wherein: t is a band variation function; because the TOP data has three bands and the DSM perturbation has one band, the DSM perturbation needs to be expanded repeatedly to three bands for similarity calculation. l C The method is a cosine similarity measurement function and is used for measuring the similarity of disturbance under different modes.
Step 6.2, constructing a multi-modal discrimination loss function, as shown in formula (J):
wherein:the multi-mode discriminator loss function is an optical remote sensing image and a digital elevation model image respectively, and is defined as shown in a formula (K) and a formula (L):
wherein: d T (x T ) Representing training sample x for original multimodality to optical remote sensing image T Input to a multimode optical remote sensing image identification network D T The obtained output probability; d D (x D ) Representing training sample x of original digital elevation model image D Input to the digital elevation model image discrimination network D D The obtained output probability; d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The resulting output probability.
Seventhly, respectively training the multi-modal generation loss function and the multi-modal identification loss function in the sixth step alternately, updating each disturbance generation network and each identification network, and finishing the training of the multi-modal anti-attack network;
alternately training by utilizing a multi-mode generator loss and multi-mode discriminator loss optimization model, respectively training a TOP disturbance generation network, a TOP discrimination network, a DSM disturbance generation network and a DSM discrimination network, and performing the following steps:
step 7.1, update TOP discriminator D T . Health-care productDSM discriminator D D DSM disturbance generator G D TOP perturbation generator G T Using a gradient descent method to minimize the formula (M) for the TOP discriminator D T And (5) training.
Step 7.2 update TOP perturbation generator G T . Hold TOP discriminator D T DSM disturbance generator G D DSM discriminator D D The TOP disturbance generator G is subjected to minimization of the formula (N) by a gradient descent method under the condition that the parameters are not changed T And (6) updating.
Step 7.3, update DSM discriminator D D . Preserving DSM disturbance generator G D TOP discriminator D T TOP disturbance generator G T Using gradient descent with unchanged parameters to minimize formula (O) for DSM discriminator D D And (6) updating.
Step 7.4, update DSM disturbance Generator G D . Holding DSM discriminator D D TOP perturbation generator G T TOP discriminator D T Using gradient descent method with unchanged parameters to minimize formula (P) for DSM disturbance generator G D And (4) updating.
Step eight, inputting the test samples into the multi-modal counter attack network in the step seven, and generating corresponding test counter samples.
In the step eight, the specific process of attacking the multi-modal remote sensing image classification network by using the test sample is as follows:
testing TOP samplesInputting the input to the trained TOP disturbance generation network, and outputting the TOP disturbanceAdding the TOP disturbance and the input TOP test sample to obtain a confrontation sample of the TOP test sampleSimultaneous DSM testingInputting the data to a trained DSM disturbance generation network, and outputting DSM disturbanceAnd adding the DSM disturbance to the input DSM test sample to obtain a countermeasure sample of the DSM test sampleFinishing the attack resistance of the multi-mode remote sensing image classification network; the concrete expression is as follows:
the ninth step is as follows: inputting training samples into a multi-modal counterattack network to obtain countersamples of the training samples, adding the countersamples of the training samples into a training sample set, and re-training the target model f in the second step; and repeating the third step to the eighth step.
The method comprises the following specific steps: optical remote sensing imageTraining sample x T Inputting the optical remote sensing image disturbance into a trained optical remote sensing image disturbance to generate a network output optical remote sensing image disturbance, and adding the optical remote sensing image disturbance and an input optical remote sensing image training sample to obtain a countermeasure sample x 'of the optical remote sensing image training sample' T ;
Training sample x of digital elevation model image D Inputting the image disturbance of the trained digital elevation model into a disturbance generation network, outputting the image disturbance of the digital elevation model, and adding the image disturbance of the digital elevation model and the input image training sample of the digital elevation model to obtain a countermeasure sample x 'of the image training sample of the digital elevation model' D (ii) a X' T And x' D Adding the new training sample set T 'into the training sample set to obtain a new training sample set T', namely { x T ,x D ,x′ T ,x′ D E, the target model f in the step two is trained again to enhance the capability of the model to cope with attacks.
The effect of the method of the present invention can be further illustrated by the following simulation experiments:
(1) Simulation conditions are as follows:
the hardware conditions of the simulation of the invention are as follows: windows10, GPU NVIDIA GeForce RTX3060; the software platform is as follows: matlabR2016a, pycharm;
the picture sources selected for simulation are Potsdam dataset and Vaihingen dataset. The Potsdam data set comprises 28 6000-pixel unmanned aerial vehicle images in total, the resolution is 5 cm/pixel, the data set comprises image data, terrain data and label data of three different channels, and a pair of DSM data and TOP data are randomly selected as input of the multi-source remote sensing image network as shown in 2a in figure 2. The Vaihingen data set comprises 33 remote sensing images with different sizes, TOP data with 3 wave bands and DSM data with a single wave band, the resolution is 9 cm/pixel, and a pair of DSM data and TOP data are selected as the input of a multi-source remote sensing image classification network and are shown as 2b in figure 2; in the invention, 10000 pixels are randomly selected from a Potsdam data set to each class respectively to construct a training sample, and 800 pixels are randomly selected from a Vaihingen data set to each class respectively to construct a training sample.
Simulation content and results:
fig. 4 (a) to (d) are FGSM, C & W, PGD and the classification effect diagram of the technology of the present invention on the target model f for the test countermeasure sample on the Potsdam image dataset, respectively, (i) is the classification effect diagram of the network on the original test sample, and can be obtained from the diagram, the countermeasure sample generated by the four technologies can make the target model generate obvious wrong scores compared with the classification diagram of the original test sample on the target model f, and the classification diagrams (a) - (d) have obvious differences with the classification diagram (i), but the classification diagram obtained by the present invention has more obvious differences with the diagram (i), and the wrong regions are more numerous, which indicates that the countermeasure sample generated by the technology of the present invention is more aggressive.
Fig. 5 (a) to (d) are graphs of the classification effect of the FGSM, C & W, PGD and the technique of the present invention on the target model f for the testing challenge sample on the Vaihingen image dataset, respectively, (i) is a graph of the classification effect of the network on the original testing sample, and fig. 5 shows the same result as fig. 4.
The numerical comparison of the classification results before and after the antagonistic training on the data set between the method of the present invention and the prior art FGSM, C & W, PGD is shown in tables 4 and 5:
table 4 is a numerical comparison of classification results before and after the countertraining on Potsdam dataset for the method of the present invention and prior art FGSM, C & W, PGD
Table 5 is a numerical comparison of the classification results of the present invention method and prior art FGSM, C & W, PGD before and after the challenge training on the Vaihingen dataset
The data in tables 4 and 5 show that the technique of the present invention has shorter test time and higher attack time efficiency. It is obvious that the method has obvious advantages in processing multi-modal data attacks.
Claims (10)
1. An attack resisting method for a multi-mode remote sensing image classification network is characterized by comprising the following steps:
step one, constructing a multi-modal training sample set T and a test sample set S;
step two, inputting the training sample set T and the testing sample set S in the step one into a target model f to be attacked;
step three, constructing a target attack class t according to the target model f to be attacked in the step two;
step four, building a multi-mode anti-attack network; the multi-modal attack-resisting network consists of a disturbance generation network and a discriminator network of each mode;
step five, generating a multi-modal confrontation sample;
step six, respectively inputting the multi-modal confrontation samples in the step five into a corresponding disturbance generation network, a discriminator network and a target model f to construct a multi-modal generation loss function and a multi-modal discrimination loss function;
seventhly, respectively training the multi-modal generation loss function and the multi-modal identification loss function in the sixth step alternately, updating each disturbance generation network and each identification network, and finishing the training of the multi-modal anti-attack network;
step eight, inputting the test samples into the multi-modal counter attack network in the step seven, and generating corresponding test counter samples.
2. The attack-fighting method facing the multi-modal remote sensing image classification network as claimed in claim 1, wherein after the step eight, the method further comprises a step nine as follows: inputting training samples into a multi-modal counterattack network to obtain countersamples of the training samples, adding the countersamples of the training samples into a training sample set, and re-training the target model f in the second step; and repeating the third step to the eighth step.
3. The attack-fighting method facing multi-modal remote sensing image classification network according to claim 1 or 2, characterized in that the multi-modalities are optical remote sensing images and digital elevation model images.
4. The attack resisting method for the multi-modal remote sensing image classification network as claimed in claim 3, wherein the specific process of constructing the multi-modal generation loss function and the multi-modal identification loss function is as follows:
step 6.1, constructing a multi-modal generating loss function, as shown in formula (C):
wherein:respectively taking the optical remote sensing images as a multimode mode to resist loss and the digital elevation model images to resist loss;
the hyper-parameters alpha, beta and gamma are weight coefficients of perception loss, deception loss and cooperative attack loss;
L f to defraud the loss; classified as t-th by error, spoofing loss L f The definition is shown as formula (H);
L c the definition of the synergistic loss is shown as formula (I);
wherein: d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The obtained output probability;respectively representing the perception loss of the optical remote sensing image and the perception loss of the digital elevation model image, and the definitions are shown as formulas (F) and (G):
wherein: ε is a hyperparameter controlling the minimum allowable disturbance intensity, | | | | | non-calculation P L representing a disturbance P A norm;
formula (H) is:
L f =l f (f(x′ T ,x′ D ),t) (H);
wherein: f is the target model to be attacked, l f Obtaining a target label t of the target attack for the cross entropy loss by the third step;
formula (I) is:
L C =l C (T(G T (x T )),G D (x D )) (I);
wherein: t is a band variation function; l C Is a cosine similarity measure function;
step 6.2, constructing a multi-modal discrimination loss function, as shown in formula (J):
wherein:the multi-mode is an identifier loss function of an optical remote sensing image and a digital elevation model image respectively, and the definitions are respectively shown as a formula (K) and a formula (L):
wherein: d T (x T ) Representing training sample x for original multi-modality to optical remote sensing image T Input to a multimode optical remote sensing image identification network D T The obtained output probability; d D (x D ) Representing training sample x of original digital elevation model image D Input to the digital elevation model image discrimination network D D The obtained output probability; d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The resulting output probability.
5. The attack-resisting method facing the multi-modal remote sensing image classification network as claimed in claim 4, wherein in the second step, the target model f is a trained multi-source remote sensing image classification network.
6. The method for resisting attacks on the multi-modal remote sensing image classification network as claimed in claim 1 or 2, wherein in the third step, the predicted probability score P = [ P ] of the sample is obtained through forward calculation of the target model f 1 ,p 2 ,...,p n ] T ;
Performing One-Hot encoding according to the original label and the category number of the sample to obtain an encoding vector, wherein if the original sample label is 1, the encoding vector is h = [1,0., 0 =] T The vector length is the number of classes of the sample;
obtaining an inverse mask from the encoded vectorMultiplying the reverse mask by the prediction probability score P to obtain the prediction probability value P' = [0,p ] of other categories except the sample original label category probability 2 ,...,p n ] T Performing difference processing on p ', subtracting a larger value from the position of the original label, namely the difference s = p' -h × 1e10, and taking the number of positions where the maximum value of the difference s is located as the target category;
wherein: p is a radical of n The probability score of the output nth class is represented.
7. The method for resisting attack to the multi-modal remote sensing image classification network as claimed in claim 1 or 2, wherein in the fourth step, the multi-modal anti-attack network generates the network G by the disturbance of the optical remote sensing image T Optical remote sensing image discriminator network D T Digital elevation model image disturbance generation network G D And a digital elevation model image discriminator network D D And (4) forming.
8. The attack resisting method facing the multi-modal remote sensing image classification network as claimed in claim 1 or 2, characterized in that in the fifth step, the multi-modal resisting sample generation process is as follows:
training sample x of optical remote sensing image T Input to optical remote sensing image disturbance generation network G T In the method, optical remote sensing image disturbance is generated, and the generated optical remote sensing image disturbance is added to an input optical remote sensing image training sample to obtain a TOP confrontation sample x' T As shown in formula (A):
x′ T =x T +G T (x T ) (A);
wherein: g T (x T ) Representing TOP disturbance generated by an optical remote sensing image disturbance generation network.
Training sample x of digital elevation model image D Input to digital elevation model image disturbance generation network D T Generating image disturbance of a digital elevation model, adding the generated disturbance to an input digital elevation model image training sample to obtain a countermeasure sample x 'of the digital elevation model image' D As shown in formula (B):
x′ D =x D +G D (x D ) (B);
wherein: g D (x D ) And representing the image disturbance of the digital elevation model generated by the image disturbance generation network of the digital elevation model.
9. The attack resisting method facing the multi-modal remote sensing image classification network as claimed in claim 1 or 2, characterized in that in the eighth step, the specific process of attacking the multi-modal remote sensing image classification network by using the test sample is as follows:
testing TOP samplesInputting the input to the trained TOP disturbance generation network, and outputting the TOP disturbanceAnd adding the TOP disturbance and the input TOP test sample to obtainChallenge samples of TOP test samplesSimultaneous DSM testingInputting the data to a trained DSM disturbance generation network, and outputting the DSM disturbanceAnd adding the DSM perturbation to the input DSM test sample to obtain an antagonistic sample of the DSM test sampleFinishing the attack resistance of the multi-mode remote sensing image classification network; the concrete expression is as follows:
10. the attack resisting method facing the multi-modal remote sensing image classification network as claimed in claim 2, wherein the specific process of the ninth step is as follows:
training sample x of optical remote sensing image T The disturbance of the trained good optical remote sensing image is input to generate disturbance of the optical remote sensing image output by the network, the disturbance of the optical remote sensing image is added with the training sample of the input optical remote sensing image to obtain a countermeasure sample x 'of the training sample of the optical remote sensing image' T ;
Training sample x of digital elevation model image D Inputting the image disturbance of the trained digital elevation model into a disturbance generation network, outputting the image disturbance of the digital elevation model, and performing disturbance of the image of the digital elevation model and the input digital elevation modelAdding the image training samples to obtain a confrontation sample x 'of the digital elevation model image training sample' D (ii) a X' T And x' D Adding the new training sample set T 'into the training sample set to obtain a new training sample set T', namely { x T ,x D ,x′ T ,x′ D E, T', and retraining the target model f in the step two.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211005572.XA CN115331079A (en) | 2022-08-22 | 2022-08-22 | Attack resisting method for multi-mode remote sensing image classification network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211005572.XA CN115331079A (en) | 2022-08-22 | 2022-08-22 | Attack resisting method for multi-mode remote sensing image classification network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115331079A true CN115331079A (en) | 2022-11-11 |
Family
ID=83926881
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211005572.XA Pending CN115331079A (en) | 2022-08-22 | 2022-08-22 | Attack resisting method for multi-mode remote sensing image classification network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115331079A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115984635A (en) * | 2023-03-21 | 2023-04-18 | 自然资源部第一海洋研究所 | Multi-source remote sensing data classification model training method, classification method and electronic equipment |
CN116343050A (en) * | 2023-05-26 | 2023-06-27 | 成都理工大学 | Target detection method for remote sensing image noise annotation based on self-adaptive weight |
CN116523032A (en) * | 2023-03-13 | 2023-08-01 | 之江实验室 | Image text double-end migration attack method, device and medium |
-
2022
- 2022-08-22 CN CN202211005572.XA patent/CN115331079A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116523032A (en) * | 2023-03-13 | 2023-08-01 | 之江实验室 | Image text double-end migration attack method, device and medium |
CN116523032B (en) * | 2023-03-13 | 2023-09-29 | 之江实验室 | Image text double-end migration attack method, device and medium |
CN115984635A (en) * | 2023-03-21 | 2023-04-18 | 自然资源部第一海洋研究所 | Multi-source remote sensing data classification model training method, classification method and electronic equipment |
CN116343050A (en) * | 2023-05-26 | 2023-06-27 | 成都理工大学 | Target detection method for remote sensing image noise annotation based on self-adaptive weight |
CN116343050B (en) * | 2023-05-26 | 2023-08-01 | 成都理工大学 | Target detection method for remote sensing image noise annotation based on self-adaptive weight |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109948663B (en) | Step-length self-adaptive attack resisting method based on model extraction | |
CN108537743B (en) | Face image enhancement method based on generation countermeasure network | |
CN115331079A (en) | Attack resisting method for multi-mode remote sensing image classification network | |
CN105095862B (en) | A kind of human motion recognition method based on depth convolution condition random field | |
CN108520202B (en) | Method for extracting image characteristics with robustness resistance based on variational spherical projection | |
Wang et al. | Defending dnn adversarial attacks with pruning and logits augmentation | |
CN110390308B (en) | Video behavior identification method based on space-time confrontation generation network | |
CN110826056B (en) | Recommended system attack detection method based on attention convolution self-encoder | |
CN112541865A (en) | Underwater image enhancement method based on generation countermeasure network | |
CN114463677B (en) | Safety helmet wearing detection method based on global attention | |
CN115311186B (en) | Cross-scale attention confrontation fusion method and terminal for infrared and visible light images | |
CN113627543A (en) | Anti-attack detection method | |
CN113627504B (en) | Multi-mode multi-scale feature fusion target detection method based on generation of countermeasure network | |
CN113095218B (en) | Hyperspectral image target detection algorithm | |
Deng et al. | Detecting C&W adversarial images based on noise addition-then-denoising | |
CN116824695A (en) | Pedestrian re-identification non-local defense method based on feature denoising | |
CN115375966A (en) | Image countermeasure sample generation method and system based on joint loss function | |
Sun et al. | Instance-level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space | |
CN111401155A (en) | Image identification method of residual error neural network based on implicit Euler jump connection | |
Wang et al. | A fall detection system based on convolutional neural networks | |
CN113837360B (en) | DNN robust model reinforcement method based on relational graph | |
CN115797711B (en) | Improved classification method for countermeasure sample based on reconstruction model | |
Sinha et al. | CAPTCHA Recognition And Analysis Using Custom Based CNN Model-Capsecure | |
Patel et al. | Image Forgery Detection using CNN | |
CN112836605B (en) | Near-infrared and visible light cross-modal face recognition method based on modal augmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |