CN115331079A - Attack resisting method for multi-mode remote sensing image classification network - Google Patents

Attack resisting method for multi-mode remote sensing image classification network Download PDF

Info

Publication number
CN115331079A
CN115331079A CN202211005572.XA CN202211005572A CN115331079A CN 115331079 A CN115331079 A CN 115331079A CN 202211005572 A CN202211005572 A CN 202211005572A CN 115331079 A CN115331079 A CN 115331079A
Authority
CN
China
Prior art keywords
remote sensing
network
sensing image
disturbance
modal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211005572.XA
Other languages
Chinese (zh)
Inventor
石程
党叶楠
赵明华
苗启广
潘治文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202211005572.XA priority Critical patent/CN115331079A/en
Publication of CN115331079A publication Critical patent/CN115331079A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Remote Sensing (AREA)
  • Optical Communication System (AREA)

Abstract

The invention discloses an anti-attack method for a multi-mode remote sensing image classification network, which provides an anti-attack technology for the multi-mode remote sensing image classification network for two remote sensing image data sources, namely an optical remote sensing image TOP with three wave bands and a digital elevation model image DSM with one wave band, has more obvious attack effect and higher attack time efficiency and is used for evaluating and improving the robustness of the multi-mode remote sensing image classification network.

Description

Attack resisting method for multi-mode remote sensing image classification network
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an anti-attack method for a multi-mode remote sensing image classification network.
Background
Remote sensing image classification remains a challenging task in the field of remote sensing. In recent years, with the rapid development of aerospace technology, a large number of remote sensing images of different sensors are generated, and effective classification of ground objects by combining a plurality of sensor images becomes a hot point of research. The multimodal remote sensing image covers the ground features with different observations, for example a TOP image, which expresses high spatial resolution information of the remote sensing scene, and a Digital Surface Model (DSM) image, which provides height information of the ground features. The multi-mode remote sensing image classification method based on the advantages of the multi-mode remote sensing data can effectively reduce the uncertainty in the classification task. In recent years, deep neural networks are widely introduced into the task of multi-modal remote sensing image classification. Compared with a single-mode remote sensing image classification method, the multi-mode remote sensing image classification method can effectively utilize the correlation between two modes and realize higher classification precision.
The vulnerability of deep neural networks has also attracted a great deal of attention in recent years. Adding a tiny perturbation that cannot be distinguished by human eyes to the original clean sample enables the deep neural network to generate wrong predictions with high confidence, the perturbed sample is called a countermeasure sample, and the process of generating the countermeasure sample is called countermeasure attack. Therefore, it is necessary to make the network more robust, and to know all risks faced by the network in advance, and to find a counterexample with more aggressive performance. Various attack methods have been proposed to generate countermeasure samples, such as Fast Gradient Sign Method (FGSM), iterative Fast Gradient Sign Method (I-FGSM), C & W algorithm based on optimized objective function, deepFool algorithm, etc., but the existing countermeasure attack technology only considers the attack effect under a single mode and lacks the attack implementation on the multi-mode classification network. When the multi-mode remote sensing image classification network is attacked, not only the success rate of the attack, the concealment of disturbance and the timeliness of the attack but also the cooperative attack capability among different modes need to be considered. Therefore, for the multi-modal remote sensing image classification network, high-quality multi-modal countermeasures need to be generated, and the robustness of the multi-modal remote sensing image classification network is further evaluated and improved.
Disclosure of Invention
The invention aims to provide an attack resisting method for a multi-mode remote sensing image classification network, which has more remarkable attack effect and higher attack time efficiency.
The invention adopts the following technical scheme: an attack resisting method for a multi-mode remote sensing image classification network comprises the following steps:
step one, constructing a multi-modal training sample set T and a test sample set S;
step two, inputting the training sample set T and the testing sample set S in the step one into a target model f to be attacked;
step three, constructing a target attack class t according to the target model f to be attacked in the step two;
step four, building a multi-mode anti-attack network; the multi-mode anti-attack network consists of a disturbance generation network and an identifier network of each mode;
step five, generating a multi-modal confrontation sample;
step six, inputting the multi-modal confrontation samples in the step five into a corresponding disturbance generation network, a discriminator network and a target model f respectively, and constructing a multi-modal generation loss function and a multi-modal discrimination loss function;
step seven, respectively training the multi-modal generation loss function and the multi-modal identification loss function in the step six alternately, updating each disturbance generation network and each identification network, and finishing the training of the multi-modal anti-attack network;
step eight, inputting the test samples into the multi-modal counter attack network in the step seven, and generating corresponding test counter samples.
Further, after the step eight, a step nine is further included, as follows: inputting the training samples into the multi-modal counterattack network to obtain the counterattack samples of the training samples, adding the counterattack samples of the training samples into the training sample set, and re-training the target model f in the second step; and repeating the third step to the eighth step.
Further, the multiple modes are optical remote sensing images and digital elevation model images.
Further, the specific process of constructing the multi-modal generation loss function and the multi-modal discrimination loss function is as follows:
step 6.1, constructing a multi-modal generating loss function, as shown in formula (C):
Figure BDA0003808867920000031
wherein:
Figure BDA0003808867920000032
respectively taking the optical remote sensing image as a multimode mode and taking the optical remote sensing image as an image of the resistance loss and the digital elevation model as a resistance loss;
the hyper-parameters alpha, beta and gamma are weight coefficients of perception loss, deception loss and cooperative attack loss;
L f to defraud the loss; classified as t-th by error, spoofing loss L f The definition is shown as a formula (H);
L c the definition of the synergistic loss is shown as formula (I);
Figure BDA0003808867920000033
the definitions are shown in formulas (D) and (E):
Figure BDA0003808867920000034
Figure BDA0003808867920000035
wherein: d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to the optical remoteSensory image authentication network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The obtained output probability;
Figure BDA0003808867920000036
respectively representing the perception loss of the optical remote sensing image and the perception loss of the digital elevation model image, and the definitions are shown as formulas (F) and (G):
Figure BDA0003808867920000041
Figure BDA0003808867920000042
wherein: epsilon is a hyperparameter controlling the minimum allowable disturbance intensity, | | | | computational complexity P L representing a disturbance P A norm;
formula (H) is:
f=l f (f(x′ T ,x′ D ),t) (H);
wherein: f is the target model to be attacked, l f Obtaining a target label t of the target attack for the cross entropy loss by the third step;
formula (I) is:
L C =l C (T(G T (x T )),G D (x D )) (I);
wherein: t is a band variation function; l C Is a cosine similarity measure function;
step 6.2, constructing a multi-modal discrimination loss function, as shown in formula (J):
Figure BDA0003808867920000043
wherein:
Figure BDA0003808867920000044
the multi-mode discriminator loss function is an optical remote sensing image and a digital elevation model image respectively, and is defined as shown in a formula (K) and a formula (L):
Figure BDA0003808867920000045
Figure BDA0003808867920000046
wherein: d T (x T ) Representing training sample x for original multi-modality to optical remote sensing image T Input to a multimode optical remote sensing image identification network D T The obtained output probability; d D (x D ) Representing training sample x of original digital elevation model image D Input to the digital elevation model image discrimination network D D The obtained output probability; d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The resulting output probability.
Further, in the second step, the target model f is a trained multi-source remote sensing image classification network.
Further, in step three, the prediction probability score P = [ P ] of the sample is obtained by forward calculation of the target model f 1 ,p 2 ,...,p n ] T
Performing One-Hot encoding according to the original label and the category number of the sample to obtain an encoded vector, where if the original sample label is 1, the encoded vector is h = [1,0., 0 =] T The vector length is the number of classes of the sample;
obtaining an inverse mask from the encoded vector
Figure BDA0003808867920000051
Multiplying the reverse mask by the prediction probability score P to obtain the prediction probability value P' = [0,p ] of other categories except the sample original label category probability 2 ,...,p n ] T Performing difference processing on p ', subtracting a larger value from the position of the original label, namely the difference s = p' -h × 1e10, and taking the number of positions where the maximum value of the difference s is located as the target category;
wherein: p is a radical of n The probability score of the output nth class is represented.
Further, in step four, the multi-modal attack-resisting network generates a network G by the disturbance of the optical remote sensing image T Optical remote sensing image discriminator network D T Digital elevation model image disturbance generation network G D And a digital elevation model image discriminator network D D And (4) forming.
Further, in step five, the multi-modal confrontation sample generation process is as follows:
training sample x of optical remote sensing image T Input to optical remote sensing image disturbance generation network G T In the method, optical remote sensing image disturbance is generated, and the generated optical remote sensing image disturbance is added to an input optical remote sensing image training sample to obtain a TOP confrontation sample x' T As shown in formula (A):
x′ T =x T +G T (x T ) (A);
wherein: g T (x T ) Representing TOP disturbance generated by an optical remote sensing image disturbance generation network.
Training sample x of digital elevation model image D Input to the image disturbance generation network D of the digital elevation model T Generating image disturbance of a digital elevation model, adding the generated disturbance to an input digital elevation model image training sample to obtain a countermeasure sample x 'of the digital elevation model image' D As shown in formula (B):
x′ D =x D +G D (x D ) (B);
wherein: g D (x D ) And representing the image disturbance of the digital elevation model generated by the image disturbance generation network of the digital elevation model.
Further, in step eight, the specific process of attacking the multimodal remote sensing image classification network by using the test sample is as follows:
testing TOP samples
Figure BDA0003808867920000061
Inputting the input to the trained TOP disturbance generation network, and outputting the TOP disturbance
Figure BDA0003808867920000062
And adding the TOP disturbance and the input TOP test sample to obtain a confrontation sample of the TOP test sample
Figure BDA0003808867920000063
Simultaneous DSM testing
Figure BDA0003808867920000064
Inputting the data to a trained DSM disturbance generation network, and outputting the DSM disturbance
Figure BDA0003808867920000065
And adding the DSM disturbance to the input DSM test sample to obtain a countermeasure sample of the DSM test sample
Figure BDA0003808867920000066
Finishing the attack resistance of the multi-mode remote sensing image classification network; the concrete expression is as follows:
Figure BDA0003808867920000067
Figure BDA0003808867920000068
further, the specific process of step nine is as follows:
training sample x of optical remote sensing image T Is inputted intoThe trained good optical remote sensing image disturbance generates network output optical remote sensing image disturbance, the optical remote sensing image disturbance is added with the input optical remote sensing image training sample to obtain a countermeasure sample x 'of the optical remote sensing image training sample' T
Training sample x of digital elevation model image D Inputting the image disturbance of the trained digital elevation model into a disturbance generation network, outputting the image disturbance of the digital elevation model, and adding the image disturbance of the digital elevation model and the input image training sample of the digital elevation model to obtain a countermeasure sample x 'of the image training sample of the digital elevation model' D (ii) a X' T And x' D Adding the training samples into the training sample set to obtain a new training sample set T', namely { x T ,x D ,x′ T ,x′ D E, T', and retraining the target model f in the step two.
The invention has the beneficial effects that: 1. the method considers that multi-source disturbance is generated by combining a plurality of modes, and the generated countermeasure sample is more real. 2. A multi-modal generation loss function and a multi-modal identification loss function are designed to establish a relation between different modal disturbances, so that the interference intensity added to each data source can be effectively reduced, and meanwhile, a higher attack success rate is kept.
Drawings
FIG. 1 is a research flow block diagram of an attack resisting method for a multi-source remote sensing image classification network according to the invention;
FIG. 2 is a data set used in the experiments of the present invention;
2a is a Potsdam dataset; 2b is a Vaihingen dataset;
FIG. 3 is a schematic diagram of the structure of the object model, disturbance generation network, discriminator network of the present invention;
FIG. 4 is a diagram of the classification before and after various approaches to confrontation training on a Potsdam dataset;
FIG. 5 is a classification chart of different classification methods on the Vaihingen dataset before and after the countertraining.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention discloses an attack resisting method for a multi-mode remote sensing image classification network, which comprises the following steps as shown in figure 1:
step one, constructing a multi-modal training sample set T and a test sample set S;
the multi-modes are three-band optical remote sensing images (TOP) and one-band digital elevation model images (DSM).
Inputting two multi-modal data TOP and DSM, and a real class diagram corresponding to the two modalities, as shown in FIG. 2; according to each pixel on the two multi-modal data TOP and DSM, taking the pixel as the center, defining a spatial window with the size of 27 multiplied by 27 pixels to extract samples, and respectively extracting the TOP samples and the DSM samples in pairs on the TOP and the DSM to form a sample pair; forming a sample set according to all the sample pairs; selecting part of sample pairs to form a training sample set, and selecting the rest of sample pairs in the sample set to form a test sample set, wherein { x T ,x D }∈T,
Figure BDA0003808867920000081
x T And
Figure BDA0003808867920000082
respectively TOP training and TOP test samples, x D And
Figure BDA0003808867920000083
DSM training samples and DSM test samples, respectively.
Step two, inputting the training sample set T and the testing sample set S in the step one into a target model f to be attacked;
in the second step, the target model f is a trained multi-source remote sensing image classification network. As shown in fig. 3 (a), the configuration is specifically: the method comprises the steps of inputting two data sources, wherein the data input size of a TOP data source is 28 multiplied by 3, the data input size of a DSM data source is 28 multiplied by 1, each data source is provided with 5-layer networks to extract basic features of each data source respectively, then the outputs of the two data sources are connected, and then common features of the two data sources are extracted and classified through the 4-layer networks. The network parameters of the target model f are specifically shown in table 1, the target model f is trained well in advance, and the parameters of the target model are kept unchanged in the attack process.
TABLE 1 target model f network parameters
Figure BDA0003808867920000091
Step three, constructing a target attack class t according to the target model f to be attacked in the step two;
in the third step, each pair of samples x in the training sample set T is separately paired T ,x D Carrying out forward calculation through a pre-trained target model f to obtain an output probability score of each sample on each class, selecting a target attack class according to the output probability score, and assuming that a training sample pair is x T ,x D The output probability score is P = [ P ] 1 ,p 2 ,...,p n ] T ,p n Representing the probability score of the output nth class, and constructing a target class according to the following principle:
obtaining a prediction probability score P = [ P ] of the sample through forward calculation of the target model f 1 ,p 2 ,...,p n ] T
Performing One-Hot encoding according to the original label and the category number of the sample to obtain an encoded vector, where if the original sample label is 1, the encoded vector is h = [1,0., 0 =] T The vector length is the number of classes of the sample;
obtaining an inverse mask from the encoded vector
Figure BDA0003808867920000101
Multiplying the reverse mask by the prediction probability score P to obtain the prediction probability value P' = [0,p ] of other categories except the sample original label category probability 2 ,...,p n ] T In order to avoid the repetition of the target category label and the original category label, difference processing is performed on p ', a larger value is subtracted from the position of the original label, namely the difference s = p' -h × 1e10, and the number of positions where the maximum value of the difference s is located is taken as the number of the positions where the maximum value of the difference s is locatedA target category;
wherein: p is a radical of n The probability score of the output nth class is represented.
Step four, building a multi-mode anti-attack network; the multi-modal attack-resisting network consists of a disturbance generation network and a discriminator network of each mode;
in the fourth step, the multi-modal attack-resisting network generates a network G by the disturbance of the optical remote sensing image T Optical remote sensing image discriminator network D T Digital elevation model image disturbance generation network G D And a digital elevation model image discriminator network D D And (4) forming.
As shown in fig. 3 (b) (c). The disturbance generating network is a coding-decoding structure in which G T The image input size of the layer 1 convolutional layer of (2) is 28X 3,G D Has an input size of 28 × 28 × 1, and the number of filters in the third layer in the decoding structure depends on the number of channels in the picture, G T Is 3,G D To 1, in addition to this parameter setting, the disturbance generation network G T And G D The other parameter settings are the same, as shown in table 2; TOP discriminator network D T Layer 1 convolutional layer input picture size 28 x3, dsm discriminator network D D The size of the convolution layer 1 input picture is 28 × 28 × 1, and the discriminator D T And D D The same parameter settings are shown in table 3.
Table 2 network parameters of a disturbance generation network
Figure BDA0003808867920000111
TABLE 3 discriminator network parameters
Figure BDA0003808867920000112
Step five, generating a multi-modal confrontation sample;
in the fifth step, the multi-modal confrontation sample generation process is as follows:
training sample x of optical remote sensing image T Input to optical remote sensing image disturbance generation network G T In the method, optical remote sensing image disturbance is generated, and the generated optical remote sensing image disturbance is added to an input optical remote sensing image training sample to obtain a TOP confrontation sample x' T As shown in formula (A):
x′ T =x T +G T (x T ) (A);
wherein: g T (x T ) Representing TOP disturbance generated by an optical remote sensing image disturbance generation network.
Training sample x of digital elevation model image D Input to the image disturbance generation network D of the digital elevation model T Generating image disturbance of a digital elevation model, adding the generated disturbance to an input digital elevation model image training sample to obtain a countermeasure sample x 'of the digital elevation model image' D As shown in formula (B):
x′ D =x D +G D (x D ) (B);
wherein: g D (x D ) And representing the image disturbance of the digital elevation model generated by the image disturbance generation network of the digital elevation model.
Step six, inputting the multi-modal confrontation samples in the step five into a corresponding disturbance generation network, a discriminator network and a target model f respectively, and constructing a multi-modal generation loss function and a multi-modal discrimination loss function;
in the sixth step, the specific process of constructing the multi-modal generating loss function and the multi-modal discriminating loss function is as follows:
inputting TOP countercheck samples into a TOP discriminator and a target model f respectively, and simultaneously inputting DSM countercheck samples into a DSM discriminator and a target function f respectively, and constructing a multi-modal generation loss function and a multi-modal discrimination loss function according to TOP disturbance, the TOP countercheck samples, TOP at the output of the discriminator and TOP at the output of the target model, and DSM disturbance, DSM countercheck samples, DSM at the output of the discriminator and DSM at the output of the target model; the concrete structure is as follows:
step 6.1, constructing a multi-modal generating loss function, as shown in formula (C):
Figure BDA0003808867920000121
wherein:
Figure BDA0003808867920000122
respectively taking the optical remote sensing image as a multimode mode and taking the optical remote sensing image as an image of the resistance loss and the digital elevation model as a resistance loss; the TOP challenge samples generated with the constraints are closer to the input TOP training samples and the DSM challenge samples generated with the constraints are closer to the input DSM training samples.
The hyper-parameters alpha, beta and gamma are weight coefficients of perception loss, deception loss and cooperative attack loss;
L f for spoofing losses, classified by mistake as category t, spoofing losses L f The definition is shown as a formula (H);
L c is a cooperative loss to realize multi-modal cooperative attack, and the definition of the cooperative loss is shown as a formula (I);
Figure BDA0003808867920000131
the definitions are shown in formulas (D) and (E):
Figure BDA0003808867920000132
Figure BDA0003808867920000133
wherein: d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The obtained output probability;
Figure BDA0003808867920000134
respectively representing the optical remote sensing image perception loss and the digital elevation model image perception loss so as to respectively restrict the disturbance intensity of TOP disturbance and DSM disturbance, and the definitions are shown as formulas (F) and (G):
Figure BDA0003808867920000135
Figure BDA0003808867920000136
wherein: epsilon is a hyperparameter controlling the minimum allowable disturbance intensity, | | | | computational complexity P L representing a disturbance P A norm; used in the examples is The norm constrains the perturbation, forcing the generated countermeasure sample closer to the true sample.
Formula (H) is:
L f =l f (f(x′ T ,x′ D ),t) (H):
wherein: f is the target model to be attacked, l f Obtaining a target label t of the target attack for the cross entropy loss by the third step;
formula (I) is:
L C =l C (T(G T (x T )),G D (x D )) (I);
wherein: t is a band variation function; because the TOP data has three bands and the DSM perturbation has one band, the DSM perturbation needs to be expanded repeatedly to three bands for similarity calculation. l C The method is a cosine similarity measurement function and is used for measuring the similarity of disturbance under different modes.
Step 6.2, constructing a multi-modal discrimination loss function, as shown in formula (J):
Figure BDA0003808867920000141
wherein:
Figure BDA0003808867920000142
the multi-mode discriminator loss function is an optical remote sensing image and a digital elevation model image respectively, and is defined as shown in a formula (K) and a formula (L):
Figure BDA0003808867920000143
Figure BDA0003808867920000144
wherein: d T (x T ) Representing training sample x for original multimodality to optical remote sensing image T Input to a multimode optical remote sensing image identification network D T The obtained output probability; d D (x D ) Representing training sample x of original digital elevation model image D Input to the digital elevation model image discrimination network D D The obtained output probability; d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The resulting output probability.
Seventhly, respectively training the multi-modal generation loss function and the multi-modal identification loss function in the sixth step alternately, updating each disturbance generation network and each identification network, and finishing the training of the multi-modal anti-attack network;
alternately training by utilizing a multi-mode generator loss and multi-mode discriminator loss optimization model, respectively training a TOP disturbance generation network, a TOP discrimination network, a DSM disturbance generation network and a DSM discrimination network, and performing the following steps:
step 7.1, update TOP discriminator D T . Health-care productDSM discriminator D D DSM disturbance generator G D TOP perturbation generator G T Using a gradient descent method to minimize the formula (M) for the TOP discriminator D T And (5) training.
Figure BDA0003808867920000151
Step 7.2 update TOP perturbation generator G T . Hold TOP discriminator D T DSM disturbance generator G D DSM discriminator D D The TOP disturbance generator G is subjected to minimization of the formula (N) by a gradient descent method under the condition that the parameters are not changed T And (6) updating.
Figure BDA0003808867920000152
Step 7.3, update DSM discriminator D D . Preserving DSM disturbance generator G D TOP discriminator D T TOP disturbance generator G T Using gradient descent with unchanged parameters to minimize formula (O) for DSM discriminator D D And (6) updating.
Figure BDA0003808867920000153
Step 7.4, update DSM disturbance Generator G D . Holding DSM discriminator D D TOP perturbation generator G T TOP discriminator D T Using gradient descent method with unchanged parameters to minimize formula (P) for DSM disturbance generator G D And (4) updating.
Figure BDA0003808867920000154
Step eight, inputting the test samples into the multi-modal counter attack network in the step seven, and generating corresponding test counter samples.
In the step eight, the specific process of attacking the multi-modal remote sensing image classification network by using the test sample is as follows:
testing TOP samples
Figure BDA0003808867920000168
Inputting the input to the trained TOP disturbance generation network, and outputting the TOP disturbance
Figure BDA0003808867920000161
Adding the TOP disturbance and the input TOP test sample to obtain a confrontation sample of the TOP test sample
Figure BDA0003808867920000162
Simultaneous DSM testing
Figure BDA0003808867920000163
Inputting the data to a trained DSM disturbance generation network, and outputting DSM disturbance
Figure BDA0003808867920000164
And adding the DSM disturbance to the input DSM test sample to obtain a countermeasure sample of the DSM test sample
Figure BDA0003808867920000165
Finishing the attack resistance of the multi-mode remote sensing image classification network; the concrete expression is as follows:
Figure BDA0003808867920000166
Figure BDA0003808867920000167
the ninth step is as follows: inputting training samples into a multi-modal counterattack network to obtain countersamples of the training samples, adding the countersamples of the training samples into a training sample set, and re-training the target model f in the second step; and repeating the third step to the eighth step.
The method comprises the following specific steps: optical remote sensing imageTraining sample x T Inputting the optical remote sensing image disturbance into a trained optical remote sensing image disturbance to generate a network output optical remote sensing image disturbance, and adding the optical remote sensing image disturbance and an input optical remote sensing image training sample to obtain a countermeasure sample x 'of the optical remote sensing image training sample' T
Training sample x of digital elevation model image D Inputting the image disturbance of the trained digital elevation model into a disturbance generation network, outputting the image disturbance of the digital elevation model, and adding the image disturbance of the digital elevation model and the input image training sample of the digital elevation model to obtain a countermeasure sample x 'of the image training sample of the digital elevation model' D (ii) a X' T And x' D Adding the new training sample set T 'into the training sample set to obtain a new training sample set T', namely { x T ,x D ,x′ T ,x′ D E, the target model f in the step two is trained again to enhance the capability of the model to cope with attacks.
The effect of the method of the present invention can be further illustrated by the following simulation experiments:
(1) Simulation conditions are as follows:
the hardware conditions of the simulation of the invention are as follows: windows10, GPU NVIDIA GeForce RTX3060; the software platform is as follows: matlabR2016a, pycharm;
the picture sources selected for simulation are Potsdam dataset and Vaihingen dataset. The Potsdam data set comprises 28 6000-pixel unmanned aerial vehicle images in total, the resolution is 5 cm/pixel, the data set comprises image data, terrain data and label data of three different channels, and a pair of DSM data and TOP data are randomly selected as input of the multi-source remote sensing image network as shown in 2a in figure 2. The Vaihingen data set comprises 33 remote sensing images with different sizes, TOP data with 3 wave bands and DSM data with a single wave band, the resolution is 9 cm/pixel, and a pair of DSM data and TOP data are selected as the input of a multi-source remote sensing image classification network and are shown as 2b in figure 2; in the invention, 10000 pixels are randomly selected from a Potsdam data set to each class respectively to construct a training sample, and 800 pixels are randomly selected from a Vaihingen data set to each class respectively to construct a training sample.
Simulation content and results:
simulation 1, the method of the present invention and the existing three technologies are used to perform classification simulation on the two data sets shown in fig. 2, and the results are as follows:
fig. 4 (a) to (d) are FGSM, C & W, PGD and the classification effect diagram of the technology of the present invention on the target model f for the test countermeasure sample on the Potsdam image dataset, respectively, (i) is the classification effect diagram of the network on the original test sample, and can be obtained from the diagram, the countermeasure sample generated by the four technologies can make the target model generate obvious wrong scores compared with the classification diagram of the original test sample on the target model f, and the classification diagrams (a) - (d) have obvious differences with the classification diagram (i), but the classification diagram obtained by the present invention has more obvious differences with the diagram (i), and the wrong regions are more numerous, which indicates that the countermeasure sample generated by the technology of the present invention is more aggressive.
Fig. 5 (a) to (d) are graphs of the classification effect of the FGSM, C & W, PGD and the technique of the present invention on the target model f for the testing challenge sample on the Vaihingen image dataset, respectively, (i) is a graph of the classification effect of the network on the original testing sample, and fig. 5 shows the same result as fig. 4.
The numerical comparison of the classification results before and after the antagonistic training on the data set between the method of the present invention and the prior art FGSM, C & W, PGD is shown in tables 4 and 5:
table 4 is a numerical comparison of classification results before and after the countertraining on Potsdam dataset for the method of the present invention and prior art FGSM, C & W, PGD
Figure BDA0003808867920000181
Table 5 is a numerical comparison of the classification results of the present invention method and prior art FGSM, C & W, PGD before and after the challenge training on the Vaihingen dataset
Figure BDA0003808867920000182
The data in tables 4 and 5 show that the technique of the present invention has shorter test time and higher attack time efficiency. It is obvious that the method has obvious advantages in processing multi-modal data attacks.

Claims (10)

1. An attack resisting method for a multi-mode remote sensing image classification network is characterized by comprising the following steps:
step one, constructing a multi-modal training sample set T and a test sample set S;
step two, inputting the training sample set T and the testing sample set S in the step one into a target model f to be attacked;
step three, constructing a target attack class t according to the target model f to be attacked in the step two;
step four, building a multi-mode anti-attack network; the multi-modal attack-resisting network consists of a disturbance generation network and a discriminator network of each mode;
step five, generating a multi-modal confrontation sample;
step six, respectively inputting the multi-modal confrontation samples in the step five into a corresponding disturbance generation network, a discriminator network and a target model f to construct a multi-modal generation loss function and a multi-modal discrimination loss function;
seventhly, respectively training the multi-modal generation loss function and the multi-modal identification loss function in the sixth step alternately, updating each disturbance generation network and each identification network, and finishing the training of the multi-modal anti-attack network;
step eight, inputting the test samples into the multi-modal counter attack network in the step seven, and generating corresponding test counter samples.
2. The attack-fighting method facing the multi-modal remote sensing image classification network as claimed in claim 1, wherein after the step eight, the method further comprises a step nine as follows: inputting training samples into a multi-modal counterattack network to obtain countersamples of the training samples, adding the countersamples of the training samples into a training sample set, and re-training the target model f in the second step; and repeating the third step to the eighth step.
3. The attack-fighting method facing multi-modal remote sensing image classification network according to claim 1 or 2, characterized in that the multi-modalities are optical remote sensing images and digital elevation model images.
4. The attack resisting method for the multi-modal remote sensing image classification network as claimed in claim 3, wherein the specific process of constructing the multi-modal generation loss function and the multi-modal identification loss function is as follows:
step 6.1, constructing a multi-modal generating loss function, as shown in formula (C):
Figure FDA0003808867910000021
wherein:
Figure FDA0003808867910000022
respectively taking the optical remote sensing images as a multimode mode to resist loss and the digital elevation model images to resist loss;
the hyper-parameters alpha, beta and gamma are weight coefficients of perception loss, deception loss and cooperative attack loss;
L f to defraud the loss; classified as t-th by error, spoofing loss L f The definition is shown as formula (H);
L c the definition of the synergistic loss is shown as formula (I);
Figure FDA0003808867910000023
the definitions are shown in formulas (D) and (E):
Figure FDA0003808867910000024
Figure FDA0003808867910000025
wherein: d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The obtained output probability;
Figure FDA0003808867910000026
respectively representing the perception loss of the optical remote sensing image and the perception loss of the digital elevation model image, and the definitions are shown as formulas (F) and (G):
Figure FDA0003808867910000027
Figure FDA0003808867910000028
wherein: ε is a hyperparameter controlling the minimum allowable disturbance intensity, | | | | | non-calculation P L representing a disturbance P A norm;
formula (H) is:
L f =l f (f(x′ T ,x′ D ),t) (H);
wherein: f is the target model to be attacked, l f Obtaining a target label t of the target attack for the cross entropy loss by the third step;
formula (I) is:
L C =l C (T(G T (x T )),G D (x D )) (I);
wherein: t is a band variation function; l C Is a cosine similarity measure function;
step 6.2, constructing a multi-modal discrimination loss function, as shown in formula (J):
Figure FDA0003808867910000031
wherein:
Figure FDA0003808867910000032
the multi-mode is an identifier loss function of an optical remote sensing image and a digital elevation model image respectively, and the definitions are respectively shown as a formula (K) and a formula (L):
Figure FDA0003808867910000033
Figure FDA0003808867910000034
wherein: d T (x T ) Representing training sample x for original multi-modality to optical remote sensing image T Input to a multimode optical remote sensing image identification network D T The obtained output probability; d D (x D ) Representing training sample x of original digital elevation model image D Input to the digital elevation model image discrimination network D D The obtained output probability; d T (x′ T ) Representing that the generated optical remote sensing image is confronted with a sample x' T Input to an optical remote sensing image discrimination network D T The obtained output probability; d D (x′ D ) Representing the generated digital elevation model image against the sample x' D Input to a digital elevation model image discriminator network D D The resulting output probability.
5. The attack-resisting method facing the multi-modal remote sensing image classification network as claimed in claim 4, wherein in the second step, the target model f is a trained multi-source remote sensing image classification network.
6. The method for resisting attacks on the multi-modal remote sensing image classification network as claimed in claim 1 or 2, wherein in the third step, the predicted probability score P = [ P ] of the sample is obtained through forward calculation of the target model f 1 ,p 2 ,...,p n ] T
Performing One-Hot encoding according to the original label and the category number of the sample to obtain an encoding vector, wherein if the original sample label is 1, the encoding vector is h = [1,0., 0 =] T The vector length is the number of classes of the sample;
obtaining an inverse mask from the encoded vector
Figure FDA0003808867910000041
Multiplying the reverse mask by the prediction probability score P to obtain the prediction probability value P' = [0,p ] of other categories except the sample original label category probability 2 ,...,p n ] T Performing difference processing on p ', subtracting a larger value from the position of the original label, namely the difference s = p' -h × 1e10, and taking the number of positions where the maximum value of the difference s is located as the target category;
wherein: p is a radical of n The probability score of the output nth class is represented.
7. The method for resisting attack to the multi-modal remote sensing image classification network as claimed in claim 1 or 2, wherein in the fourth step, the multi-modal anti-attack network generates the network G by the disturbance of the optical remote sensing image T Optical remote sensing image discriminator network D T Digital elevation model image disturbance generation network G D And a digital elevation model image discriminator network D D And (4) forming.
8. The attack resisting method facing the multi-modal remote sensing image classification network as claimed in claim 1 or 2, characterized in that in the fifth step, the multi-modal resisting sample generation process is as follows:
training sample x of optical remote sensing image T Input to optical remote sensing image disturbance generation network G T In the method, optical remote sensing image disturbance is generated, and the generated optical remote sensing image disturbance is added to an input optical remote sensing image training sample to obtain a TOP confrontation sample x' T As shown in formula (A):
x′ T =x T +G T (x T ) (A);
wherein: g T (x T ) Representing TOP disturbance generated by an optical remote sensing image disturbance generation network.
Training sample x of digital elevation model image D Input to digital elevation model image disturbance generation network D T Generating image disturbance of a digital elevation model, adding the generated disturbance to an input digital elevation model image training sample to obtain a countermeasure sample x 'of the digital elevation model image' D As shown in formula (B):
x′ D =x D +G D (x D ) (B);
wherein: g D (x D ) And representing the image disturbance of the digital elevation model generated by the image disturbance generation network of the digital elevation model.
9. The attack resisting method facing the multi-modal remote sensing image classification network as claimed in claim 1 or 2, characterized in that in the eighth step, the specific process of attacking the multi-modal remote sensing image classification network by using the test sample is as follows:
testing TOP samples
Figure FDA0003808867910000051
Inputting the input to the trained TOP disturbance generation network, and outputting the TOP disturbance
Figure FDA0003808867910000052
And adding the TOP disturbance and the input TOP test sample to obtainChallenge samples of TOP test samples
Figure FDA0003808867910000053
Simultaneous DSM testing
Figure FDA0003808867910000054
Inputting the data to a trained DSM disturbance generation network, and outputting the DSM disturbance
Figure FDA0003808867910000055
And adding the DSM perturbation to the input DSM test sample to obtain an antagonistic sample of the DSM test sample
Figure FDA0003808867910000056
Finishing the attack resistance of the multi-mode remote sensing image classification network; the concrete expression is as follows:
Figure FDA0003808867910000057
Figure 2
10. the attack resisting method facing the multi-modal remote sensing image classification network as claimed in claim 2, wherein the specific process of the ninth step is as follows:
training sample x of optical remote sensing image T The disturbance of the trained good optical remote sensing image is input to generate disturbance of the optical remote sensing image output by the network, the disturbance of the optical remote sensing image is added with the training sample of the input optical remote sensing image to obtain a countermeasure sample x 'of the training sample of the optical remote sensing image' T
Training sample x of digital elevation model image D Inputting the image disturbance of the trained digital elevation model into a disturbance generation network, outputting the image disturbance of the digital elevation model, and performing disturbance of the image of the digital elevation model and the input digital elevation modelAdding the image training samples to obtain a confrontation sample x 'of the digital elevation model image training sample' D (ii) a X' T And x' D Adding the new training sample set T 'into the training sample set to obtain a new training sample set T', namely { x T ,x D ,x′ T ,x′ D E, T', and retraining the target model f in the step two.
CN202211005572.XA 2022-08-22 2022-08-22 Attack resisting method for multi-mode remote sensing image classification network Pending CN115331079A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211005572.XA CN115331079A (en) 2022-08-22 2022-08-22 Attack resisting method for multi-mode remote sensing image classification network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211005572.XA CN115331079A (en) 2022-08-22 2022-08-22 Attack resisting method for multi-mode remote sensing image classification network

Publications (1)

Publication Number Publication Date
CN115331079A true CN115331079A (en) 2022-11-11

Family

ID=83926881

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211005572.XA Pending CN115331079A (en) 2022-08-22 2022-08-22 Attack resisting method for multi-mode remote sensing image classification network

Country Status (1)

Country Link
CN (1) CN115331079A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984635A (en) * 2023-03-21 2023-04-18 自然资源部第一海洋研究所 Multi-source remote sensing data classification model training method, classification method and electronic equipment
CN116343050A (en) * 2023-05-26 2023-06-27 成都理工大学 Target detection method for remote sensing image noise annotation based on self-adaptive weight
CN116523032A (en) * 2023-03-13 2023-08-01 之江实验室 Image text double-end migration attack method, device and medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116523032A (en) * 2023-03-13 2023-08-01 之江实验室 Image text double-end migration attack method, device and medium
CN116523032B (en) * 2023-03-13 2023-09-29 之江实验室 Image text double-end migration attack method, device and medium
CN115984635A (en) * 2023-03-21 2023-04-18 自然资源部第一海洋研究所 Multi-source remote sensing data classification model training method, classification method and electronic equipment
CN116343050A (en) * 2023-05-26 2023-06-27 成都理工大学 Target detection method for remote sensing image noise annotation based on self-adaptive weight
CN116343050B (en) * 2023-05-26 2023-08-01 成都理工大学 Target detection method for remote sensing image noise annotation based on self-adaptive weight

Similar Documents

Publication Publication Date Title
CN109948663B (en) Step-length self-adaptive attack resisting method based on model extraction
CN108537743B (en) Face image enhancement method based on generation countermeasure network
CN115331079A (en) Attack resisting method for multi-mode remote sensing image classification network
CN105095862B (en) A kind of human motion recognition method based on depth convolution condition random field
CN108520202B (en) Method for extracting image characteristics with robustness resistance based on variational spherical projection
Wang et al. Defending dnn adversarial attacks with pruning and logits augmentation
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN110826056B (en) Recommended system attack detection method based on attention convolution self-encoder
CN112541865A (en) Underwater image enhancement method based on generation countermeasure network
CN114463677B (en) Safety helmet wearing detection method based on global attention
CN115311186B (en) Cross-scale attention confrontation fusion method and terminal for infrared and visible light images
CN113627543A (en) Anti-attack detection method
CN113627504B (en) Multi-mode multi-scale feature fusion target detection method based on generation of countermeasure network
CN113095218B (en) Hyperspectral image target detection algorithm
Deng et al. Detecting C&W adversarial images based on noise addition-then-denoising
CN116824695A (en) Pedestrian re-identification non-local defense method based on feature denoising
CN115375966A (en) Image countermeasure sample generation method and system based on joint loss function
Sun et al. Instance-level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space
CN111401155A (en) Image identification method of residual error neural network based on implicit Euler jump connection
Wang et al. A fall detection system based on convolutional neural networks
CN113837360B (en) DNN robust model reinforcement method based on relational graph
CN115797711B (en) Improved classification method for countermeasure sample based on reconstruction model
Sinha et al. CAPTCHA Recognition And Analysis Using Custom Based CNN Model-Capsecure
Patel et al. Image Forgery Detection using CNN
CN112836605B (en) Near-infrared and visible light cross-modal face recognition method based on modal augmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination