CN116468083A - Transformer-based network generation countermeasure method - Google Patents

Transformer-based network generation countermeasure method Download PDF

Info

Publication number
CN116468083A
CN116468083A CN202310469413.3A CN202310469413A CN116468083A CN 116468083 A CN116468083 A CN 116468083A CN 202310469413 A CN202310469413 A CN 202310469413A CN 116468083 A CN116468083 A CN 116468083A
Authority
CN
China
Prior art keywords
gan
transducer
xpca
data
inputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310469413.3A
Other languages
Chinese (zh)
Inventor
郝思媛
翟世杰
夏裕凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University of Technology
Original Assignee
Qingdao University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University of Technology filed Critical Qingdao University of Technology
Priority to CN202310469413.3A priority Critical patent/CN116468083A/en
Publication of CN116468083A publication Critical patent/CN116468083A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for generating an countermeasure network (Generative Adversarial Network, GAN) based on a transducer aiming at the field of hyperspectral image classification (Hyperspectral Image Classification, HIC). The method introduces a transducer into the GAN and proposes a generation countermeasure network (Transformer with residual upscale GAN, TRUG) with a residual upgrade module for HIC based on the transducer. The TRUG includes a generator G and a discriminator D. In G, we propose a Residual Upgrade (RU) that can increase the resolution of the generated image. In D we employ a progressively decreasing scale Transformer Block and use a grid self-attention mechanism in the first layer to better extract image features. In addition, GAN is prone to the problem of training instability, and to solve this problem we have improved the normalization algorithm and increased the relative position coding. TRUG is the first transducer-based GAN to be applied to HIC.

Description

Transformer-based network generation countermeasure method
Technical Field
The invention relates to a hyperspectral image classification method, in particular to a method for generating an countermeasure network (Generative Adversarial Network, GAN) based on a transducer, and belongs to the technical field of remote sensing information processing.
Background
With the development of technology, hyperspectral image classification (Hyperspectral Image Classification, HIC) has been widely used in many ways. In recent years, a Deep Learning (DL) model has been applied to the HIC field.
As deep learning progresses and model parameters increase, the problem of overfitting becomes a significant challenge. To alleviate this problem Zhang et al have focused on developing a simple network. They propose a 1D capsule network that is easy to implement and lighter than ordinary 3D convolution. Mou et al, however, believe that one-dimensional convolution may result in loss of pixel information when representing hyperspectral pixels, and therefore they propose a novel recurrent neural network (Recurrent Neural Network, RNN) structure. However, RNNs have a problem of inefficiency in processing sequence information. In processing sequential data, a transducer with an attention mechanism can better solve the problem of inefficiency in processing sequences relative to RNNs. Currently, it is a relatively common way to learn image features by combining a transducer with CNN. However, the amount of parameters of the transducer is large, and the phenomenon of fitting is very easy to occur in training for small samples such as HSI. An important way to alleviate the overfitting is to add training data. Many researchers have alleviated this by adding data. This specifically includes data flipping, cropping, translating, and generating models. The generative model alleviates this problem by generating high quality samples. The GAN is a typical generation model and mainly comprises a generator G and a discriminator D, and can fundamentally solve the problem of few data samples, thereby solving the problem of over-fitting. Therefore, more researchers have designed GAN to alleviate the problem of insufficient samples. Zhu et al used 1D GAN as the spectral classifier and 3D GAN as the spatial classifier. In addition, many researchers have combined GAN with other technologies. However, GAN always has problems of training data imbalance and pattern collapse. To solve the problem of training data imbalance, wang et al adapt D to a single classifier and propose an adaptive DropBlock regularization method to solve the pattern collapse problem.
GAN has the disadvantage of instability, and most researchers have been working on solving this problem, so many people introduce various regularization methods, but rarely change its network structure. For CNNs, the convolution operator has a local acceptance field, so CNNs cannot handle remote dependencies. However, HSI has more spectral sequence information. The method thus uses a transducer as a basic framework, which is more suitable for processing global information and also good at processing sequence information. Currently, in the HIC field, no one has introduced a transducer into GAN. Thus, the method combines the ideas of a transducer and a GAN, and proposes a generation countermeasure network (Transformer with residual upscale GAN, TRUG) with a residual upgrade module.
Disclosure of Invention
The present invention introduces a transducer into the GAN and proposes a generation countermeasure network (Transformer with residual upscale GAN, TRUG) with residual upgrade module for HIC based on the transducer. The TRUG contains a generator G and a discriminator D. In G, we propose a Residual Upgrade (RU) that can increase the resolution of the generated image. In D we use a progressively scaled down transducer block and use a grid self-attention mechanism in the first layer to better extract image features. In addition, GAN is prone to the problem of training instability, and to solve this problem we have improved the normalization algorithm and increased the relative position coding.
The method comprises the following specific steps:
s1, performing dimension reduction on original data through PCA to obtain Xpca, and inputting the Xpca into a discriminator D to learn the characteristics of a real sample of the Xpca;
s2, dividing Xpca into a plurality of patches in a discriminator D, and carrying out ebedding on the patches;
s3, inputting the data after the ebedding into a Block of a transducer, learning the characteristics of the data, then downsampling the obtained characteristics to reduce the size of the obtained characteristics, and repeating the steps for three times to obtain final distinguishing characteristics;
s4, inputting one-dimensional random noise Z epsilon R into generator G B*L And class labels C, reconstructing the noise Z into a feature map X epsilon R with resolution (H×W) by a Multi-Layer Perceptron (MLP) B*H*W*C And input the obtained characteristic diagram X to TrThe ansformer Block further extracts features;
s5, improving the resolution ratio of the feature map by a Residual upgrade module (RU) through the features obtained in the S4, wherein the Residual upgrade module comprises the following specific steps: a Kronecker product is made between the pre-module feature map X and the post-module feature map Xnew to generate a high resolution Xup, specifically formulated as follows:
X ⊗Xnew =Xup
s6, inputting the characteristic diagram Xup obtained in the S5 into a Swin Transform (ST) to further extract characteristics Xst among different windows, and further improving the resolution of the obtained characteristic diagram Xst through an RU module to obtain characteristics Xstnew;
s7, compressing the channel dimension of the Xstnew to be consistent with the channel dimension of the Xpca to obtain a false sample Fake data ∈R B *M*N*C
S8: false sample Fake to be generated data And (3) inputting the identification features obtained in the step (S3) into a discriminator (D) together with the real sample Xpca, classifying and distinguishing true and false to obtain a final classification result, and simultaneously, returning the Loss of the classification result and the true and false to a generator to enable the generator to continuously learn to generate a sample with higher quality.
Compared with the prior art, the technical scheme of the invention has the following technical effects:
(1) The GAN can generate a false image similar to the real data, so that the problem of few training samples can be relieved;
(2) RU in TRUG can improve image resolution, enabling G to generate high quality samples;
(3) The network based on the Swin transducer basic module can obtain different characteristic information through window exchange;
(4) Compared with the traditional GAN, TRUG combines a loss function for identifying the true and false samples and classifying, so that the problem of training mode collapse caused by the traditional GAN can be relieved;
(5) TRUG is the first transducer-based GAN to be applied to HIC. Compared with the common GAN, the method can realize higher HSI classification precision;
(6) The expressive information is enhanced by a mechanism of attention.
Drawings
Fig. 1 is a frame diagram of the TRUG of the present invention.
FIG. 2 is a visualization of different size samples generated by the data sets IP and PU.
FIG. 3 is OA of different samples generated by different data sets.
FIG. 4 is a data set IP, OA of whether a PU uses samples generated by the RU module.
FIG. 5 is a visual comparison of classification maps obtained by different methods for IP datasets; (a) False color image, (b) group trunk, (c) SVM, (D) CNN, (e) 3D CNN, (f) hybrid SN, (g) DPRN, (h) transducer, (i) ViT, (j) TRUG.
FIG. 6 is a visual comparison of classification charts obtained by different methods for UP datasets; (a) False color image group try, (c) SVM, (D) CNN, (e) 3D CNN, (f) hybrid SN, (g) DPRN, (h) transducer, (i) ViT, (j) TRUG.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, shall fall within the scope of the invention.
Fig. 1 is a frame diagram of the TRUG of the present invention.
We selected two published HSI datasets, indian Pins (IP), university of Pavia (UP), respectively, to verify the validity of the proposed method.
All data sets are divided into two parts, namely a training set and a testing set. Since GAN is very sensitive to small samples, we classify each class of samples and choose 10% of each class of samples to train. The experimental results mainly comprise three evaluation criteria, namely Overall Accuracy (OA), average Accuracy (AA) and Kappa coefficient (Kappa). In addition, to avoid biased estimates, 10 independent tests were performed using Pytorch on a computer equipped with an intel cool i5 processor and an RTX3090 GPU.
The specific steps for each test were as follows:
s1, reducing the dimension of the original data through PCA to obtainX pca And willX pca Inputting the characteristics of the real sample into a discriminator D to learn;
s2, in discriminator DX pca Dividing into a plurality of patches, and carrying out ebedding on the patches;
s3, inputting the data after the ebedding into a Block of a transducer, learning the characteristics of the data, then downsampling the obtained characteristics to reduce the size of the obtained characteristics, and repeating the steps for three times to obtain final distinguishing characteristics;
s4, inputting one-dimensional random noise into the generator GZ∈R B*L And class labels C, noise is generated by a Multi-Layer Perceptron (MLP)ZReconstructed as a feature map of resolution (H W)X∈R B*H*W*C And to get to the characteristic diagramXInput to Transformer Block for further extraction of features;
s5, improving the resolution ratio of the feature map by a Residual upgrade module (RU) through the features obtained in the S4, wherein the Residual upgrade module comprises the following specific steps: feature map before moduleXAnd post-module feature mapX new A Kronecker product is formed between the two to generate a high resolutionX up The specific formula is as follows:
X⊗X new =X up
s6, the feature map obtained in S5X up Input into Swin Transducer (ST) to further extract features between different windowsXstAnd the obtained characteristic diagramXstFurther enhancement of resolution to feature by RU moduleX stnew
S7, willX stnew Is compressed to and with the channel dimension of (2)X pca Channel dimension agreement to obtain false samplesFake data ∈R B *M*N*C
S8: false samples to be generatedFake data And a true sampleX pca And (3) inputting the identification features obtained in the step (S3) into a discriminator (D) for classification and identification of true and false to obtain a final classification result, and simultaneously, returning the identification of true and false and the Loss of the classification result to a generator to enable the generator to continuously learn and generate a sample with higher quality.
To test the effectiveness of the present invention, ablation and comparative experiments were performed, respectively.
A. Ablation experiment
(1) The amount of sample generated and the size of the visual analysis generated image are important parameters. In experiments we generated images of different sizes from the dataset, 16,32 respectively. Visualization of image features generated by a particular experiment is shown in fig. 2. Feature analysis has been a challenge, particularly in its visual analysis. For quality assessment of the generated samples, it is most intuitive to compare with the real image for visual analysis. Fig. 2 (a) shows, from top to bottom, a comparison of the true and false images generated by the data set PU from early to late training. And (b) IN fig. 2 is a visualization of 32×32 false images generated from the IN dataset, which are displayed sequentially from top to bottom according to the length of training time. As can be seen from the figure, the true and false images in the earlier stage of training are significantly different, and similar parts appear in the middle and later stages. The learning process of the image can still be seen during the training process.
For the size of the generated samples, we performed parametric experiments on the two data sets respectively, as shown in fig. 3, and found that the larger the sample size is, the better the classification effect is, but we cannot continue to increase the size for the experiment due to the limitation of hardware conditions. As can be seen from the figure, the size of the generated samples has a great influence on the experimental results for different data sets. For data set IP, when the sample size was 64, experimental results with OA of 94.56% were obtained, which was 14.49% higher than O A when the sample size was 16. The highest classification accuracy 96.76% was obtained when the sample size was 16. It is explained that different datasets have different classification accuracy for different scales. Thus, in subsequent experiments we used a size of 64 for IP and 16 for PU.
(2) RU analysis we selected TransGAN as the comparative experiment using the traditional method of improving image resolution, upScalating. As can be seen from fig. 4, GAN classification accuracy using RU modules is significantly higher than using conventional methods. Experimental results show that RU modules do have certain effects.
B. Comparative experiments
We provide classification accuracy obtained by different methods on the IP and UP datasets. The comparison method includes SVM, CNN,3D CNN,HybridSN,Deep Pyramidal Residual Networks (DPRN) and the most recently occurring transformers and ViT. In the experiment we used 10% of the training set.
It can be seen from table 1 that treg is superior to all other methods. For data set IP, we propose a method with OA much higher than SVM, CNN,3D CNN and ViT, respectively, by about 4.43% higher than the OA obtained by Transformer, hybridSN and DPRN. OA, AA and Kappa for TRUG were 97.85%, 97.67% and 97.55%, respectively. The TRUG has better classifying effect on the HSI. Furthermore, we can also observe that the proposed method achieves the best performance (i.e. oa=99.83%, aa=99.67%, kappa=99.77%) for UP datasets, which is about 1% higher than other deep learning-based methods, 4% to 5% higher than traditional methods. Classification diagrams for different data sets are shown in fig. 5 and 6. The classification map obtained by the method is clearer than the classification map obtained by other methods.
TABLE 1
The foregoing is merely a specific embodiment of the present application, and is not intended to limit the present application in any way, and any simple modification, equivalent variation or modification made to the above embodiment according to the technical matter of the present application still falls within the scope of the technical solutions of the present application.

Claims (1)

1. A method of generating a countermeasure network (Generative Adversarial Network, GAN) based on a transducer, comprising the steps of:
s1: performing dimension reduction on the original data through PCA to obtain Xpca, and inputting the Xpca into a discriminator D to learn the characteristics of a real sample of the Xpca;
s2: xpca is split into several Patches in discriminator D and is subjected to ebedding;
s3: inputting the data after the ebedding into a Block of a transducer, learning the characteristics of the data, then downsampling the obtained characteristics to reduce the size of the characteristics, and repeating the step S3 for three times to obtain the final distinguishing characteristics;
s4: input one-dimensional random noise Z epsilon R into generator G B*L And class labels C, reconstructing the noise Z into a feature map X epsilon R with resolution (H×W) by a Multi-Layer Perceptron (MLP) B*H*W*C And inputting the obtained feature map X to Transformer Block for further feature extraction;
s5: the resolution of the feature map is improved by a Residual upgrade module (RU) through the features obtained in S4, and the specific steps of the Residual upgrade module are as follows: a Kronecker product is formed between the characteristic diagram X before the module and the characteristic diagram Xnew after the module, and Xup with high resolution is generated;
s6: inputting the feature graph Xup obtained in the step S5 into SwinTransformer (ST) to further extract features Xst among different windows of the feature graph, and further improving the resolution of the obtained feature graph Xst through an RU module to obtain features Xstnew;
s7: compressing the channel dimension of Xstnew to be consistent with the channel dimension of Xpca to obtain a false sample Fake data ∈R B*M*N*C
S8: false sample Fake to be generated data The final classification result is obtained by inputting the discrimination features obtained in S3 into a discriminator D together with a real sample Xpca, classifying and discriminating true and false in a softmax, and meanwhile, the Loss of the discrimination true and false and classification result is transmitted back to a generator to enable the generator to makeThe continuous learning generates higher quality samples.
CN202310469413.3A 2023-04-27 2023-04-27 Transformer-based network generation countermeasure method Pending CN116468083A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310469413.3A CN116468083A (en) 2023-04-27 2023-04-27 Transformer-based network generation countermeasure method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310469413.3A CN116468083A (en) 2023-04-27 2023-04-27 Transformer-based network generation countermeasure method

Publications (1)

Publication Number Publication Date
CN116468083A true CN116468083A (en) 2023-07-21

Family

ID=87180571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310469413.3A Pending CN116468083A (en) 2023-04-27 2023-04-27 Transformer-based network generation countermeasure method

Country Status (1)

Country Link
CN (1) CN116468083A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117250657A (en) * 2023-11-17 2023-12-19 东北石油大学三亚海洋油气研究院 Seismic data reconstruction denoising integrated method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117250657A (en) * 2023-11-17 2023-12-19 东北石油大学三亚海洋油气研究院 Seismic data reconstruction denoising integrated method
CN117250657B (en) * 2023-11-17 2024-03-08 东北石油大学三亚海洋油气研究院 Seismic data reconstruction denoising integrated method

Similar Documents

Publication Publication Date Title
CN111369563B (en) Semantic segmentation method based on pyramid void convolutional network
CN110097543B (en) Hot-rolled strip steel surface defect detection method based on generation type countermeasure network
CN106529447B (en) Method for identifying face of thumbnail
CN112734646B (en) Image super-resolution reconstruction method based on feature channel division
CN106096547B (en) A kind of low-resolution face image feature super resolution ratio reconstruction method towards identification
CN111968193B (en) Text image generation method based on StackGAN (secure gas network)
CN110473142B (en) Single image super-resolution reconstruction method based on deep learning
CN112070158B (en) Facial flaw detection method based on convolutional neural network and bilateral filtering
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
CN110826389B (en) Gait recognition method based on attention 3D frequency convolution neural network
CN110796022B (en) Low-resolution face recognition method based on multi-manifold coupling mapping
CN111639697B (en) Hyperspectral image classification method based on non-repeated sampling and prototype network
CN109002771B (en) Remote sensing image classification method based on recurrent neural network
Sutramiani et al. MAT-AGCA: Multi Augmentation Technique on small dataset for Balinese character recognition using Convolutional Neural Network
CN116468083A (en) Transformer-based network generation countermeasure method
CN115661029A (en) Pulmonary nodule detection and identification system based on YOLOv5
Fu et al. Detecting GAN-generated face images via hybrid texture and sensor noise based features
CN110264404B (en) Super-resolution image texture optimization method and device
Thuan et al. Edge-focus thermal image super-resolution using generative adversarial network
CN115661680A (en) Satellite remote sensing image processing method
CN114373080A (en) Hyperspectral classification method of lightweight hybrid convolution model based on global reasoning
Knoche et al. Susceptibility to image resolution in face recognition and trainings strategies
Wang et al. Se-resnet56: Robust network model for deepfake detection
CN104361354A (en) Large image classification method based on sparse coding K nearest neighbor histograms
CN117036893B (en) Image fusion method based on local cross-stage and rapid downsampling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination