CN113436073B - Real image super-resolution robust method and device based on frequency domain - Google Patents

Real image super-resolution robust method and device based on frequency domain Download PDF

Info

Publication number
CN113436073B
CN113436073B CN202110728827.4A CN202110728827A CN113436073B CN 113436073 B CN113436073 B CN 113436073B CN 202110728827 A CN202110728827 A CN 202110728827A CN 113436073 B CN113436073 B CN 113436073B
Authority
CN
China
Prior art keywords
image data
frequency domain
total
super
total image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110728827.4A
Other languages
Chinese (zh)
Other versions
CN113436073A (en
Inventor
李冠彬
岳九涛
魏朋旭
林倞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202110728827.4A priority Critical patent/CN113436073B/en
Publication of CN113436073A publication Critical patent/CN113436073A/en
Application granted granted Critical
Publication of CN113436073B publication Critical patent/CN113436073B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The embodiment of the application provides a real image super-resolution robust method and a device based on a frequency domain, wherein the method comprises the following steps: receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data to a random frequency domain covering module when the image data is the image data of the countermeasure sample; converting the total image data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data to obtain second image data, converting the second image data into third image data of a time domain, and sending the third image data to a real super-resolution model; an image with clearer details and better fidelity corresponding to the total image data is generated based on the third image data.

Description

Real image super-resolution robust method and device based on frequency domain
Technical Field
The present application relates to the field of communications, and in particular, to a method and an apparatus for configuring a connection mode.
Background
Single image super resolution is used to recover high resolution images from degraded low resolution images with clearer details and better fidelity. At present, a common deep convolutional neural network is used for completing single image super-resolution, a large number of paired samples containing low-resolution input images and corresponding training of high-quality versions of the paired samples are required for the common deep convolutional neural network, and the training of the deep neural network is easily threatened by synthetic anti-attack.
Existing methods for resisting counterattack can be largely classified into three types. First, the model is made to learn the mapping from the low resolution image side with the competing noise to the high resolution image side by competing training. The method is a common defense method in a white-box environment, but the successful defense can not be achieved when a multi-step iteration strong attacker based on gradient is faced. Second, by a gradient masking-like approach, the counterpropagating gradient is blurred in the generation of the counterpropagating sample counter-propagating or by an undifferentiated design so that counterpropagating noise cannot be added to the clean image. For example, the image is randomly cropped. But all gradient ambiguity and masking based methods have proven to be completely defeated in BPDA's attack mode (i.e. under white-box environment, back propagation deliberately bypasses the gradient ambiguity's module or estimates that the non-differentiable layer's gradient is back-propagated onto the challenge sample). The third denoising-based method can be roughly divided into the steps of performing some denoising processing on the input of the model at a low-resolution image end and reducing the influence of the anti-disturbance on the output of the model through some denoising operations at a feature level. Denoising processing at an image end: image preprocessing and JPEG preprocessing based on an image compression model; denoising treatment at a characteristic level: a noise removal module based on Non-Local and a noise removal method based on mean filtering and bilateral filtering. Also image pre-processing based methods have proven successful for BPDA attacks. Although the module based on feature denoising obtains certain effect in the image classification model in cooperation with confrontation training, good defense effect is not shown in the field of real image super-resolution.
In addition, the defense methods all affect the super-resolution result of the model on the clean sample to different degrees, so that the super-resolution result of the model on the clean sample becomes fuzzy, and clear details and textures which can be generated by the model before cannot be generated visually.
Disclosure of Invention
The embodiment of the application provides a real image super-resolution robust method and device based on a frequency domain, and aims to solve the problems that the super-resolution result of a model on a clean sample is influenced in the prior art, the super-resolution result of the model on the clean sample becomes fuzzy, and clear details and textures which can be generated by the model before cannot be generated visually, and the method comprises the following steps:
a real image super-resolution robust method based on frequency domain, the method comprising:
receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data or image characteristic data to a random frequency domain covering module when the image data is the image data of the countermeasure sample;
converting the total image data or the image characteristic data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data to obtain second image data, converting the second image data into third image data of the time domain, sending the third image data to a real super-resolution model, and embedding a frequency domain covering module into the super-resolution model;
generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.
Optionally, the receiving total image data, determining whether the total image data is image data of a challenge sample, and sending the total image data to a random frequency domain masking module when the image data is the image data of the challenge sample, includes:
the confrontation sample classifier is used for receiving total image data, and the total image data is X epsilon R H×W×C H is the height of the initial image, W is the width of the initial image, C is the channel number of the initial image, and the image data is converted into a frequency domain to obtain fourth image data
Figure GDA0003228524170000021
Sub-image data for each channel->
Figure GDA0003228524170000022
Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises True and FalseAnd when the classification result is True, the image data is the image data of the confrontation sample, when the classification result is False, the image data is the image data of the normal sample, and when the image data is the image data of the confrontation sample, the image data and the intermediate characteristic data thereof are sent to a random frequency domain masking module.
Optionally, the converting the total image data into first image data in a frequency domain, masking off a plurality of high frequency components in the first image data, and obtaining second image data, converting the second image data into third image data in the time domain, and sending the third image data to the embedded real super-resolution model includes:
the random frequency domain masking module is used for masking the total image data or the intermediate characteristic data X E R H×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform H×W By Hadamard product the algorithm is
Figure GDA0003228524170000031
And an indication indicates a hadamard product, second image data is obtained, the second image data is converted into third image data in a time domain, and the third image data is sent to a real super-resolution model.
Optionally, the inputting the one-dimensional tensor data into a two-class network to obtain a classification result is implemented by the following formula:
Figure GDA0003228524170000032
where G (X) is the result of the classification, φ (X) represents the channel averaging, mean pooling, and flattening functions, σ represents the activation function LeakyReLU,
Figure GDA0003228524170000033
for the Softmax activation function, W 1 、W 2 、W 3 、b 1 、b 2 And b 3 Is made in advanceSet learnable parameters.
A real image super-resolution robust apparatus based on a frequency domain, the apparatus comprising:
the confrontation sample classifier module is used for receiving total image data, judging whether the total image data is image data of a confrontation sample or not, and sending the total image data or image characteristic data to the random frequency domain covering module when the image data is the image data of the confrontation sample;
the random frequency domain covering module is used for converting the total image data or the image characteristic data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data, obtaining second image data, converting the second image data into third image data of a time domain, sending the third image data to a real super-resolution model, and embedding the frequency domain covering module into the super-resolution model;
a robust real image super resolution network module for generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.
Optionally, the confrontation sample classifier module is configured to receive total image data, the total image data being X e R H ×W×C H is the height of the initial image, W is the width of the initial image, C is the channel number of the initial image, and the image data is converted into a frequency domain to obtain fourth image data
Figure GDA0003228524170000041
Sub-image data for each channel
Figure GDA0003228524170000042
Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises True and False, when the classification result is True, the image data is the image data of the countermeasure sample, and when the classification result is True, the image data is the image data of the countermeasure sampleAnd when False, the image data is the image data of the normal sample, and when the image data is the image data of the countermeasure sample, the image data and the intermediate characteristic data thereof are sent to a random frequency domain masking module.
Optionally, the random frequency domain masking module is configured to mask the total image data or the intermediate feature data X ∈ R H ×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform H×W By Hadamard product the algorithm is
Figure GDA0003228524170000043
And an indication of an image is a hadamard product, obtaining second image data, converting the second image data into third image data in a time domain, and transmitting the third image data to the embedded real super-resolution model.
Optionally, the random frequency domain masking module is configured to input the one-dimensional tensor data into a two-class network to obtain a classification result by the following formula:
Figure GDA0003228524170000044
where G (X) is the result of the classification, φ (X) represents the channel averaging, mean pooling, and flattening functions, σ represents the activation function LeakyReLU,
Figure GDA0003228524170000045
for the Softmax activation function, W 1 、W 2 、W 3 、b 1 、b 2 And b 3 Is a pre-set learnable parameter.
A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program performs the steps of the method as claimed in any one of the preceding claims.
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the preceding claims.
The application has the following advantages:
in the application, whether total image data is image data of a countermeasure sample or not is judged by receiving the total image data, and when the image data is the image data of the countermeasure sample, the total image data is sent to a random frequency domain covering module; converting the total image data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data to obtain second image data, converting the second image data into third image data of a time domain, and sending the third image data to a real super-resolution model; generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data. The method realizes accurate discrimination of the confrontation sample and the normal sample, and the design of the classifier ensures the super-resolution performance of the model on the clean sample. Compared with other defense methods, the method can obtain near-optimal performance for a clean sample, and when confronted with a confrontation sample, the successful defense method is ensured, and structured and unreasonable textures can not be generated like other super-resolution models. And the influence of the antagonistic sample on the image super-resolution model is analyzed from the aspects of frequency domain and characteristics. We first found that the anti-noise becomes stronger as the depth of the network deepens, so we propose to remove the ill-conditioned impact on anti-noise from the level of features. Secondly, a randomness strategy is introduced in the covering process, and experiments prove that the randomness strategy can greatly improve the robustness of the method.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings needed to be used in the description of the present application will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.
Fig. 1 is a flowchart illustrating steps of a real image super-resolution robust method based on a frequency domain according to an embodiment of the present application;
fig. 2 is a block diagram of a structure of a real image super-resolution robust apparatus based on a frequency domain according to an embodiment of the present application.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description. It should be apparent that the embodiments described are some, but not all embodiments of the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
Referring to fig. 1, a flowchart illustrating steps of a real image super-resolution robust method based on a frequency domain according to an embodiment of the present application is shown, which may specifically include the following steps:
step 101, receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data to an embedded random frequency domain masking module when the image data is the image data of the countermeasure sample;
in an embodiment of the present application, the step 101 includes:
the confrontation sample classifier is used for receiving total image data, and the total image data is X epsilon R H×W×C H is the height of the initial image, W is the width of the initial image, C is the channel number of the initial image, and the image data is converted into a frequency domain to obtain fourth image data
Figure GDA0003228524170000061
Sub-image data for each channel>
Figure GDA0003228524170000062
Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, and outputting the one-dimensional tensor dataAnd entering a binary network to obtain a classification result, wherein the classification result comprises True and False, when the classification result is True, the image data is the image data of the confrontation sample, when the classification result is False, the image data is the image data of the normal sample, and when the image data is the image data of the confrontation sample, the image data is sent to an embedded random frequency domain masking module.
In specific implementation, the random frequency domain masking module can improve the robustness of the model and achieve successful defense in the face of resisting attackers, but because the distinguishing boundaries of resisting noise and image high-frequency detail components are fuzzy, the masking operation still masks the image and the high-frequency details generated by over-partitioning, so that the super-resolution performance of the model in the face of a clean sample is reduced.
Based on the experimental results and the difference between the distributions of the challenge samples and the normal samples in the frequency domain, a robust classifier of the challenge samples is designed, so that the classifier can accurately distinguish the challenge samples from the normal samples. Thus, we decide whether to activate the random frequency-domain masking module after the classifier based on the result of the classifier. If the classifier output is False, the current sample is a normal sample, the subsequent random frequency domain covering module is skipped, and otherwise, the activation is selected.
The classifier takes the image as input and is represented as: x is formed by R H×W×C There may be a challenge sample and a clean sample. For X, we first convert it to the frequency domain
Figure GDA0003228524170000063
Then we average it according to the channel to get
Figure GDA0003228524170000064
Then, we do a mean pooling and flattening operation to get a one-dimensional tensor, then input it into a two-classification network to get the result G (-) e [ True, false]:
Figure GDA0003228524170000071
Where phi (X) denotes the channel averaging, mean pooling, and flattening operations, sigma denotes the activation function, leak relu,
Figure GDA0003228524170000072
representing the Softmax activation function. W is a group of i And b i Representing a learnable parameter. The design of the classifier enables the model to generate sharp details and textures on normal samples, and the excellent super-resolution effect is kept.
102, converting the total image data into first image data of a frequency domain or intermediate characteristic data, covering a plurality of high-frequency components in the first image data, obtaining second image data, converting the second image data into third image data of a time domain, and sending the third image data to an embedded real super-resolution model;
in an embodiment of the present application, the step 102 includes:
step S11, the random frequency domain covering module is used for covering the total image data X e R H×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform H×W By means of the Hadamard product the algorithm is
Figure GDA0003228524170000073
An indication of an h-hadamard product is obtained, second image data is obtained, the second image data is converted into third image data in a time domain, and the third image data is transmitted to an embedded real super-resolution model.
In an embodiment of the present application, the step S11 is implemented by the following formula:
Figure GDA0003228524170000074
wherein G (X) is the result of the classification, and phi (X) represents the channel average, mean poolingAnd a flattening function, σ denotes the activation function LeakyReLU,
Figure GDA0003228524170000075
for the Softmax activation function, W 1 、W 2 、W 3 、b 1 、b 2 And b 3 Is a preset learnable parameter.
The frequency domain masking module takes the image and features as input and is represented as: the tensor of H multiplied by W multiplied by C, H, W and C respectively represent the height, width and channel number of the feature map. Let X ∈ R H×W One of the channel tensors, expressed as the input tensor. We transform it to the frequency domain using DCT (discrete cosine transform). The DCT frequency domain of X can be represented as:
Figure GDA0003228524170000076
Figure GDA0003228524170000077
can be calculated according to the formula:
Figure GDA0003228524170000081
where c (u, v) is a compensation factor, when u =0,
Figure GDA0003228524170000082
when u is not equal to 0, then>
Figure GDA0003228524170000083
We average and visualize the H × W × C tensors by channel C. The visualization results are shown in fig. 1. The upper row of fig. 1 represents the feature frequency domain visualization of a clean sample, and the lower row corresponds to the challenge sample. The visualization result also shows that the anti-noise is mostly coded as the high-frequency component of the frequency domain (the red component at the upper right corner, the lower left corner and the lower right corner of the frequency domain graph). We believe that the competing noise encoded as high frequency components severely affects the super-resolution results, so we propose to randomly mask some of the high frequency components in the frequency domain to remove the competing noiseInfluence. We will randomly sample a frequency domain mask map M e R according to the algorithm we propose H×W :
Figure GDA0003228524170000084
/>
Here | _ indicates a hadamard product (corresponding position multiplication). Next, the frequency domain map after multiplication is converted into a time domain space by I-DCT. The process of frequency domain masking can be summarized by the formula:
X m =F -1 (F(X)⊙M)
where F (X) is denoted DCT, F -1 Denoted as I-DCT.
For the sampling method of the frequency domain map M, we first calculate the normalized distance of each component distance (0,0) component:
Figure GDA0003228524170000085
here, the
Figure GDA0003228524170000086
r u, The larger the signal, the higher the confidence that the current component has against the noise. Since the content and texture of the image are mostly stored in the low-frequency components of the image, we randomly sample a protection radius r in order to protect the low-frequency components of the image from being masked t ∈[r l ,r u ]. The components within the radius we do not mask them. For components other than radius, we are based on the Bernoulli distribution and r max A probability value is sampled to determine whether to perform a masking operation. Bernoulli sampling can be expressed as M (u, v) = Bernoulli (p = r) (u,v) ): for (u, v) in the mask map M we have a probability p = r (u,v) It was sampled to 1,p =1-r (u,v) It is sampled to 0. So M can be defined as:
Figure GDA0003228524170000087
the masking module we propose has two advantages: first, our module is differentiable, guarantees the training of the network, and can be simply embedded anywhere in the network. Secondly, our module has no learnable parameters and is a light-weight defense module.
Step 103, generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.
In the existing super-resolution method, the super-resolution effect of an image in a clean sample is generally only considered, and the robustness of a model when the model faces an attacker is not considered, so that a robust image super-resolution model is provided, and the robustness of the model is further improved by combining countertraining when the model is trained. Our model has the current best true super resolution model CDC as the cornerstone of our model. The backbone network of the CDC consists of six HourGlass modules. The CDC constructs three component-of-interest blocks (CABs) associated with planes, edges, and corners. Each CAB learns exclusively one of the three low-level components through an intermediate supervised strategy, while controlling the HourGlass module to learn the corresponding component mapping function. The outputs of the three CABs are summed to generate the final SR reconstruction. Considering that different image regions have different gradients in different directions, a Gradient Weighted (GW) loss function is proposed for SR reconstruction. The more complex a region texture is, the greater its corresponding gradient penalty, forcing the network to notice these complex texture regions.
We first place the confrontation sample classifier at the head of the CDC network to identify if the input sample is a clean sample. Second, we place our random frequency domain masking module in two places in the network: the first is the head of the entire model, the tail of the classifier, and the second is the head and tail of each Hourglass. The output of the classifier decides whether or not to activate the following random frequency domain masking module.
To further improve the robustness of the model, we incorporate countertraining in the training process, we adopt I-FGSM as the counterattacker:
L(X n ,X 0 )=‖f(X n )-f(X 0 )‖ 2
Figure GDA0003228524170000091
Figure GDA0003228524170000092
where X is 0 Represents a clean sample, X n Indicating that the challenge sample L (-) represents the MSE loss, alpha represents the challenge perturbation strength,
Figure GDA0003228524170000093
indicating the derivation thereof, clip (a,b) (X) = min (max (X, a), b), T represents the number of iterations. In training, we adopt T =2, α =6/255 as the challenge sample generator.
In conclusion, compared with the prior art, the method analyzes the influence of the confrontation sample on the image super-resolution model from the aspects of frequency domain and characteristics. We first found that the anti-noise becomes stronger as the depth of the network deepens, so we propose to remove the ill-conditioned impact on anti-noise from the level of features. Secondly, a randomness strategy is introduced in the covering process, and experiments prove that the randomness strategy can greatly improve the robustness of the method.
In addition, a robust confrontation sample classifier is designed, the classifier can accurately distinguish the confrontation sample from the normal sample, and the design of the classifier ensures the super-resolution performance of the model on a clean sample. Compared with other defense methods, the method can obtain nearly optimal performance for a clean sample, and when confrontation samples are faced, the successful defense method is ensured, and structured and unreasonable textures cannot be generated like other super-resolution models.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.
Referring to fig. 2, a structural block diagram of a real image super-resolution robust apparatus based on a frequency domain according to an embodiment of the present application is shown, which may specifically include the following modules:
the confrontation sample classifier module 201 is configured to receive total image data, determine whether the total image data is image data of a confrontation sample, and send the total image data and intermediate feature data thereof to the random frequency domain masking module when the image data is image data of the confrontation sample;
a random frequency domain masking module 202, configured to convert the total image data into first image data in a frequency domain, mask out multiple high-frequency components in the first image data, obtain second image data, convert the second image data into third image data in a time domain, and send the third image data to the embedded real super-resolution model;
a robust real image super resolution network module 203 for generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.
In an embodiment of the application, the confrontation sample classifier module is configured to receive total image data, where X ∈ R H×W×C H is the height of the initial image, W is the width of the initial image, C is the channel number of the initial image, and the image data is converted into a frequency domain to obtain fourth image data
Figure GDA0003228524170000111
Sub-image data for each channel->
Figure GDA0003228524170000112
Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises True and False, when the classification result is True, the image data is the image data of the countermeasure sample, when the classification result is False, the image data is the image data of a normal sample, and when the image data is the image data of the countermeasure sample, the image data is sent to a random frequency domain masking module. />
In an embodiment of the application, the random frequency domain masking module is configured to mask the total image data X ∈ R H×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform H×W By means of the Hadamard product the algorithm is
Figure GDA0003228524170000113
And an indication of an image is a hadamard product, obtaining second image data, converting the second image data into third image data in a time domain, and transmitting the third image data to the embedded real super-resolution model.
In an embodiment of the present application, the random frequency domain masking module is configured to input the one-dimensional tensor data into a two-class network to obtain a classification result through the following formula:
Figure GDA0003228524170000114
where G (X) is the result of the classification, φ (X) represents the channel averaging, mean pooling, and flattening functions, σ represents the activation function LeakyReLU,
Figure GDA0003228524170000115
for the Softmax activation function, W 1 、W 2 、W 3 、b 1 、b 2 And b 3 Is a preset learnable parameter.
In conclusion, compared with the prior art, the method analyzes the influence of the confrontation sample on the image super-resolution model from the aspects of frequency domain and characteristics. We first found that the anti-noise becomes stronger as the depth of the network deepens, so we propose to remove the ill-conditioned impact on anti-noise from the level of features. Secondly, a randomness strategy is introduced in the masking process, experiments prove that the randomness strategy can greatly improve the robustness of the method, and the designed frequency domain masking module is embedded in a real super-resolution network. On one hand, the influence of the anti-noise can be removed, and the super-resolution effect of the image is ensured.
In addition, a robust confrontation sample classifier is designed, the classifier can accurately distinguish the confrontation sample from the normal sample, and the super-resolution performance of the model on a clean sample is guaranteed due to the design of the classifier. Compared with other defense methods, the method can obtain nearly optimal performance for a clean sample, and when confrontation samples are faced, the successful defense method is ensured, and structured and unreasonable textures cannot be generated like other super-resolution models.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the above method for super-resolution robustness of real images based on frequency domain.
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the above method for super-resolution robustness of a real image based on frequency domain.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "include", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or terminal device including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such process, method, article, or terminal device. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or terminal apparatus that comprises the element.
The method and the device for super-resolution robust of real images based on frequency domain provided by the application are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the above embodiment is only used to help understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, the specific implementation manner and the application scope may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (8)

1. A real image super-resolution robust method based on a frequency domain is characterized by comprising the following steps:
receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data to a random frequency domain covering module when the total image data is the image data of the countermeasure sample;
in the random frequency domain covering module, converting the total image data into first image data in a frequency domain, obtaining second image data by covering a plurality of high-frequency components in the first image data, converting the second image data into third image data in a time domain, and sending the third image data to a real super-resolution model, wherein the random frequency domain covering module is embedded into the real super-resolution model; specifically;
the total image data is processed
Figure QLYQS_3
Based on a discrete cosine transform conversion into first image data ^ in the frequency domain>
Figure QLYQS_8
And sampling each frequency domain mask pattern->
Figure QLYQS_9
Said->
Figure QLYQS_2
For high of the total image data, said +>
Figure QLYQS_4
For the width of the total image data, said +>
Figure QLYQS_6
The number of channels of the total image data; based on the Hadamard product algorithm>
Figure QLYQS_7
Obtaining second image data
Figure QLYQS_1
In which>
Figure QLYQS_5
Expressed as a hadamard product; converting the second image data into third image data in a time domain, and sending the third image data to a real super-resolution model;
generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.
2. The real image super-resolution robust method according to claim 1, wherein the receiving total image data, determining whether the total image data is image data of a confrontation sample, and sending the total image data to a random frequency domain masking module when the total image data is image data of the confrontation sample, comprises:
receiving total image data, wherein the total image data is
Figure QLYQS_10
Converting the total image data into the frequency domain resulting in fourth image data->
Figure QLYQS_11
Sub-image data for each channel->
Figure QLYQS_12
Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises the fact that the value is greater than or equal to>
Figure QLYQS_13
On the outcome of said classification is->
Figure QLYQS_14
When the total image data is the image data of the antagonistic sample, the result of the classification is->
Figure QLYQS_15
And when the total image data is the image data of the countermeasure sample, sending the total image data and the intermediate characteristic data thereof to a random frequency domain masking module.
3. The real image super-resolution robust method according to claim 2, wherein the inputting the one-dimensional tensor data into the two-class network to obtain the classification result is realized by the following formula:
Figure QLYQS_16
wherein, the first and the second end of the pipe are connected with each other,
Figure QLYQS_19
for the result of a classification>
Figure QLYQS_25
Represents the channel averaging, mean pooling, and flattening function, <' > based on the mean value>
Figure QLYQS_26
Represents the activation function LeakyReLU, <' > in conjunction with a tone mark>
Figure QLYQS_18
For the Softmax activation function, <' > H>
Figure QLYQS_20
、/>
Figure QLYQS_21
、/>
Figure QLYQS_23
、/>
Figure QLYQS_17
、/>
Figure QLYQS_22
And->
Figure QLYQS_24
Is a pre-set learnable parameter.
4. A real image super-resolution robust apparatus based on frequency domain, the apparatus comprising:
the confrontation sample classifier module is used for receiving total image data, judging whether the total image data is the image data of the confrontation sample or not, and sending the total image data to the random frequency domain covering module when the total image data is the image data of the confrontation sample;
the random frequency domain covering module is used for converting the total image data into first image data in a frequency domain, obtaining second image data by covering a plurality of high-frequency components in the first image data, converting the second image data into third image data in a time domain, and sending the third image data to a real super-resolution model, wherein the random frequency domain covering module is embedded into the real super-resolution model; specifically;
the total image data is processed
Figure QLYQS_28
Based on a discrete cosine transform conversion into first image data ^ in the frequency domain>
Figure QLYQS_34
And sampling each frequency domain mask pattern->
Figure QLYQS_35
Said->
Figure QLYQS_29
For a high of the total image data, the->
Figure QLYQS_30
For the width of the total image data, the->
Figure QLYQS_31
The number of channels of the total image data; based on the Hadamard product algorithm>
Figure QLYQS_33
Obtaining second image data
Figure QLYQS_27
In which>
Figure QLYQS_32
Expressed as a hadamard product; converting the second image data into third image data in a time domain, and sending the third image data to a real super-resolution model;
a robust real image super resolution network module for generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.
5. The apparatus according to claim 4, wherein the robust for real image super resolution module is configured to receive total image data, the total image data being
Figure QLYQS_36
Converting the total image data into the frequency domain resulting in fourth image data +>
Figure QLYQS_37
Sub-image data for each channel->
Figure QLYQS_38
Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises/>
Figure QLYQS_39
On the outcome of said classification is->
Figure QLYQS_40
When the total image data is the image data of the antagonistic sample, the result of the classification is->
Figure QLYQS_41
And when the total image data is the image data of the countermeasure sample, sending the total image data and the intermediate characteristic data thereof to a random frequency domain masking module.
6. The apparatus according to claim 5, wherein the random frequency domain masking module is configured to input the one-dimensional tensor data into a two-class network to obtain a classification result by the following formula:
Figure QLYQS_42
wherein the content of the first and second substances,
Figure QLYQS_43
for the result of the classification, is>
Figure QLYQS_46
Represents the channel averaging, mean pooling, and flattening function, <' > based on the mean value>
Figure QLYQS_47
Represents the activation function LeakyReLU, <' > or>
Figure QLYQS_45
For the Softmax activation function, <' > H>
Figure QLYQS_50
、/>
Figure QLYQS_51
、/>
Figure QLYQS_52
、/>
Figure QLYQS_44
、/>
Figure QLYQS_48
And->
Figure QLYQS_49
Is a preset learnable parameter.
7. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 3.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 3.
CN202110728827.4A 2021-06-29 2021-06-29 Real image super-resolution robust method and device based on frequency domain Active CN113436073B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110728827.4A CN113436073B (en) 2021-06-29 2021-06-29 Real image super-resolution robust method and device based on frequency domain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110728827.4A CN113436073B (en) 2021-06-29 2021-06-29 Real image super-resolution robust method and device based on frequency domain

Publications (2)

Publication Number Publication Date
CN113436073A CN113436073A (en) 2021-09-24
CN113436073B true CN113436073B (en) 2023-04-07

Family

ID=77757686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110728827.4A Active CN113436073B (en) 2021-06-29 2021-06-29 Real image super-resolution robust method and device based on frequency domain

Country Status (1)

Country Link
CN (1) CN113436073B (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019222401A2 (en) * 2018-05-17 2019-11-21 Magic Leap, Inc. Gradient adversarial training of neural networks
CN110473142B (en) * 2019-05-22 2022-09-27 南京理工大学 Single image super-resolution reconstruction method based on deep learning
CN112633306B (en) * 2019-09-24 2023-09-22 杭州海康威视数字技术股份有限公司 Method and device for generating countermeasure image
CN111915486B (en) * 2020-07-30 2022-04-22 西华大学 Confrontation sample defense method based on image super-resolution reconstruction
CN111950635B (en) * 2020-08-12 2023-08-25 温州大学 Robust feature learning method based on layered feature alignment
CN112464230B (en) * 2020-11-16 2022-05-17 电子科技大学 Black box attack type defense system and method based on neural network intermediate layer regularization
CN112686249B (en) * 2020-12-22 2022-01-25 中国人民解放军战略支援部队信息工程大学 Grad-CAM attack method based on anti-patch

Also Published As

Publication number Publication date
CN113436073A (en) 2021-09-24

Similar Documents

Publication Publication Date Title
Qiu et al. Review on image processing based adversarial example defenses in computer vision
CN110378844B (en) Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network
Wang et al. Optimized feature extraction for learning-based image steganalysis
Patel et al. Noise benefits in quantizer-array correlation detection and watermark decoding
Channappayya et al. Design of linear equalizers optimized for the structural similarity index
Peng et al. The framework of P systems applied to solve optimal watermarking problem
CN112884671A (en) Fuzzy image restoration method based on unsupervised generation countermeasure network
Behnia et al. Watermarking based on discrete wavelet transform and q-deformed chaotic map
Thakkar et al. Performance comparison of recent optimization algorithm Jaya with particle swarm optimization for digital image watermarking in complex wavelet domain
CN103247018A (en) Expanding conversion shaking watermark modulating method based on logarithm domain visual model
Amirmazlaghani Heteroscedastic watermark detector in the contourlet domain
Wei et al. A robust image watermarking approach using cycle variational autoencoder
US20230325982A1 (en) Methods, systems and computer programs for processing image data for generating a filter
CN113436073B (en) Real image super-resolution robust method and device based on frequency domain
CN110765843A (en) Face verification method and device, computer equipment and storage medium
Maity et al. Genetic algorithms for optimality of data hiding in digital images
Zhang et al. Noise reduction using genetic algorithm based PCNN method
Zhai et al. Universal image noise removal filter based on type-2 fuzzy logic system and QPSO
Langampol et al. Smart switching bilateral filter with estimated noise characterization for mixed noise removal
Gaata et al. Digital watermarking method based on fuzzy image segmentation technique
Piao et al. A blind watermarking algorithm based on HVS and RBF neural network for digital image
Singh et al. A robust watermarking scheme based on image normalization
Loukhaoukha et al. Multi-Objective genetic algorithm optimization for image watermarking based on singular value decomposition and Lifting wavelet transform
González-Hidalgo et al. High-density impulse noise removal using fuzzy mathematical morphology
CN117473469B (en) Model watermark embedding method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant