CN113436073B

CN113436073B - Real image super-resolution robust method and device based on frequency domain

Info

Publication number: CN113436073B
Application number: CN202110728827.4A
Authority: CN
Inventors: 李冠彬; 岳九涛; 魏朋旭; 林倞
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2023-04-07
Anticipated expiration: 2041-06-29
Also published as: CN113436073A

Abstract

The embodiment of the application provides a real image super-resolution robust method and a device based on a frequency domain, wherein the method comprises the following steps: receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data to a random frequency domain covering module when the image data is the image data of the countermeasure sample; converting the total image data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data to obtain second image data, converting the second image data into third image data of a time domain, and sending the third image data to a real super-resolution model; an image with clearer details and better fidelity corresponding to the total image data is generated based on the third image data.

Description

Real image super-resolution robust method and device based on frequency domain

Technical Field

The present application relates to the field of communications, and in particular, to a method and an apparatus for configuring a connection mode.

Background

Single image super resolution is used to recover high resolution images from degraded low resolution images with clearer details and better fidelity. At present, a common deep convolutional neural network is used for completing single image super-resolution, a large number of paired samples containing low-resolution input images and corresponding training of high-quality versions of the paired samples are required for the common deep convolutional neural network, and the training of the deep neural network is easily threatened by synthetic anti-attack.

Existing methods for resisting counterattack can be largely classified into three types. First, the model is made to learn the mapping from the low resolution image side with the competing noise to the high resolution image side by competing training. The method is a common defense method in a white-box environment, but the successful defense can not be achieved when a multi-step iteration strong attacker based on gradient is faced. Second, by a gradient masking-like approach, the counterpropagating gradient is blurred in the generation of the counterpropagating sample counter-propagating or by an undifferentiated design so that counterpropagating noise cannot be added to the clean image. For example, the image is randomly cropped. But all gradient ambiguity and masking based methods have proven to be completely defeated in BPDA's attack mode (i.e. under white-box environment, back propagation deliberately bypasses the gradient ambiguity's module or estimates that the non-differentiable layer's gradient is back-propagated onto the challenge sample). The third denoising-based method can be roughly divided into the steps of performing some denoising processing on the input of the model at a low-resolution image end and reducing the influence of the anti-disturbance on the output of the model through some denoising operations at a feature level. Denoising processing at an image end: image preprocessing and JPEG preprocessing based on an image compression model; denoising treatment at a characteristic level: a noise removal module based on Non-Local and a noise removal method based on mean filtering and bilateral filtering. Also image pre-processing based methods have proven successful for BPDA attacks. Although the module based on feature denoising obtains certain effect in the image classification model in cooperation with confrontation training, good defense effect is not shown in the field of real image super-resolution.

In addition, the defense methods all affect the super-resolution result of the model on the clean sample to different degrees, so that the super-resolution result of the model on the clean sample becomes fuzzy, and clear details and textures which can be generated by the model before cannot be generated visually.

Disclosure of Invention

The embodiment of the application provides a real image super-resolution robust method and device based on a frequency domain, and aims to solve the problems that the super-resolution result of a model on a clean sample is influenced in the prior art, the super-resolution result of the model on the clean sample becomes fuzzy, and clear details and textures which can be generated by the model before cannot be generated visually, and the method comprises the following steps:

a real image super-resolution robust method based on frequency domain, the method comprising:

receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data or image characteristic data to a random frequency domain covering module when the image data is the image data of the countermeasure sample;

converting the total image data or the image characteristic data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data to obtain second image data, converting the second image data into third image data of the time domain, sending the third image data to a real super-resolution model, and embedding a frequency domain covering module into the super-resolution model;

generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.

Optionally, the receiving total image data, determining whether the total image data is image data of a challenge sample, and sending the total image data to a random frequency domain masking module when the image data is the image data of the challenge sample, includes:

the confrontation sample classifier is used for receiving total image data, and the total image data is X epsilon R ^H×W×C H is the height of the initial image, W is the width of the initial image, C is the channel number of the initial image, and the image data is converted into a frequency domain to obtain fourth image data

Sub-image data for each channel->

Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises True and FalseAnd when the classification result is True, the image data is the image data of the confrontation sample, when the classification result is False, the image data is the image data of the normal sample, and when the image data is the image data of the confrontation sample, the image data and the intermediate characteristic data thereof are sent to a random frequency domain masking module.

Optionally, the converting the total image data into first image data in a frequency domain, masking off a plurality of high frequency components in the first image data, and obtaining second image data, converting the second image data into third image data in the time domain, and sending the third image data to the embedded real super-resolution model includes:

the random frequency domain masking module is used for masking the total image data or the intermediate characteristic data X E R ^H×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform ^H×W By Hadamard product the algorithm is

And an indication indicates a hadamard product, second image data is obtained, the second image data is converted into third image data in a time domain, and the third image data is sent to a real super-resolution model.

Optionally, the inputting the one-dimensional tensor data into a two-class network to obtain a classification result is implemented by the following formula:

where G (X) is the result of the classification, φ (X) represents the channel averaging, mean pooling, and flattening functions, σ represents the activation function LeakyReLU,

for the Softmax activation function, W ₁ 、W ₂ 、W ₃ 、b ₁ 、b ₂ And b ₃ Is made in advanceSet learnable parameters.

A real image super-resolution robust apparatus based on a frequency domain, the apparatus comprising:

the confrontation sample classifier module is used for receiving total image data, judging whether the total image data is image data of a confrontation sample or not, and sending the total image data or image characteristic data to the random frequency domain covering module when the image data is the image data of the confrontation sample;

the random frequency domain covering module is used for converting the total image data or the image characteristic data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data, obtaining second image data, converting the second image data into third image data of a time domain, sending the third image data to a real super-resolution model, and embedding the frequency domain covering module into the super-resolution model;

a robust real image super resolution network module for generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.

Optionally, the confrontation sample classifier module is configured to receive total image data, the total image data being X e R ^H ^×W×C H is the height of the initial image, W is the width of the initial image, C is the channel number of the initial image, and the image data is converted into a frequency domain to obtain fourth image data

Sub-image data for each channel

Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises True and False, when the classification result is True, the image data is the image data of the countermeasure sample, and when the classification result is True, the image data is the image data of the countermeasure sampleAnd when False, the image data is the image data of the normal sample, and when the image data is the image data of the countermeasure sample, the image data and the intermediate characteristic data thereof are sent to a random frequency domain masking module.

Optionally, the random frequency domain masking module is configured to mask the total image data or the intermediate feature data X ∈ R ^H ^×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform ^H×W By Hadamard product the algorithm is

And an indication of an image is a hadamard product, obtaining second image data, converting the second image data into third image data in a time domain, and transmitting the third image data to the embedded real super-resolution model.

Optionally, the random frequency domain masking module is configured to input the one-dimensional tensor data into a two-class network to obtain a classification result by the following formula:

for the Softmax activation function, W ₁ 、W ₂ 、W ₃ 、b ₁ 、b ₂ And b ₃ Is a pre-set learnable parameter.

A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program performs the steps of the method as claimed in any one of the preceding claims.

A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the preceding claims.

The application has the following advantages:

in the application, whether total image data is image data of a countermeasure sample or not is judged by receiving the total image data, and when the image data is the image data of the countermeasure sample, the total image data is sent to a random frequency domain covering module; converting the total image data into first image data of a frequency domain, covering a plurality of high-frequency components in the first image data to obtain second image data, converting the second image data into third image data of a time domain, and sending the third image data to a real super-resolution model; generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data. The method realizes accurate discrimination of the confrontation sample and the normal sample, and the design of the classifier ensures the super-resolution performance of the model on the clean sample. Compared with other defense methods, the method can obtain near-optimal performance for a clean sample, and when confronted with a confrontation sample, the successful defense method is ensured, and structured and unreasonable textures can not be generated like other super-resolution models. And the influence of the antagonistic sample on the image super-resolution model is analyzed from the aspects of frequency domain and characteristics. We first found that the anti-noise becomes stronger as the depth of the network deepens, so we propose to remove the ill-conditioned impact on anti-noise from the level of features. Secondly, a randomness strategy is introduced in the covering process, and experiments prove that the randomness strategy can greatly improve the robustness of the method.

Drawings

In order to more clearly illustrate the technical solutions of the present application, the drawings needed to be used in the description of the present application will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.

Fig. 1 is a flowchart illustrating steps of a real image super-resolution robust method based on a frequency domain according to an embodiment of the present application;

fig. 2 is a block diagram of a structure of a real image super-resolution robust apparatus based on a frequency domain according to an embodiment of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description. It should be apparent that the embodiments described are some, but not all embodiments of the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.

Referring to fig. 1, a flowchart illustrating steps of a real image super-resolution robust method based on a frequency domain according to an embodiment of the present application is shown, which may specifically include the following steps:

step 101, receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data to an embedded random frequency domain masking module when the image data is the image data of the countermeasure sample;

in an embodiment of the present application, the step 101 includes:

Sub-image data for each channel>

Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, and outputting the one-dimensional tensor dataAnd entering a binary network to obtain a classification result, wherein the classification result comprises True and False, when the classification result is True, the image data is the image data of the confrontation sample, when the classification result is False, the image data is the image data of the normal sample, and when the image data is the image data of the confrontation sample, the image data is sent to an embedded random frequency domain masking module.

In specific implementation, the random frequency domain masking module can improve the robustness of the model and achieve successful defense in the face of resisting attackers, but because the distinguishing boundaries of resisting noise and image high-frequency detail components are fuzzy, the masking operation still masks the image and the high-frequency details generated by over-partitioning, so that the super-resolution performance of the model in the face of a clean sample is reduced.

Based on the experimental results and the difference between the distributions of the challenge samples and the normal samples in the frequency domain, a robust classifier of the challenge samples is designed, so that the classifier can accurately distinguish the challenge samples from the normal samples. Thus, we decide whether to activate the random frequency-domain masking module after the classifier based on the result of the classifier. If the classifier output is False, the current sample is a normal sample, the subsequent random frequency domain covering module is skipped, and otherwise, the activation is selected.

The classifier takes the image as input and is represented as: x is formed by R ^H×W×C There may be a challenge sample and a clean sample. For X, we first convert it to the frequency domain

Then we average it according to the channel to get

Then, we do a mean pooling and flattening operation to get a one-dimensional tensor, then input it into a two-classification network to get the result G (-) e [ True, false]：

Where phi (X) denotes the channel averaging, mean pooling, and flattening operations, sigma denotes the activation function, leak relu,

representing the Softmax activation function. W is a group of _i And b _i Representing a learnable parameter. The design of the classifier enables the model to generate sharp details and textures on normal samples, and the excellent super-resolution effect is kept.

102, converting the total image data into first image data of a frequency domain or intermediate characteristic data, covering a plurality of high-frequency components in the first image data, obtaining second image data, converting the second image data into third image data of a time domain, and sending the third image data to an embedded real super-resolution model;

in an embodiment of the present application, the step 102 includes:

step S11, the random frequency domain covering module is used for covering the total image data X e R ^H×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform ^H×W By means of the Hadamard product the algorithm is

An indication of an h-hadamard product is obtained, second image data is obtained, the second image data is converted into third image data in a time domain, and the third image data is transmitted to an embedded real super-resolution model.

In an embodiment of the present application, the step S11 is implemented by the following formula:

wherein G (X) is the result of the classification, and phi (X) represents the channel average, mean poolingAnd a flattening function, σ denotes the activation function LeakyReLU,

for the Softmax activation function, W ₁ 、W ₂ 、W ₃ 、b ₁ 、b ₂ And b ₃ Is a preset learnable parameter.

The frequency domain masking module takes the image and features as input and is represented as: the tensor of H multiplied by W multiplied by C, H, W and C respectively represent the height, width and channel number of the feature map. Let X ∈ R ^H×W One of the channel tensors, expressed as the input tensor. We transform it to the frequency domain using DCT (discrete cosine transform). The DCT frequency domain of X can be represented as:

can be calculated according to the formula:

where c (u, v) is a compensation factor, when u =0,

when u is not equal to 0, then>

We average and visualize the H × W × C tensors by channel C. The visualization results are shown in fig. 1. The upper row of fig. 1 represents the feature frequency domain visualization of a clean sample, and the lower row corresponds to the challenge sample. The visualization result also shows that the anti-noise is mostly coded as the high-frequency component of the frequency domain (the red component at the upper right corner, the lower left corner and the lower right corner of the frequency domain graph). We believe that the competing noise encoded as high frequency components severely affects the super-resolution results, so we propose to randomly mask some of the high frequency components in the frequency domain to remove the competing noiseInfluence. We will randomly sample a frequency domain mask map M e R according to the algorithm we propose ^H×W :

/>

Here | _ indicates a hadamard product (corresponding position multiplication). Next, the frequency domain map after multiplication is converted into a time domain space by I-DCT. The process of frequency domain masking can be summarized by the formula:

X _m ＝F ^-1 (F(X)⊙M)

where F (X) is denoted DCT, F ^-1 Denoted as I-DCT.

For the sampling method of the frequency domain map M, we first calculate the normalized distance of each component distance (0,0) component:

here, the

r _u, The larger the signal, the higher the confidence that the current component has against the noise. Since the content and texture of the image are mostly stored in the low-frequency components of the image, we randomly sample a protection radius r in order to protect the low-frequency components of the image from being masked _t ∈[r _l ,r _u ]. The components within the radius we do not mask them. For components other than radius, we are based on the Bernoulli distribution and r _max A probability value is sampled to determine whether to perform a masking operation. Bernoulli sampling can be expressed as M (u, v) = Bernoulli (p = r) _(u,v) ): for (u, v) in the mask map M we have a probability p = r _(u,v) It was sampled to 1,p =1-r _(u，v) It is sampled to 0. So M can be defined as:

the masking module we propose has two advantages: first, our module is differentiable, guarantees the training of the network, and can be simply embedded anywhere in the network. Secondly, our module has no learnable parameters and is a light-weight defense module.

Step 103, generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.

In the existing super-resolution method, the super-resolution effect of an image in a clean sample is generally only considered, and the robustness of a model when the model faces an attacker is not considered, so that a robust image super-resolution model is provided, and the robustness of the model is further improved by combining countertraining when the model is trained. Our model has the current best true super resolution model CDC as the cornerstone of our model. The backbone network of the CDC consists of six HourGlass modules. The CDC constructs three component-of-interest blocks (CABs) associated with planes, edges, and corners. Each CAB learns exclusively one of the three low-level components through an intermediate supervised strategy, while controlling the HourGlass module to learn the corresponding component mapping function. The outputs of the three CABs are summed to generate the final SR reconstruction. Considering that different image regions have different gradients in different directions, a Gradient Weighted (GW) loss function is proposed for SR reconstruction. The more complex a region texture is, the greater its corresponding gradient penalty, forcing the network to notice these complex texture regions.

We first place the confrontation sample classifier at the head of the CDC network to identify if the input sample is a clean sample. Second, we place our random frequency domain masking module in two places in the network: the first is the head of the entire model, the tail of the classifier, and the second is the head and tail of each Hourglass. The output of the classifier decides whether or not to activate the following random frequency domain masking module.

To further improve the robustness of the model, we incorporate countertraining in the training process, we adopt I-FGSM as the counterattacker:

L(X _n ,X ₀ )＝‖f(X _n )-f(X ₀ )‖ ₂

where X is ₀ Represents a clean sample, X _n Indicating that the challenge sample L (-) represents the MSE loss, alpha represents the challenge perturbation strength,

indicating the derivation thereof, clip _(a,b) (X) = min (max (X, a), b), T represents the number of iterations. In training, we adopt T =2, α =6/255 as the challenge sample generator.

In conclusion, compared with the prior art, the method analyzes the influence of the confrontation sample on the image super-resolution model from the aspects of frequency domain and characteristics. We first found that the anti-noise becomes stronger as the depth of the network deepens, so we propose to remove the ill-conditioned impact on anti-noise from the level of features. Secondly, a randomness strategy is introduced in the covering process, and experiments prove that the randomness strategy can greatly improve the robustness of the method.

In addition, a robust confrontation sample classifier is designed, the classifier can accurately distinguish the confrontation sample from the normal sample, and the design of the classifier ensures the super-resolution performance of the model on a clean sample. Compared with other defense methods, the method can obtain nearly optimal performance for a clean sample, and when confrontation samples are faced, the successful defense method is ensured, and structured and unreasonable textures cannot be generated like other super-resolution models.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.

Referring to fig. 2, a structural block diagram of a real image super-resolution robust apparatus based on a frequency domain according to an embodiment of the present application is shown, which may specifically include the following modules:

the confrontation sample classifier module 201 is configured to receive total image data, determine whether the total image data is image data of a confrontation sample, and send the total image data and intermediate feature data thereof to the random frequency domain masking module when the image data is image data of the confrontation sample;

a random frequency domain masking module 202, configured to convert the total image data into first image data in a frequency domain, mask out multiple high-frequency components in the first image data, obtain second image data, convert the second image data into third image data in a time domain, and send the third image data to the embedded real super-resolution model;

a robust real image super resolution network module 203 for generating an image with clearer details and better fidelity corresponding to the total image data based on the third image data.

In an embodiment of the application, the confrontation sample classifier module is configured to receive total image data, where X ∈ R ^H×W×C H is the height of the initial image, W is the width of the initial image, C is the channel number of the initial image, and the image data is converted into a frequency domain to obtain fourth image data

Sub-image data for each channel->

Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises True and False, when the classification result is True, the image data is the image data of the countermeasure sample, when the classification result is False, the image data is the image data of a normal sample, and when the image data is the image data of the countermeasure sample, the image data is sent to a random frequency domain masking module. />

In an embodiment of the application, the random frequency domain masking module is configured to mask the total image data X ∈ R ^H×W×C Sampling each frequency domain mask map M E R based on the first image data converted into frequency domain by discrete cosine transform ^H×W By means of the Hadamard product the algorithm is

In an embodiment of the present application, the random frequency domain masking module is configured to input the one-dimensional tensor data into a two-class network to obtain a classification result through the following formula:

In conclusion, compared with the prior art, the method analyzes the influence of the confrontation sample on the image super-resolution model from the aspects of frequency domain and characteristics. We first found that the anti-noise becomes stronger as the depth of the network deepens, so we propose to remove the ill-conditioned impact on anti-noise from the level of features. Secondly, a randomness strategy is introduced in the masking process, experiments prove that the randomness strategy can greatly improve the robustness of the method, and the designed frequency domain masking module is embedded in a real super-resolution network. On one hand, the influence of the anti-noise can be removed, and the super-resolution effect of the image is ensured.

In addition, a robust confrontation sample classifier is designed, the classifier can accurately distinguish the confrontation sample from the normal sample, and the super-resolution performance of the model on a clean sample is guaranteed due to the design of the classifier. Compared with other defense methods, the method can obtain nearly optimal performance for a clean sample, and when confrontation samples are faced, the successful defense method is ensured, and structured and unreasonable textures cannot be generated like other super-resolution models.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the above method for super-resolution robustness of real images based on frequency domain.

A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the above method for super-resolution robustness of a real image based on frequency domain.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "include", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or terminal device including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such process, method, article, or terminal device. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or terminal apparatus that comprises the element.

The method and the device for super-resolution robust of real images based on frequency domain provided by the application are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the above embodiment is only used to help understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, the specific implementation manner and the application scope may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A real image super-resolution robust method based on a frequency domain is characterized by comprising the following steps:

receiving total image data, judging whether the total image data is image data of a countermeasure sample, and sending the total image data to a random frequency domain covering module when the total image data is the image data of the countermeasure sample;

in the random frequency domain covering module, converting the total image data into first image data in a frequency domain, obtaining second image data by covering a plurality of high-frequency components in the first image data, converting the second image data into third image data in a time domain, and sending the third image data to a real super-resolution model, wherein the random frequency domain covering module is embedded into the real super-resolution model; specifically;

the total image data is processed

Based on a discrete cosine transform conversion into first image data ^ in the frequency domain>

And sampling each frequency domain mask pattern->

Said->

For high of the total image data, said +>

For the width of the total image data, said +>

The number of channels of the total image data; based on the Hadamard product algorithm>

Obtaining second image data

In which>

Expressed as a hadamard product; converting the second image data into third image data in a time domain, and sending the third image data to a real super-resolution model;

2. The real image super-resolution robust method according to claim 1, wherein the receiving total image data, determining whether the total image data is image data of a confrontation sample, and sending the total image data to a random frequency domain masking module when the total image data is image data of the confrontation sample, comprises:

receiving total image data, wherein the total image data is

Converting the total image data into the frequency domain resulting in fourth image data->

Sub-image data for each channel->

Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises the fact that the value is greater than or equal to>

On the outcome of said classification is->

When the total image data is the image data of the antagonistic sample, the result of the classification is->

And when the total image data is the image data of the countermeasure sample, sending the total image data and the intermediate characteristic data thereof to a random frequency domain masking module.

3. The real image super-resolution robust method according to claim 2, wherein the inputting the one-dimensional tensor data into the two-class network to obtain the classification result is realized by the following formula:

wherein, the first and the second end of the pipe are connected with each other,

for the result of a classification>

Represents the channel averaging, mean pooling, and flattening function, <' > based on the mean value>

Represents the activation function LeakyReLU, <' > in conjunction with a tone mark>

For the Softmax activation function, <' > H>

、/>

、/>

、/>

、/>

And->

Is a pre-set learnable parameter.

4. A real image super-resolution robust apparatus based on frequency domain, the apparatus comprising:

the confrontation sample classifier module is used for receiving total image data, judging whether the total image data is the image data of the confrontation sample or not, and sending the total image data to the random frequency domain covering module when the total image data is the image data of the confrontation sample;

the random frequency domain covering module is used for converting the total image data into first image data in a frequency domain, obtaining second image data by covering a plurality of high-frequency components in the first image data, converting the second image data into third image data in a time domain, and sending the third image data to a real super-resolution model, wherein the random frequency domain covering module is embedded into the real super-resolution model; specifically;

the total image data is processed

And sampling each frequency domain mask pattern->

Said->

For a high of the total image data, the->

For the width of the total image data, the->

Obtaining second image data

In which>

5. The apparatus according to claim 4, wherein the robust for real image super resolution module is configured to receive total image data, the total image data being

Converting the total image data into the frequency domain resulting in fourth image data +>

Sub-image data for each channel->

Performing mean value pooling and flattening operation on the sub-image data to generate one-dimensional tensor data, inputting the one-dimensional tensor data into a two-classification network to obtain a classification result, wherein the classification result comprises/>

On the outcome of said classification is->

6. The apparatus according to claim 5, wherein the random frequency domain masking module is configured to input the one-dimensional tensor data into a two-class network to obtain a classification result by the following formula:

wherein the content of the first and second substances,

for the result of the classification, is>

Represents the activation function LeakyReLU, <' > or>

For the Softmax activation function, <' > H>

、/>

、/>

、/>

、/>

And->

Is a preset learnable parameter.

7. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 3.

8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 3.