CN114495239A - Forged image detection method and system based on frequency domain information and generation countermeasure network - Google Patents

Forged image detection method and system based on frequency domain information and generation countermeasure network Download PDF

Info

Publication number
CN114495239A
CN114495239A CN202210139691.8A CN202210139691A CN114495239A CN 114495239 A CN114495239 A CN 114495239A CN 202210139691 A CN202210139691 A CN 202210139691A CN 114495239 A CN114495239 A CN 114495239A
Authority
CN
China
Prior art keywords
image
encoder
target
layer
spectrogram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210139691.8A
Other languages
Chinese (zh)
Inventor
江倩
黄珊珊
刘玲
金鑫
董云云
吴楠
姚绍文
何德芬
程子恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN202210139691.8A priority Critical patent/CN114495239A/en
Publication of CN114495239A publication Critical patent/CN114495239A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for detecting a forged image based on frequency domain information and a generation countermeasure network, which relate to the technical field of image processing and deep learning, and comprise the following steps: acquiring a target image; the target image is a face image or a natural scene image; converting the target image from an image space to a frequency domain space to obtain a target spectrogram; inputting the target spectrogram into a forged image detection model to determine a judgment result of the target image; the discrimination result comprises a real image and a forged image; wherein a network structure of the forged image detection model generates a countermeasure network; the generation countermeasure network includes a generator and a discriminator; the discriminator adopts a U-Net structure. The method and the device can detect the forged face image and the forged image of the natural scene.

Description

Forged image detection method and system based on frequency domain information and generation countermeasure network
Technical Field
The invention relates to the technical field of image processing and deep learning, in particular to a method and a system for detecting a forged image based on frequency domain information and a generation countermeasure network.
Background
With the development of image processing technology and deep learning technology, the quality of forged images is higher and lower, and the cost is lower and higher, so that the threat to social life and production activities is increasingly remarkable.
In recent years, with the popularization of multimedia acquisition tools, particularly with the popularization of image editing software such as photoshop and american show, the creation and propagation of digital images have entered a major explosion age. Existing methods for detecting counterfeit images can be roughly classified into two categories: one is to detect a counterfeit image by using specific clues in the image, such as pixel-level inconsistency; another class is to capture spurious features in an image based on a deep learning model, thereby improving detection performance. In particular, with the continuous and intensive research on deep learning, deep convolutional neural networks (depcnns, DCNNs) are gradually applied to the field of counterfeit image detection, and show strong development potential. However, most of the conventional fake image detection techniques are limited to the detection of fake face images, and there is only a method for detecting fake natural images. In fact, the detection of the forged image of the natural scene is also crucial, for example, in the field of military reconnaissance, when planning a marching route, if rivers and mountains are "created" on a topographic map originally without rivers or mountains, the planning of the marching route is greatly affected, and thus serious consequences are generated. Therefore, it is necessary to research a detection method for a forged natural image.
Disclosure of Invention
The invention aims to provide a method and a system for detecting a forged image based on frequency domain information and generation countermeasure network, which can not only detect a forged face image, but also detect a forged image of a natural scene.
In order to achieve the purpose, the invention provides the following scheme:
in a first aspect, the present invention provides a method for detecting a counterfeit image based on frequency domain information and generation of a countermeasure network, comprising:
acquiring a target image; the target image is a face image or a natural scene image;
converting the target image from an image space to a frequency domain space to obtain a target spectrogram;
inputting the target spectrogram into a forged image detection model to determine a discrimination result of the target image; the discrimination result comprises a real image and a forged image;
wherein a network structure of the forged image detection model generates a countermeasure network; the generation countermeasure network includes a generator and a discriminator; the discriminator adopts a U-Net structure.
Optionally, the converting the target image from an image space to a frequency domain space to obtain a target spectrogram specifically includes:
performing discrete Fourier transform on the target image to obtain an initial spectrogram;
and carrying out centering processing on the initial spectrogram to obtain a target spectrogram.
Optionally, the generator includes a first encoder and a first decoder; the discriminator comprises a second encoder and a second decoder;
the output end of the first encoder is connected with the input end of the first decoder; the output end of the first decoder is connected with the input end of the second encoder; the first output end of the second encoder is used for outputting sample image classes in the model training process; a second output terminal of the second encoder is connected to an input terminal of the second decoder.
Optionally, the first encoder is configured to extract potential codes of the target spectrogram; the first decoder is configured to reconstruct the target spectrogram based on a potential encoding of the target spectrogram, so as to obtain a reconstructed image;
the first encoder comprises five first convolution modules, wherein the first four first convolution modules comprise a convolution layer, a batch layer and an LReLU active layer, and the last first convolution module comprises a convolution layer, a batch layer and a ReLU active layer;
the first decoder comprises five and identical first deconvolution modules; the first deconvolution module includes a deconvolution layer and a ReLU activation layer.
Optionally, a second output of the second encoder is used for outputting a potential encoding of the reconstructed image;
the output end of the second decoder is used for outputting image pixel-by-pixel judgment result values; the image pixel-by-pixel discrimination result value is the discrimination result of the target image;
wherein the second encoder comprises at least five second convolution modules, wherein a first one of the second convolution modules comprises a two-dimensional convolution layer and a spectral normalization layer, and the last four second convolution modules comprise a two-dimensional convolution layer, a spectral normalization layer and a ReLU activation layer;
the second decoder comprises at least five and identical second deconvolution modules; the second deconvolution module includes a deconvolution layer and a ReLU activation layer.
Optionally, the loss function of the forged image detection model is a composite loss function;
the composite loss function comprises a countering loss sub-function, a first label loss sub-function, a second label loss sub-function and a reconstruction sub-function;
the countermeasure loss sub-function is a cross entropy loss function;
the first label loss sub-function is used for representing difference values between potential codes of the sample spectrogram and real labels of the sample image; the real label of the sample image is a label corresponding to the sample spectrogram; the sample spectrogram is obtained by converting the sample image from an image space to a frequency domain space;
the second label loss sub-function is used for representing a difference value between a potential code of the sample reconstructed image and a real label of the sample image; the sample reconstructed image is a potentially encoded reconstructed image based on the sample spectrogram;
the reconstruction loss sub-function is used to represent pixel-level loss values between the sample reconstructed image and the sample spectrogram.
In a second aspect, the present invention provides a system for detecting a counterfeit image based on frequency domain information and generation of a countermeasure network, comprising:
the data acquisition module is used for acquiring a target image; the target image is a face image or a natural scene image;
the processing module is used for converting the target image from an image space to a frequency domain space to obtain a target spectrogram;
the category result determining module is used for inputting the target spectrogram into a forged image detection model so as to determine a judgment result of the target image; the discrimination result comprises a real image and a forged image;
wherein a network structure of the forged image detection model generates a countermeasure network; the generation countermeasure network includes a generator and a discriminator; the discriminator adopts a U-Net structure.
Optionally, the processing module specifically includes:
the transformation unit is used for carrying out discrete Fourier transformation on the target image to obtain an initial spectrogram;
and the centralization treatment is used for centralizing the initial spectrogram to obtain a target spectrogram.
Optionally, the generator includes a first encoder and a first decoder; the discriminator comprises a second encoder and a second decoder;
the output end of the first encoder is connected with the input end of the first decoder; the output end of the first decoder is connected with the input end of the second encoder; the first output end of the second encoder is used for outputting sample image classes in the model training process; a second output terminal of the second encoder is connected to an input terminal of the second decoder.
Optionally, the first encoder is configured to extract potential codes of the target spectrogram; the first decoder is configured to reconstruct the target spectrogram based on a potential encoding of the target spectrogram, so as to obtain a reconstructed image;
the first encoder comprises five first convolution modules, wherein the first four first convolution modules comprise a convolution layer, a batch layer and an LReLU active layer, and the last first convolution module comprises a convolution layer, a batch layer and a ReLU active layer;
the first decoder comprises five and identical first deconvolution modules; the first deconvolution module comprises a deconvolution layer and a ReLU activation layer;
a second output of the second encoder is for outputting a potential encoding of the reconstructed image;
the output end of the second decoder is used for outputting image pixel-by-pixel judgment result values; the image pixel-by-pixel discrimination result value is the discrimination result of the target image;
wherein the second encoder comprises at least five second convolution modules, wherein a first one of the second convolution modules comprises a two-dimensional convolution layer and a spectral normalization layer, and the last four second convolution modules comprise a two-dimensional convolution layer, a spectral normalization layer and a ReLU activation layer;
the second decoder comprises at least five and identical second deconvolution modules; the second deconvolution module includes a deconvolution layer and a ReLU activation layer.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention provides a novel forged image detection method and system based on frequency domain information and a generated countermeasure network, which utilizes the game idea of the generated countermeasure network to introduce the generated countermeasure network into the field of forged image detection. Firstly, converting a target image from an image space to a frequency domain space, and then inputting the processed target image into a generator for generating a countermeasure network; then, a U-Net structure, namely an encoder-decoder, is introduced into the discriminator, so that the discrimination capability of the discriminator is enhanced; and finally, inputting the output result of the generator into the discriminator. The discrimination capability is enhanced due to the conversion of the target image from image space to frequency domain space and the introduction of the U-Net structure into the discriminator. Obviously, the method and the system for detecting the forged image based on the frequency domain information and the generated countermeasure network can not only detect the forged face image, but also detect the forged image (image which is difficult to distinguish) of a natural scene.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a spectrum diagram of a real image and a forged image generated by different models according to an embodiment of the present invention; FIGS. 1(a) (b) (c) (d) (e) (f) (g) are different real images in embodiments of the present invention; FIG. 1(h) (i) (j) (k) (m) (n) (o) is a spectrum diagram corresponding to a forged image generated by different models in the embodiment of the present invention;
FIG. 2 is a block diagram of a generator in an embodiment of the invention;
FIG. 3 is a diagram of an embodiment of an authenticator of the present invention;
FIG. 4 is a flow chart of a spectrum normalization algorithm in an embodiment of the present invention;
FIG. 5 is a schematic flow chart of a method for detecting a counterfeit image based on frequency domain information and generation of a countermeasure network according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a forged image detection system based on frequency domain information and generation of a countermeasure network according to an embodiment of the present invention;
FIG. 7 is a general architecture diagram of a counterfeit image detection method based on frequency domain information and generation of a countermeasure network according to an embodiment of the present invention;
FIG. 8 is a data set presentation diagram in an embodiment of the present invention;
FIG. 9 is a diagram showing the detection accuracy of the discriminator according to the embodiment of the present invention;
FIG. 10 is a diagram illustrating the detection precision of the U-Net structure of the discriminator according to the embodiment of the present invention;
FIG. 11 is a diagram illustrating the detection precision of the U-Net structure of the discriminator according to the embodiment of the present invention;
FIG. 12 is a diagram showing F1 values detected by the discriminator according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Recently, the generation countermeasure networks (GANs) have shown huge application potential in the fields of computer vision and image processing, including image translation, image restoration, image synthesis, and the like, by virtue of their powerful image generation capabilities. Also, due to the development of the generation countermeasure network technology, a large number of counterfeit images that are near-authentic and whose authenticity is difficult to distinguish by human eyes are beginning to emerge in the field of view of the public. Since a large number of forged images with different forms may have a great negative effect on individual privacy and even social stability, there is a strong need for a forged image detection technique to detect false images generated by Convolutional Neural Networks (CNNs), especially GANs.
With the development of deep learning methods, deep counterfeiting (deep) technology is becoming more mature. A large number of near-real natural images are affused in people's lives, and while satisfying personal entertainment interests, the abuse of the Deepfake technology poses potential threats to personal privacy, economic markets and even national security. Therefore, a detection method for a counterfeit image is urgently under study. Most of the existing forged image detection technologies have the problems of low accuracy and poor generalization. The invention provides a method and a system for detecting a forged image based on frequency domain information and a generated countermeasure network, and provides a forged image detection model by starting from an image forging mechanism of a Deepfake technology, converting an image from an image domain to a frequency domain, and introducing a discriminator on the basis of a U-Net structure. The technical scheme provided by the invention is that experimental verification is respectively carried out on 7 independent data sets and a mixed data set, compared with the existing partial advanced method, the highest accuracy of the method can reach 100% on the independent data sets, and the lowest accuracy can also reach 94.93%; the model detection recall rate, the accuracy rate and the F1 score can averagely reach 98.17%, 98.25% and 98.19% respectively, and the accuracy rate on a mixed data set can reach 92.96%. Experiments show that the invention provides the method and the system for detecting the forged image with effectiveness and good generalization performance.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Example one
Aiming at the problems, the embodiment of the invention introduces GANs into the field of counterfeit image detection by using the game idea of GANs, and provides a novel detection method for counterfeit natural scene images or counterfeit face images. In the training phase, firstly, the forged image is converted into a frequency domain space from an image space, and then the processed forged image is input into a generator of the GANs so as to extract potential codes and image reconstruction; then, the discriminator is introduced into a U-Net structure, namely an encoder-decoder, wherein the encoder is used for image classification, and the decoder is used for distinguishing image pixels, so that the discrimination capability of the discriminator is enhanced. In the testing stage, the trained discriminator is used for detecting the forged image.
The embodiment of the invention also designs a composite loss function to better optimize the model, and analyzes the performance of the model by utilizing various evaluation indexes in order to comprehensively evaluate the effectiveness of the method of the embodiment of the invention, thereby showing that the method of the embodiment of the invention can obtain good detection performance of the forged image.
The embodiment of the invention provides a method for detecting a forged image by combining frequency domain information and a countermeasure network, which comprises the following steps:
s1: by utilizing the game idea of the GANs, the GANs are introduced into the field of counterfeit image detection, and a novel counterfeit natural scene image detection method is provided.
S2: the counterfeit image is first converted from image space to frequency domain space.
S2-1: fig. 1 shows spectrograms corresponding to forged images generated by different generation models, wherein a first row and a second row respectively represent a real image and a forged image generated by different generation models, and each row is a real image and a forged image corresponding to the real image. Firstly, Discrete Fourier Transform (DFT) is performed on a real image and a forged image obtained based on different generation models to obtain a spectrogram, then the obtained spectrogram is centered, and then the centered spectrogram is input into a forged image detection model.
The most important point of converting the forged image from the image space to the frequency domain space is that the forged image generated by the deep learning model is difficult to find the trace of the fake from the viewpoint of the image domain, and especially for the models proposed in recent years, such as StyleGAN and StyleGAN2, it is often able to generate high-quality and high-resolution images; and the fake image is converted from the image domain to the frequency domain, so that the real image and the fake image have great difference. As can be seen from the observation of fig. 1, the high frequency information of the real image is significantly higher than the amount of high frequency information in the forged image. This is because the forged image obtained by generating the model is often well generated for an area in which the gradation change is slow in the image, and is often difficult to generate for an area in which the gradation information change is severe, that is, an edge area.
S3: inputting the obtained centralized spectrogram into an encoder of a generator to obtain a potential encoding of the image; the image reconstruction is then performed using the underlying code as input to a decoder in the generator. The structure of the generator in the proposed model for detecting counterfeit images is shown in fig. 2, and the generator is composed of two parts, namely an encoder and a decoder.
S3-1: the encoder part is used for extracting potential codes of the input image, and the encoder in the generator is composed of 5 convolution modules, wherein the first four convolution modules are composed of a convolution layer (Conv) and a batch processing layer (BN) followed by an active layer (LReLU), and the last convolution module replaces the active layer (LReLU) with the active layer (Reconfigured Linear units).
S3-2: the decoder is used for image reconstruction. The decoder structure corresponds to the encoder structure and is composed of 5 deconvolution modules, and each deconvolution module has the same structure and comprises a deconvolution layer and a ReLU activation layer.
S4: the reconstructed image and the input image (i.e. the centralized spectrogram) are respectively input into a discriminator to discriminate the authenticity. The discriminator provided by the embodiment of the invention adopts a U-Net structure and consists of an encoder and a decoder. The structure of the discriminator in the proposed model of detection of counterfeit images is shown in fig. 3. Where Conv2D represents two-dimensional convolution, SpectraNorm represents batch normalization, k represents convolution kernel size, s represents step size, d represents number of channels of output signature, LReLU represents LEAKY RELU activation function, RELU represents RELU activation function, and deConv2D represents two-dimensional deconvolution.
S4-1: the encoder classifies the input image, i.e., classifies the input image as a real image or a counterfeit image.
S4-2: the decoder performs a discrimination of authenticity of the image pixels, so that the discriminator has two outputs, corresponding to the image detection result and the image pixel-by-pixel discrimination result.
S4-3: the invention applies Spectrum Normalization (SN) to the discriminator, so that the discriminator D meets the continuity of the Lipschitz in a more elegant mode, and the intensity of function change is limited, thereby enabling the model to be more stable. Before implementing SN, it is first necessary to solve the singular value of the convolution kernel (weight matrix) W in the convolution network to obtain the spectral norm of each layer of the parameter matrix, and in this process, a "power iteration method" is used to approximate the solution, and the iteration process is shown as steps 2.2 and 2.3 in fig. 4. After the spectral norm is found, the parameters on each parameter matrix are divided by it for normalization purposes. The specific flow of the algorithm is shown in fig. 4.
A spectral normalization layer is added to the discriminator so that the discriminator meets the lipschitz continuity. The spectral normalization used in the embodiments of the present invention is consistent with existing spectral normalization techniques.
S5: in addition to optimizing the generator and the discriminator by adopting the countermeasure loss, the invention also adopts the label loss to restrict the consistency between the generated potential codes and the image labels, and meanwhile, in order to ensure the similarity between the reconstructed image and the input image, the embodiment of the invention avoids the pixel-level loss in the process of reconstructing the image by adopting the reconstruction loss. The generator trains by countering losses, label losses and reconstruction losses; the discriminator is trained by countering the loss and the loss of the label.
S5-1: to combat losses; because the embodiment of the invention provides that the forged image detection model is a detection model based on the GANs, the embodiment of the invention continues to use the cross entropy loss in the original GANs, and the loss function can maximize the discrimination of the discriminator and minimize the difference between the output of the generator and the real data. Against loss LGANThe calculation formula of (c) is as follows.
Figure BDA0003506191280000091
Where x denotes the input spectrogram and D and G denote the discriminator and generator, respectively.
S5-2: loss of labels; in order to constrain the differences between the potential encoding obtained by the encoder encoding in the generator and the image's true label, thereby enabling the reconstructed image to be more consistent with the input image. In addition, cross entropy loss between the output of an encoder in the discriminator and the label is calculated, so that the discriminator is assisted to better discriminate the authenticity of the input image. The calculation formula of the label loss is as follows.
Llabel_G=|l-GE(x)|;
Llabel_D=-[l log(DE(x))+(1-l)log(1-DE(x))];
Wherein l denotes a label of the input image, GEAnd DERepresenting the encoder part of the generator and the encoder part of the discriminator, respectively.
S5-3: in order to enable the generator reconstructed image to highly restore the input image, the embodiment of the invention also provides a reconstruction loss, and the loss function calculates the pixel-level loss between the reconstructed image G (x) and the input image x and uses LreconstructionThe loss expression can be calculated by the following formula:
Figure BDA0003506191280000101
where m, n denotes the size of the image, where m-n-256. The total loss function is a weighting of the above loss functions, expressed as:
Ltotal=LGANlabel(Llabel_G+Llabel_D)+λrecLreconstruction
wherein λlabel、λrecRepresenting the weights of the tag loss and reconstruction loss, respectively.
Example two
As shown in fig. 5, a method for detecting a counterfeit image based on frequency domain information and generation of a countermeasure network includes:
step 100: acquiring a target image; the target image is a face image or a natural scene image.
Step 200: and converting the target image from an image space to a frequency domain space to obtain a target spectrogram.
Step 300: inputting the target spectrogram into a forged image detection model to determine a judgment result of the target image; the discrimination result includes a real image and a forged image.
Wherein a network structure of the forged image detection model generates a countermeasure network; the generation countermeasure network includes a generator and a discriminator; the discriminator adopts a U-Net structure.
The step 200 specifically includes:
performing discrete Fourier transform on the target image to obtain an initial spectrogram; and carrying out centering processing on the initial spectrogram to obtain a target spectrogram.
The generator comprises a first encoder and a first decoder; the discriminator comprises a second encoder and a second decoder; the output end of the first encoder is connected with the input end of the first decoder; the output end of the first decoder is connected with the input end of the second encoder; the first output end of the second encoder is used for outputting sample image classes in the model training process; a second output terminal of the second encoder is connected to an input terminal of the second decoder.
Further, the first encoder is configured to extract potential encodings of the target spectrogram; the first decoder is configured to reconstruct the target spectrogram based on a potential encoding of the target spectrogram, to obtain a reconstructed image. The first encoder comprises five first convolution modules, wherein the first four first convolution modules comprise a convolution layer, a batch layer and an LReLU active layer, and the last first convolution module comprises a convolution layer, a batch layer and a ReLU active layer; the first decoder comprises five and identical first deconvolution modules; the first deconvolution module includes a deconvolution layer and a ReLU activation layer.
A second output of the second encoder is for outputting a potential encoding of the reconstructed image;
the output end of the second decoder is used for outputting image pixel-by-pixel judgment result values; the image pixel-by-pixel discrimination result value is the discrimination result of the target image; wherein the second encoder comprises at least five second convolution modules, wherein a first one of the second convolution modules comprises a two-dimensional convolution layer and a spectral normalization layer, and the last four second convolution modules comprise a two-dimensional convolution layer, a spectral normalization layer and a ReLU activation layer; the second decoder comprises at least five and identical second deconvolution modules; the second deconvolution module includes a deconvolution layer and a ReLU activation layer.
In one example, the loss function of the forged image detection model is a composite loss function;
the composite loss function includes a countering loss sub-function, a first label loss sub-function, a second label loss sub-function, and a reconstruction sub-function.
The countermeasure loss sub-function is a cross entropy loss function; the first label loss sub-function is used for representing difference values between potential codes of the sample spectrogram and real labels of the sample image; the real label of the sample image is a label corresponding to the sample spectrogram; the sample spectrogram is obtained by converting the sample image from an image space to a frequency domain space; the second label loss sub-function is used for representing a difference value between a potential code of the sample reconstructed image and a real label of the sample image; the sample reconstructed image is a potentially encoded reconstructed image based on the sample spectrogram; the reconstruction loss sub-function is used to represent pixel-level loss values between the sample reconstructed image and the sample spectrogram.
EXAMPLE III
As shown in fig. 6, the present embodiment provides a forged image detection system based on frequency domain information and generation countermeasure network, including:
a data acquisition module 400 for acquiring a target image; the target image is a face image or a natural scene image.
A processing module 500, configured to convert the target image from an image space to a frequency domain space to obtain a target spectrogram.
A category result determining module 600, configured to input the target spectrogram into a counterfeit image detection model to determine a determination result of the target image; the discrimination result includes a real image and a forged image.
Wherein a network structure of the forged image detection model generates a countermeasure network; the generation countermeasure network includes a generator and a discriminator; the discriminator adopts a U-Net structure.
The processing module 500 specifically includes:
and the transformation unit is used for carrying out discrete Fourier transformation on the target image to obtain an initial spectrogram.
And the centralization treatment is used for centralizing the initial spectrogram to obtain a target spectrogram.
The generator comprises a first encoder and a first decoder; the discriminator comprises a second encoder and a second decoder; the output end of the first encoder is connected with the input end of the first decoder; the output end of the first decoder is connected with the input end of the second encoder; the first output end of the second encoder is used for outputting sample image classes in the model training process; a second output terminal of the second encoder is connected to an input terminal of the second decoder.
The first encoder is used for extracting potential codes of the target spectrogram; the first decoder is configured to reconstruct the target spectrogram based on a potential encoding of the target spectrogram, so as to obtain a reconstructed image; the first encoder comprises five first convolution modules, wherein the first four first convolution modules comprise a convolution layer, a batch layer and an LReLU active layer, and the last first convolution module comprises a convolution layer, a batch layer and a ReLU active layer; the first decoder comprises five and identical first deconvolution modules; the first deconvolution module includes a deconvolution layer and a ReLU activation layer.
A second output of the second encoder is for outputting a potential encoding of the reconstructed image; the output end of the second decoder is used for outputting image pixel-by-pixel judgment result values; the image pixel-by-pixel discrimination result value is the discrimination result of the target image; wherein the second encoder comprises at least five second convolution modules, wherein a first one of the second convolution modules comprises a two-dimensional convolution layer and a spectral normalization layer, and the last four second convolution modules comprise a two-dimensional convolution layer, a spectral normalization layer and a ReLU activation layer; the second decoder comprises at least five and identical second deconvolution modules; the second deconvolution module includes a deconvolution layer and a ReLU activation layer.
Example four
Referring to the architecture of fig. 7, the method for detecting a counterfeit image based on frequency domain information and production countermeasure network provided by the embodiment includes the following steps:
s1: model training and validation was performed using the counterfeit image dataset provided by Wang et al, as shown in fig. 8. The data set contains 11 sets of counterfeit images obtained by different CNNs-based generative models. In this embodiment, only 8 counterfeit image data sets are used for model verification, including StyleGAN, BigGAN, CycleGAN, Star-GAN, CRN, IMLE, StyleGAN2, and GauGAN.
S2: the first 7 kinds of data sets in S1 are used not only for verifying the detection performance of the detection method on the individual data sets, but also for performance verification on the mixed data set. The ratio of the real image to the counterfeit image for each data set is 1: 1. In the selected data set, 80% of the data was used for training and the remaining 20% was used to verify the validity of the proposed model.
S3: the key parameters in the test are set as follows: the image size used for training was 256 × 256, and the learning rate was 0. 0002, 100 epochs training times and 1 batch size.
EXAMPLE five
In order to fully verify the effectiveness of the proposed method of the present invention, models are proposed on different counterfeit image data sets, and the present embodiment performs an ablation experiment according to the setting of the first embodiment on whether a frequency domain conversion module is provided.
S1: a comparison of experimental results with or without frequency domain conversion and SN operation in the proposed method is shown in tables 1 to.
TABLE 1 model Accuracy (ACC) Compare sheet with or without frequency domain conversion and SN operation
Figure BDA0003506191280000141
TABLE 2 model precision (P) Compare Table with or without frequency domain conversion and SN operation
Figure BDA0003506191280000142
TABLE 3 model recall ratio (R) comparison Table with or without frequency domain conversion and SN operation
Figure BDA0003506191280000143
Table 4 model F1 value-to-value comparison table with or without frequency domain conversion and SN operation
Figure BDA0003506191280000144
S2: after the frequency domain conversion and the SN operation are added, the model detection accuracy ACC, the precision ratio P, the recall ratio R and the F1 are generally improved, and particularly for three models including StyleGAN, BigGAN and StarGAN, when the frequency domain conversion and the SN operation are not carried out, the accuracy ACC and the precision ratio P are extremely low, which shows that the detection model has almost any effect on forged images generated by the models.
S3: after the frequency domain conversion and SN operation are added, the detection accuracy rate ACC and the precision ratio P are greatly improved, and meanwhile, the recall ratio R is kept at a better level.
S4: this embodiment shows that, in the frequency domain, defects existing in a forged image obtained by generating a model are more easily detected.
EXAMPLE six
In order to fully verify the effectiveness of the method provided by the invention, the present embodiment performs an ablation experiment according to the setting of the first embodiment on whether the discriminator structure in the network structure adopts a U-Net structure.
S1: the experimental results are visually shown in fig. 9-12 in the form of graphs.
S2: the adoption of the U-Net structure improves the performance of model detection to a certain extent, wherein the values of the model detection accuracy rate ACC, the precision rate P, the recall rate R and the F1 are respectively and averagely improved by 3.698%, 2.039%, 6.064% and 4.241%.
S3: this example shows that the encoder-decoder structure employed in the proposed model is more suitable for counterfeit image detection than the common discriminator structure.
EXAMPLE seven
In order to fully verify the effectiveness of the method provided by the invention, the embodiment performs model training on a mixed data set, and then performs independent training on different data sets to perform experiments.
S1: 4000 images were shared in the mixed dataset, with the ratio of real and counterfeit images being 1: 1. the mixed data set contains in total forged images produced by five different generative models: IMLE, StyleGAN2, CRN, GauGAN, 8000 images each, wherein the real image and the forged image each account for 50%
S2: to fully demonstrate the generalization performance of the detection model presented herein, the experimental results were comprehensively analyzed, and the results are shown in table 5 below.
Table 5 results show table
Figure BDA0003506191280000151
Figure BDA0003506191280000161
S3: the embodiment shows that the method can obtain better detection performance on a mixed data set, the average detection accuracy can reach 92.96%, and the detection generalization of the method provided by the invention is further shown.
Example eight
In order to fully verify the effectiveness of the proposed method, this example compares the method of the present invention with the existing partially advanced method.
The following table shows experimental comparative analysis of the process of the invention with some of the currently available advanced processes:
TABLE 6 comparative analysis table
Figure BDA0003506191280000162
Wherein the bold font portion represents the optimal value of each index. The comparison method comprises six comparison algorithms of inclusion, Resnet50, Xclipping, Mesonet-inclusion and EfficientNet. The data in the table show that the method can show excellent detection performance on most data sets, and compared with a counterfeit image detection model for comparison, the method achieves better detection accuracy rate although the detection performance is slightly lower than that of an inclusion and EfficientNet model on two data sets of CRN and StyleGAN; in general, the method of the present invention has good detection performance.
The invention has the following beneficial effects:
firstly, the invention introduces the GANs into the field of counterfeit image detection by utilizing the game idea of the GANs, and provides a novel counterfeit natural scene image detection method.
And secondly, introducing a U-Net structure, namely an encoder-decoder into the discriminator, wherein the encoder is used for image classification, and the decoder is used for distinguishing image pixels, so that the discrimination capability of the discriminator is enhanced.
Thirdly, the invention designs a composite loss function to better optimize the model.
Fourthly, the model performance is analyzed by using various evaluation indexes, and the method disclosed by the invention can obtain good detection performance of the forged image.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. A method for detecting a forged image based on frequency domain information and a generation countermeasure network is characterized by comprising the following steps:
acquiring a target image; the target image is a face image or a natural scene image;
converting the target image from an image space to a frequency domain space to obtain a target spectrogram;
inputting the target spectrogram into a forged image detection model to determine a judgment result of the target image; the discrimination result comprises a real image and a forged image;
wherein a network structure of the forged image detection model generates a countermeasure network; the generation countermeasure network includes a generator and a discriminator; the discriminator adopts a U-Net structure.
2. A method for detecting a counterfeit image based on frequency domain information and generation countermeasure network according to claim 1, wherein the converting the target image from image space to frequency domain space to obtain the target spectrogram specifically comprises:
performing discrete Fourier transform on the target image to obtain an initial spectrogram;
and carrying out centering processing on the initial spectrogram to obtain a target spectrogram.
3. A method for detecting a counterfeit image based on frequency domain information and generation countermeasure network as claimed in claim 1, wherein the generator comprises a first encoder and a first decoder; the discriminator comprises a second encoder and a second decoder;
the output end of the first encoder is connected with the input end of the first decoder; the output end of the first decoder is connected with the input end of the second encoder; the first output end of the second encoder is used for outputting sample image classes in the model training process; a second output terminal of the second encoder is connected to an input terminal of the second decoder.
4. A method for detecting a counterfeit image based on frequency domain information and generation countermeasure network as claimed in claim 3, wherein the first encoder is configured to extract a potential code of the target spectrogram; the first decoder is configured to reconstruct the target spectrogram based on a potential encoding of the target spectrogram, so as to obtain a reconstructed image;
the first encoder comprises five first convolution modules, wherein the first four first convolution modules comprise a convolution layer, a batch layer and an LReLU active layer, and the last first convolution module comprises a convolution layer, a batch layer and a ReLU active layer;
the first decoder comprises five and identical first deconvolution modules; the first deconvolution module includes a deconvolution layer and a ReLU activation layer.
5. A method for detecting a counterfeit image based on frequency domain information and generation countermeasure network according to claim 4,
a second output of the second encoder is for outputting a potential encoding of the reconstructed image;
the output end of the second decoder is used for outputting image pixel-by-pixel judgment result values; the image pixel-by-pixel discrimination result value is the discrimination result of the target image;
wherein the second encoder comprises at least five second convolution modules, wherein a first one of the second convolution modules comprises a two-dimensional convolution layer and a spectral normalization layer, and the last four second convolution modules comprise a two-dimensional convolution layer, a spectral normalization layer and a ReLU activation layer;
the second decoder comprises at least five and identical second deconvolution modules; the second deconvolution module includes a deconvolution layer and a ReLU activation layer.
6. The method for detecting the forged image based on the frequency domain information and the generated countermeasure network as claimed in claim 5, wherein the loss function of the forged image detection model is a composite loss function;
the composite loss function comprises a countering loss sub-function, a first label loss sub-function, a second label loss sub-function and a reconstruction sub-function;
the countermeasure loss sub-function is a cross entropy loss function;
the first label loss sub-function is used for representing difference values between potential codes of the sample spectrogram and real labels of the sample image; the real label of the sample image is a label corresponding to the sample spectrogram; the sample spectrogram is obtained by converting the sample image from an image space to a frequency domain space;
the second label loss sub-function is used for representing a difference value between a potential code of the sample reconstructed image and a real label of the sample image; the sample reconstructed image is a potentially encoded reconstructed image based on the sample spectrogram;
the reconstruction loss sub-function is used to represent pixel-level loss values between the sample reconstructed image and the sample spectrogram.
7. A system for detecting counterfeit images based on frequency domain information and generation countermeasure networks, comprising:
the data acquisition module is used for acquiring a target image; the target image is a face image or a natural scene image;
the processing module is used for converting the target image from an image space to a frequency domain space to obtain a target spectrogram;
the category result determining module is used for inputting the target spectrogram into a forged image detection model so as to determine a judgment result of the target image; the discrimination result comprises a real image and a forged image;
wherein a network structure of the forged image detection model generates a countermeasure network; the generation countermeasure network includes a generator and a discriminator; the discriminator adopts a U-Net structure.
8. A system for detecting a counterfeit image based on frequency domain information and generation countermeasure network according to claim 7, wherein the processing module specifically comprises:
the transformation unit is used for carrying out discrete Fourier transformation on the target image to obtain an initial spectrogram;
and the centralization treatment is used for centralizing the initial spectrogram to obtain a target spectrogram.
9. A forged image detection system based on frequency domain information and generation countermeasure network according to claim 7, wherein said generator comprises a first encoder and a first decoder; the discriminator comprises a second encoder and a second decoder;
the output end of the first encoder is connected with the input end of the first decoder; the output end of the first decoder is connected with the input end of the second encoder; the first output end of the second encoder is used for outputting sample image classes in the model training process; a second output terminal of the second encoder is connected to an input terminal of the second decoder.
10. A system for detecting a counterfeit image based on frequency-domain information and generation countermeasure network as claimed in claim 9, wherein the first encoder is configured to extract a potential encoding of the target spectrogram; the first decoder is configured to reconstruct the target spectrogram based on a potential encoding of the target spectrogram, so as to obtain a reconstructed image;
the first encoder comprises five first convolution modules, wherein the first four first convolution modules comprise a convolution layer, a batch layer and an LReLU active layer, and the last first convolution module comprises a convolution layer, a batch layer and a ReLU active layer;
the first decoder comprises five and identical first deconvolution modules; the first deconvolution module comprises a deconvolution layer and a ReLU activation layer;
a second output of the second encoder is for outputting a potential encoding of the reconstructed image;
the output end of the second decoder is used for outputting image pixel-by-pixel judgment result values; the image pixel-by-pixel discrimination result value is the discrimination result of the target image;
wherein the second encoder comprises at least five second convolution modules, wherein a first one of the second convolution modules comprises a two-dimensional convolution layer and a spectral normalization layer, and the last four second convolution modules comprise a two-dimensional convolution layer, a spectral normalization layer and a ReLU activation layer;
the second decoder comprises at least five and identical second deconvolution modules; the second deconvolution module includes a deconvolution layer and a ReLU activation layer.
CN202210139691.8A 2022-02-16 2022-02-16 Forged image detection method and system based on frequency domain information and generation countermeasure network Pending CN114495239A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210139691.8A CN114495239A (en) 2022-02-16 2022-02-16 Forged image detection method and system based on frequency domain information and generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210139691.8A CN114495239A (en) 2022-02-16 2022-02-16 Forged image detection method and system based on frequency domain information and generation countermeasure network

Publications (1)

Publication Number Publication Date
CN114495239A true CN114495239A (en) 2022-05-13

Family

ID=81480051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210139691.8A Pending CN114495239A (en) 2022-02-16 2022-02-16 Forged image detection method and system based on frequency domain information and generation countermeasure network

Country Status (1)

Country Link
CN (1) CN114495239A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757342A (en) * 2022-06-14 2022-07-15 南昌大学 Electronic data information evidence-obtaining method based on confrontation training

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757342A (en) * 2022-06-14 2022-07-15 南昌大学 Electronic data information evidence-obtaining method based on confrontation training
CN114757342B (en) * 2022-06-14 2022-09-09 南昌大学 Electronic data information evidence-obtaining method based on confrontation training

Similar Documents

Publication Publication Date Title
Zhong et al. An end-to-end dense-inceptionnet for image copy-move forgery detection
Mi et al. GAN-generated image detection with self-attention mechanism against GAN generator defect
CN111968193B (en) Text image generation method based on StackGAN (secure gas network)
Nguyen et al. Learning spatio-temporal features to detect manipulated facial videos created by the deepfake techniques
Jia et al. Inconsistency-aware wavelet dual-branch network for face forgery detection
CN112257741B (en) Method for detecting generative anti-false picture based on complex neural network
CN111797702A (en) Face counterfeit video detection method based on spatial local binary pattern and optical flow gradient
CN111476727B (en) Video motion enhancement method for face-changing video detection
Yu et al. Augmented multi-scale spatiotemporal inconsistency magnifier for generalized DeepFake detection
Wei et al. Universal deep network for steganalysis of color image based on channel representation
Yang et al. Learning to disentangle gan fingerprint for fake image attribution
CN116630183A (en) Text image restoration method based on generated type countermeasure network
CN116958637A (en) Training method, device, equipment and storage medium of image detection model
Yin et al. Dynamic difference learning with spatio-temporal correlation for deepfake video detection
CN114495239A (en) Forged image detection method and system based on frequency domain information and generation countermeasure network
Tan et al. Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection
Zhao et al. TAN-GFD: generalizing face forgery detection based on texture information and adaptive noise mining
CN116434759B (en) Speaker identification method based on SRS-CL network
Han et al. FCD-Net: Learning to detect multiple types of homologous deepfake face images
CN116721176A (en) Text-to-face image generation method and device based on CLIP supervision
Lai et al. Generative focused feedback residual networks for image steganalysis and hidden information reconstruction
CN116895100A (en) Knowledge distillation depth counterfeiting detection method and system based on space-frequency feature fusion
CN113850284B (en) Multi-operation detection method based on multi-scale feature fusion and multi-branch prediction
CN114898137A (en) Face recognition-oriented black box sample attack resisting method, device, equipment and medium
Wu et al. Learning domain-invariant representation for generalizing face forgery detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination