CN111723714A

CN111723714A - Method, device and medium for identifying authenticity of face image

Info

Publication number: CN111723714A
Application number: CN202010527530.7A
Authority: CN
Inventors: 殷国君; 邵婧
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Priority date: 2020-06-10
Filing date: 2020-06-10
Publication date: 2020-09-29
Anticipated expiration: 2040-06-10
Also published as: JP2022553768A; JP7251000B2; WO2021249006A1; CN111723714B

Abstract

The application discloses a method, a device and a medium for identifying authenticity of a face image. The method comprises the following steps: acquiring a first face image; performing frequency domain transformation on the first face image to obtain a first spectrogram; respectively carrying out multiple times of filtering processing on the first spectrogram to obtain a plurality of second spectrograms; obtaining input data according to the plurality of second spectrogram; and determining the authenticity of the first face image according to the input data.

Description

Method, device and medium for identifying authenticity of face image

Technical Field

The application relates to the technical field of image recognition, in particular to a method, a device and a medium for recognizing the authenticity of a face image.

Background

As machine learning and computer vision technologies advance, more and more face-forgery technologies are emerging. Face forgery technology can be used to realistically replace faces or modify facial expressions, mouth shapes, and the like. For example, a face in a video may be replaced with a face in B by face-forgery techniques.

However, such face-forging techniques can greatly infringe the portrait rights and reputation rights of others. In order to recognize face image forgery, currently, frequency domain information of an image is widely used to recognize whether a face image is forged or not. For example, an image is subjected to Discrete Cosine Transform (DCT), frequency domain information of the image is extracted, edges and textures of the image are analyzed from the frequency domain information, and when the edges or textures are abnormal, the image is determined to be forged. However, for some low quality images, such as compressed images, it is not completely certain that the image is counterfeit in the case of determining edge or texture anomalies. Therefore, the existing methods for identifying whether a face image is forged or not have low accuracy.

Disclosure of Invention

The embodiment of the application provides a method, a device and a medium for identifying authenticity of a face image. And filtering the frequency spectrogram through a plurality of groups of filters to obtain a plurality of frequency band information, thereby improving the accuracy of identifying the authenticity of the face image.

In a first aspect, an embodiment of the present application provides a method for identifying authenticity of a face image, including:

acquiring a first face image;

performing frequency domain transformation on the first face image to obtain a first spectrogram;

respectively carrying out multiple times of filtering processing on the first spectrogram to obtain a plurality of second spectrograms;

obtaining input data according to the plurality of second spectrogram;

and determining the authenticity of the first face image according to the input data.

In some possible embodiments, the frequency domain transform comprises at least one of: global frequency domain transforms and local frequency domain transforms.

In some possible embodiments, in a case where the frequency-domain transform includes the global frequency-domain transform, the obtaining the input data according to the plurality of second spectrograms includes:

performing inverse frequency domain transformation on each second spectrogram to obtain a plurality of second images, wherein the inverse frequency domain transformation is an inverse process of the global frequency domain transformation;

and splicing the plurality of second images to obtain the input data.

In some possible embodiments, the number of first spectrogram comprises one or more in case the frequency-domain transform comprises the partial frequency-domain transform;

the filtering the first spectrogram for multiple times to obtain multiple second spectrograms comprises:

and respectively carrying out multiple times of filtering processing on the first spectrogram to obtain a plurality of second spectrograms corresponding to each first spectrogram.

In some possible embodiments, in a case that the number of the first spectrogram includes a plurality, the obtaining the input data according to the plurality of second spectrograms includes:

determining the energy of each second spectrogram;

taking each first spectrogram as a first target spectrogram, and obtaining a feature vector corresponding to the first target spectrogram according to the energy of a plurality of second spectrograms corresponding to the first target spectrogram;

and splicing the feature vectors corresponding to the plurality of first target spectrograms to obtain the input data.

In some possible embodiments, the determining the authenticity of the first face image according to the input data includes:

performing feature extraction on the input data to obtain a first feature map;

and determining the authenticity of the first face image according to the first feature map.

In some possible embodiments, in a case that the frequency-domain transform includes the entire frequency-domain transform and the partial frequency-domain transform, the entire frequency-domain transform obtains a first spectrogram, the partial frequency-domain transform obtains one or more first spectrograms, and the filtering processes are respectively performed on the first spectrograms for multiple times to obtain multiple second spectrograms, where the filtering processes include:

performing multiple filtering processing on a first spectrogram obtained by global frequency domain transformation to obtain a plurality of second spectrograms corresponding to the first spectrogram;

and performing multiple filtering processing on one or more first spectrograms obtained by local frequency domain transformation to obtain a plurality of second spectrograms corresponding to each first spectrogram.

In some possible embodiments, the obtaining the input data according to the plurality of second spectrogram comprises:

performing inverse frequency domain transformation on each of a plurality of second spectrograms corresponding to the global frequency domain transformation to obtain a plurality of second images, wherein the inverse frequency domain transformation is an inverse process of the global frequency domain transformation;

splicing the plurality of second images to obtain first input data;

under the condition that the number of the first spectrograms obtained by the frequency domain transformation is multiple, taking each first spectrogram obtained by the frequency domain transformation as a first target spectrogram, and determining the energy of each second spectrogram in a plurality of second spectrograms corresponding to the first target spectrogram;

obtaining a feature vector corresponding to the first target spectrogram according to the energy of a plurality of second spectrograms corresponding to the first target spectrogram;

and splicing the feature vectors corresponding to the plurality of first target spectrograms to obtain the second input data.

performing cross fusion processing on the first input data and the second input data to obtain a second characteristic diagram and a third characteristic diagram;

and determining the authenticity of the first face image according to the second feature map and the third feature map.

In some possible embodiments, in a case that the number of times of the cross fusion processing is multiple, the cross fusion processing on the first input data and the second input data to obtain a second feature map and a third feature map includes:

performing first cross fusion processing on the first input data and the second input data to obtain a fourth feature map and a fifth feature map;

and taking the fourth feature map and the fifth feature map as input data of the next cross fusion processing, and obtaining the second feature map and the third feature map after carrying out the cross fusion processing for a plurality of times.

In some possible embodiments, the performing a first cross fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map includes:

performing feature extraction on the first input data to obtain a sixth feature map;

performing feature extraction on the second input data to obtain a seventh feature map;

obtaining a first matrix according to the sixth feature map and the seventh feature map, wherein the first matrix is used for representing the correlation between the sixth feature map and the seventh feature map;

obtaining an eighth feature map according to the first matrix and the seventh feature map, and overlapping the eighth feature map and the sixth feature map to obtain a fourth feature map;

and obtaining a ninth feature map according to the first matrix and the sixth feature map, and overlapping the ninth feature map and the seventh feature map to obtain the fifth feature map.

In some possible embodiments, the determining the authenticity of the first face image according to the second feature map and the third feature map includes:

and splicing the second feature map and the third feature map, and determining the authenticity of the first face image according to the spliced feature maps.

In some possible embodiments, the multiple filtering process includes:

performing multiple filtering processing on the first spectrogram through multiple groups of filters, wherein each group of filters corresponds to one filtering processing;

the filtering parameters of each group of filters comprise preset parameters and reference parameters, each group of filters is used for separating frequency band information corresponding to the preset parameters from a first spectrogram, the reference parameters are used for compensating the frequency band information, the frequency band information separated by the filters is different in any two groups, and the frequency band information separated by the filters comprises all the frequency band information in the first spectrogram.

In some possible embodiments, in the process of performing multiple filtering processing on the first spectrogram obtained by the global frequency domain transformation and the local frequency domain transformation through multiple sets of filters, the filtering parameters of each set of filters are different.

In a second aspect, an embodiment of the present application provides an apparatus for identifying authenticity of a face image, including:

an acquisition unit configured to acquire a first face image;

the transformation unit is used for carrying out frequency domain transformation on the first face image to obtain a first frequency spectrogram;

the filtering unit is used for respectively carrying out multiple times of filtering processing on the first spectrogram to obtain a plurality of second spectrograms;

the processing unit is used for obtaining input data according to the plurality of second frequency spectrogram;

the processing unit is further configured to determine authenticity of the first face image according to the input data.

In some possible embodiments, in a case where the frequency-domain transform includes the global frequency-domain transform, in terms of obtaining input data from the plurality of second spectrograms, the processing unit is specifically configured to:

and splicing the plurality of second images to obtain the input data.

in the aspect of performing multiple filtering processing on the first spectrogram to obtain multiple second spectrograms, the filtering unit is specifically configured to:

In some possible embodiments, when the number of the first spectrogram includes a plurality of numbers, the processing unit is specifically configured to obtain the input data according to the plurality of second spectrograms, and:

determining the energy of each second spectrogram;

In some possible embodiments, in determining the authenticity of the first face image according to the input data, the processing unit is specifically configured to:

performing feature extraction on the input data to obtain a first feature map;

In some possible embodiments, in a case that the frequency-domain transform includes the entire frequency-domain transform and the partial frequency-domain transform, the entire frequency-domain transform obtains a first spectrogram, the partial frequency-domain transform obtains one or more first spectrograms, and in terms of performing multiple filtering processes on the first spectrogram to obtain multiple second spectrograms, the filtering unit is specifically configured to:

In some possible embodiments, the input data includes first input data and second input data, and in terms of obtaining the input data according to the plurality of second spectrogram, the processing unit is specifically configured to:

splicing the plurality of second images to obtain first input data;

In some possible embodiments, in the case that the number of times of the cross fusion processing is multiple, in terms of performing the cross fusion processing on the first input data and the second input data to obtain a second feature map and a third feature map, the processing unit is specifically configured to:

In some possible embodiments, in terms of performing a first cross fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map, the processing unit is specifically configured to:

In some possible embodiments, in determining the authenticity of the first face image according to the second feature map and the third feature map, the processing unit is specifically configured to:

In some possible embodiments, the multiple filtering process includes:

In a third aspect, an embodiment of the present application provides an apparatus for identifying authenticity of a face image, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing steps in the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, which stores a computer program, where the computer program makes a computer execute the method according to the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program, the computer being operable to cause a computer to perform the method according to the first aspect.

The embodiment of the application has the following beneficial effects:

it can be seen that, in the embodiment of the present application, multiple sets of filters are used to perform multiple filtering processes on the first spectrogram, so as to obtain multiple second spectrograms. Therefore, the frequency band information of the plurality of second spectrograms is different; the input data is obtained according to the plurality of second frequency spectrograms, so the input data comprises a plurality of frequency band information of the first frequency spectrogram, and the authenticity of the first face image is identified according to the input data, namely the authenticity of the first face image is identified by utilizing the plurality of frequency band information, so that the accuracy of identifying the authenticity of the first face image is improved, and the false identification rate is reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flow chart of a method for identifying authenticity of a face image according to an embodiment of the present application;

fig. 2 is a schematic diagram of a filtering process according to an embodiment of the present disclosure;

fig. 3 is a schematic flowchart illustrating a process of setting a second preset parameter according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of a cross-fusion process provided in an embodiment of the present application;

fig. 5 is a schematic diagram of another method for identifying authenticity of a face image according to an embodiment of the present application;

fig. 6 is a schematic diagram of a global frequency domain transform branch according to an embodiment of the present application;

fig. 7 is a schematic diagram of a partial frequency domain transform branch according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an apparatus for identifying authenticity of a face image according to an embodiment of the present application;

fig. 9 is a block diagram illustrating functional units of an apparatus for identifying authenticity of a face image according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Referring to fig. 1, fig. 1 is a schematic flowchart of a method for identifying authenticity of a face image according to an embodiment of the present application. The method is applied to a device for identifying the authenticity of the face image. The method includes, but is not limited to, the steps of:

101: a first face image is acquired.

102: and carrying out frequency domain transformation on the first face image to obtain a first spectrogram.

The frequency domain transform includes, but is not limited to, one of: DCT, Fourier transform (FFT), Fast Fourier Transform (FFT). This frequency domain transform into DCT is taken as an example in the present application for explanation.

Further, the frequency domain transform includes a global frequency domain transform and a local frequency domain transform. The global frequency domain transformation refers to performing frequency domain transformation on the whole first face image to obtain a first spectrogram. The local frequency domain transformation is to perform frequency domain transformation on a partial region in the first face image to obtain one or more first spectrogram. The partial frequency domain transform is performed by sliding the first face image using a sliding window, and performing frequency domain transform on a partial region framed by sliding the sliding window each time. Therefore, the local frequency domain Transform may also be referred to as Sliding Window Discrete Cosine Transform (SWDCT).

In addition, in practical application, only one region of the first face image may be subjected to frequency domain transformation, and sliding on the first face image by using a sliding window is not needed. The area may be a preset area, an area with more detailed information, or an area with important attention, which is not limited in the present application. Therefore, in the case of performing a local frequency domain transform on the first face image, the number of the obtained first spectrogram may be one or more. In this application, an example of obtaining a plurality of first spectrogram by performing local frequency domain transformation on the first face image is described.

In order to facilitate distinguishing the first spectrogram obtained by global frequency domain transformation and local frequency domain transformation, in the present application, the first spectrogram obtained by local frequency domain transformation is referred to as a first target spectrogram, and it should be noted that the first target spectrogram is substantially identical to the first spectrogram obtained by local frequency domain transformation, and is not repeated here.

When the global frequency domain transform and the local frequency domain transform are performed on the first face image, the global frequency domain transform may be performed first, the local frequency domain transform may be performed first, or the global frequency domain transform and the local frequency domain transform may be performed simultaneously.

103: and respectively carrying out multiple times of filtering processing on the first spectrogram to obtain a plurality of second spectrograms.

For example, in a case where only the first face image is subjected to global frequency domain transform, that is, the first spectrogram is obtained by global frequency domain transform, the plurality of second spectrograms may be obtained by performing multiple filtering processes on the first spectrogram by using multiple sets of filters corresponding to the global frequency domain transform. In this application, the multiple sets of first filters corresponding to the global frequency domain transform are referred to as multiple sets of first filters, and a process of performing multiple filtering processes on the first spectrogram through the multiple sets of filters will be described in detail later, which will not be described herein too much. The first frequency spectrogram is filtered through the plurality of groups of filters, so that second frequency spectrograms of different frequency bands can be obtained, information of different frequency bands in the first frequency spectrogram is contained in subsequently obtained input data, abundant frequency band information is contained in the input data, and the accuracy of authenticity identification of the first face image can be improved.

For example, in the case that only the first face image is subjected to local frequency domain transformation, that is, the first spectrogram is obtained by global frequency domain transformation, multiple filtering processes may be performed on each first target spectrogram by multiple sets of filters corresponding to the local transformation, so as to obtain multiple second spectrograms corresponding to each first target spectrogram. In this application, the multiple sets of filters corresponding to the local frequency domain transformation are referred to as multiple sets of second filters, and a process of performing multiple filtering processes on each first target spectrogram through the multiple sets of second filters will be described in detail later, which will not be described herein too much.

For example, in a case where a global frequency domain transform and a local frequency domain transform are performed on the first face image, that is, the first spectrogram includes a first spectrogram obtained by the global frequency domain transform and a plurality of first spectrograms obtained by the local frequency domain transform, multiple filtering processes need to be performed on the first spectrogram obtained by the global frequency domain transform through the plurality of sets of first filters to obtain a plurality of second spectrograms corresponding to the first spectrogram; and performing multiple filtering processing on each first target spectrogram through the multiple groups of second filters to obtain multiple second spectrograms corresponding to each first target spectrogram. Therefore, in this case, the plurality of second spectrograms include a plurality of second spectrograms obtained by performing multiple filtering processing on the first spectrogram obtained by global frequency domain transformation, and a plurality of second spectrograms obtained by performing multiple filtering processing on each first target spectrogram obtained by local frequency domain transformation.

It should be noted that, in the process of performing multiple filtering processing on the first spectrogram, multiple sets of first filters may perform multiple filtering processing on the first spectrogram obtained by global frequency domain transformation, or multiple sets of second filters may perform multiple filtering processing on each first target spectrogram obtained by local frequency domain transformation; of course, the filtering process may also be performed on the first spectrogram obtained by the global frequency domain transform process and the local frequency domain transform at the same time. The order of filtering is not limited in this application.

In the filtering process, each filtering process may obtain a second spectrogram, and each group of filters corresponds to one filtering process.

104: and obtaining the input data according to the plurality of second spectrogram.

For example, in a case that the plurality of second spectrograms only include a plurality of second spectrograms corresponding to a global frequency domain transform, each of the plurality of second spectrograms may be subjected to an inverse frequency domain transform to obtain a plurality of second images, wherein the inverse frequency domain transform is an inverse process of the global frequency domain transform; then, the plurality of second images are spliced to obtain the input data. In the present application, the input data obtained by the global frequency domain transform is referred to as first input data, and it should be noted that the first input data and the input data obtained by the global frequency domain transform are substantially identical.

For example, in a case where the plurality of second spectrograms only include a plurality of second spectrograms corresponding to the local frequency domain transform, the energy of each second spectrogram may be determined, and the feature vector corresponding to each first target spectrogram is obtained according to the energies of the plurality of second spectrograms corresponding to each first target spectrogram; and then, splicing the feature vectors corresponding to the plurality of first target spectrograms to obtain the input data. In the present application, the input data obtained by the local frequency domain transform is referred to as second input data, and it should be noted that the second input data and the input data obtained by the local frequency domain transform are substantially identical.

When the size of the second input data obtained after the concatenation does not match the size specified by the neural network, it is necessary to perform channel conversion on the second input data obtained after the concatenation so that the size of the second input data matches the size specified by the neural network, and to use the data after the channel conversion as the second input data. The second input data mentioned later are all subjected to corresponding channel conversion to obtain input data matched with the size specified by the neural network.

For example, in a case where the plurality of second spectrograms include a plurality of second spectrograms corresponding to the global frequency-domain transform and a plurality of second spectrograms corresponding to the local frequency-domain transform, then the input data includes the first input data and the second input data, and the manner of obtaining the first input data and the second input data is similar to the above process and will not be described again.

105: and determining the authenticity of the first face image according to the input data.

The determination of the authenticity of the first face image is essentially to determine whether the first face image is an original face image, that is, whether the first face image is replaced, modified, or copied.

In some possible embodiments, in a case where the input data only includes the first input data or the second input data, the input data may be subjected to feature extraction to obtain a first feature map; and determining the authenticity of the first face image according to the first feature map, namely classifying according to the first feature map and determining the authenticity of the first face image.

Therefore, the authenticity of the first face image is identified through the plurality of frequency band information in the first frequency spectrogram, instead of the single frequency band information, and the accuracy of identifying the authenticity of the first face image is improved.

In some possible embodiments, when the input data includes first input data and second input data, the first input data and the second input data need to be subjected to cross fusion processing to obtain a second feature map and a third feature map; and determining the authenticity of the first face image according to the second feature map and the third feature map.

For example, the second feature map and the third feature map may be merged, and the authenticity of the first image may be determined according to the merged feature maps. And performing feature extraction on the spliced feature map to obtain a target feature map, classifying according to the target feature map, and determining the authenticity of the first image.

In addition, the second feature map and the third feature map do not need to be spliced. Illustratively, the second feature map and the third feature map may be pooled simultaneously to obtain a target feature map, which is equivalent to the second feature map and the third feature map being merged in the pooling process; then, the first image is classified according to the target feature map, and the authenticity of the first image is determined.

Therefore, the global frequency domain information and the local frequency domain information of the first face image are subjected to cross fusion processing, so that the second characteristic diagram and the third characteristic diagram which are subjected to cross fusion contain more frequency band information, and the accuracy of identifying the authenticity of the first face image can be improved. Moreover, more detailed frequency band information in the first face image can be extracted through global frequency domain transformation, and the identification precision is further improved; in addition, the global frequency domain transformation process may use a sliding window to frame the first face image. Therefore, the feature vector of each first target spectrogram further includes spatial position information (the position framed by the sliding window in the first face image), so that the second input data includes the spatial position information, and the second input data can be directly input to the neural network for feature extraction.

In some possible embodiments, the number of times of the cross fusion process may be multiple times, and the implementation process of obtaining the two feature maps and the third feature map may be: performing first cross fusion processing on the first input data and the second input data to obtain a fourth feature map and a fifth feature map; and taking the fourth feature map and the fifth feature map as input data of the next cross fusion processing, and obtaining the second feature map and the third feature map after carrying out the cross fusion processing for a plurality of times.

The specific process of the cross-fusion process will be described below as the first cross-fusion process performed on the first input data and the second input data, and the implementation process of the other cross-fusion processes is similar to the first cross-fusion process and will not be described again.

Performing feature extraction on the first input data to obtain a sixth feature map; and performing feature extraction on the second data input data to obtain a seventh feature map. It should be noted that the processes of feature extraction on the first input data and the second input data are not in sequence; and obtaining a first matrix according to the sixth characteristic diagram and the seventh characteristic diagram, wherein the first matrix is used for representing the correlation between the sixth characteristic diagram and the seventh characteristic diagram. Namely, the sixth characteristic diagram and the seventh characteristic diagram are two matrixes in nature, and the first matrix is the cross correlation coefficient between the two matrixes; obtaining an eighth feature map according to the first matrix and the seventh feature map, namely performing dot multiplication on the first matrix and the seventh feature map to obtain the eighth feature map, and overlapping the eighth feature map and the seventh feature map to obtain the fourth feature map; obtaining a ninth feature map according to the first matrix and the sixth feature map, namely performing dot multiplication on the first matrix and the sixth feature map to obtain the ninth feature map; and then, overlapping the ninth characteristic diagram and the seventh characteristic diagram to obtain the fifth characteristic diagram.

The process of performing multiple filtering processes on the first spectrogram in the present application is described in detail below.

First, in the present application, whether the first spectrogram obtained by global frequency domain transformation is subjected to multiple filtering processes or the first target spectrogram obtained by local frequency domain transformation is subjected to multiple filtering processes, multiple sets of filters are used to perform multiple filtering processes on the spectrogram. Therefore, performing multiple filtering processes on the first spectrogram (or the first target spectrogram) includes: the first spectrogram (or the first target spectrogram) is subjected to multiple filtering processing through multiple groups of filters, wherein the filtering parameters of each group of filters comprise preset parameters and reference parameters, the reference parameters are network parameters and are obtained by pre-training a neural network, and how to obtain the reference parameters is described in detail later without being described too much. In addition, each group of filters is configured to separate frequency band information corresponding to the preset parameter from the first spectrogram (or the first target spectrogram), the reference parameter is configured to compensate for the frequency band information, the frequency band information separated by any two groups of filters is different, and the plurality of frequency band information separated by the plurality of groups of filters includes all the frequency band information in the first spectrogram (or the first target spectrogram).

Although the first spectrogram obtained by global frequency domain transformation and the first target spectrogram obtained by local frequency domain transformation are subjected to multiple filtering processes, multiple sets of filters are used to perform multiple filtering processes on the first spectrogram (or the first target spectrogram). In practical applications, the filter parameters are different between the filters used and the number of filters used is different for different frequency domain transforms. The different filtering parameters mean that the preset parameters between the filters are different, or the reference parameters between the filters are different, or the preset parameters and the reference parameters between the filters are different. Of course, in practical applications, in the process of filtering the frequency spectrums obtained by the global frequency domain processing filter and the local frequency domain processing filter, the preset parameters and the reference parameters between the filters are generally set to be different. That is, the preset parameter and the reference parameter are different between the first filter and the second filter, and the number of the first filters and the number of the second filters are also different. Therefore, for the sake of convenience of distinction, the preset parameters and the reference parameters of the first filter may be referred to as first preset parameters and first reference parameters; the preset parameters and the reference parameters of the second filter are referred to as second preset parameters and second reference parameters. The filtering process by the sets of first filters and the sets of second filters is described below.

Exemplarily, first frequency band information of a first spectrogram obtained by global frequency domain transformation is extracted through a first preset parameter of each group of first filters, the first frequency band information is compensated through a first reference parameter, a second spectrogram corresponding to the group of first filters is obtained, that is, third frequency band information in the first spectrogram is extracted through the first reference parameter, and the first frequency band information and the third frequency band information are superimposed to obtain the second spectrogram. In addition, the first preset parameters between any two groups of first filters are different, that is, the first frequency band information extracted by any two groups of first filters is different, and the plurality of first frequency band information extracted by the plurality of groups of filters includes all frequency band information in the first spectrogram, that is, the plurality of first frequency band information are combined, so that all frequency band information in the first spectrogram can be obtained.

In practical application, the first preset parameter and the first reference parameter of each group of first filters may be superimposed, and the superimposed parameters are used to perform filtering processing on the first spectrogram, so as to directly obtain the second spectrogram corresponding to the group of first filters.

The extracting of the first frequency band information through the first preset parameter is to perform a product operation using the first preset parameter and the first spectrogram, filter out part of the frequency band information in the first spectrogram, and reserve part of the frequency band information, where the reserved part of the frequency band information is the first frequency band information.

Wherein the first predetermined parameter is substantially a matrix of the same size as the first spectrogram. Illustratively, the first preset parameter is [0,1/16], and then [0,1/16] indicates that 1/16 at the top left corner of the matrix takes on values of 1, and the other parts take on values of 0. As shown in fig. 2, a black portion in the matrix corresponding to the first preset parameter represents a value of 0, and a white portion represents a value of 1. In addition, the first spectrogram is obtained by performing DCT on the first face image. DCT conversion is carried out on any image, and the upper left corner in the obtained spectrogram is low-frequency information of the image, the middle part is medium-frequency information of the image, and the lower right corner is high-frequency information of the image. Therefore, multiple sets of different first filters can be used to perform multiple filtering processes on the same first spectrogram, so as to obtain multiple different second spectrograms corresponding to the first spectrogram. As shown in fig. 2, assuming that the first preset parameter of the first set of first filters is [0,1/16], performing a dot product on the first preset parameter of the first set of first filters and the first spectrogram, so as to retain frequency band information of a portion 1/16 at the upper left corner of the first spectrogram, that is, retained low-frequency information, and filtering out other frequency band information in the first spectrogram to obtain a second spectrogram corresponding to the first set of first filters, where a white portion in the second spectrogram is the low-frequency information; for another example, as shown in fig. 2, the first preset parameter of the last group of first filters is [1/8,1], so that the dot multiplication performed on the first spectrogram by using the first preset parameter of the group of first filters can reserve frequency band information in the lower right corner 7/8 part of the first spectrogram, i.e., reserve high frequency information, and filter out other frequency band information in the first spectrogram to obtain a second spectrogram corresponding to the group of first filters, where a white part in the second spectrogram is the reserved high frequency information. The subsequent filtering process of the spectrogram by using the filter can refer to the filtering process shown in fig. 2, and will not be described in detail.

Therefore, a plurality of groups of first filters can be designed to filter the first spectrum image, so as to obtain a plurality of second spectrum images with different frequency band information. For example, in the case of separating the low frequency information, the intermediate frequency information, and the high frequency information of the first spectrogram, three sets of first filters may be designed to perform the filtering process. For example, the first preset parameters of the three sets of first filters are respectively

Therefore, the temperature of the molten metal is controlled,

for separating low frequency information in the first spectrogram,

for separating the frequency information in the first spectrogram,

for separating the high frequency information in the first spectrogram. The three sets of first predetermined parameters are only for illustration, and in practical applications, the first spectrogram may also be subjected to equal division filtering, that is, the first predetermined parameters of the three sets of filters are designed to have the same distance therebetween, and then the first predetermined parameters are [0,1/3 ]]，[1/3,2/3]，[2/3,1]。

Therefore, the first preset parameter of each group of first filters can be set in advance according to the frequency band information to be separated. Generally, the first preset parameters are respectively set to

Therefore, the energy of the three second frequency spectrograms obtained after the filtering processing can be ensured to be the same. So as to splice the three second spectrogramsIn the first input data, the energy difference between each layer of data is not too large, namely, the spatial continuity is satisfied, and the subsequent extraction of the characteristics of the first input data is facilitated.

In addition, each set of first filters may include a base filter and an adjustable filter, wherein the filter parameter of the base filter is a first preset parameter of the set of first filters, and the filter parameter of the adjustable filter is a first reference parameter of the set of first filters.

Wherein each set of first filters can be represented by equation (1):

wherein f is_iFor the ith group of first filters in the plurality of groups of first filters,

for the first predetermined parameter of the ith set of first filters, i.e. the base filter,

a first reference parameter for the ith group of first filtering, namely an adjustable filter; and sigma is a compression function and is used for compressing the value of the first reference parameter to a preset range. For example, σ (x) ═ 1-e^x)/(1+e^x) And the value of i is an integer from 1 to N, and N is the number of the multiple groups of first filters.

Since the first reference parameter only compensates for the frequency band information, that is, the range of the frequency band information separated by each group of first filters is determined by the basic filter in each group of first filters. Therefore, after a plurality of groups of first filters are divided in advance, all frequency band information of the first spectrogram can be extracted. Therefore, the compression function σ is used to compress the value of the first reference parameter to [ -1,1] so as to avoid that the value of the first reference parameter is too large, and after the value of the first reference parameter is overlapped with the basic filter, the filter parameter of each group of first filters is finally determined by the first reference parameter, which causes that the range of the extracted frequency band information is determined by the adjustable filter, and may possibly result in that all frequency band information of the first spectrogram may not be extracted.

In conjunction with equation (1), performing multiple filtering processes on the first spectrum can be represented by equation (2):

wherein s is_iThe second spectrogram corresponding to the ith group of first filters in the plurality of groups of first filters is x the first face image, D (x) is global frequency domain transformation, and is dot product operation.

Furthermore, the inverse transformation of the frequency domain into the inverse of the global transformation, in combination with equation (2), the process of obtaining the second image can be expressed by equation (3):

wherein x is the first face image, y_iFor the second image corresponding to the ith set of first filters, D (x) is a global frequency domain transform, D^-1For inverse frequency domain transform, and for dot product operation.

For example, similar to the first filters, second frequency band information corresponding to a second preset parameter in each first target spectrogram can be extracted through each group of second filters, and the second frequency band information extracted by the group of filters is compensated through a second reference parameter of the group of second filters, so as to obtain a second spectrogram corresponding to the second filter. That is, the fourth frequency band information in the first target spectrogram is extracted through the first reference parameter, and the second frequency band information and the fourth frequency band information are superposed to obtain the second spectrogram. Wherein, the second reference parameter is also a network parameter and is obtained by pre-training; in addition, if the second preset parameters of any two groups of second filters are different, the second frequency band information extracted by any two groups of second filters is different; moreover, the plurality of sets of second frequency band information extracted by the second filters include all frequency band information in each first target spectrogram, that is, the plurality of sets of second frequency band information are combined to obtain all frequency band information of each first target spectrogram.

In addition, each set of second filters also includes a base filter and an adjustable filter, wherein the filter parameter of the base filter is a second preset parameter of the set of second filters, and the filter parameter of the adjustable filter is a second reference parameter of the set of second filters.

Wherein each set of second filters can be represented by equation (4):

wherein h is_iFor the ith group of second filters in the plurality of groups of second filters,

for a second predetermined parameter of the ith group of second filters, i.e. the base filter,

and sigma is a compression function for the reference parameter of the ith group of second filters, namely the adjustable filter.

In combination with equation (4), the process of performing multiple filtering processes on each first target spectrum can be represented by equation (5):

wherein, g_iThe ith second spectrogram in the plurality of second spectrograms corresponding to each first target spectrogram, p is an image area obtained by carrying out the p-th frame selection on the first face image, and D (p) is partial frequency domain transformation and is dot multiplication operation.

In some possible embodiments, the second preset parameters of the plurality of sets of second filters are set in advance. For example, the setting of the second preset parameter may be obtained by performing equidistant division along a diagonal line of the first target spectrogram according to the number of the set second filters. As shown in fig. 3, the scale of the first target spectrogram is 4 × 4, and in the case that 8 groups of second filters are provided, the diagonal lines may be moved equidistantly, so as to determine second preset parameters of each group of second filters, and the second preset parameters of the 8 groups of second filters are respectively: [0,1/32], [1/32,1/8], [1/8,9/32], [9/32,1/2], [1/2,23/32], [23/32,28/32], [28/32,31/32], [31/32,1 ].

Of course, in the process of setting the second preset parameter, the equidistant division may not be performed. For example, the diagonal line may be moved in an arithmetic progression manner to obtain the second preset parameter, or other moving manners. The present application does not limit the manner of setting the second preset parameter.

Then, determining the energy of each second spectrogram in a plurality of second spectrograms corresponding to each first target spectrogram to obtain a plurality of energies; and forming the plurality of energies into a feature vector, so as to obtain the feature vector corresponding to each first target spectrogram. Since the frequency band information corresponding to each second spectrogram is different, the feature vector is essentially composed of the energy corresponding to each frequency band in the first target spectrogram.

Wherein the energy of each second spectrogram can be represented by formula (6):

wherein q is_iIs the energy of the ith second spectrogram of the plurality of second spectrograms corresponding to each first target spectrogram,

₁the norm of the matrix, i.e., the sum of absolute values of each element in the matrix, i.e., the energy of the spectrogram is obtained, the value of i is an integer from 1 to M, M is the number of the plurality of second spectrograms, i.e., the number of the second filters, and is the dot product operation.

Wherein, log10 is to make the energy of different frequency bands fall on the same order of magnitude, and prevent some frequency bands from having too high energy, and some frequency bands from having too low energy, which brings inconvenience to the subsequent treatment.

It should be noted that the first filter is configured to perform filtering processing on the first spectrogram of the global frequency domain transform. Therefore, the first filter performs filtering processing on the global frequency band information, and can roughly divide the frequency band information of the first spectrogram. For example, the frequency can be divided into high frequency, medium frequency, and low frequency. Therefore, the number of first filters can be set relatively small; the second filter is used for filtering the first target spectrogram which is locally transformed, so that the second filter is used for filtering local frequency band information and is used for extracting more detailed frequency domain information from the first target spectrogram. Therefore, the frequency band information of the first target spectrogram needs to be finely divided, and the number of the second filters needs to be set relatively large.

In some possible embodiments, the above process of identifying whether the face image is true or false may be implemented by a neural network. The neural network includes a first network and a second network. The training process of the neural network is the existing supervised training and is not described.

Specifically, a first face image is input into a first network to be subjected to frequency domain transformation to obtain a first spectrogram, the first spectrogram is subjected to multiple filtering processing to obtain a plurality of second spectrograms, and input data is obtained according to the plurality of second spectrograms; then, the input data is input into a second network, and the authenticity of the first face is determined.

Illustratively, the first network may be an existing network structure, i.e., an existing neural network that may be frequency domain transformed and filtered. For the purposes of this application, the network parameters of the neural network are simply set to the first reference parameters and the second reference parameters described above; then, in the training process, the first reference parameter and the second reference parameter of the neural network are adjusted, and after the training is finished, the first face image is subjected to frequency domain transformation and multiple times of filtering processing on the first spectrogram through the first network to obtain input data.

In practical application, frequency domain transformation is performed on the first face image, and multiple times of filtering processing are performed on the first spectrogram, which can also be realized through a packaged function. That is, after the training of the first network is completed, the filter parameters of the multiple sets of filters (including the multiple sets of first filters and the multiple sets of second filters) of the first network are packaged as a function, and then the function can be used to directly perform frequency domain transformation and filtering processing on the face image without using the first network.

The following mainly describes the process of determining the authenticity of the first face through the second network in detail.

Illustratively, the second network may be a convolutional neural network. As shown in fig. 4, the second network uses Xception network as the backbone of the convolutional neural network. The second network includes two branches and a plurality of cross-fusion processing modules. Wherein each branch comprises a plurality of network blocks (blocks), each block comprises a plurality of convolutional layers and pooling layers, and the blocks are of an existing network structure and are not described. And the two branches correspond to the first input data and the second input data, respectively.

Under the condition that the input data is first input data, feature extraction can be carried out on the first input data through a first branch, namely feature extraction is carried out on a plurality of blocks of the first branch, and the authenticity of the first face image is directly determined according to the extracted features;

under the condition that the input data is second input data, feature extraction can be carried out on the first input data through a second branch, namely feature extraction is carried out on a plurality of blocks of the second branch, and the authenticity of the first face image is directly determined according to the extracted feature map;

under the condition that the input data comprise first input data and second input data, feature extraction is firstly carried out on the first input data and the second input data through a block of each branch to obtain a sixth feature graph and a seventh feature graph; then, carrying out first cross fusion on the sixth feature map and the seventh feature map through the cross fusion processing module to obtain a fourth feature map and a fifth feature map; and subsequently, taking the fourth feature map and the fifth feature map as input data of next cross fusion processing, and continuing the cross fusion processing until a second feature map and a third feature map corresponding to the two branches are obtained. And finally, splicing the second feature map and the third feature map, and determining the authenticity of the first face image according to the spliced feature map.

Compared with the existing splicing mode of the frequency domain information, the frequency domain information obtained by global frequency domain transformation and local frequency domain transformation is subjected to cross fusion splicing in the application, namely the frequency domain information obtained by the two frequency domain transformations is mutually fused, so that the obtained second feature map and the obtained third feature map both contain the global frequency domain information and the local frequency domain information in the first frequency spectrogram, and the accuracy of identifying the authenticity of the first face image can be improved.

The following describes in detail the process of identifying the authenticity of a face image according to the present application with reference to fig. 5 to 7.

As shown in fig. 5, the first face image is subjected to frequency domain transformation and filtering processing through two branches, that is, the first face image is subjected to global frequency domain transformation and local frequency domain transformation, and a series of filtering processing is performed, so as to obtain first input data corresponding to the branch of the global frequency domain transformation and second input data corresponding to the branch of the local frequency domain transformation; then, inputting the first input data and the second input data into the convolution networks corresponding to the respective branches respectively, performing feature extraction, and performing cross fusion processing on the extracted features to obtain a second feature map and a third feature map of the two branches finally; performing synchronous pooling treatment on the second characteristic diagram and the third characteristic diagram to obtain a target characteristic diagram; and finally, predicting the authenticity of the first face image according to the target feature map so as to determine the authenticity of the first face image.

Fig. 6 is a refinement of this branch of the global frequency domain transform. As shown in fig. 6, first, a DCT transform is performed on a first face image to obtain a first spectrogram; then, multiple times of filtering processing are performed on the first spectrogram through multiple groups of first filters (only three groups of first filters are shown in fig. 6), so as to obtain multiple second spectrograms; and finally, performing frequency domain inverse transformation on each second spectrogram in the plurality of second spectrograms to obtain a plurality of second images, and splicing the plurality of second images to obtain the first input data.

Fig. 7 is a refinement of this branch of the local frequency domain transform. As shown in fig. 7, first, SWDCT transform is performed on the first face image to obtain a plurality of first target spectral maps; then, performing multiple filtering processing on each first target spectrogram through multiple groups of second filters to obtain multiple second spectrograms corresponding to each first target spectrogram; determining the energy of each second spectrogram, and determining a feature vector of each first spectrogram according to the energy of a plurality of second spectrograms corresponding to each first target spectrogram; finally, the eigenvectors corresponding to the multiple first target spectrograms are spliced, and channel conversion is performed on the spliced data (matrix) to obtain second input data with the size specified by the branch convolution network.

The application scenario of the technical scheme is introduced below by combining the method for identifying the authenticity of the face.

In some possible embodiments, in the case that the first face image is a portrait image of a user, the portrait image may be identified based on the technical solution of the present application, and in the case that the portrait image is determined to be a fake image, it is determined that another person has modified the portrait image, which is highly likely to infringe the portrait right of the user, and a behavior of malicious portrait image modification may be blamed.

In some possible embodiments, when the first face image is any one frame of the video to be recognized or a specific frame of the face image, the face image may be recognized based on the technical solution of the present application, and when the face image is determined, that is, the first face image referred to in the present application is a pseudo image, it is determined that another person has modified the video to be recognized, so that a responsibility for a behavior of another person tampering with the video work may be traced.

Referring to fig. 8, fig. 8 is a schematic structural diagram of an apparatus for identifying authenticity of a face image according to an embodiment of the present application. As shown in fig. 8, an apparatus 800 includes a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs including instructions for performing the steps of:

acquiring a first face image;

obtaining input data according to the plurality of second spectrogram;

In some possible embodiments, in a case where the frequency-domain transform includes the global frequency-domain transform, the obtaining the input data from the plurality of second spectrograms is specifically configured to execute instructions for:

and splicing the plurality of second images to obtain the input data.

in terms of performing multiple filtering processes on the first spectrogram to obtain multiple second spectrograms, the program is specifically configured to execute the following instructions:

In some possible embodiments, in a case that the number of the first spectrogram includes a plurality of numbers, in obtaining the input data according to the plurality of second spectrogram, the program is specifically configured to execute the following steps:

determining the energy of each second spectrogram;

In some possible embodiments, the program is specifically adapted to perform the following steps in determining the authenticity of the first face image based on the input data:

performing feature extraction on the input data to obtain a first feature map;

In some possible embodiments, in a case where the frequency-domain transform includes the entire frequency-domain transform and the partial frequency-domain transform, the entire frequency-domain transform obtains a first spectrogram, the partial frequency-domain transform obtains one or more first spectrograms, and in terms of performing multiple filtering processes on the first spectrogram to obtain multiple second spectrograms, the program is specifically configured to execute the following steps:

In some possible embodiments, the program is specifically configured to, in an aspect that the input data includes first input data and second input data, and the input data is obtained according to the plurality of second spectrogram maps, execute the following steps:

splicing the plurality of second images to obtain first input data;

In some possible embodiments, when the number of times of the cross fusion processing is multiple, in terms of performing the cross fusion processing on the first input data and the second input data to obtain the second feature map and the third feature map, the program is specifically configured to execute instructions for:

In some possible embodiments, in the aspect of performing the first cross fusion processing on the first input data and the second input data to obtain the fourth feature map and the fifth feature map, the program is specifically configured to execute the following steps:

In some possible embodiments, in determining the authenticity of the first face image according to the second feature map and the third feature map, the program is specifically configured to execute the following steps:

In some possible embodiments, the program is specifically configured to execute the following steps in the process of executing the filtering processing a plurality of times:

Referring to fig. 9, fig. 9 is a diagram illustrating an apparatus for identifying authenticity of a face image according to an embodiment of the present application. The apparatus 900 comprises: an obtaining unit 910, a transforming unit 920, a filtering unit 930, and a processing unit 940, wherein:

In some possible embodiments, in a case that the frequency-domain transform includes the global frequency-domain transform, in terms of obtaining input data according to the plurality of second spectrograms, the processing unit 940 is specifically configured to:

and splicing the plurality of second images to obtain the input data.

in terms of performing multiple filtering processes on the first spectrogram to obtain multiple second spectrograms, the filtering unit 930 is specifically configured to:

In some possible embodiments, in a case that the number of the first spectrogram includes a plurality of numbers, the processing unit 940 is specifically configured to obtain the input data according to the plurality of second spectrograms, and:

determining the energy of each second spectrogram;

In some possible embodiments, in determining the authenticity of the first face image according to the input data, the processing unit 940 is specifically configured to:

performing feature extraction on the input data to obtain a first feature map;

In some possible embodiments, in a case that the frequency-domain transform includes the entire frequency-domain transform and the partial frequency-domain transform, the entire frequency-domain transform obtains a first spectrogram, the partial frequency-domain transform obtains one or more first spectrograms, and in terms of performing multiple filtering processes on the first spectrogram to obtain multiple second spectrograms, the filtering unit 930 is specifically configured to:

In some possible embodiments, the input data includes first input data and second input data, and in terms of obtaining the input data according to the plurality of second spectrogram, the processing unit 940 is specifically configured to:

splicing the plurality of second images to obtain first input data;

In some possible embodiments, in a case that the number of times of the cross fusion processing is multiple, in terms of performing the cross fusion processing on the first input data and the second input data to obtain a second feature map and a third feature map, the processing unit 940 is specifically configured to:

In some possible embodiments, in terms of performing the first cross fusion processing on the first input data and the second input data to obtain a fourth feature map and a fifth feature map, the processing unit 940 is specifically configured to:

In some possible embodiments, in determining the authenticity of the first face image according to the second feature map and the third feature map, the processing unit 940 is specifically configured to:

In some possible embodiments, in terms of performing multiple filtering processes, the filtering unit 930 is specifically configured to:

The present application further provides a computer storage medium, which stores a computer program, where the computer program is executed by a processor to implement part or all of the steps of any one of the methods for identifying authenticity of a face image as described in the above method embodiments.

Embodiments of the present application also provide a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to make a computer execute part or all of the steps of any method for identifying authenticity of a face image as described in the above method embodiments.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules referred to are not necessarily required in this application.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.

The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for identifying the authenticity of a face image is characterized by comprising the following steps:

acquiring a first face image;

obtaining input data according to the plurality of second spectrogram;

2. The method of claim 1, wherein the frequency domain transform comprises at least one of: global frequency domain transforms and local frequency domain transforms.

3. The method of claim 2, wherein, in the case where the frequency-domain transform comprises the global frequency-domain transform, the deriving input data from the plurality of second spectrograms comprises:

and splicing the plurality of second images to obtain the input data.

4. The method according to claim 2, characterized in that in case the frequency-domain transform comprises the partial frequency-domain transform, the number of first spectrogram comprises one or more;

5. The method of claim 4, wherein in a case that the number of the first spectrogram comprises a plurality, the obtaining the input data according to the plurality of second spectrograms comprises:

determining the energy of each second spectrogram;

6. The method according to any one of claims 1-5, wherein said determining the authenticity of the first face image based on the input data comprises:

performing feature extraction on the input data to obtain a first feature map;

7. The method of claim 2, wherein in a case that the frequency-domain transform includes the full frequency-domain transform and the partial frequency-domain transform, the full frequency-domain transform obtaining a first spectrogram, the partial frequency-domain transform obtaining one or more first spectrograms, and the performing the filtering process on the first spectrograms respectively for multiple times to obtain multiple second spectrograms comprises:

8. The method of claim 7, wherein the input data comprises first input data and second input data, and wherein obtaining the input data from the plurality of second spectrograms comprises:

splicing the plurality of second images to obtain first input data;

9. The method of claim 8, wherein determining the authenticity of the first face image based on the input data comprises:

10. The method according to claim 9, wherein, in a case where the number of times of the cross-fusion processing is multiple, the cross-fusion processing is performed on the first input data and the second input data to obtain a second feature map and a third feature map, and the method includes:

11. The method according to claim 10, wherein the performing a first cross-fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map comprises:

12. The method according to any one of claims 9-11, wherein the determining the authenticity of the first face image based on the second feature map and the third feature map comprises:

13. The method according to any of claims 2-12, wherein the multiple filtering processes comprise:

14. The method of claim 13,

and in the process of carrying out multiple times of filtering processing on the first spectrogram obtained by the global frequency domain transformation and the local frequency domain transformation through multiple groups of filters, the filtering parameters of each group of filters are different.

15. An apparatus for identifying authenticity of a face image, comprising:

an acquisition unit configured to acquire a first face image;

16. An apparatus for identifying the authenticity of a face image, comprising a processor, a memory, a communication interface and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the one or more programs comprising instructions for carrying out the steps of the method of any of claims 1-14.

17. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method according to any one of claims 1-14.