CN111723714B

CN111723714B - Method, device and medium for identifying authenticity of face image

Info

Publication number: CN111723714B
Application number: CN202010527530.7A
Authority: CN
Inventors: 殷国君; 邵婧
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Priority date: 2020-06-10
Filing date: 2020-06-10
Publication date: 2023-11-03
Anticipated expiration: 2040-06-10
Also published as: JP2022553768A; JP7251000B2; CN111723714A; WO2021249006A1

Abstract

The application discloses a method, a device and a medium for identifying authenticity of a face image. The method comprises the following steps: acquiring a first face image; performing frequency domain transformation on the first face image to obtain a first spectrogram; respectively carrying out multiple times of filtering processing on the first spectrogram to obtain a plurality of second spectrograms; obtaining input data according to the plurality of second spectrograms; and determining the authenticity of the first face image according to the input data.

Description

Method, device and medium for identifying authenticity of face image

Technical Field

The application relates to the technical field of image recognition, in particular to a method, a device and a medium for recognizing authenticity of a face image.

Background

With advances in machine learning and computer vision technologies, more and more face forgery technologies are emerging. The face can be realistically changed or facial expression, mouth shape, etc. can be modified by face forging techniques. For example, the face of a in the video may be replaced with the face of B by a face-forging technique.

However, such face forging techniques can greatly infringe the portrait rights and reputation rights of others. In order to identify facial image forgery, currently, frequency domain information of an image is widely used to identify whether a facial image is forgery or not. For example, the image is subjected to a discrete cosine transform (Discrete Cosine Transform, DCT), frequency domain information of the image is extracted, edges and textures of the image are analyzed by the frequency domain information, and in the case of an edge or texture anomaly, the image is determined to be counterfeit. However, for some low quality images, for example, compressed images, in the case of determining edge or texture anomalies, it is not entirely determined that the image is counterfeit. Therefore, the accuracy of existing recognition of whether a face image is falsified is low.

Disclosure of Invention

The embodiment of the application provides a method, a device and a medium for identifying authenticity of a face image. And filtering the spectrogram through a plurality of groups of filters to obtain a plurality of frequency band information, thereby improving the accuracy of identifying the authenticity of the face image.

In a first aspect, an embodiment of the present application provides a method for identifying authenticity of a face image, including:

acquiring a first face image;

performing frequency domain transformation on the first face image to obtain a first spectrogram;

respectively carrying out multiple times of filtering processing on the first spectrogram to obtain a plurality of second spectrograms;

obtaining input data according to the plurality of second spectrograms;

and determining the authenticity of the first face image according to the input data.

In some possible implementations, the frequency domain transform includes at least one of: global frequency domain transforms and local frequency domain transforms.

In some possible embodiments, where the frequency domain transform includes the global frequency domain transform, the obtaining input data according to the plurality of second spectrograms includes:

performing frequency domain inverse transformation on each second spectrogram to obtain a plurality of second images, wherein the frequency domain inverse transformation is an inverse process of the global frequency domain transformation;

And splicing the plurality of second images to obtain the input data.

In some possible implementations, where the frequency domain transform comprises the local frequency domain transform, the number of first spectrograms comprises one or more;

the filtering processing is performed on the first spectrogram for multiple times to obtain multiple second spectrograms, including:

and respectively carrying out multiple times of filtering processing on the first spectrograms to obtain a plurality of second spectrograms corresponding to each first spectrogram.

In some possible embodiments, in a case that the number of the first spectrograms includes a plurality, the obtaining input data according to the plurality of the second spectrograms includes:

determining the energy of each second spectrogram;

taking each first spectrogram as a first target spectrogram, and obtaining a feature vector corresponding to the first target spectrogram according to the energy of a plurality of second spectrograms corresponding to the first target spectrogram;

and splicing the feature vectors corresponding to the plurality of first target spectrograms to obtain the input data.

In some possible embodiments, the determining, according to the input data, authenticity of the first face image includes:

Extracting features of the input data to obtain a first feature map;

and determining the authenticity of the first face image according to the first feature map.

In some possible embodiments, in a case that the frequency domain transform includes the entire frequency domain transform and the local frequency domain transform, the entire frequency domain transform obtains one first spectrogram, the local frequency domain transform obtains one or more first spectrograms, and the filtering processing is performed on the first spectrogram for multiple times to obtain multiple second spectrograms, including:

performing multiple filtering processing on the first spectrogram obtained by the global frequency domain transformation to obtain a plurality of second spectrograms corresponding to the first spectrogram;

and performing multiple filtering processing on one or more first spectrograms obtained by the local frequency domain transformation to obtain a plurality of second spectrograms corresponding to each first spectrogram.

In some possible embodiments, the input data includes first input data and second input data, and the obtaining input data according to the plurality of second spectrograms includes:

performing frequency domain inverse transformation on each of a plurality of second spectrograms corresponding to the global frequency domain transformation to obtain a plurality of second images, wherein the frequency domain inverse transformation is an inverse process of the global frequency domain transformation;

Splicing the plurality of second images to obtain the first input data;

under the condition that the number of the first spectrograms obtained by the frequency domain transformation is a plurality of, taking each first spectrogram obtained by the frequency domain transformation as a first target spectrogram, and determining the energy of each second spectrogram in a plurality of second spectrograms corresponding to the first target spectrogram;

obtaining a feature vector corresponding to the first target spectrogram according to the energy of a plurality of second spectrograms corresponding to the first target spectrogram;

and splicing the feature vectors corresponding to the plurality of first target spectrograms to obtain the second input data.

performing cross fusion processing on the first input data and the second input data to obtain a second characteristic diagram and a third characteristic diagram;

and determining the authenticity of the first face image according to the second characteristic diagram and the third characteristic diagram.

In some possible implementations, in a case that the number of times of the cross-fusion processing is multiple, the cross-fusion processing is performed on the first input data and the second input data to obtain a second feature map and a third feature map, including:

Performing first cross fusion processing on the first input data and the second input data to obtain a fourth characteristic diagram and a fifth characteristic diagram;

and taking the fourth characteristic diagram and the fifth characteristic diagram as input data of the next cross fusion processing, and obtaining the second characteristic diagram and the third characteristic diagram after the cross fusion processing is carried out for a plurality of times.

In some possible implementations, the performing a first cross fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map includes:

extracting features of the first input data to obtain a sixth feature map;

extracting features of the second input data to obtain a seventh feature map;

obtaining a first matrix according to the sixth feature map and the seventh feature map, wherein the first matrix is used for representing the correlation between the sixth feature map and the seventh feature map;

according to the first matrix and the seventh feature map, an eighth feature map is obtained, and the eighth feature map and the sixth feature map are overlapped to obtain the fourth feature map;

and according to the first matrix and the sixth feature map, obtaining a ninth feature map, and superposing the ninth feature map and the seventh feature map to obtain the fifth feature map.

In some possible implementations, the determining the authenticity of the first face image according to the second feature map and the third feature map includes:

and splicing the second characteristic diagram and the third characteristic diagram, and determining the authenticity of the first face image according to the spliced characteristic diagram.

In some possible embodiments, the multiple filtering process includes:

performing multiple times of filtering processing on the first spectrogram through multiple groups of filters, wherein each group of filters corresponds to one time of filtering processing;

the filtering parameters of each group of filters comprise preset parameters and reference parameters, each group of filters is used for separating frequency band information corresponding to the preset parameters from a first spectrogram, the reference parameters are used for compensating the frequency band information, the frequency band information separated by any two groups of filters is different, and the plurality of frequency band information separated by the plurality of groups of filters comprises all the frequency band information in the first spectrogram.

In some possible embodiments, in the process of performing multiple filtering processing on the first spectrogram obtained by the global frequency domain transform and the local frequency domain transform through multiple sets of filters, filtering parameters of each set of filters are different.

In a second aspect, an embodiment of the present application provides an apparatus for identifying authenticity of a face image, including:

the acquisition unit is used for acquiring a first face image;

the transformation unit is used for carrying out frequency domain transformation on the first face image to obtain a first spectrogram;

the filtering unit is used for respectively carrying out multiple times of filtering processing on the first spectrograms to obtain a plurality of second spectrograms;

the processing unit is used for obtaining input data according to the plurality of second spectrograms;

the processing unit is further configured to determine, according to the input data, authenticity of the first face image.

In some possible implementations, in the case that the frequency domain transform includes the global frequency domain transform, the processing unit is specifically configured to, in obtaining input data from the plurality of second spectrograms:

and splicing the plurality of second images to obtain the input data.

in the aspect of performing multiple filtering processing on the first spectrogram to obtain multiple second spectrograms, the filtering unit is specifically configured to:

In some possible embodiments, in a case that the number of the first spectrograms includes a plurality, the processing unit is specifically configured to, according to the plurality of second spectrograms, obtain input data:

determining the energy of each second spectrogram;

In some possible embodiments, the processing unit is specifically configured to, in determining the authenticity of the first face image according to the input data:

Extracting features of the input data to obtain a first feature map;

In some possible embodiments, in a case that the frequency domain transform includes the entire frequency domain transform and the local frequency domain transform, the entire frequency domain transform obtains one first spectrogram, the local frequency domain transform obtains one or more first spectrograms, and the filtering unit is specifically configured to:

In some possible embodiments, the input data includes first input data and second input data, and the processing unit is specifically configured to:

Splicing the plurality of second images to obtain the first input data;

In some possible embodiments, in determining the authenticity of the first face image according to the input data, the processing unit is specifically configured to:

In some possible implementations, when the number of times of the cross-fusion processing is multiple, in performing the cross-fusion processing on the first input data and the second input data to obtain a second feature map and a third feature map, the processing unit is specifically configured to:

In some possible implementations, in performing a first cross-fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map, the processing unit is specifically configured to:

extracting features of the first input data to obtain a sixth feature map;

extracting features of the second input data to obtain a seventh feature map;

In some possible implementations, in determining authenticity of the first face image according to the second feature map and the third feature map, the processing unit is specifically configured to:

In some possible embodiments, the multiple filtering process includes:

In a third aspect, an embodiment of the present application provides an apparatus for identifying authenticity of a face image, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs including instructions for performing the steps in the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program that causes a computer to perform the method according to the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, the computer being operable to cause a computer to perform the method according to the first aspect.

The embodiment of the application has the following beneficial effects:

it can be seen that, in the embodiment of the present application, the first spectrogram is subjected to multiple filtering processing through multiple sets of filters, so as to obtain multiple second spectrograms. Therefore, the frequency band information of the plurality of second spectrograms is different; the input data is obtained according to the plurality of second spectrograms, so that the input data comprises a plurality of frequency band information of the first spectrogram, and the authenticity of the first face image is identified according to the input data, namely the authenticity of the first face image is identified by utilizing the plurality of frequency band information, thereby improving the accuracy of identifying the authenticity of the first face image and reducing the false identification rate.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for identifying authenticity of a face image according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a filtering process according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating a second preset parameter setting process according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a cross-fusion process according to an embodiment of the present application;

fig. 5 is a schematic diagram of another method for identifying authenticity of a face image according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a global frequency domain transform branch according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a local frequency domain transform branch according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a device for identifying authenticity of a face image according to an embodiment of the present application;

fig. 9 is a functional unit composition block diagram of a device for identifying authenticity of a face image according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

Referring to fig. 1, fig. 1 is a flowchart of a method for identifying authenticity of a face image according to an embodiment of the present application. The method is applied to a device for identifying the authenticity of the face image. The method includes, but is not limited to, the steps of:

101: a first face image is acquired.

102: and carrying out frequency domain transformation on the first face image to obtain a first spectrogram.

The frequency domain transform includes, but is not limited to, one of: DCT, fourier transform (Fourier Transformation), fast fourier transform (fast Fourier transform, FFT). In the present application, the frequency domain transform is described as an example of the DCT.

Further, the frequency domain transform includes a global frequency domain transform and a local frequency domain transform. The global frequency domain transformation is to perform frequency domain transformation on the whole first face image to obtain a first spectrogram. The local frequency domain transformation is to perform frequency domain transformation on a partial region in the first face image to obtain one or more first spectrograms. The local frequency domain transformation essentially slides over the first face image using a sliding window, and frequency domain transformation is performed on the partial region framed by each sliding of the sliding window. Thus, the local frequency domain transform may also be referred to as a sliding window discrete cosine transform (Slide Window Discrete Cosine Transform, SWDCT).

In addition, in practical application, only one region of the first face image may be subjected to frequency domain transformation, that is, sliding on the first face image using a sliding window is not required. The area may be a preset area or an area with more detailed information or an area with important attention, which is not limited in the present application. Thus, in the case of performing the local frequency domain transformation on the first face image, the number of the obtained first spectrograms may be one or more. In the present application, an example of obtaining a plurality of first spectrograms is described by performing local frequency domain transformation on the first face image.

In order to facilitate distinguishing the first spectrogram obtained by the global frequency domain transformation and the local frequency domain transformation, in the present application, the first spectrogram obtained by the local frequency domain transformation is referred to as a first target spectrogram, and it should be noted that the first target spectrogram is substantially identical to the first spectrogram obtained by the local frequency domain transformation, and the two are not repeated.

In the case of performing global frequency domain transformation and local frequency domain transformation on the first face image, the global frequency domain transformation may be performed first, the local frequency domain transformation may be performed first, or the global frequency domain transformation and the local frequency domain transformation may be performed simultaneously.

103: and respectively carrying out multiple times of filtering processing on the first spectrograms to obtain a plurality of second spectrograms.

For example, in the case where only the first face image is subjected to global frequency domain transformation, that is, the first spectrogram is obtained by global frequency domain transformation, the plurality of second spectrograms may be obtained by performing multiple filtering processing on the first spectrogram through a plurality of sets of filters corresponding to the global frequency domain transformation. In the present application, a plurality of sets of first filters corresponding to the global frequency domain transform are referred to as a plurality of sets of first filters, and a process of performing a plurality of filtering processes on the first spectrogram by the plurality of sets of filters will be described in detail hereinafter, which will not be described in detail. The first spectrogram is filtered through a plurality of groups of filters, so that second spectrograms with different frequency bands can be obtained, the subsequently obtained input data contains information with different frequency bands in the first spectrogram, the input data also contains abundant frequency band information, and the accuracy of identifying the authenticity of the first face image can be improved.

For example, in the case where only the first face image is subjected to the local frequency domain transformation, that is, the first spectrogram is obtained by the global frequency domain transformation, multiple filtering processes may be performed on each first target spectrogram through multiple sets of filters corresponding to the local transformation, so as to obtain multiple second spectrograms corresponding to each first target spectrogram. In the present application, a plurality of sets of filters corresponding to the local frequency domain transform are referred to as a plurality of sets of second filters, and a process of performing a plurality of filtering processes on each first target spectrogram by the plurality of sets of second filters will be described in detail, which will not be described in detail.

For example, in the case of performing global frequency domain transformation and local frequency domain transformation on the first face image, that is, the first spectrogram includes one first spectrogram obtained by global frequency domain transformation and multiple first spectrograms obtained by local frequency domain transformation, multiple filtering processes are required to be performed on the first spectrogram obtained by global frequency domain transformation through the multiple sets of first filters, so as to obtain multiple second spectrograms corresponding to the first spectrogram; and carrying out multiple times of filtering processing on each first target spectrogram through the multiple groups of second filters to obtain multiple second spectrograms corresponding to each first target spectrogram. In this case, therefore, the plurality of second spectrograms includes a plurality of second spectrograms obtained by performing a plurality of filtering processes on the first spectrogram obtained by the global frequency domain transformation, and a plurality of second spectrograms obtained by performing a plurality of filtering processes on each of the first target spectrograms obtained by the local frequency domain transformation.

In the process of performing multiple filtering processing on the first spectrogram, multiple filtering processing may be performed on the first spectrogram obtained by global frequency domain transformation through multiple groups of first filters, or multiple filtering processing may be performed on each first target spectrogram obtained by local frequency domain transformation through multiple groups of second filters; of course, the filtering process may be performed on the first spectrogram obtained by the global frequency domain transform process and the local frequency domain transform process at the same time. The application is not limited to the order of filtering.

In the filtering process, a second spectrogram can be obtained in each filtering process, and each group of filters corresponds to one filtering process.

104: and obtaining input data according to the plurality of second spectrograms.

For example, in the case that the plurality of second spectrograms only includes a plurality of second spectrograms corresponding to the global frequency domain transformation, performing frequency domain inverse transformation on each of the plurality of second spectrograms to obtain a plurality of second images, where the frequency domain inverse transformation is an inverse process of the global frequency domain transformation; then, the plurality of second images are spliced to obtain the input data. In the present application, the input data obtained by the global frequency domain transform is referred to as first input data, and the first input data is substantially identical to the input data obtained by the global frequency domain transform.

For example, in the case that the plurality of second spectrograms only includes the plurality of second spectrograms corresponding to the local frequency domain transformation, energy of each second spectrogram may be determined, and a feature vector corresponding to each first target spectrogram is obtained according to the energy of the plurality of second spectrograms corresponding to each first target spectrogram; and then, splicing the feature vectors corresponding to the plurality of first target spectrograms to obtain the input data. In the present application, the input data obtained by the local frequency domain transformation is referred to as second input data, and the second input data is substantially identical to the input data obtained by the local frequency domain transformation.

When the size of the second input data obtained after the concatenation is not matched with the size specified by the neural network, it is necessary to perform channel conversion on the second input data obtained after the concatenation so that the size of the second input data matches with the size specified by the neural network, and use the data after the channel conversion as the second input data. The second input data mentioned later are all input data matched with the size specified by the neural network through corresponding channel conversion.

In the case that the plurality of second spectrograms includes a plurality of second spectrograms corresponding to the global frequency domain transform and a plurality of second spectrograms corresponding to the local frequency domain transform, the input data includes the first input data and the second input data, and the manner of obtaining the first input data and the second input data is similar to the above process, which will not be described.

105: and determining the authenticity of the first face image according to the input data.

Determining whether the first face image is authentic is essentially determining whether the first face image is an original face image, i.e., whether the first face image has been replaced, modified, flipped, etc.

In some possible embodiments, in the case that the input data includes only the first input data or the second input data, feature extraction may be performed on the input data to obtain a first feature map; and determining the authenticity of the first face image according to the first feature map, namely classifying according to the first feature map, and determining the authenticity of the first face image.

It can be seen that the authenticity of the first face image is identified through the plurality of frequency band information in the first spectrogram, instead of the single frequency band information, so that the accuracy of identifying the authenticity of the first face image is improved.

In some possible embodiments, in the case that the input data includes first input data and second input data, a cross fusion process is required to be performed on the first input data and the second input data, so as to obtain a second feature map and a third feature map; and determining the authenticity of the first face image according to the second feature map and the third feature map.

For example, the second feature map and the third feature map may be spliced, and the authenticity of the first image may be determined according to the spliced feature map. And extracting the characteristics of the spliced characteristic images to obtain a target characteristic image, classifying according to the target characteristic image, and determining the authenticity of the first image.

In addition, the second feature map and the third feature map may not be spliced. For example, the second feature map and the third feature map may be pooled at the same time to obtain a target feature map, which is equivalent to stitching the second feature map and the third feature map in the pooling process; and classifying according to the target feature map to determine the authenticity of the first image.

It can be seen that the global frequency domain information and the local frequency domain information of the first face image are subjected to cross fusion processing, so that the second feature image and the third feature image after cross fusion contain more frequency band information, and further the accuracy of identifying the authenticity of the first face image can be improved. In addition, the global frequency domain transformation can extract more detailed frequency band information in the first face image, so that the recognition accuracy is further improved; in addition, the global frequency domain transform process frames the first face image using a sliding window. Therefore, the feature vector of each first target spectrogram also includes spatial position information (the position of the sliding window in the first face image), that is, the second input data includes spatial position information, so that the second input data can be directly input into the neural network for feature extraction.

In some possible embodiments, the number of times of the cross-fusion process may be multiple times, and the implementation process for obtaining the two feature map and the third feature map may be: performing first cross fusion processing on the first input data and the second input data to obtain a fourth characteristic diagram and a fifth characteristic diagram; and taking the fourth characteristic diagram and the fifth characteristic diagram as input data of the next cross fusion processing, and obtaining the second characteristic diagram and the third characteristic diagram after the cross fusion processing is carried out for a plurality of times.

The specific process of the cross fusion process is described below by performing the first cross fusion process on the first input data and the second input data, and the implementation process of other cross fusion processes is similar to that of the first cross fusion process, and will not be described.

Extracting features of the first input data to obtain a sixth feature map; and carrying out feature extraction on the second data input data to obtain a seventh feature map. It should be noted that, there is no sequence in the process of extracting the features of the first input data and the second input data; and obtaining a first matrix according to the sixth characteristic diagram and the seventh characteristic diagram, wherein the first matrix is used for representing the correlation between the sixth characteristic diagram and the seventh characteristic diagram. The sixth feature map and the seventh feature map are essentially two matrices, and the first matrix is the cross-correlation coefficient between the two matrices; obtaining an eighth feature map according to the first matrix and the seventh feature map, namely performing point multiplication on the first matrix and the seventh feature map to obtain the eighth feature map, and superposing the eighth feature map and the seventh feature map to obtain the fourth feature map; obtaining a ninth feature map according to the first matrix and the sixth feature map, namely performing dot multiplication on the first matrix and the sixth feature map to obtain the ninth feature map; then, the ninth feature map and the seventh feature map are superimposed to obtain the fifth feature map.

The process of performing the multiple filtering processing on the first spectrogram in the present application is described in detail below.

First, in the present application, whether the first spectrogram obtained by global frequency domain transformation is subjected to multiple filtering processes or the first target spectrogram obtained by local frequency domain transformation is subjected to multiple filtering processes, multiple filtering processes are performed on the spectrogram by using multiple sets of filters. Thus, performing the multiple filtering process on the first spectrogram (or the first target spectrogram) includes: and performing multiple filtering processing on the first spectrogram (or the first target spectrogram) through multiple groups of filters, wherein the filtering parameters of each group of filters comprise preset parameters and reference parameters, the reference parameters are network parameters, the reference parameters are obtained by training a neural network in advance, and detailed description is not made here. In addition, each set of filters is used for separating frequency band information corresponding to the preset parameter from the first spectrogram (or the first target spectrogram), the reference parameter is used for compensating the frequency band information, the frequency band information separated by any two sets of filters is different, and the plurality of frequency band information separated by the plurality of sets of filters comprises all the frequency band information in the first spectrogram (or the first target spectrogram).

Although the multiple filtering processes are performed on the first spectrogram (or the first target spectrogram) obtained by the global frequency domain transformation and the multiple filtering processes are performed on the first spectrogram (or the first target spectrogram) by multiple sets of filters in the process of performing multiple filtering processes on the first spectrogram obtained by the local frequency domain transformation. In practical applications, the filtering parameters are different between the used filters and the number of used filters is also different with respect to different frequency domain transforms. The different filter parameters means that the preset parameters between the filters are different, or the reference parameters between the filters are different, or the preset parameters and the reference parameters between the filters are different. Of course, in practical application, in the process of filtering the spectrum obtained by the global frequency domain processing and the local frequency domain processing filters, the preset parameters and the reference parameters between the filters are generally set to be different. That is, the preset parameter and the reference parameter are different between the first filter and the second filter, and the number of the first filters and the number of the second filters are also different. Therefore, for convenience of distinction, the preset parameter and the reference parameter of the first filter may be referred to as a first preset parameter and a first reference parameter; the preset parameters and the reference parameters of the second filter are referred to as second preset parameters and second reference parameters. The filtering process by the plurality of sets of first filters and the plurality of sets of second filters is described below.

The first frequency band information of the first frequency spectrum graph obtained by the global frequency domain transformation is extracted through a first preset parameter of each group of first filters, the first frequency band information is compensated through the first reference parameter, a second frequency spectrum graph corresponding to the group of first filters is obtained, namely, third frequency band information in the first frequency spectrum graph is extracted through the first reference parameter, and the first frequency band information and the third frequency band information are overlapped to obtain the second frequency spectrum graph. In addition, the first preset parameters between any two groups of first filters are different, that is, the first frequency band information extracted by any two groups of first filters is different, and the plurality of first frequency band information extracted by the plurality of groups of filters includes all frequency band information in the first spectrogram, that is, all frequency band information in the first spectrogram can be obtained by combining the plurality of first frequency band information.

In practical application, the first preset parameters and the first reference parameters of each group of the first filters are overlapped, and the overlapped parameters are used for carrying out filtering treatment on the first spectrogram, so that the second spectrogram corresponding to the group of the first filters is directly obtained.

The first preset parameters are used for carrying out product operation on the first preset parameters and the first spectrogram, partial frequency band information in the first spectrogram is filtered, partial frequency band information is reserved, and the reserved partial frequency band information is the first frequency band information.

Wherein the first predetermined parameter is essentially a matrix of the same size as the first spectrogram. For example, the first preset parameter is [0,1/16], and then [0,1/16] represents that the 1/16 portion of the upper left corner of the matrix has a value of 1, and the other portions have a value of 0. As shown in fig. 2, the black part in the matrix corresponding to the first preset parameter represents a value of 0, and the white part represents a value of 1. In addition, the first spectrogram is obtained by DCT conversion of the first face image. DCT conversion is carried out on any image, and the upper left corner part in the obtained spectrogram is the low frequency information of the image, the middle part is the medium frequency information of the image, and the lower right corner is the high frequency information of the image. Therefore, the same first spectrogram can be subjected to multiple filtering processes through multiple groups of different first filters, and multiple different second spectrograms corresponding to the first spectrogram can be obtained. As shown in fig. 2, assuming that the first preset parameter of the first set of first filters is [0,1/16], performing dot multiplication with the first spectrogram by using the first preset parameter of the first set of first filters, and reserving the frequency band information of the 1/16 partial area of the upper left corner of the first spectrogram, namely, the reserved low frequency information, and filtering out other frequency band information in the first spectrogram to obtain a second spectrogram corresponding to the first set of first filters, wherein the white part in the second spectrogram is the low frequency information; as shown in fig. 2, the first preset parameters of the last group of first filters are [1/8,1], so that the first preset parameters of the group of first filters are used to perform dot product on the first spectrogram, so that the frequency band information of the 7/8 partial area of the right lower corner of the first spectrogram can be reserved, namely, the high frequency information is reserved, other frequency band information in the first spectrogram is filtered, and the second spectrogram corresponding to the group of first filters is obtained, wherein the white part in the second spectrogram is reserved high frequency information. The subsequent filtering of the spectrogram using the filter may be referred to as the filtering process shown in fig. 2, and will not be described in detail.

Therefore, a plurality of groups of first filters can be designed to filter the first frequency spectrum image, so as to obtain a plurality of second spectrograms with different frequency band information. For example, in the case of separating the low frequency information, the intermediate frequency information, and the high frequency information of the first spectrogram, three sets of first filters may be designed to perform the filtering process. For example, the first preset parameters of the three sets of first filters are respectivelyThus (S)>For separating low frequency information in the first spectrogram,/i>For separating the frequency information in the first spectrogram,for separating high frequency information in the first spectrogram. The three sets of first preset parameters are only exemplified, and in practical application, the first spectrogram can also be equally filtered, i.e. the first preset parameters of the three sets of filters are designed to be identical in spacing, and the first preset parameters are [0,1/3 ]]，[1/3,2/3]，[2/3,1]。

Therefore, the first preset parameters of each group of the first filters can be set in advance according to the frequency band information to be separated. Generally, the first preset parameters are respectively set asThus, the energy of the three second spectrograms obtained after the filtering treatment is identical. So that the energy difference between each layer of data in the first input data obtained by splicing the three second spectrograms cannot be reduced And the method is overlarge, namely, the spatial continuity is met, and the subsequent extraction of the characteristics of the first input data is facilitated.

In addition, each set of first filters may include a basic filter and an adjustable filter, where a filtering parameter of the basic filter is a first preset parameter of the set of first filters, and a filtering parameter of the adjustable filter is a first reference parameter of the set of first filters.

Wherein each set of first filters can be represented by formula (1):

wherein f _i For the ith group of first filters of the plurality of groups of first filters,for a first predetermined parameter of the i-th set of first filters, i.e. the basis filter,/->A first reference parameter for the i-th set of first filters, i.e. a tunable filter; sigma is a compression function for compressing the value of the first reference parameter to a preset range. For example, σ (x) = (1-e ^x )/(1+e ^x ) The value of i is an integer of 1-N, and N is the number of the plurality of groups of first filters. />

Since the first reference parameter only compensates the band information, that is, the range of the band information separated by each set of the first filters is determined by the basis filter of each set of the first filters. Therefore, after a plurality of groups of first filters are divided in advance, all frequency band information of the first spectrogram can be extracted. Therefore, the compression function σ is used to compress the value of the first reference parameter to [ -1,1], so as to avoid that the value of the first reference parameter is too large, and after the first reference parameter is overlapped with the basic filter, the filtering parameter of each group of first filters is finally determined by the first reference parameter, so that the range of the extracted frequency band information is determined by the adjustable filter, and therefore, the whole frequency band information of the first spectrogram may not be extracted.

The multiple filtering of the first spectrum in combination with equation (1) can be expressed by equation (2):

wherein s is _i For the second spectrogram corresponding to the ith group of first filters in the multiple groups of first filters, x is the first face image, D (x) is global frequency domain transformation, and is a dot multiplication operation.

Furthermore, if the inverse transform of the frequency domain is the inverse of the global transform, the process of obtaining the second image can be represented by the formula (3) in combination with the formula (2):

wherein x is a first face image, y _i For a second image corresponding to the ith set of first filters, D (x) is a global frequency domain transform, D ^-1 Inverse transform in the frequency domain, point multiplication operation.

For example, similar to the first filters, the second frequency band information corresponding to the second preset parameter in each first target spectrogram may be extracted through each set of second filters, and the second frequency band information extracted by the set of filters may be compensated through the second reference parameters of the set of second filters, so as to obtain a second spectrogram corresponding to the second filters. And extracting fourth frequency band information in the first target spectrogram through the first reference parameter, and overlapping the second frequency band information and the fourth frequency band information to obtain the second spectrogram. Wherein the second reference parameter is also a network parameter and is also obtained by pre-training; in addition, if the second preset parameters of any two groups of second filters are different, the second frequency band information extracted by any two groups of second filters is different; and the plurality of second frequency band information extracted by the plurality of groups of second filters comprises all frequency band information in each first target spectrogram, namely the plurality of second frequency band information is combined to obtain all frequency band information of each first target spectrogram.

In addition, each group of second filters also comprises a basic filter and an adjustable filter, wherein the filtering parameters of the basic filter are second preset parameters of the group of second filters, and the filtering parameters of the adjustable filter are second reference parameters of the group of second filters.

Wherein each set of second filters can be represented by formula (4):

wherein h is _i For the ith set of second filters of the plurality of sets of second filters,for a second predetermined parameter of the i-th set of second filters, i.e. the basis filter,/->For the reference parameters of the i-th set of second filters, i.e. the tunable filters, σ is the compression function.

The process of performing the filtering process multiple times on each first target spectrum in conjunction with equation (4) can be expressed by equation (5):

wherein g _i For the ith second spectrogram in the plurality of second spectrograms corresponding to each first target spectrogram, p is an image area obtained by carrying out p-th frame selection on the first face image, D (p) is partial frequency domain transformation, and the operation is dot multiplication.

In some possible embodiments, the second preset parameters of the plurality of sets of second filters are set in advance. For example, the setting of the second preset parameter may be obtained by equally dividing the second preset parameter along the diagonal line of the first target spectrogram according to the number of the set second filters. As shown in fig. 3, the scale of the first target spectrogram is 4*4, and when 8 sets of second filters are set, the diagonal lines can be equidistantly moved, and the second preset parameters of each set of second filters can be determined, where the second preset parameters of the 8 sets of second filters are respectively: [0,1/32], [1/32,1/8], [1/8,9/32], [9/32,1/2], [1/2,23/32], [23/32,28/32], [28/32,31/32], [31/32,1].

Of course, the second preset parameters may not be equally divided in the process of setting the second preset parameters. For example, the diagonal line may be moved in an arithmetic progression manner to obtain the second preset parameter, or other movement manners. The present application is not limited in the manner in which the second preset parameter is set.

Then, determining the energy of each second spectrogram in a plurality of second spectrograms corresponding to each first target spectrogram to obtain a plurality of energies; and forming the plurality of energy into feature vectors, so as to obtain the feature vector corresponding to each first target spectrogram. Since the frequency band information corresponding to each second spectrogram is different, the feature vector essentially consists of energy corresponding to each frequency band in the first target spectrogram.

Wherein the energy of each second spectrogram can be represented by formula (6):

wherein q _i For the energy of the ith second spectrogram of the plurality of second spectrograms corresponding to each first target spectrogram,

₁ the energy of the spectrogram is calculated as the sum of absolute values of all elements in the matrix, i.e. the energy of the spectrogram, i is an integer from 1 to M, M is the number of the second spectrograms, i.e. the number of the second filters, and the operation is dot multiplication.

The log10 is used for enabling the energy of different frequency bands to fall on the same order of magnitude, so that the energy of some frequency bands is prevented from being too high, and the energy of some frequency bands is prevented from being too low, and inconvenience is brought to subsequent processing.

The first filter is to filter the first spectrogram of the global frequency domain transform. Therefore, the first filter performs filtering processing on the global frequency band information, and can roughly divide the frequency band information of the first spectrogram. For example, it can be classified into high frequency, intermediate frequency, and low frequency. Therefore, the number of first filters can be set relatively small; the second filter performs filtering processing on the first target spectrogram of the local transformation, so that the second filter filters the local frequency band information and is frequency domain information of which the extraction of more details from the first target spectrogram is required. Therefore, the frequency band information of the first target spectrogram needs to be finely divided, and the number of the second filters needs to be set relatively large.

In some possible embodiments, the above-mentioned process of identifying the authenticity of the face image may be implemented through a neural network. The neural network includes a first network and a second network. The training process of the neural network is the existing supervised training and is not described.

Specifically, a first face image is input into a first network to perform frequency domain transformation to obtain a first spectrogram, the first spectrogram is subjected to multiple times of filtering processing to obtain a plurality of second spectrograms, and input data are obtained according to the plurality of second spectrograms; then, the input data is input to a second network, and the authenticity of the first face is determined.

The first network may be, for example, an existing network structure, i.e. an existing neural network that may perform frequency domain transformation and filtering. For the present application, only the network parameters of the neural network are set to the first reference parameter and the second reference parameter described above; and then, in the training process, the first reference parameter and the second reference parameter of the neural network are adjusted, and after the training is finished, the first network can be used for carrying out frequency domain transformation on the first face image and carrying out multiple filtering processing on the first spectrogram to obtain input data.

In practical application, the frequency domain transformation of the first face image and the multiple filtering processing of the first spectrogram can also be realized by a packaged function. That is, after training the first network, the filtering parameters of the multiple sets of filters (including multiple sets of first filters and multiple sets of second filters) of the first network are encapsulated as a function, and then the function can be used to directly perform frequency domain transformation and filtering processing on the face image without using the first network.

The process of determining the authenticity of the first face through the second network will be described in detail mainly below.

The second network may be, for example, a convolutional neural network. As shown in fig. 4, the second network uses an Xception network as the backbone of the convolutional neural network. The second network includes two branches and a plurality of cross-fusion processing modules. Each branch comprises a plurality of network blocks (blocks), each block comprises a plurality of convolution layers and a pooling layer, and the blocks are existing network structures and are not described further. And the two branches correspond to the first input data and the second input data, respectively.

In the case that the input data is first input data, feature extraction can be performed on the first input data through a first branch, that is, feature extraction is performed through a plurality of blocks of the first branch, and authenticity of the first face image is directly determined according to the extracted features;

if the input data is the second input data, the first input data can be subjected to feature extraction through the second branch, namely, the feature extraction is performed through a plurality of blocks of the second branch, and the authenticity of the first face image is directly determined according to the extracted feature map;

Under the condition that the input data comprises first input data and second input data, firstly, carrying out feature extraction on the first input data and the second input data through blocks of each branch to obtain a sixth feature map and a seventh feature map; then, the sixth feature map and the seventh feature map are subjected to first cross fusion through the cross fusion processing module, so that a fourth feature map and a fifth feature map are obtained; and then, taking the fourth characteristic diagram and the fifth characteristic diagram as input data of the next cross fusion processing, and continuing the cross fusion processing until a second characteristic diagram and a third characteristic diagram corresponding to the two branches are obtained. And finally, splicing the second feature map and the third feature map, and determining the authenticity of the first face image according to the spliced feature map.

Compared with the existing splicing mode of the frequency domain information, the method and the device for splicing the frequency domain information in the application are used for carrying out cross fusion splicing on the frequency domain information obtained by global frequency domain transformation and local frequency domain transformation, namely, the frequency domain information obtained by the two frequency domain transformation are mutually fused, so that the obtained second characteristic diagram and third characteristic diagram both contain the global frequency domain information and the local frequency domain information in the first spectrogram, and the accuracy of identifying the authenticity of the first face image can be improved.

The following describes the process of identifying the authenticity of a face image according to the present application in detail with reference to fig. 5 to 7.

As shown in fig. 5, the frequency domain transformation and the filtering process are performed on the first face image through two branches, that is, the global frequency domain transformation and the local frequency domain transformation are performed on the first face image, and a series of filtering processes are performed, so as to obtain first input data corresponding to the branch of the global frequency domain transformation and second input data corresponding to the branch of the local frequency domain transformation; then, the first input data and the second input data are respectively input into convolution networks corresponding to the branches, feature extraction is carried out, cross fusion processing is carried out on the extracted features, and finally a second feature map and a third feature map of the two branches are obtained; carrying out synchronous pooling treatment on the second characteristic diagram and the third characteristic diagram to obtain a target characteristic diagram; and finally, predicting the authenticity of the face image according to the target feature image so as to determine the authenticity of the first face image.

Fig. 6 is a refinement of this branch of the global frequency domain transform. As shown in fig. 6, first, DCT transformation is performed on a first face image to obtain a first spectrogram; then, the first spectrogram is subjected to multiple filtering processing through multiple groups of first filters (only three groups of first filters are shown in fig. 6), so as to obtain multiple second spectrograms; and finally, carrying out frequency domain inversion on each second spectrogram in the plurality of second spectrograms to obtain a plurality of second images, and splicing the plurality of second images to obtain the first input data.

Fig. 7 is a refinement of this branch of the local frequency domain transform. As shown in fig. 7, SWDCT is first performed on the first face image to obtain a plurality of first target spectrograms; then, carrying out multiple times of filtering processing on each first target spectrogram through multiple groups of second filters to obtain multiple second spectrograms corresponding to each first target spectrogram; determining the energy of each second spectrogram, and determining the feature vector of each first spectrogram according to the energy of a plurality of second spectrograms corresponding to each first target spectrogram; and finally, splicing the feature vectors corresponding to the plurality of first target spectrograms, and performing channel conversion on the spliced data (matrix) to obtain second input data with the size specified by the branch convolution network.

The application scenario of the technical scheme of the application is introduced by combining the method for identifying the authenticity of the human face.

In some possible embodiments, in the case that the first face image is a portrait image of a user, the portrait image may be identified based on the technical solution of the present application, and in the case that the portrait image is determined to be a pseudo image, it is determined that another person modifies the portrait image, which is highly likely to infringe the portrait rights of the user, and the act of modifying the portrait image may be overtaking for a malicious act.

In some possible embodiments, when the first face image is any frame or a specific frame of the video to be identified, the face image may be identified based on the technical solution of the present application, and when the face image is determined to be a pseudo image, that is, when the first face image referred to by the present application is determined to be a pseudo image, it is determined that another person modifies the video to be identified, and it may be able to follow up the behavior of the other person tampering with the video work.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a device for identifying authenticity of a face image according to an embodiment of the present application. As shown in fig. 8, an apparatus 800 includes a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs including instructions for:

acquiring a first face image;

obtaining input data according to the plurality of second spectrograms;

In some possible embodiments, in a case that the frequency domain transformation includes the global frequency domain transformation, the obtaining input data according to the plurality of second spectrograms, the program is specifically configured to execute instructions for:

and splicing the plurality of second images to obtain the input data.

in the aspect of performing multiple filtering processing on the first spectrogram to obtain multiple second spectrograms, the program is specifically configured to execute instructions of the following steps:

In some possible embodiments, in the case that the number of the first spectrograms includes a plurality, in obtaining input data according to the plurality of the second spectrograms, the above program is specifically configured to execute instructions for:

Determining the energy of each second spectrogram;

In some possible embodiments, in determining the authenticity of the first face image according to the input data, the above program is specifically configured to execute instructions for:

extracting features of the input data to obtain a first feature map;

In some possible embodiments, in the case that the frequency domain transform includes the entire frequency domain transform and the local frequency domain transform, the entire frequency domain transform obtains one first spectrogram, the local frequency domain transform obtains one or more first spectrograms, and the first spectrogram is subjected to multiple filtering processing to obtain multiple second spectrograms respectively, the program is specifically configured to execute instructions for:

In some possible embodiments, the above program is specifically configured to execute the following instructions, where the input data includes first input data and second input data, and the input data is obtained according to the plurality of second spectrograms:

splicing the plurality of second images to obtain the first input data;

In some possible embodiments, in the case that the number of times of the cross-fusion processing is multiple, in performing the cross-fusion processing on the first input data and the second input data to obtain a second feature map and a third feature map, the above program is specifically configured to execute instructions for:

In some possible embodiments, in the aspect of performing a first cross fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map, the program is specifically configured to execute instructions for:

extracting features of the first input data to obtain a sixth feature map;

extracting features of the second input data to obtain a seventh feature map;

In some possible embodiments, in determining the authenticity of the first face image according to the second feature map and the third feature map, the above program is specifically configured to execute instructions for:

In some possible embodiments, the above-mentioned program is specifically configured to execute, during the execution of the filtering process, instructions for:

Referring to fig. 9, fig. 9 shows a device for identifying authenticity of a face image according to an embodiment of the present application. The apparatus 900 includes: an acquisition unit 910, a transformation unit 920, a filtering unit 930, and a processing unit 940, wherein:

In some possible embodiments, where the frequency domain transform includes the global frequency domain transform, the processing unit 940 is specifically configured to, in obtaining input data according to the plurality of second spectrograms:

and splicing the plurality of second images to obtain the input data.

in terms of performing multiple filtering processing on the first spectrogram to obtain multiple second spectrograms, the filtering unit 930 is specifically configured to:

In some possible embodiments, in the case that the number of the first spectrograms includes a plurality of second spectrograms, the processing unit 940 is specifically configured to, according to the plurality of second spectrograms, obtain input data:

determining the energy of each second spectrogram;

In some possible embodiments, the processing unit 940 is specifically configured to determine, according to the input data, whether the first face image is true or false:

extracting features of the input data to obtain a first feature map;

In some possible embodiments, in the case that the frequency domain transform includes the entire frequency domain transform and the local frequency domain transform, the entire frequency domain transform obtains one first spectrogram, the local frequency domain transform obtains one or more first spectrograms, and the filtering unit 930 is specifically configured to:

In some possible embodiments, the input data includes first input data and second input data, and the processing unit 940 is specifically configured to, in obtaining the input data according to the plurality of second spectrograms:

splicing the plurality of second images to obtain the first input data;

In some possible embodiments, the processing unit 940 is specifically configured to determine, according to the input data, authenticity of the first face image:

In some possible embodiments, in the case that the number of times of the cross-fusion processing is multiple, in performing the cross-fusion processing on the first input data and the second input data to obtain a second feature map and a third feature map, the processing unit 940 is specifically configured to:

In some possible embodiments, in performing a first cross fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map, the processing unit 940 is specifically configured to:

extracting features of the first input data to obtain a sixth feature map;

extracting features of the second input data to obtain a seventh feature map;

In some possible embodiments, the processing unit 940 is specifically configured to determine the authenticity of the first face image according to the second feature map and the third feature map:

In some possible embodiments, in performing the filtering process multiple times, the filtering unit 930 is specifically configured to:

The embodiment of the application also provides a computer storage medium, and the computer storage medium stores a computer program, and the computer program is executed by a processor to implement part or all of the steps of any method for identifying authenticity of a face image as described in the embodiment of the method.

Embodiments of the present application also provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform part or all of the steps of any one of the methods of identifying authenticity of a face image as described in the method embodiments above.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are alternative embodiments, and that the acts and modules referred to are not necessarily required for the present application.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

In the several embodiments provided by the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, such as the division of the units, merely a logical function division, and there may be additional manners of dividing the actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units described above may be implemented either in hardware or in software program modules.

The integrated units, if implemented in the form of software program modules, may be stored in a computer-readable memory for sale or use as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or partly in the form of a software product, or all or part of the technical solution, which is stored in a memory, and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned memory includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, which may include: flash disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.

The foregoing has outlined rather broadly the more detailed description of embodiments of the application, wherein the principles and embodiments of the application are explained in detail using specific examples, the above examples being provided solely to facilitate the understanding of the method and core concepts of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method for identifying authenticity of a face image, comprising:

acquiring a first face image;

performing frequency domain transformation on the first face image to obtain a first spectrogram; the frequency domain transform includes at least one of: global frequency domain transformation and local frequency domain transformation; the global frequency domain transform is different from the local frequency domain transform in terms of filter parameters, and the local frequency domain transform uses a greater number of filters than the global frequency domain transform;

obtaining input data according to the plurality of second spectrograms;

2. The method of claim 1, wherein, in the case where the frequency domain transform comprises the global frequency domain transform, the deriving input data from the plurality of second spectrograms comprises:

and splicing the plurality of second images to obtain the input data.

3. The method of claim 1, wherein the number of first spectrograms comprises one or more if the frequency domain transform comprises the local frequency domain transform;

4. A method according to claim 3, wherein, in case the number of the first spectrograms comprises a plurality, the obtaining input data from the plurality of second spectrograms comprises:

Determining the energy of each second spectrogram;

5. The method of claim 1, wherein determining the authenticity of the first face image based on the input data comprises:

extracting features of the input data to obtain a first feature map;

6. The method of claim 1, wherein, in the case that the frequency domain transform includes the global frequency domain transform and the local frequency domain transform, the global frequency domain transform obtains one first spectrogram, the local frequency domain transform obtains one or more first spectrograms, and the filtering processing is performed on the first spectrograms for multiple times to obtain a plurality of second spectrograms, including:

7. The method of claim 6, wherein the input data comprises first input data and second input data, the deriving input data from the plurality of second spectrograms comprising:

splicing the plurality of second images to obtain the first input data;

8. The method of claim 7, wherein determining the authenticity of the first face image based on the input data comprises:

9. The method according to claim 8, wherein, in the case where the number of times of the cross-fusion processing is a plurality of times, the cross-fusion processing is performed on the first input data and the second input data to obtain a second feature map and a third feature map, including:

10. The method of claim 9, wherein performing a first cross-fusion process on the first input data and the second input data to obtain a fourth feature map and a fifth feature map comprises:

Extracting features of the first input data to obtain a sixth feature map;

extracting features of the second input data to obtain a seventh feature map;

11. The method of claim 8, wherein determining authenticity of the first face image based on the second feature map and the third feature map comprises:

12. The method according to any one of claims 1 to 11, wherein the multiple filtering process comprises:

13. The method of claim 12, wherein the step of determining the position of the probe is performed,

and in the process of carrying out multiple filtering processing on the first spectrogram obtained by the global frequency domain transformation and the local frequency domain transformation through multiple groups of filters, the filtering parameters of each group of filters are different.

14. A device for identifying authenticity of a face image, comprising:

the acquisition unit is used for acquiring a first face image;

the transformation unit is used for carrying out frequency domain transformation on the first face image to obtain a first spectrogram; the frequency domain transform includes at least one of: global frequency domain transformation and local frequency domain transformation; the global frequency domain transform is different from the local frequency domain transform in terms of filter parameters, and the local frequency domain transform uses a greater number of filters than the global frequency domain transform;

15. An apparatus for identifying authenticity of a face image, comprising a processor, a memory, a communication interface and one or more programs, wherein the one or more programs are stored in the memory and configured for execution by the processor, the one or more programs comprising instructions for performing the steps of the method of any of claims 1 to 13.

16. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any one of claims 1 to 13.