CN113011332A - Face counterfeiting detection method based on multi-region attention mechanism - Google Patents
Face counterfeiting detection method based on multi-region attention mechanism Download PDFInfo
- Publication number
- CN113011332A CN113011332A CN202110295565.7A CN202110295565A CN113011332A CN 113011332 A CN113011332 A CN 113011332A CN 202110295565 A CN202110295565 A CN 202110295565A CN 113011332 A CN113011332 A CN 113011332A
- Authority
- CN
- China
- Prior art keywords
- attention
- texture
- map
- region
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 26
- 230000007246 mechanism Effects 0.000 title claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 40
- 238000011176 pooling Methods 0.000 claims abstract description 26
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims description 12
- 238000010586 diagram Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 239000000126 substance Substances 0.000 claims 1
- 230000006870 function Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000002708 enhancing effect Effects 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 239000002131 composite material Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 241000219823 Medicago Species 0.000 description 1
- 241000834151 Notesthes Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/40—Analysis of texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
Abstract
The invention discloses a face counterfeiting detection method based on a multi-region attention mechanism, which comprises the following steps: inputting a human face image to be detected into a convolutional neural network to obtain a shallow layer characteristic image, a middle layer characteristic image and a deep layer characteristic image; carrying out texture enhancement operation on the shallow feature map to obtain a texture feature map; generating a multi-region attention map for the intermediate layer characteristic map through a multi-attention mechanism; performing attention pooling on the texture feature map by using a multi-region attention map to obtain local texture features, and performing attention pooling on the deep feature map after adding the attention maps to obtain global features; and after the global features and the local texture features are fused, classifying to obtain a face forgery detection result. The method has a plurality of attention areas, and each area can extract mutually independent features to enable the network to pay more attention to local texture information, so that the accuracy of a detection result is improved.
Description
Technical Field
The invention relates to the technical field of face forgery detection, in particular to a face forgery detection method based on a multi-region attention mechanism.
Background
The face counterfeiting refers to tampering of face regions in media such as images or videos by using a computer technology, including identity replacement and expression editing, and the face counterfeiting technology can be applied to the post-processing of movies and televisions. With the great development of the deep learning technology in the field of image generation, a generation countermeasure network and an auto-encoder are applied to the field of face forgery to generate a face forgery picture or video that is difficult to be distinguished by human eyes, such as defafakes, FSGAN, FaceShifter, and the like. Many face-forgery-programs are available on the internet, so that anyone can synthesize a fake video by using a personal computer through simple learning. These counterfeit videos are now widely available on the internet. If this technique is used to spread rumors, counterfeit evidence, etc. illegal activities, it can cause serious social harm. Therefore, research on the face-forged video detection technology has gained wide attention in academia. The mainstream face counterfeit detection algorithm at present uses a face detector based on a deep neural network and a counterfeit face classifier.
The neural network model can capture the characteristics capable of effectively distinguishing real faces from fake faces by training on a large-scale fake face data set. The effect of the general image classification network in the task of detecting the counterfeit human face has certain limitations, especially when detecting compressed video and counterfeit methods which do not appear in the training set. In computer vision, the attention mechanism is a broad concept, and is referred to as flexible attention based on positions, and each position in a feature map is multiplied by a weight; however, this scheme: 1) texture information is ignored using only deep features; 2) only one attention area is available, and local features are ignored; therefore, the accuracy of the detection result still needs to be improved.
Disclosure of Invention
The invention aims to provide a face forgery detection method based on a multi-region attention mechanism, which has a plurality of attention regions, wherein each region can extract independent features to enable a network to pay more attention to local texture information, so that the accuracy of a detection result is improved.
The purpose of the invention is realized by the following technical scheme:
a face counterfeiting detection method based on a multi-region attention mechanism comprises the following steps:
inputting a human face image to be detected into a convolutional neural network to obtain a shallow layer characteristic image, a middle layer characteristic image and a deep layer characteristic image;
carrying out texture enhancement operation on the shallow feature map to obtain a texture feature map; generating a multi-region attention map for the intermediate layer characteristic map through a multi-attention mechanism; performing attention pooling on the texture feature map by using the multi-region attention maps to obtain local texture features, and performing attention pooling on the deep feature map after adding the multi-region attention maps to obtain global features; and after the global features and the local texture features are fused, classifying to obtain a face forgery detection result.
According to the technical scheme provided by the invention, on one hand, the method can be used together with the traditional convolutional neural network backbone to improve the accuracy rate in the human face forgery detection task. On the other hand, the input face images are classified by using the local texture features of different regions and the global deep features, so that the accuracy of the classification result can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a network overall structure diagram of a face forgery detection method based on a multi-region attention mechanism according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an attention generating module and an attention pooling module according to an embodiment of the present invention;
fig. 3 is a visualization example of each discriminant region obtained through weak supervised learning according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Compared with the traditional method, the method provided by the embodiment of the invention has a plurality of attention areas, and each area can extract independent features so that the network pays more attention to local texture information. As shown in fig. 1, the method mainly includes: inputting a human face image to be detected into a convolutional neural network to obtain a shallow layer characteristic image, a middle layer characteristic image and a deep layer characteristic image; carrying out texture enhancement operation on the shallow feature map to obtain a texture feature map; generating a multi-region attention map for the intermediate layer characteristic map through a multi-attention mechanism; performing attention pooling on the texture feature map by using the multi-region attention maps to obtain local texture features, and performing attention pooling on the deep feature map after adding the multi-region attention maps to obtain global features; and after the global features and the local texture features are fused, classifying to obtain a face forgery detection result.
The scheme provided by the embodiment of the invention can be used together with the traditional convolutional neural network backbone to improve the accuracy rate in the face forgery detection task; and the local texture features of different regions and the global deep features are used for classifying the input face images. Compared with the prior art, the method achieves higher accuracy and mobility in various published Deepfake detection data sets.
For ease of understanding, the following detailed description is provided for the above-described aspects of embodiments of the present invention.
One, the whole network and its composition.
Fig. 1 shows the entire network structure, mainly including: a backbone of the convolutional neural network, and a texture enhancement module, an attention generation module, an attention pooling module, and a fully-connected classifier. The main introduction is as follows:
1. a convolutional neural network.
The convolutional neural network can be a convolutional neural network used for a face forgery detection task in a traditional scheme, and a main part mainly refers to a feature extraction part. Shallow layer, middle layer and deep layer characteristic maps extracted from the main part of the convolutional neural network are correspondingly input into a texture enhancing module, an attention generating module and an attention pooling module.
It will be appreciated by those skilled in the art that convolutional neural networks are generally constructed of similar N-layer stacks, and shallow, intermediate and deep layers are relatively general concepts, as the method of the present invention is not limited to using a particular convolutional neural network as a backbone. Since the feature extraction portion has a multi-layer structure, the shallow layer, the intermediate layer, and the deep layer gradually increase in depth, for example, the shallow layer may be the first layer or the second layer, the deep layer may be the last layer or the second last layer, and the intermediate layer may be any layer between the shallow layer and the deep layer. As a specific example of the EfficientNet as a backbone is provided in the experiments below, the shallow layer refers to the second block, the middle layer refers to the fifth block, and the deep layer refers to the seventh (i.e., last) block.
2. And a texture enhancement module.
The texture enhancement module is mainly used for enhancing the sensitivity of the feature map to the texture features of face forgery by extracting the residual error of the convolutional neural network feature map, and can extract and enhance texture information by using densely connected convolutional layers.
It was observed that slight artifacts caused by the counterfeiting method tend to remain in the texture information of the shallow features. Here, the texture information represents high frequency components of the shallow features, similar to residual information of RGB images. Therefore, shallow features should be of interest and enhanced. As shown in fig. 1, the texture enhancing module performs a texture enhancing operation on the shallow feature map, and the main steps include: carrying out local average pooling on the shallow feature map to obtain a non-texture feature map D; and inputting the residual error of the shallow characteristic diagram and the non-texture characteristic diagram D into the densely connected convolution layer to obtain a texture characteristic diagram.
Those skilled in the art will appreciate that dense connections are a term of art, in this english the denseconnection, and in particular it is the structure proposed by the DenseNet network, as distinguished from conventional convolutional networks: conventional convolutional neural network processing flows are multi-layer serial, with the input of each layer containing the output of all previous layers in a densely connected structure.
3. An attention generation module.
The difference between real and fake faces usually appears as different features in different facial regions and is not easily captured by a single attention structure. Therefore, the discrimination of the face forgery detection network can be improved by using multi-region attention instead of global average pooling.
As shown in fig. 2, a multi-region attention map is generated by an attention generating module for the intermediate layer feature map through a multi-attention mechanism; the attention generating module comprises a convolution layer (specifically, a 1 × 1 convolution layer), a batch normalization layer and a nonlinear activation layer (ReLU) which are arranged in sequence. The middle layer feature map will pass through the attention generation module, and M region attention maps a with the size of Ht × Wt can be obtained.
4. Attention pooling module.
In the embodiment of the invention, the attention pooling module uses a multi-region attention map to pool the texture feature map to obtain local texture features, and uses the multi-region attention map to add and then pool the deep feature map to obtain the global features.
In embodiments of the invention, Bilinear Attention Pooling (BAP) is used instead of global averaging pooling, BAP is applied to shallow and deep feature maps to collect texture features from the shallow and to retain deeper semantic features. Specifically, the method comprises the following steps: if the resolution of the multi-region attention map does not match the resolution of the texture map, mapping the multi-region attention map to the same resolution as the texture map; then, multiplying the texture feature map by each region attention map respectively to obtain a plurality of partial texture feature maps; performing global average pooling on all partial texture feature maps, and performing L2And normalizing to obtain local texture features.
In the embodiment of the invention, a plurality of attention maps are provided, and generally, each attention map has high intensity in a specific area and low intensity in other areas; therefore, after multiplying each region attention map by the texture feature map, the obtained partial texture feature map is the feature map with only the texture information of the corresponding attention region reserved.
5. And (4) fully connecting a classifier.
In the embodiment of the invention, a multilayer full-connection network is used for fusing the global characteristics and the local texture characteristics, classification is carried out based on the fused characteristics, and the output result shows that the human face image to be detected is a real image or a forged image.
And II, a loss function.
In the embodiment of the present invention, the loss function of the network model shown in fig. 1 during training includes two parts: cross entropy loss and region independence loss.
The cross entropy loss is a conventional loss, and can be realized by referring to a conventional technology, so that the detailed description is omitted. The following description will be made mainly in detail with respect to the region independence loss function.
The loss of area independence is a loss of assistance in training the attention generating module due to the training notesThe intent generation module may be prone to network performance degradation due to lack of tag guidance, i.e., different attention attempts tend to focus on the same area, which is detrimental to network capture of rich local texture information. In addition, for different input pictures, it is desirable that each attention map is located in a fixed semantic area to reduce the randomness of the information captured by each attention map. To achieve the above goal, we propose a regional independence penalty that helps reduce the overlap between attention maps and maintain the consistency of different inputsExpressed as:
in which the region independence is lostThe first part of (a) is an intra-class loss, which pulls feature V near feature center c, and the second part is an inter-class loss, which repels scattered feature centers;for the local feature of the jth region of sample i, respectively representing the characteristic centers of the jth area, the kth area and the ith area of the t-th update; m isinRepresenting the edge distance, y, between a feature and the center of the corresponding featureiIs the label (true or false) of sample i, min(yi) Is thatThe different labels have different margins; m isoutIs the margin between the centers of each feature; b is the batch size (batch size), M is the number of multi-region attention maps, each region attention map focusing on a region;
the characteristic center c is the sliding average value of the characteristic V, and the updating formula is as follows:
wherein, ct-1、ctRespectively representing the feature centers of the t-1 st and t-th updates, ViFor the feature of sample i, α is the update rate of the feature center, decaying α after each training period.
When the classification is carried out, the local texture features and the deep features are connected and then a classification result is obtained through the full connection layer. In the embodiment of the invention, the loss function is optimized by using a gradient descent algorithm in the training process Is the cross entropy loss of the classifier.
And thirdly, a data enhancement scheme of attention guidance.
In order to further separate different attention diagrams, attention-guided data enhancement is introduced, namely, in the training process, Gaussian blur is carried out on a selected attention area in the input face image I to generate a data enhanced face image I′Expressed as:
In order to demonstrate the effects of the above-described embodiments of the present invention, the following description is made with reference to the experimental results.
In the experiment, effective-Net B4 is selected as a main network, a second layer feature map (a shallow layer feature map) is selected as an input of a texture enhancement module, a fifth layer feature map (a middle layer feature map) is used for extracting an attention map, and the last layer before global pooling is used as a deep layer feature map. The number of attention areas is set to 4.
In the experiment, various public face forgery detection data sets are used, and the example of the method is trained and tested according to a standard flow. Table 1 shows the accuracy of the method in the FF + + data set high-quality (HQ) and low-quality (LQ) video test, and as can be seen from the results shown in table 1, the method has higher detection accuracy on high-quality video than other existing methods.
TABLE 1 accuracy of the method on FF + + dataset compared to other methods
Table 2 shows the accuracy of the migration test on the data set Celeb-DF v2 after the FF + + data set is trained by the method.
Method | FF++ | Celeb-DF |
Two-stream | 70.1 | 53.8 |
Meso4 | 84.7 | 54.8 |
Mesolnception4 | 83 | 53.6 |
FWA | 80.1 | 56.9 |
Xception-raw | 99.7 | 48.2 |
Xception-c23 | 99.7 | 65.3 |
Xception-c40 | 95.5 | 65.5 |
Multi-task | 76.3 | 54.3 |
Capsule | 96.6 | 57.5 |
DSP-FWA | 93 | 64.6 |
Two Branch | 93.18 | 73.41 |
F3-Net | 98.1 | 65.17 |
EfficientNet-B4 | 99.7 | 64.29 |
Method for producing a composite material | 99.8 | 67.44 |
TABLE 2 migration Capacity of this method on Celeb-DFv2 dataset compared to other methods
Table 3 shows that the testing effect (Logloss, the smaller the effect is better) of the method in the private testing set of the DFDC match is better than the winning method in the match.
Method | Logloss |
Selim Seferbekov | 0.1983 |
WM | 0.1787 |
NTechLab | 0.1703 |
Eighteen Years Old | 0.1882 |
The Medics | 0.2157 |
Method for producing a composite material | 0.1679 |
TABLE 3 Logloss of the method on DFDC test set compares the DFDC match winning method
The comparison result shows that: compared with the prior art, the method achieves higher accuracy and mobility in various published Deepfake detection data sets.
Fig. 3 shows a visualization example of each discriminant region obtained by the weak supervised learning, which is different from the original method (Dang et al) that requires additional information to train attention, but the method provided by the present invention does not require additional information, and thus may be referred to as the attention obtained by the weak supervised learning, and each discriminant region in fig. 3 refers to a different attention region.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the system is divided into different functional modules to perform all or part of the above described functions.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (6)
1. A face forgery detection method based on a multi-region attention mechanism is characterized by comprising the following steps:
inputting a human face image to be detected into a convolutional neural network to obtain a shallow layer characteristic image, a middle layer characteristic image and a deep layer characteristic image;
carrying out texture enhancement operation on the shallow feature map to obtain a texture feature map; generating a multi-region attention map for the intermediate layer characteristic map through a multi-attention mechanism; performing attention pooling on the texture feature map by using the multi-region attention maps to obtain local texture features, and performing attention pooling on the deep feature map after adding the multi-region attention maps to obtain global features; and after the global features and the local texture features are fused, classifying to obtain a face forgery detection result.
2. The method for detecting face forgery based on multi-region attention mechanism as claimed in claim 1, wherein the texture enhancement module performs texture enhancement operation on the shallow feature map to obtain the texture feature map, and the implementation steps include:
performing local average pooling on the shallow feature map to obtain a non-texture feature map;
and inputting the residual error of the shallow characteristic diagram and the non-texture characteristic diagram into the densely connected convolution layer to obtain the texture characteristic diagram.
3. The method for detecting face forgery based on multi-region attention mechanism as claimed in claim 1, wherein the multi-region attention diagram is generated by the attention generation module to the middle layer feature diagram through the multi-attention mechanism; the attention generation module comprises a convolution layer, a batch normalization layer and a nonlinear activation layer which are arranged in sequence.
4. The method for detecting face forgery based on multi-region attention mechanism as claimed in claim 1, wherein the attention pooling for texture feature map is performed using multi-region attention maps to obtain local texture features, and the attention pooling for deep layer feature maps after adding the attention maps to obtain global features is realized by an attention pooling module; wherein:
bilinear attention pooling is adopted when performing attention pooling on a texture feature map by using multi-region attention maps, and comprises the following steps: if the resolution of the multi-region attention map does not match the resolution of the texture map, mapping the multi-region attention map to the same resolution as the texture map; then, multiplying the texture feature map by each region attention map respectively to obtain a plurality of partial texture feature maps; performing global average pooling on all partial texture feature maps, and performing L2And normalizing to obtain local texture features.
5. The method for detecting face forgery based on multi-region attention mechanism as claimed in claim 2, wherein the loss function of the network model corresponding to the method in training includes two parts: cross entropy loss and region independence loss; wherein, the loss of the regional independence is the loss of the auxiliary training attention generation moduleExpressed as:
wherein the content of the first and second substances,for the local feature of the jth region of sample i,respectively representing the characteristic centers of the jth area, the kth area and the ith area of the t-th update; m isinRepresenting the edge distance, y, between a feature and the center of the corresponding featureiIs a label of sample i, min(yi) Is a whole, and different labels have different margins; m isoutIs the margin between the centers of each feature; b is the batch size, M is the number of multi-zone attention maps, each zone attention map focuses on a zone;
the characteristic center c is the sliding average value of the characteristic V, and the updating formula is as follows:
wherein, ct-1、ctRespectively representing the feature centers of the t-1 st and t-th updates, ViFor the feature of sample i, α is the update rate of the feature center, decaying α after each training period.
6. The method of claim 1, wherein the method further comprises: attention-guided data enhancement is introduced, namely, in the training process, Gaussian blur is carried out on a selected attention area in the input face image I to generate a data-enhanced face image I′Expressed as:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110295565.7A CN113011332A (en) | 2021-03-19 | 2021-03-19 | Face counterfeiting detection method based on multi-region attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110295565.7A CN113011332A (en) | 2021-03-19 | 2021-03-19 | Face counterfeiting detection method based on multi-region attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113011332A true CN113011332A (en) | 2021-06-22 |
Family
ID=76403104
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110295565.7A Pending CN113011332A (en) | 2021-03-19 | 2021-03-19 | Face counterfeiting detection method based on multi-region attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113011332A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113485261A (en) * | 2021-06-29 | 2021-10-08 | 西北师范大学 | CAEs-ACNN-based soft measurement modeling method |
CN113537027A (en) * | 2021-07-09 | 2021-10-22 | 中国科学院计算技术研究所 | Face depth forgery detection method and system based on facial segmentation |
CN114842524A (en) * | 2022-03-16 | 2022-08-02 | 电子科技大学 | Face false distinguishing method based on irregular significant pixel cluster |
CN115471736A (en) * | 2022-11-02 | 2022-12-13 | 浙江君同智能科技有限责任公司 | Forged image detection method and device based on attention mechanism and knowledge distillation |
CN116453199A (en) * | 2023-05-19 | 2023-07-18 | 山东省人工智能研究院 | GAN (generic object model) generation face detection method based on fake trace of complex texture region |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106599883A (en) * | 2017-03-08 | 2017-04-26 | 王华锋 | Face recognition method capable of extracting multi-level image semantics based on CNN (convolutional neural network) |
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
CN110414414A (en) * | 2019-07-25 | 2019-11-05 | 合肥工业大学 | SAR image Ship Target discrimination method based on the fusion of multi-layer depths of features |
CN111768415A (en) * | 2020-06-15 | 2020-10-13 | 哈尔滨工程大学 | Image instance segmentation method without quantization pooling |
CN111967427A (en) * | 2020-08-28 | 2020-11-20 | 广东工业大学 | Fake face video identification method, system and readable storage medium |
-
2021
- 2021-03-19 CN CN202110295565.7A patent/CN113011332A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
CN106599883A (en) * | 2017-03-08 | 2017-04-26 | 王华锋 | Face recognition method capable of extracting multi-level image semantics based on CNN (convolutional neural network) |
CN110414414A (en) * | 2019-07-25 | 2019-11-05 | 合肥工业大学 | SAR image Ship Target discrimination method based on the fusion of multi-layer depths of features |
CN111768415A (en) * | 2020-06-15 | 2020-10-13 | 哈尔滨工程大学 | Image instance segmentation method without quantization pooling |
CN111967427A (en) * | 2020-08-28 | 2020-11-20 | 广东工业大学 | Fake face video identification method, system and readable storage medium |
Non-Patent Citations (1)
Title |
---|
HANQING ZHAO ET.AL: "Multi-attentional Deepfake Detection", 《ARXIV:2103.02406V1 [CS.CV]》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113485261A (en) * | 2021-06-29 | 2021-10-08 | 西北师范大学 | CAEs-ACNN-based soft measurement modeling method |
CN113485261B (en) * | 2021-06-29 | 2022-06-28 | 西北师范大学 | CAEs-ACNN-based soft measurement modeling method |
CN113537027A (en) * | 2021-07-09 | 2021-10-22 | 中国科学院计算技术研究所 | Face depth forgery detection method and system based on facial segmentation |
CN113537027B (en) * | 2021-07-09 | 2023-09-01 | 中国科学院计算技术研究所 | Face depth counterfeiting detection method and system based on face division |
CN114842524A (en) * | 2022-03-16 | 2022-08-02 | 电子科技大学 | Face false distinguishing method based on irregular significant pixel cluster |
CN114842524B (en) * | 2022-03-16 | 2023-03-10 | 电子科技大学 | Face false distinguishing method based on irregular significant pixel cluster |
CN115471736A (en) * | 2022-11-02 | 2022-12-13 | 浙江君同智能科技有限责任公司 | Forged image detection method and device based on attention mechanism and knowledge distillation |
CN116453199A (en) * | 2023-05-19 | 2023-07-18 | 山东省人工智能研究院 | GAN (generic object model) generation face detection method based on fake trace of complex texture region |
CN116453199B (en) * | 2023-05-19 | 2024-01-26 | 山东省人工智能研究院 | GAN (generic object model) generation face detection method based on fake trace of complex texture region |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Qian et al. | Thinking in frequency: Face forgery detection by mining frequency-aware clues | |
Wu et al. | Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features | |
Chen et al. | A robust GAN-generated face detection method based on dual-color spaces and an improved Xception | |
Ross et al. | Security in smart cities: A brief review of digital forensic schemes for biometric data | |
CN113011332A (en) | Face counterfeiting detection method based on multi-region attention mechanism | |
Zhang et al. | A dense u-net with cross-layer intersection for detection and localization of image forgery | |
Fei et al. | Exposing AI-generated videos with motion magnification | |
Zhang et al. | Face anti-spoofing detection based on DWT-LBP-DCT features | |
Cao et al. | Metric learning for anti-compression facial forgery detection | |
Miao et al. | Learning forgery region-aware and ID-independent features for face manipulation detection | |
Yu et al. | Manipulation classification for jpeg images using multi-domain features | |
Agarwal et al. | Privacy preservation through facial de-identification with simultaneous emotion preservation | |
Liu et al. | Image deblocking detection based on a convolutional neural network | |
Zobaed et al. | Deepfakes: Detecting forged and synthetic media content using machine learning | |
Arora et al. | A review of techniques to detect the GAN-generated fake images | |
Zeng et al. | Occlusion‐invariant face recognition using simultaneous segmentation | |
Ke et al. | DF-UDetector: An effective method towards robust deepfake detection via feature restoration | |
Xu et al. | Facial depth forgery detection based on image gradient | |
Lal et al. | A study on deep fake identification techniques using deep learning | |
Ibsen et al. | Impact of facial tattoos and paintings on face recognition systems | |
Zhao et al. | TAN-GFD: generalizing face forgery detection based on texture information and adaptive noise mining | |
Meena et al. | Image splicing forgery detection techniques: A review | |
Peng et al. | Face morphing attack detection and attacker identification based on a watchlist | |
Zhang et al. | A pyramid attention network with edge information injection for remote sensing object detection | |
Annadani et al. | Augment and adapt: A simple approach to image tampering detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210622 |
|
RJ01 | Rejection of invention patent application after publication |