CN111144314B - Method for detecting tampered face video - Google Patents

Method for detecting tampered face video Download PDF

Info

Publication number
CN111144314B
CN111144314B CN201911376257.6A CN201911376257A CN111144314B CN 111144314 B CN111144314 B CN 111144314B CN 201911376257 A CN201911376257 A CN 201911376257A CN 111144314 B CN111144314 B CN 111144314B
Authority
CN
China
Prior art keywords
frames
feature
face
tampered
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911376257.6A
Other languages
Chinese (zh)
Other versions
CN111144314A (en
Inventor
张勇东
尚志华
谢洪涛
邓旭冉
李岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Research Institute
University of Science and Technology of China USTC
Original Assignee
Beijing Zhongke Research Institute
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Research Institute, University of Science and Technology of China USTC filed Critical Beijing Zhongke Research Institute
Priority to CN201911376257.6A priority Critical patent/CN111144314B/en
Publication of CN111144314A publication Critical patent/CN111144314A/en
Application granted granted Critical
Publication of CN111144314B publication Critical patent/CN111144314B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for detecting a tampered face video, which comprises the following steps: decoding the face video data into a group of continuous frame images, intercepting the face area of each frame image, and correspondingly storing the face area as a face picture according to the frame number; extracting each face picture through a feature extractor to obtain a corresponding feature map; and simultaneously inputting the feature maps of two continuous frames into an interframe correlation classifier, fusing the feature maps of the two frames together by the interframe correlation classifier by adopting an attention mechanism, and classifying, wherein the classification result is the probability that the two input frames are tampered. The method simultaneously utilizes the information of the frame picture and the interframe relation of the adjacent frames, and has favorable effect. Meanwhile, the detection is automatically completed, and the method can be suitable for large-scale video platforms and social platforms.

Description

Method for detecting tampered face video
Technical Field
The invention relates to the technical field of network space security, in particular to a method for detecting a tampered face video.
Background
The technology of 'face changing' based on the deep neural network is quite popular, the face in the video can be quickly changed into the face of other people based on the technology, and more lawless persons tamper the video aiming at politicians, stars and celebrities to issue false messages. For this phenomenon, methods for detecting whether a video is tampered, such as detecting blink frequency, detecting noise consistency, etc., have been available.
However, the existing method has poor detection performance, and cannot ensure the accuracy of the detection result, and particularly, the existing method cannot meet the requirements of practical application along with the rapid development of a counterfeiting technology.
Disclosure of Invention
The invention aims to provide a method for detecting a tampered face video, which has higher detection accuracy.
The purpose of the invention is realized by the following technical scheme:
a method for detecting a tampered face video comprises the following steps:
decoding the face video data into a group of continuous frame images, intercepting the face area of each frame image, and correspondingly storing the face area as a face picture according to the frame number;
extracting the characteristics of each face picture through a characteristic extractor to obtain a corresponding characteristic graph;
and simultaneously inputting the feature maps of two continuous frames into an interframe correlation classifier, fusing the feature maps of the two frames by adopting an attention mechanism, and classifying, wherein the classification result is the probability that the two input frames are tampered.
The technical scheme provided by the invention can show that the method has a good effect based on the deep neural network and simultaneously utilizes the self information of the frame picture and the interframe relation of the adjacent frame. Meanwhile, the detection is automatically completed, and the method can be suitable for large-scale video platforms and social platforms.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flowchart of a method for detecting a tampered face video according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an attention module according to an embodiment of the invention;
fig. 3 is a schematic diagram of a classifier according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a method for detecting a tampered face video, which mainly comprises the following steps of:
1. and decoding the face video data into a group of continuous frame images, intercepting the face area of each frame image, and correspondingly storing the face area as a face image according to the frame number.
In the embodiment of the invention, the face video data can be decoded into a group of continuous frame images through a universal opencv or ffmpeg toolkit; and intercepting the face region of each frame of image through a Dlib tool library of an open source in python, wherein the sizes of the face regions in different frames of images are the same or different.
2. And extracting the characteristics of each face picture through a characteristic extractor to obtain a corresponding characteristic image.
In the embodiment of the invention, the feature extractor selects an Xcenter network to realize, can extract and extract the feature map of each face picture,
the feature extractor can input pictures with any size, but the input of the inter-frame correlation classifier needs a fixed-size classifier, so that an adaptive positive layer (adaptive pooling layer) is added at the tail end of the feature extractor, the feature map with any size can be divided into different regions according to a uniform scale, and the average value in each region is calculated, so that the feature map with the uniform scale is obtained.
The scale of the feature map is set to be N × M, where N × N represents the spatial size of the feature, and M represents the feature vector dimension of each point in the feature space.
For example, N may be 10, and M may be 2048.
3. And simultaneously inputting the feature maps of two continuous frames into an interframe correlation classifier, fusing the feature maps of the two frames by adopting an attention mechanism, and classifying, wherein the classification result is the probability that the two input frames are tampered.
The preferred embodiment of this step is as follows:
firstly, a correlation matrix Cor between two characteristic graphs (marked as a characteristic graph A and a characteristic graph B) is obtained, and Cor is A × B calculated according to the similarity between two characteristic vectors in the two characteristic graphsT
Obtaining the correlation matrix corresponding to each of the characteristic diagrams A and B by deforming the correlation matrix Cor as follows: rA=reshape(Cor,N×N×N2),RB=reshape(CorT,N×N×N2) (ii) a Where reshape (X, SHAPE) denotes the conversion of the size of X to SHAPE, where X ═ Cor, CorTAnd X has a size N2×N2,SHAPE=N×N×N2Where N × N represents the spatial dimension of the feature.
The principle of the above steps is that assuming that N is 10, M is 2048, each feature map is regarded as a three-dimensional matrix 10 × 10 × 2048, 10 × 010 is a space size, 2048 is a feature vector dimension, a space size (10 × 110) is regarded as a dimension, a feature map can be regarded as a matrix (10 × 10) × 2048 is 100 × 2048, the correlation matrix between the two feature maps is in the SHAPE of (10 × 10) × (10 × 10) 100 × 100, i.e. it is a two-dimensional matrix, and for subsequent calculation, Cor needs to be deformed into a three-dimensional matrix, 100 of the first dimension in Cor is regarded as 10 × 10 of the two dimensions, i.e. 10 continuous points in 100, corresponding to one row in the SHAPE (SHAPE), and the deformation result is that the correlation matrix R of the two feature maps is a rowA、RBThe warped content is consistent with the position, but transposes and fuses the dimensions.
Secondly, to obtain more distinctive features, R is addedAAnd RBRespectively input into the attention module to generate corresponding attention mask MAAnd MBAnd then calculating: a. theT=(MA+1)×A,BT=(MB+1) × B, and then, mixing ATAnd BTAnd F is spliced together in the feature dimension and input into a final classifier as a weighted feature value, and exemplary F is a feature map of 10 × 10 × 4096.
As shown in fig. 2, the attention module mainly includes: the three convolutional layers are sequentially connected, each convolutional layer uses padding of 1, the filling value is 0, a batch regularization layer is connected behind each convolutional layer, and a ReLu activation layer is connected behind each batch regularization layer except the last convolutional layer; and the output of the last convolution layer is added with the input correlation matrix R after passing through a batch regularization layer, and then a corresponding mask M is obtained through a ReLu activation layer.
Illustratively, the convolution kernel sizes of the three convolution layers are set to 1 × 1, 3 × 3, and 1 × 1 in this order. The input dimension of the first 1 × 1 convolutional layer is 2048, the output dimension is 512, the input and output dimensions of the subsequent 3 × 3 convolutional layers are 512, the input dimension of the final 1 × 1 convolutional layer is 512, the output dimension is 2048, the input and output dimensions are added after the batch regularization layer passes through, the attention mask is obtained through a ReLu activation layer and then the characteristic dimension, namely 2048 dimension addition.
As shown in fig. 3, the classifier includes: three convolution layers and a full-connection layer at the tail end which are connected in sequence; after the feature map fusion results of the two frames are input, the feature map fusion results are sequentially input into the full-connection layer through the processing of the three convolution layers, the output dimensionality of the full-connection layer is 1, and then the probability that the input two frames are tampered is obtained through a sigmod function.
Illustratively, the convolution kernel sizes of the three convolution layers are set to 1 × 1, 3 × 3, and 3 × 3 in this order. The first 1 x 1 convolutional layer has an input dimension of 4096 and an output dimension of 512. Then, the input and output of the 3 × 3 convolutional layers are all 512 dimensions. Finally, the output dimension of the fully-connected layer is 512, and the output dimension is 1.
In the embodiment of the invention, the feature extractor and the interframe correlation classifier form a deep neural network, and whether the face in the video is tampered or not can be automatically detected after network training. During the training process, the loss function is set as:
Figure GDA0002607828350000041
where s is the probability that two frames of the input are tampered with.
The present invention provides two training modes (determined by mean or maximum), either of which can be used:
the first method comprises the following steps: in the training process, two continuous frames are respectively used as input to calculate loss and are reversely transmitted; after training is finished, for a test video, after every two continuous frames are input, calculating the probability of tampering, finally obtaining K-1 probabilities of tampering, judging whether the test video comes from the tampered video according to the average value of the K-1 probabilities of tampering, and considering that the test video comes from the tampered video when the average value is larger than 50%, wherein K represents the number of frames of the test video.
And the second method comprises the following steps: in the training process, two continuous frames are used as input, the probability of tampering the two continuous frames is calculated, the loss of the calculated maximum probability of tampering in a batch of training samples (the number of the samples can be set by self) is calculated, and then the maximum probability of tampering is propagated reversely; after training is finished, for a test video, calculating the probability of being tampered after every two continuous frames are input, finally obtaining K-1 tampered probabilities, judging whether the test video comes from the tampered video according to the maximum value of the tampered probabilities, and considering that the test video comes from the tampered video when the maximum value is larger than 50%.
The technical scheme of the embodiment of the invention is based on the deep neural network, and simultaneously utilizes the self information of the frame picture and the interframe relation of the adjacent frame, thereby achieving better effect. Meanwhile, the detection is automatically completed, and the method can be suitable for large-scale video platforms and social platforms.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A method for detecting a tampered face video is characterized by comprising the following steps:
decoding the face video data into a group of continuous frame images, intercepting the face area of each frame image, and correspondingly storing the face area as a face picture according to the frame number;
extracting the characteristics of each face picture through a characteristic extractor to obtain a corresponding characteristic graph;
simultaneously inputting the feature maps of two continuous frames into an interframe correlation classifier, fusing the feature maps of the two frames together by adopting an attention mechanism, and classifying, wherein the classification result is the probability that the two input frames are tampered;
wherein the fusing the feature maps of the two frames together by the inter-frame relevance classifier using an attention mechanism comprises:
marking the characteristic graphs of two continuous frames as A and B, solving a correlation matrix Cor between the two characteristic graphs, and calculating the similarity between every two characteristic vectors in the two characteristic graphs, wherein Cor is A × BT
Obtaining the correlation matrix corresponding to each of the characteristic diagrams A and B by deforming the correlation matrix Cor as follows: rA=reshape(Cor,N×N×N2),RB=reshape(CorT,N×N×N2) (ii) a Where reshape (X, SHAPE) denotes the conversion of the size of X to SHAPE, where X ═ Cor, CorTAnd X has a size N2×N2,SHAPE=N×N×N2Wherein N × N represents the spatial dimension of the feature;
r is to beAAnd RBRespectively input into the attention module to generate corresponding attention mask MAAnd MBAnd then calculating: a. theT=(MA+1)×A,BT=(MB+1) × B, and then, mixing ATAnd BTStitched together in the feature dimension.
2. The method according to claim 1, wherein the face video data is decoded into a group of continuous frame images through a general opencv or ffmpeg toolkit; and intercepting the face region of each frame of image through a Dlib tool library of an open source in python, wherein the sizes of the face regions in different frames of images are the same or different.
3. The method according to claim 1, wherein the extracting the features of each face picture by the feature extractor to obtain the corresponding feature map comprises:
the feature extractor selects an Xmeeting network to realize;
the self-adaptive pooling layer is added at the tail end of the feature extractor and used for dividing the feature map with any size into different regions according to a uniform scale and solving the average value in each region so as to obtain the feature map with the uniform scale;
the scale of the feature map is set to be N × M, where N × N represents the spatial size of the feature, and M represents the feature vector dimension of each point in the feature space.
4. The method for detecting the tampered face video, according to claim 1, wherein the attention module comprises three convolutional layers connected in sequence, the three convolutional layers all use padding-1, the padding value is 0, a batch regularization layer is connected behind each convolutional layer, and a ReLu activation layer is connected behind the batch regularization layer except the last convolutional layer;
and the output of the last convolution layer is added with the input correlation matrix after being subjected to batch regularization layer, and then a corresponding mask is obtained through the ReLu activation layer.
5. The method for detecting the tampered face video, according to claim 1, is characterized in that the feature map fusion results of two frames are classified by a classifier in an interframe correlation classifier; the classifier includes: three convolution layers and a full-connection layer at the tail end which are connected in sequence; after the feature map fusion results of the two frames are input, the feature map fusion results are sequentially input into the full-connection layer through the processing of the three convolution layers, the output dimensionality of the full-connection layer is 1, and then the probability that the input two frames are tampered is obtained through a sigmod function.
6. The method for detecting the tampered face video, according to claim 1, wherein the feature extractor and the interframe correlation classifier form a deep neural network, and in the training process, the loss function is as follows:
Figure FDA0002607828340000021
wherein s is the probability of tampering of the two input frames;
using any one of the following training modes:
the first method comprises the following steps: in the training process, two continuous frames are respectively used as input to calculate loss and are reversely transmitted; after training is finished, calculating the probability of tampering every time when two continuous frames of the test video are input, finally obtaining K-1 probabilities of tampering, and judging whether the test video comes from tampering according to the average value of the K-1 probabilities of tampering, wherein K represents the number of frames of the test video;
and the second method comprises the following steps: in the training process, two continuous frames are used as input, the probability of being tampered is calculated, loss is calculated for the calculated maximum probability of being tampered in a batch of training samples, and then back propagation is carried out; after training is finished, for the test video, after every two continuous frames are input, the probability of tampering is calculated, K-1 probabilities of tampering are finally obtained, and whether the test video comes from the tampered video or not is judged according to the maximum value of the probabilities of tampering.
CN201911376257.6A 2019-12-27 2019-12-27 Method for detecting tampered face video Active CN111144314B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911376257.6A CN111144314B (en) 2019-12-27 2019-12-27 Method for detecting tampered face video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911376257.6A CN111144314B (en) 2019-12-27 2019-12-27 Method for detecting tampered face video

Publications (2)

Publication Number Publication Date
CN111144314A CN111144314A (en) 2020-05-12
CN111144314B true CN111144314B (en) 2020-09-18

Family

ID=70520954

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911376257.6A Active CN111144314B (en) 2019-12-27 2019-12-27 Method for detecting tampered face video

Country Status (1)

Country Link
CN (1) CN111144314B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113674195A (en) * 2020-05-13 2021-11-19 中国移动通信集团有限公司 Image detection method, device, equipment and storage medium
CN111783608B (en) * 2020-06-24 2024-03-19 南京烽火星空通信发展有限公司 Face-changing video detection method
CN111860414B (en) * 2020-07-29 2023-10-24 中国科学院深圳先进技术研究院 Method for detecting deep video based on multi-feature fusion
CN111986180B (en) * 2020-08-21 2021-07-06 中国科学技术大学 Face forged video detection method based on multi-correlation frame attention mechanism
CN112036356B (en) * 2020-09-09 2024-06-25 北京达佳互联信息技术有限公司 Video detection method, device, equipment and storage medium
CN112749686B (en) * 2021-01-29 2021-10-29 腾讯科技(深圳)有限公司 Image detection method, image detection device, computer equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567731A (en) * 2011-12-06 2012-07-11 北京航空航天大学 Extraction method for region of interest
CN103034993A (en) * 2012-10-30 2013-04-10 天津大学 Digital video transcode detection method
CN108765405A (en) * 2018-05-31 2018-11-06 北京瑞源智通科技有限公司 A kind of image authenticating method and system
CN109726733A (en) * 2018-11-19 2019-05-07 西安理工大学 A kind of video tamper detection method based on frame-to-frame correlation
CN109934116A (en) * 2019-02-19 2019-06-25 华南理工大学 A kind of standard faces generation method based on generation confrontation mechanism and attention mechanism
CN110457996A (en) * 2019-06-26 2019-11-15 广东外语外贸大学南国商学院 Moving Objects in Video Sequences based on VGG-11 convolutional neural networks distorts evidence collecting method
CN110503076A (en) * 2019-08-29 2019-11-26 腾讯科技(深圳)有限公司 Video classification methods, device, equipment and medium based on artificial intelligence

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104954807B (en) * 2015-06-25 2018-02-23 西安理工大学 The video dubbing altering detecting method of resist geometric attackses
CN107527337B (en) * 2017-08-07 2019-07-09 杭州电子科技大学 A kind of the video object removal altering detecting method based on deep learning
US20190304102A1 (en) * 2018-03-30 2019-10-03 Qualcomm Incorporated Memory efficient blob based object classification in video analytics
US11580203B2 (en) * 2018-04-30 2023-02-14 Arizona Board Of Regents On Behalf Of Arizona State University Method and apparatus for authenticating a user of a computing device
CN110414350A (en) * 2019-06-26 2019-11-05 浙江大学 The face false-proof detection method of two-way convolutional neural networks based on attention model
CN110418129B (en) * 2019-07-19 2021-03-02 长沙理工大学 Digital video interframe tampering detection method and system
CN110414437A (en) * 2019-07-30 2019-11-05 上海交通大学 Face datection analysis method and system are distorted based on convolutional neural networks Model Fusion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567731A (en) * 2011-12-06 2012-07-11 北京航空航天大学 Extraction method for region of interest
CN103034993A (en) * 2012-10-30 2013-04-10 天津大学 Digital video transcode detection method
CN108765405A (en) * 2018-05-31 2018-11-06 北京瑞源智通科技有限公司 A kind of image authenticating method and system
CN109726733A (en) * 2018-11-19 2019-05-07 西安理工大学 A kind of video tamper detection method based on frame-to-frame correlation
CN109934116A (en) * 2019-02-19 2019-06-25 华南理工大学 A kind of standard faces generation method based on generation confrontation mechanism and attention mechanism
CN110457996A (en) * 2019-06-26 2019-11-15 广东外语外贸大学南国商学院 Moving Objects in Video Sequences based on VGG-11 convolutional neural networks distorts evidence collecting method
CN110503076A (en) * 2019-08-29 2019-11-26 腾讯科技(深圳)有限公司 Video classification methods, device, equipment and medium based on artificial intelligence

Also Published As

Publication number Publication date
CN111144314A (en) 2020-05-12

Similar Documents

Publication Publication Date Title
CN111144314B (en) Method for detecting tampered face video
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
US20200364478A1 (en) Method and apparatus for liveness detection, device, and storage medium
CN107784288B (en) Iterative positioning type face detection method based on deep neural network
CN107273458B (en) Depth model training method and device, and image retrieval method and device
CN112150450B (en) Image tampering detection method and device based on dual-channel U-Net model
Abdulreda et al. A landscape view of deepfake techniques and detection methods
CN111986180B (en) Face forged video detection method based on multi-correlation frame attention mechanism
Yang et al. Spatiotemporal trident networks: detection and localization of object removal tampering in video passive forensics
CN115631112B (en) Building contour correction method and device based on deep learning
CN111325237A (en) Image identification method based on attention interaction mechanism
An Pedestrian Re‐Recognition Algorithm Based on Optimization Deep Learning‐Sequence Memory Model
US9081800B2 (en) Object detection via visual search
CN113392791A (en) Skin prediction processing method, device, equipment and storage medium
CN114724218A (en) Video detection method, device, equipment and medium
Parde et al. Deep convolutional neural network features and the original image
Mahpod et al. Facial landmarks localization using cascaded neural networks
Chen et al. Salbinet360: Saliency prediction on 360 images with local-global bifurcated deep network
Jiang et al. Application of a fast RCNN based on upper and lower layers in face recognition
CN111814846A (en) Training method and recognition method of attribute recognition model and related equipment
CN115294326A (en) Method for extracting features based on target detection grouping residual error structure
Tang et al. An automatic fine-grained violence detection system for animation based on modified faster R-CNN
CN117315752A (en) Training method, device, equipment and medium for face emotion recognition network model
CN111191549A (en) Two-stage face anti-counterfeiting detection method
CN115937596A (en) Target detection method, training method and device of model thereof, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant