CN111241958A - Video image identification method based on residual error-capsule network - Google Patents
Video image identification method based on residual error-capsule network Download PDFInfo
- Publication number
- CN111241958A CN111241958A CN202010008315.6A CN202010008315A CN111241958A CN 111241958 A CN111241958 A CN 111241958A CN 202010008315 A CN202010008315 A CN 202010008315A CN 111241958 A CN111241958 A CN 111241958A
- Authority
- CN
- China
- Prior art keywords
- capsule
- residual
- video image
- network
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000002775 capsule Substances 0.000 title claims abstract description 121
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000013528 artificial neural network Methods 0.000 claims abstract description 20
- 238000000605 extraction Methods 0.000 claims description 21
- 238000010606 normalization Methods 0.000 claims description 17
- 238000003062 neural network model Methods 0.000 claims description 11
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 10
- 230000006835 compression Effects 0.000 claims description 8
- 238000007906 compression Methods 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 7
- 238000007781 pre-processing Methods 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 5
- 238000012850 discrimination method Methods 0.000 claims 1
- 238000001514 detection method Methods 0.000 abstract description 21
- 238000013527 convolutional neural network Methods 0.000 abstract description 13
- 238000012545 processing Methods 0.000 abstract description 8
- 230000008034 disappearance Effects 0.000 abstract description 6
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000006870 function Effects 0.000 description 27
- 230000008569 process Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 7
- 238000012360 testing method Methods 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000013526 transfer learning Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 235000009854 Cucurbita moschata Nutrition 0.000 description 1
- 240000001980 Cucurbita pepo Species 0.000 description 1
- 235000009852 Cucurbita pepo Nutrition 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000010287 polarization Effects 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 235000020354 squash Nutrition 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a video image identification method based on a residual error-capsule network, and belongs to an image classification technology in the field of computer vision image processing. The method constructs a residual error-capsule neural network by a residual error neural network for extracting the potential features of the image, a capsule network for coding the corresponding relation between a local part and a whole object and a decoder for reconstructing the image, mainly solves the problems of overfitting and gradient disappearance in a convolutional neural network, reconstructs an original input image by an output vector of the capsule network, and discriminates and classifies models according to the matching degree of the reconstruction and the original image, thereby further improving the detection performance of forged face images and videos.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a video image identification method based on a residual error-capsule network.
Background
Residual neural Networks (Residual Networks) are easy to optimize and converge quickly, and can improve accuracy by adding considerable depth. The inner residual block uses jump connection, and the problem of gradient disappearance caused by depth increase in a deep neural network is relieved.
Convolutional Neural Networks (CNN) are a class of feed forward Neural Networks (fed forward Neural Networks) that contain convolution computations and have a deep structure, and are one of the representative algorithms for deep learning (deep). The successful application of convolutional neural networks in object recognition and classification tasks is favored by the computer vision application community. CNNs are composed of multiple neurons stacked together. Computing the convolution between neurons requires a large amount of computation, and therefore pooling is often used to reduce the size of the network layer. Convolution methods can learn many complex features of data through simple calculations. Its artificial neuron can respond to peripheral units in a part of coverage range, and has excellent performance for large-scale image processing. The application fields include computer vision, natural language processing and the like.
The traditional convolutional neural network has good detection performance in the aspect of extracting important features, and is difficult to pay attention to the relative relationship (such as position, proportion, direction, size and skewness) between a local part and a whole object, so that some important position information is lost. How to correctly classify and maintain the corresponding relation between the part and the whole is a key problem in solving the image classification problem.
The capsule neural network (CapsNets) is a brand-new deep learning system structure, overcomes the defects of CNN, and is a new and promising network structure. The capsule represents various characteristics of a specific entity in the image, such as position, size, direction, speed, hue, texture, etc., existing as a single logical unit, and then using a protocol routing algorithm, when the capsule passes its learned and predicted data to the capsule of the highest level, if the prediction is consistent, the higher level capsule becomes active, a process called dynamic routing. With the continuous iteration of the routing mechanism, various capsules can be trained into logic units for learning different thinking, the neural network is enabled to identify the face, and different parts of the face are respectively routed to the capsules capable of understanding eyes, nose, mouth and ears. Compared with the traditional neural network, the capsule network has the following characteristics.
Based on the development of deep learning, there is a risk of being attacked by faces of counterfeit legitimate users. Under the great trend of deep learning, a batch of high-quality false image video generation technologies appear. Recently, the fierce DeepFake technology, the Face2Face technology, the GAN and variant technology and the like are compared, and the abuse of the technologies causes potential safety hazards in the financial industry, so that the identification of forged image videos is a key link in the field of financial anti-fraud.
The digital media evidence obtaining method is mainly based on texture, motion information and multispectral characteristics. Common counterfeit detection methods are to analyze the difference between an image generated by GAN and a real image; detecting GAN by using a co-occurrence matrix to generate an image; the color difference between the image generated by utilizing the GAN and the real image in a non-RGB color space; detecting a false video generated by the deepfake with the bio-signal based on the detection of the blink in the video; detecting a forged video through a unique artifact left by the inconsistency of the resolution of the twisted surface area and the resolution of the surrounding environment; carrying out false identification by utilizing the inconsistency of the head postures; the authenticity of the image is distinguished based on the foreground and background correlation analysis of an optical flow method, and the authenticity is judged based on the difference of the spectral reflectivity of skin and other materials; and extracting the motion information of the face area from the video to judge the true and false face based on the local mode of the diffusion speed. The detection methods indicate the importance of texture characteristics, motion information and multispectral characteristics to a certain extent. The disadvantages are as follows: texture features are susceptible to illumination, image resolution, and the like. The motion information is widely applied, but is easily attacked at low risk by an attacker who makes corresponding instructions according to requirements after hollowing out the mouth and eyes of the facial picture. The multispectral characteristics have strict requirements on lighting, and the presented multispectral image has poor user experience and higher cost than a visible light system.
With the development of deep neural networks, the authentication method can be based on the detection of learning features. A double-flow network-based face tampering detection method is characterized in that a GoogLeNet and a patch-based triple network are trained respectively to detect a forged face by utilizing two flows of feature capture local noise residual and camera features, and a mixed method of a convolutional neural network and a capsule network is used for detection; a new network is proposed, which is a CNN-based network, and is used for detecting face tampering in videos; the method based on learning is used for universal operation detection, does not depend on preselected characteristics, and can automatically learn how to detect various image operations or any preprocessing; XceptionNet is a traditional CNN trained on ImageNet based on separable convolutions with residual connections. Compared with texture features, motion information and multispectral features, the detection method based on the learning features has the advantages of fewer external influence factors, higher detection rate and better detection performance on forged faces. Most are authenticated for specific counterfeit scenes. The neural network result which can be applied to various fake scenes is provided, the problems of gradient disappearance and overfitting in the classification problem are solved, and the generalization capability of the identification network is improved.
Disclosure of Invention
Aiming at the problems of overfitting, gradient disappearance and the like in a convolutional neural network adopted by the existing classification method, the invention provides a video image identification method based on a residual error-capsule network.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a video image identification method based on a residual error-capsule network comprises the following steps:
s1, inputting a video image containing a human face and preprocessing the video image;
s2, extracting potential network features in the face image by adopting a pre-trained residual neural network model to obtain a face potential feature map;
s3, extracting the features of the potential face feature map obtained in the step S2 by adopting a capsule network model;
and S4, reconstructing the feature map of the features extracted in the step S3 by adopting a decoder, and identifying and classifying the video images according to the matching degree of the reconstructed visual feature map and the potential feature map of the human face obtained in the step S2.
Further, the preprocessing the video image including the human face in the step S1 specifically includes: and adopting a Dlib face recognition positioning library to perform face positioning on the video image containing the face, and cutting the detected face into a set size.
Further, the residual neural network model in step S2 includes a two-dimensional convolution unit, a pooling unit, a first residual layer and a second residual layer, which are connected in sequence, where the first residual layer includes three residual blocks, the second residual layer includes four residual blocks, and each residual block includes two convolution blocks.
Further, in the step S2, the features in the face image are extracted by using a two-dimensional convolution unit in the residual neural network model, and then the potential features in the face image are extracted sequentially through the first residual layer and the second residual layer.
Further, the capsule network model in step S3 includes extracting capsules and outputting capsules; the extraction capsule adopts three parallel feature extraction modules to extract features, performs superposition operation on the extracted features, performs compression operation, and finally sends feature information to an output capsule through a dynamic routing algorithm; the output capsule adopts a real capsule and a false capsule as classified capsules for true and false identification.
Further, the feature extraction module comprises a two-dimensional convolution unit, a statistical pool unit and a one-dimensional convolution unit, wherein a two-dimensional normalization unit and a ReLU activation function are arranged behind the two-dimensional convolution unit, and a one-dimensional normalization unit and an output unit are arranged behind the one-dimensional convolution unit.
Further, in the step S4, the decoder performs feature map reconstruction on the extracted features by using a two-layer feedforward neural network, and constructs a residual error-capsule network structure in a fully-connected decoder manner together with the capsule network model.
Further, the feature extraction module comprises two-dimensional convolution units, wherein a two-dimensional normalization unit and a ReLU activation function are arranged behind the first two-dimensional convolution unit, and a two-dimensional normalization unit and an output unit are arranged behind the second two-dimensional convolution unit.
Further, in the step S4, the decoder performs feature map reconstruction on the extracted features by using three two-dimensional deconvolution units, and constructs a residual error-capsule network structure in a deconvolution decoder mode together with the capsule network model.
Further, the loss function for performing identification and classification on the video image adopts an edge loss function and a reconstruction loss function to form a total loss function, which is expressed as:
wherein L ismarginRepresenting the edge loss function, L _ recon representing the reconstruction loss function, λ representing the reconstruction loss function weight, TkDenotes the k class, vkIndicating k classes of output capsules, x _ recon indicating reconstructed input features, x indicating input features, N indicating number of input features, m+,m-The scaling coefficients of the positive and negative examples are shown separately.
The invention has the following beneficial effects:
(1) the residual error neural network which is more stable than the traditional convolution neural network is adopted to carry out primary feature extraction on the preprocessed image, so that the extracted potential features have more feature points;
(2) two different reconstruction implementation modes are constructed by fusing a capsule network, namely a network structure in a full-connection decoder mode and a network structure in a deconvolution decoder mode, so that the performance of counterfeit detection is improved to the maximum extent;
(3) random Gaussian noise and compression operation are added in a consistency dynamic routing algorithm of the capsule network, so that the problems of overfitting and gradient disappearance are solved;
(4) in the whole image identification process, the feature maps of the images before and after reconstruction are visualized, so that the network structure can be conveniently adjusted and understood in the network training process.
Drawings
FIG. 1 is a schematic flow chart of a video image identification method based on a residual error-capsule network according to the present invention;
FIG. 2 is a schematic diagram of a residual neural network according to an embodiment of the present invention;
FIG. 3 is a diagram of different layer activation states in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a residual error-capsule network structure of a fully-connected decoder approach in an embodiment of the present invention;
FIG. 5 is a schematic diagram of a residual-capsule network structure of a deconvolution decoder approach in an embodiment of the present invention;
FIG. 6 is a flow chart of a dynamic routing algorithm in an embodiment of the present invention;
FIG. 7 is a comparison graph of features before and after reconstruction in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, an embodiment of the present invention discloses a method for identifying a video image based on a residual error-capsule network, which is characterized by comprising the following steps S1 to S4:
s1, inputting a video image containing a human face and preprocessing the video image;
in this embodiment, after the video image including the face is input, the video image including the face is preprocessed, and the specific process is as follows:
adopting a Dlib face recognition positioning library to perform face positioning on a video image containing a face, and cutting the detected face into a set size; here the face image is scaled to 128 x 128 size by resize.
S2, extracting potential network features in the face image by adopting a pre-trained residual neural network model to obtain a face potential feature map;
in this embodiment, the present invention uses the first or second layer of the pre-trained residual neural network model to extract potential network features and use them as input to the capsule network model, as shown in FIG. 2.
The residual neural network model comprises a two-dimensional convolution unit, a pooling unit, a first residual layer and a second residual layer which are sequentially connected, wherein the first residual layer comprises three residual blocks, the second residual layer comprises four residual blocks, and each residual block comprises two convolution blocks.
And extracting features in the face image by using a two-dimensional convolution unit in the residual neural network model, and extracting potential features in the face image sequentially through the first residual layer and the second residual layer.
As shown in fig. 2, the features of the face image are extracted by 2d convolution of 7x7, the face image sequentially passes through a first residual layer and a second residual layer of a residual network, and finally the extracted potential network features are output, so that 128 feature maps with the size of 16x16 are obtained. Where 3x3Conv,64 represents a Conv2d () function with a convolution kernel of 3x3, the residual block is made up of two 2d convolutions of 3x 3.
The method tests the influence of the features extracted from different layers of the residual error network on the final identification detection performance of the network on the deepfake data set, as shown in table 1.
TABLE 1 Deepfake data set test results
Resnet18_Acc | Resnet34_Acc | |
Layer1 | 90% | 93% |
Layer2 | 91% | 95% |
Layer3 | 83% | 88% |
Through experimental comparison, the characteristics of the residual error network extracted in the second layer are best as input detection performance, and the training result is more stable.
As shown in fig. 3, the feature activation state diagrams of the ResNet34 neural network Layer1, Layer2 and Layer3 are sequentially from left to right. The comparison shows that the first layer and the second layer have more activated features compared with the third layer, and the second layer is selected as a feature extraction layer in combination with the comprehensive consideration of experiments.
The method for extracting the potential network characteristics in the face image by adopting the residual error neural network model has the advantages that:
(1) by adopting the transfer learning mode, the time for training the model is reduced, the pre-trained model uses knowledge learned by a well-trained network on a large set by using the transfer learning mode, and the pre-trained model is applied to improve the performance of the detector on a smaller data set, so that higher initial precision, higher convergence speed and higher approximation precision can be achieved.
(2) The residual error network replaces the traditional convolutional neural network, so that on one hand, more position information can be prevented from being lost in the convolutional neural network; on the other hand, the residual network can implement a combination of different resolution features, shallow ones tend to have high resolution but low level semantic features, while deep ones have high level semantic but low resolution.
(3) The training is guided as a regularizer to reduce overfitting.
S3, extracting the features of the potential face feature map obtained in the step S2 by adopting a capsule network model;
in the present embodiment, the Capsule network model includes an extract Capsule (extract Capsule) and an Output Capsule (Output Capsule).
The capsule extraction adopts three parallel feature extraction modules to extract features, the extracted features are subjected to superposition (stack) operation, then compression (square) operation is carried out, and finally feature information is sent to an output capsule through a dynamic routing algorithm.
The output Capsule adopts Real Capsule (Real Capsule) and Fake Capsule (Fake Capsule) as classified capsules for identifying true and false.
The extraction capsule can be realized by adopting a full-connection type capsule network layer or a deconvolution type capsule network layer.
Referring to fig. 4, each feature extraction module in the fully-connected capsule network layer includes a two-dimensional convolution unit, a statistic pool unit and a one-dimensional convolution unit, a two-dimensional normalization unit and a ReLU activation function are arranged behind the two-dimensional convolution unit, and a one-dimensional normalization unit and an output unit are arranged behind the one-dimensional convolution unit.
Each feature extraction module in the capsule extracts features from convolution kernels with the size of 3x3 and the size of 2d, performs data normalization processing for 2d, performs pooling on the data through a statistics pool stats posing, extracts features from convolution kernels with the size of 5x5 and performs data normalization processing for 1d, and converts the final result into one-dimensional vector output.
The invention helps to make the network independent of the size of the input image by setting the statistical pool, which means that one network structure can be applied to different problems of different input sizes without redesigning the network, and the mean and variance of each filter are calculated in the statistical pool layer.
Referring to fig. 5, each feature extraction module in the deconvolution type capsule network layer includes two-dimensional convolution units, wherein a two-dimensional normalization unit and a ReLU activation function are arranged behind the first two-dimensional convolution unit, and a two-dimensional normalization unit and an output unit are arranged behind the second two-dimensional convolution unit. By removing the statistics pool, more convolution information is retained.
Extracting the characteristics of each characteristic extraction module in the capsule by a convolution kernel with the size of 3x3 and the size of 2d, and then performing 2d data normalization processing; extracting features through 2d convolution kernels with the size of 5x5, and then performing 2d data normalization processing; and extracting features through 2d convolution kernels with the size of 3x3, performing 2d data normalization processing, and converting a final result into a one-dimensional vector output.
According to the method, potential feature extraction is carried out through a second layer of a residual error neural network, the output size is 128 x16, the potential feature extraction is input into a capsule network, three same feature extraction modules are adopted in the capsule network layer to carry out feature extraction simultaneously, extracted information is overlapped in the last dimension, and compression operation is carried out after overlapping. The compression operation normalizes each element in the vector to be between 0 and 1. Where the compression function square () is represented as:
wherein v isjIs the vector output of capsule j, sjIs its input.
The implementation of Extract Capsule and Output Capsule in the Capsule network of the present invention uses dynamic routing algorithm (dynamic routing algorithm) to dynamically calculate at runtime, and the result will be Output to the appropriate Output Capsule. According to the experiment, the effect that the iterative routing time is 2 times is better. The dynamic routing mechanism is used for determining where the information is mainly sent, and the consistency between capsules calculated by the dynamic routing algorithm can well describe the hierarchical attitude relationship between object parts, so that the accuracy between visual tasks is improved.
The Capsule output vector of Extract Capsule is set to u(i)Real Capsule of Output Capsule is called true Capsule v(1)Fake capsule called pseudo capsule v(2),W(i,j)Is to mix u(i)Route to v(j)R is the number of iterations;
the Output of the Capsule network layer transfers information to the next Capsule layer through a dynamic routing algorithm, as shown in fig. 6, a dynamic routing process for sending information from an extra Capsule route to an Output Capsule route is shown, and the parameter updating and iteration process of the dynamic routing is shown in detail in the figure.
In the dynamic routing process, firstly, a nonlinear activation function square is adopted to output capsule (u)(i)) Perform a square () operation, represented asEq.2;
To pairPerform a dropout operation, denoted asEq.4; by applying the output capsules (u)(i)) Perform the operation of square () andand performing dropout operation, so that the training process is more stable. The compression function is used to scale the vector size to a single length.
The parameters to be updated in the whole routing process are W(i,j)According to a back propagation algorithm; bi,jAnd cijAccording to the dynamic route consistency principle, bi,iInitialized to 0 and updated toEq.6;cijUpdate the formula toEq.5;
To sjPerforming nonlinear activation function square operation to output capsule vector v(i)Is shown as v(j)←squash(sj)。
For prediction output capsule after routing algorithmApplying each dimension capsule vector output by the softmax function to realize strong polarization instead of simply using length output capsules, adjusting the capsules in a maximization mode, wherein the final result is an average value of all softmax outputs, and the calculation formula is as follows:
wherein m represents a dimension number.
And S4, reconstructing the feature map of the features extracted in the step S3 by adopting a decoder, and identifying and classifying the video images according to the matching degree of the reconstructed visual feature map and the potential feature map of the human face obtained in the step S2.
In this embodiment, the present invention uses a decoder to reconstruct the input from the final capsule, forcing the network to hold as much information as possible from the entire network input, acting as an effective regularizer, reducing the risk of over-fitting.
The decoder part adopts two implementation modes:
(1) fully-connected decoder: and (3) reconstructing the extracted features by adopting two layers of feedforward neural networks, and constructing a residual error-capsule network structure in a fully-connected decoder mode together with the capsule network model.
The network structure of the fully-connected decoder part adopts two linear functions to map and finally uses a sigmoid function to process.
(2) A deconvolution-style decoder: and (3) reconstructing the feature graph of the extracted features by adopting three two-dimensional deconvolution units, and constructing a residual error-capsule network structure in a deconvolution decoder mode together with the capsule network model.
The network structure of the deconvolution type decoder part sequentially adopts deconvolution kernels with the size of 3x3 and 2 d; a size 2d deconvolution kernel of 6x 6; a size 2d deconvolution kernel of 3x3 is processed.
Three loss functions are used throughout: and (3) an edge Loss function Margin Loss, a Reconstruction Loss function Reconstruction Loss, and a Total Loss Total Loss which are Margin Loss and Reconstruction Loss are summed, so that the calculation influence of the Reconstruction Loss on the Margin Loss is avoided.
The edge Loss function, Margin Loss, is expressed as:
Lk=Tkmax(0,m+-||vk||)2+λ(1-Tk)max(0,||vk||-m-)2
wherein T isk1 denotes k, m+,m-Respectively representing the proportionality coefficients, m, of positive and negative examples+=0.9,m-0.1. λ is to reduce the loss of classes that do not appear in the picture. The present invention uses λ ═ 0.5. The total loss is the sum of the losses of all output capsules.
The Reconstruction Loss function Reconstruction Loss is expressed as:
the total loss function is thus expressed as:
wherein L ismarginRepresenting the edge loss function, L _ recon representing the reconstruction loss function, λ representing the reconstruction loss function weight, TkDenotes the k class, vkIndicating k classes of output capsules, x _ recon indicating reconstructed input features, x indicating input features, N indicating number of input features, m+,m-The scaling coefficients of the positive and negative examples are shown separately.
The invention visualizes the reconstructed image output by the capsule network and the input of the capsule network. The visual image characteristic diagram is compared to be more beneficial to understanding the characteristic operation of the network to the image and adjusting the whole network parameter, thereby improving the precision of image classification.
Compared with the characteristic diagrams before and after reconstruction, the characteristic points of the reconstructed image are more obvious, the network is forced to store information input from the whole network as much as possible by using the reconstruction loss as an effective regularizer, and the overfitting risk can be effectively reduced. The visual characteristic map of the image after reconstruction is shown in fig. 7.
To further illustrate the effectiveness of the method of the present invention, the present invention performs image classification and raw image reconstruction experiments using the Deepfake data set and the data set of Deepfake _ Detection.
In the experiment, the Deepfake data set is a data set in FaceForenses + +, and comprises 977 videos downloaded from youtube and 1000 extracted original sequences, which contain a face that can be easily tracked without any problems. Detailed training data are shown in table 1.
A deep forgery Detection dataset Deepfake _ Detection, provided by google and JigSaw, has been currently hosted, which contains over 3000 processed videos from 28 different scenes, as detailed in table 2.
TABLE 1 Deepfake data set
Real(youtube) | Fake(deepfake) | |
Train | 4000 | 4000 |
Val | 500 | 500 |
Test | 500 | 500 |
TABLE 2 Deepfake _ Detection dataset
The experiment was trained on PC of GTX1060 TI; during training, Adam is selected as an optimizer, and the learning rate is 0.0005; during testing, the same data sets are used for testing in original capsules respectively, and the result shows that the performance of the reconstructed network structure in the two data sets is superior to that of the original counterfeit detection network result. The test results are shown in table 3.
TABLE 3 comparison of test results of different models on different data sets
Model (model) | Deepfake_Acc | Deepfake_Detection_Acc |
Original | 95.70% | 85.34% |
Full connection type | 96.30% | 87.53% |
Deconvolution formula | 98.60% | 88.16% |
Compared with the characteristic graphs of the data set before and after reconstruction, the reconstructed image characteristic points are more obvious, the network is forced to store information input from the whole network as much as possible by using the reconstruction loss as an effective regularizer, and the overfitting risk can be effectively reduced.
The experimental results show that: the complexity of the data set can directly influence the performance of the identification, the scene of the Deepfake data set is simple relative to the scene of the Deepfake _ Detection data set provided by Google, the Deepfake data set is a positive picture of a person, and the identification performance is good. The reconstruction idea is superior to the idea of cross entropy loss to a certain extent, and the model is identified according to the matching degree of the reconstruction and the original image, so that the accuracy of identification and classification can be effectively improved.
The invention improves the performance in the classification task of the authenticity identification, solves the problems of overfitting and gradient disappearance in the neural network, and improves the authenticity identification performance by fusing the reconstruction network structure of the capsule network.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (10)
1. A video image identification method based on a residual error-capsule network is characterized by comprising the following steps:
s1, inputting a video image containing a human face and preprocessing the video image;
s2, extracting potential network features in the face image by adopting a pre-trained residual neural network model to obtain a face potential feature map;
s3, extracting the features of the potential face feature map obtained in the step S2 by adopting a capsule network model;
and S4, reconstructing the feature map of the features extracted in the step S3 by adopting a decoder, and identifying and classifying the video images according to the matching degree of the reconstructed visual feature map and the potential feature map of the human face obtained in the step S2.
2. The method for video image discrimination based on the residual error-capsule network of claim 1, wherein the preprocessing of the video image containing the human face in the step S1 is specifically as follows: and adopting a Dlib face recognition positioning library to perform face positioning on the video image containing the face, and cutting the detected face into a set size.
3. The method for video image discrimination based on the residual-capsule network according to claim 1, wherein the residual neural network model in the step S2 includes a two-dimensional convolution unit, a pooling unit, a first residual layer and a second residual layer connected in sequence, wherein the first residual layer includes three residual blocks, and the second residual layer includes four residual blocks, each of which includes two convolution blocks.
4. The method for video image discrimination based on residual error-capsule network as claimed in claim 3, wherein said step S2 utilizes a two-dimensional convolution unit in the residual error neural network model to extract features in the face image, and then extracts potential features in the face image sequentially through the first residual layer and the second residual layer.
5. The residual-capsule network-based video image discrimination method according to claim 1, wherein the capsule network model in the step S3 includes extracting a capsule and outputting the capsule; the extraction capsule adopts three parallel feature extraction modules to extract features, performs superposition operation on the extracted features, performs compression operation, and finally sends feature information to an output capsule through a dynamic routing algorithm; the output capsule adopts a real capsule and a false capsule as classified capsules for true and false identification.
6. The method of claim 5, wherein the feature extraction module comprises a two-dimensional convolution unit followed by a two-dimensional normalization unit and a ReLU activation function, a statistics pool unit, and a one-dimensional convolution unit followed by a one-dimensional normalization unit and an output unit.
7. The method for video image identification based on residual error-capsule network as claimed in claim 6, wherein the decoder performs feature map reconstruction on the extracted features by using two layers of feedforward neural networks in step S4, and constructs a residual error-capsule network structure in a fully connected decoder manner together with the capsule network model.
8. The method of claim 5, wherein the feature extraction module comprises two-dimensional convolution units, wherein a first two-dimensional convolution unit is followed by a two-dimensional normalization unit and a ReLU activation function, and a second two-dimensional convolution unit is followed by a two-dimensional normalization unit and an output unit.
9. The method for video image discrimination based on residual error-capsule network as claimed in claim 8, wherein the decoder performs feature map reconstruction on the extracted features by using three two-dimensional deconvolution units in step S4, and the feature map reconstruction and the capsule network model together construct a residual error-capsule network structure in a deconvolution decoder mode.
10. The method for video image identification based on the residual error-capsule network according to claim 7 or 9, wherein the loss function for identifying and classifying the video image uses the edge loss function and the reconstruction loss function to form a total loss function, which is expressed as:
wherein L ismarginRepresenting the edge loss function, L _ recon representing the reconstruction loss function, λ representing the reconstruction loss function weight, TkDenotes the k class, vkIndicating k classes of output capsules, x _ recon indicating reconstructed input features, x indicating input features, N indicating number of input features, m+,m-The scaling coefficients of the positive and negative examples are shown separately.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010008315.6A CN111241958B (en) | 2020-01-06 | 2020-01-06 | Video image identification method based on residual error-capsule network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010008315.6A CN111241958B (en) | 2020-01-06 | 2020-01-06 | Video image identification method based on residual error-capsule network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111241958A true CN111241958A (en) | 2020-06-05 |
CN111241958B CN111241958B (en) | 2022-07-22 |
Family
ID=70876020
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010008315.6A Expired - Fee Related CN111241958B (en) | 2020-01-06 | 2020-01-06 | Video image identification method based on residual error-capsule network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111241958B (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222457A (en) * | 2020-01-06 | 2020-06-02 | 电子科技大学 | Detection method for identifying video authenticity based on depth separable convolution |
CN111967427A (en) * | 2020-08-28 | 2020-11-20 | 广东工业大学 | Fake face video identification method, system and readable storage medium |
CN112036281A (en) * | 2020-07-29 | 2020-12-04 | 重庆工商大学 | Facial expression recognition method based on improved capsule network |
CN112036494A (en) * | 2020-09-02 | 2020-12-04 | 公安部物证鉴定中心 | Gun image identification method and system based on deep learning network |
CN112069891A (en) * | 2020-08-03 | 2020-12-11 | 武汉大学 | Deep fake face identification method based on illumination characteristics |
CN112085734A (en) * | 2020-09-25 | 2020-12-15 | 西安交通大学 | GAN-based image restoration defect detection method |
CN112232261A (en) * | 2020-10-27 | 2021-01-15 | 上海眼控科技股份有限公司 | Method and device for fusing image sequences |
CN112256878A (en) * | 2020-10-29 | 2021-01-22 | 沈阳农业大学 | Rice knowledge text classification method based on deep convolution |
CN112487989A (en) * | 2020-12-01 | 2021-03-12 | 重庆邮电大学 | Video expression recognition method based on capsule-long-and-short-term memory neural network |
CN112507783A (en) * | 2020-10-29 | 2021-03-16 | 上海交通大学 | Mask face detection, identification, tracking and temperature measurement method based on attention mechanism |
CN112733701A (en) * | 2021-01-07 | 2021-04-30 | 中国电子科技集团公司信息科学研究院 | Robust scene recognition method and system based on capsule network |
CN113283393A (en) * | 2021-06-28 | 2021-08-20 | 南京信息工程大学 | Method for detecting Deepfake video based on image group and two-stream network |
CN113343886A (en) * | 2021-06-23 | 2021-09-03 | 贵州大学 | Tea leaf identification grading method based on improved capsule network |
CN113610108A (en) * | 2021-07-06 | 2021-11-05 | 中南民族大学 | Rice pest identification method based on improved residual error network |
CN113807232A (en) * | 2021-09-14 | 2021-12-17 | 广州大学 | Fake face detection method, system and storage medium based on double-flow network |
CN114241245A (en) * | 2021-12-23 | 2022-03-25 | 西南大学 | Image classification system based on residual error capsule neural network |
CN114339398A (en) * | 2021-12-24 | 2022-04-12 | 天翼视讯传媒有限公司 | Method for real-time special effect processing in large-scale video live broadcast |
CN115082928A (en) * | 2022-06-21 | 2022-09-20 | 电子科技大学 | Method for asymmetric double-branch real-time semantic segmentation of network for complex scene |
CN116030454A (en) * | 2023-03-28 | 2023-04-28 | 中南民族大学 | Text recognition method and system based on capsule network and multi-language model |
CN114241245B (en) * | 2021-12-23 | 2024-05-31 | 西南大学 | Image classification system based on residual capsule neural network |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130223681A1 (en) * | 2012-02-29 | 2013-08-29 | Suprema Inc. | Apparatus and method for identifying fake face |
US20150310259A1 (en) * | 2011-07-12 | 2015-10-29 | Microsoft Technology Licensing, Llc | Using facial data for device authentication or subject identification |
CN107977932A (en) * | 2017-12-28 | 2018-05-01 | 北京工业大学 | It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method |
CN108875618A (en) * | 2018-06-08 | 2018-11-23 | 高新兴科技集团股份有限公司 | A kind of human face in-vivo detection method, system and device |
CN108898577A (en) * | 2018-05-24 | 2018-11-27 | 西南大学 | Based on the good malign lung nodules identification device and method for improving capsule network |
CN108985316A (en) * | 2018-05-24 | 2018-12-11 | 西南大学 | A kind of capsule network image classification recognition methods improving reconstructed network |
CN109086728A (en) * | 2018-08-14 | 2018-12-25 | 成都智汇脸卡科技有限公司 | Biopsy method |
CN110009097A (en) * | 2019-04-17 | 2019-07-12 | 电子科技大学 | The image classification method of capsule residual error neural network, capsule residual error neural network |
CN110443867A (en) * | 2019-08-01 | 2019-11-12 | 太原科技大学 | Based on the CT image super-resolution reconstructing method for generating confrontation network |
CN110516576A (en) * | 2019-08-20 | 2019-11-29 | 西安电子科技大学 | Near-infrared living body faces recognition methods based on deep neural network |
CN110533004A (en) * | 2019-09-07 | 2019-12-03 | 哈尔滨理工大学 | A kind of complex scene face identification system based on deep learning |
CN110570353A (en) * | 2019-08-27 | 2019-12-13 | 天津大学 | Dense connection generation countermeasure network single image super-resolution reconstruction method |
-
2020
- 2020-01-06 CN CN202010008315.6A patent/CN111241958B/en not_active Expired - Fee Related
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150310259A1 (en) * | 2011-07-12 | 2015-10-29 | Microsoft Technology Licensing, Llc | Using facial data for device authentication or subject identification |
US20130223681A1 (en) * | 2012-02-29 | 2013-08-29 | Suprema Inc. | Apparatus and method for identifying fake face |
CN107977932A (en) * | 2017-12-28 | 2018-05-01 | 北京工业大学 | It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method |
CN108898577A (en) * | 2018-05-24 | 2018-11-27 | 西南大学 | Based on the good malign lung nodules identification device and method for improving capsule network |
CN108985316A (en) * | 2018-05-24 | 2018-12-11 | 西南大学 | A kind of capsule network image classification recognition methods improving reconstructed network |
CN108875618A (en) * | 2018-06-08 | 2018-11-23 | 高新兴科技集团股份有限公司 | A kind of human face in-vivo detection method, system and device |
CN109086728A (en) * | 2018-08-14 | 2018-12-25 | 成都智汇脸卡科技有限公司 | Biopsy method |
CN110009097A (en) * | 2019-04-17 | 2019-07-12 | 电子科技大学 | The image classification method of capsule residual error neural network, capsule residual error neural network |
CN110443867A (en) * | 2019-08-01 | 2019-11-12 | 太原科技大学 | Based on the CT image super-resolution reconstructing method for generating confrontation network |
CN110516576A (en) * | 2019-08-20 | 2019-11-29 | 西安电子科技大学 | Near-infrared living body faces recognition methods based on deep neural network |
CN110570353A (en) * | 2019-08-27 | 2019-12-13 | 天津大学 | Dense connection generation countermeasure network single image super-resolution reconstruction method |
CN110533004A (en) * | 2019-09-07 | 2019-12-03 | 哈尔滨理工大学 | A kind of complex scene face identification system based on deep learning |
Non-Patent Citations (5)
Title |
---|
HUY H.NGUYEN等: "Capsule-forensics: Using capsule networks to detect forged images and videos", 《2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTIC,SPEECH AND SIGNAL PROCESSING (ICASSP)》, 17 April 2019 (2019-04-17), pages 2307 - 2311 * |
HUY H.NGUYEN等: "Use of a capsule network to detect fake images and videos", 《ARXIV》, 1 November 2019 (2019-11-01), pages 1 - 14 * |
佟越洋等: "基于卷积神经网络的活体人脸检测算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, 15 June 2019 (2019-06-15), pages 138 - 614 * |
杨泽: "伪造数字图像检测算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, 15 January 2019 (2019-01-15), pages 138 - 3875 * |
陈健等: "基于胶囊网络的汉字笔迹鉴定算法", 《包装学报》, vol. 10, no. 5, 11 December 2018 (2018-12-11), pages 51 - 56 * |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222457A (en) * | 2020-01-06 | 2020-06-02 | 电子科技大学 | Detection method for identifying video authenticity based on depth separable convolution |
US12008071B2 (en) | 2020-07-17 | 2024-06-11 | Tata Consultancy Services Limited | System and method for parameter compression of capsule networks using deep features |
CN112036281A (en) * | 2020-07-29 | 2020-12-04 | 重庆工商大学 | Facial expression recognition method based on improved capsule network |
CN112036281B (en) * | 2020-07-29 | 2023-06-09 | 重庆工商大学 | Facial expression recognition method based on improved capsule network |
CN112069891A (en) * | 2020-08-03 | 2020-12-11 | 武汉大学 | Deep fake face identification method based on illumination characteristics |
CN112069891B (en) * | 2020-08-03 | 2023-08-18 | 武汉大学 | Deep fake face identification method based on illumination characteristics |
CN111967427A (en) * | 2020-08-28 | 2020-11-20 | 广东工业大学 | Fake face video identification method, system and readable storage medium |
CN112036494A (en) * | 2020-09-02 | 2020-12-04 | 公安部物证鉴定中心 | Gun image identification method and system based on deep learning network |
CN112085734B (en) * | 2020-09-25 | 2022-02-01 | 西安交通大学 | GAN-based image restoration defect detection method |
CN112085734A (en) * | 2020-09-25 | 2020-12-15 | 西安交通大学 | GAN-based image restoration defect detection method |
CN112232261A (en) * | 2020-10-27 | 2021-01-15 | 上海眼控科技股份有限公司 | Method and device for fusing image sequences |
CN112256878B (en) * | 2020-10-29 | 2024-01-16 | 沈阳农业大学 | Rice knowledge text classification method based on deep convolution |
CN112507783A (en) * | 2020-10-29 | 2021-03-16 | 上海交通大学 | Mask face detection, identification, tracking and temperature measurement method based on attention mechanism |
CN112256878A (en) * | 2020-10-29 | 2021-01-22 | 沈阳农业大学 | Rice knowledge text classification method based on deep convolution |
CN112487989B (en) * | 2020-12-01 | 2022-07-15 | 重庆邮电大学 | Video expression recognition method based on capsule-long-and-short-term memory neural network |
CN112487989A (en) * | 2020-12-01 | 2021-03-12 | 重庆邮电大学 | Video expression recognition method based on capsule-long-and-short-term memory neural network |
CN112733701A (en) * | 2021-01-07 | 2021-04-30 | 中国电子科技集团公司信息科学研究院 | Robust scene recognition method and system based on capsule network |
CN113343886A (en) * | 2021-06-23 | 2021-09-03 | 贵州大学 | Tea leaf identification grading method based on improved capsule network |
CN113283393A (en) * | 2021-06-28 | 2021-08-20 | 南京信息工程大学 | Method for detecting Deepfake video based on image group and two-stream network |
CN113283393B (en) * | 2021-06-28 | 2023-07-25 | 南京信息工程大学 | Deepfake video detection method based on image group and two-stream network |
CN113610108B (en) * | 2021-07-06 | 2022-05-20 | 中南民族大学 | Rice pest identification method based on improved residual error network |
CN113610108A (en) * | 2021-07-06 | 2021-11-05 | 中南民族大学 | Rice pest identification method based on improved residual error network |
CN113807232A (en) * | 2021-09-14 | 2021-12-17 | 广州大学 | Fake face detection method, system and storage medium based on double-flow network |
CN114241245B (en) * | 2021-12-23 | 2024-05-31 | 西南大学 | Image classification system based on residual capsule neural network |
CN114241245A (en) * | 2021-12-23 | 2022-03-25 | 西南大学 | Image classification system based on residual error capsule neural network |
CN114339398A (en) * | 2021-12-24 | 2022-04-12 | 天翼视讯传媒有限公司 | Method for real-time special effect processing in large-scale video live broadcast |
CN115082928A (en) * | 2022-06-21 | 2022-09-20 | 电子科技大学 | Method for asymmetric double-branch real-time semantic segmentation of network for complex scene |
CN115082928B (en) * | 2022-06-21 | 2024-04-30 | 电子科技大学 | Method for asymmetric double-branch real-time semantic segmentation network facing complex scene |
CN116030454B (en) * | 2023-03-28 | 2023-07-18 | 中南民族大学 | Text recognition method and system based on capsule network and multi-language model |
CN116030454A (en) * | 2023-03-28 | 2023-04-28 | 中南民族大学 | Text recognition method and system based on capsule network and multi-language model |
Also Published As
Publication number | Publication date |
---|---|
CN111241958B (en) | 2022-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111241958B (en) | Video image identification method based on residual error-capsule network | |
US11645835B2 (en) | Hypercomplex deep learning methods, architectures, and apparatus for multimodal small, medium, and large-scale data representation, analysis, and applications | |
Qin et al. | How convolutional neural network see the world-A survey of convolutional neural network visualization methods | |
Zhang et al. | Generative adversarial network with spatial attention for face attribute editing | |
De Rezende et al. | Exposing computer generated images by using deep convolutional neural networks | |
White | Sampling generative networks | |
CN112766158B (en) | Multi-task cascading type face shielding expression recognition method | |
Sohn et al. | Learning structured output representation using deep conditional generative models | |
Han et al. | Two-stage learning to predict human eye fixations via SDAEs | |
Hou et al. | Improving variational autoencoder with deep feature consistent and generative adversarial training | |
Mallouh et al. | Utilizing CNNs and transfer learning of pre-trained models for age range classification from unconstrained face images | |
CN111444881A (en) | Fake face video detection method and device | |
Tian et al. | Ear recognition based on deep convolutional network | |
CN109740539B (en) | 3D object identification method based on ultralimit learning machine and fusion convolution network | |
CN113989890A (en) | Face expression recognition method based on multi-channel fusion and lightweight neural network | |
Chen et al. | Mask dynamic routing to combined model of deep capsule network and u-net | |
Tong et al. | Adaptive weight based on overlapping blocks network for facial expression recognition | |
Sharma et al. | IPDCN2: Improvised Patch-based Deep CNN for facial retouching detection | |
Saealal et al. | Three-Dimensional Convolutional Approaches for the Verification of Deepfake Videos: The Effect of Image Depth Size on Authentication Performance | |
CN115457374B (en) | Deep pseudo-image detection model generalization evaluation method and device based on reasoning mode | |
Tunc et al. | Age group and gender classification using convolutional neural networks with a fuzzy logic-based filter method for noise reduction | |
Gautam et al. | Deep supervised class encoding for iris presentation attack detection | |
Vepuri | Improving facial emotion recognition with image processing and deep learning | |
Huang et al. | Deep Multimodal Fusion Autoencoder for Saliency Prediction of RGB‐D Images | |
Hua et al. | Collaborative Generative Adversarial Network with Visual perception and memory reasoning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220722 |