CN111339897B - Living body identification method, living body identification device, computer device, and storage medium - Google Patents

Living body identification method, living body identification device, computer device, and storage medium Download PDF

Info

Publication number
CN111339897B
CN111339897B CN202010107870.4A CN202010107870A CN111339897B CN 111339897 B CN111339897 B CN 111339897B CN 202010107870 A CN202010107870 A CN 202010107870A CN 111339897 B CN111339897 B CN 111339897B
Authority
CN
China
Prior art keywords
image
processed
sample
training
living body
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010107870.4A
Other languages
Chinese (zh)
Other versions
CN111339897A (en
Inventor
姚太平
吴双
孟嘉
丁守鸿
李季檩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010107870.4A priority Critical patent/CN111339897B/en
Publication of CN111339897A publication Critical patent/CN111339897A/en
Application granted granted Critical
Publication of CN111339897B publication Critical patent/CN111339897B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4014Identity check for transactions
    • G06Q20/40145Biometric identity checks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]

Abstract

The application relates to a living body identification method, a living body identification device, a computer device and a storage medium. The method comprises the following steps: acquiring an image to be processed, and converting the image to be processed into a first image through a conversion layer of an identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image; extracting features of the image to be processed and the first image through an identification layer of the identification model to obtain a feature map of the image to be processed and a feature map of the first image; determining a residual error diagram between the image to be processed and the first image according to the feature diagram of the image to be processed and the feature diagram of the first image; and performing living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body. By adopting the method, whether the image to be processed is a living body or not can be accurately identified.

Description

Living body identification method, living body identification device, computer device, and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a living body identification method, apparatus, computer device, and storage medium.
Background
With the development of computer technology, living body recognition technology has emerged. In-vivo detection refers to a user making a corresponding action according to a system instruction, for example: blink, shake head, say a string of numbers, etc., to prevent users from cheating the system with photographs to complete verification in some important circumstances. After the user performs the action according to the system instruction, the system performs operations such as face detection, five sense organs positioning, action detection and the like to judge whether the living body detection of the user passes.
However, a malicious user may trick the living body detection system through a combined video of various actions, resulting in inaccurate living body identification.
Disclosure of Invention
In view of the above, it is necessary to provide a living body recognition method, apparatus, computer device, and storage medium, which address the above-described technical problems and which address the technical problem of inaccurate living body recognition.
In one embodiment, a method of in-vivo identification is provided, the method comprising:
acquiring an image to be processed, and converting the image to be processed into a first image through a conversion layer of an identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image;
Extracting features of the image to be processed and the first image through an identification layer of the identification model to obtain a feature map of the image to be processed and a feature map of the first image;
determining a residual error diagram between the image to be processed and the first image according to the feature diagram of the image to be processed and the feature diagram of the first image;
and performing living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body.
In one embodiment, there is provided a living body recognition apparatus including:
the conversion module is used for acquiring an image to be processed, converting the image to be processed into a first image through a conversion layer of the identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image;
the extraction module is used for extracting the characteristics of the image to be processed and the first image through the identification layer of the identification model to obtain a characteristic image of the image to be processed and a characteristic image of the first image;
the determining module is used for determining a residual error diagram between the image to be processed and the first image according to the characteristic diagram of the image to be processed and the characteristic diagram of the first image;
The identification module is used for carrying out living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body.
In one embodiment, a computer device is provided comprising a memory storing a computer program and a processor implementing the following steps when executing the computer program:
acquiring an image to be processed, and converting the image to be processed into a first image through a conversion layer of an identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image;
extracting features of the image to be processed and the first image through an identification layer of the identification model to obtain a feature map of the image to be processed and a feature map of the first image;
determining a residual error diagram between the image to be processed and the first image according to the feature diagram of the image to be processed and the feature diagram of the first image;
and performing living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body.
In one embodiment, a computer readable storage medium is provided having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring an image to be processed, and converting the image to be processed into a first image through a conversion layer of an identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image;
extracting features of the image to be processed and the first image through an identification layer of the identification model to obtain a feature map of the image to be processed and a feature map of the first image;
determining a residual error diagram between the image to be processed and the first image according to the feature diagram of the image to be processed and the feature diagram of the first image;
and performing living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body.
According to the living body identification method, the living body identification device, the computer equipment and the storage medium, the image to be processed is acquired, the image to be processed is converted into the first image through the conversion layer of the identification model, the image to be processed and the first image correspond to different attributes, the attributes comprise fake images and non-fake images, the characteristic extraction is carried out on the image to be processed and the first image through the identification layer of the identification model, the characteristic diagram of the image to be processed and the characteristic diagram of the first image are obtained, the residual diagram between the image to be processed and the first image is determined according to the characteristic diagram of the image to be processed and the characteristic diagram of the first image, and therefore the difference between the image to be processed and the first image can be determined. The image to be processed is subjected to living body recognition based on the residual image, so that the category of the image to be processed is obtained, the category is living body or non-living body, the user does not need to cooperate to make any face action, living body recognition can be carried out only from a single image, the detection cost is reduced, and the accuracy of living body recognition is improved.
In one embodiment, there is provided a recognition model training method including:
acquiring a training image sample and a category label corresponding to the training image sample, wherein the category label comprises a living body and a non-living body;
converting the training image sample into a first image sample through a conversion layer of the recognition model, wherein the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images;
extracting features of the training image sample and the first image sample through an identification layer of the identification model to obtain a feature map of the training image sample and a feature map of the first image sample;
determining a residual error map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample;
performing living body recognition on the training image sample based on the residual image to obtain a recognition result of the training image sample;
and adjusting parameters of the recognition model according to the difference between the recognition result of the training image sample and the corresponding class label, and continuing training until the preset condition is met, stopping training, and obtaining the trained recognition model.
In one embodiment, there is provided an identification model training apparatus, the apparatus comprising:
the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring a training image sample and a class label corresponding to the training image sample, and the class label comprises a living body and a non-living body;
the sample conversion module is used for converting the training image sample into a first image sample through a conversion layer of the recognition model, and the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images;
the feature extraction module is used for extracting features of the training image sample and the first image sample through the recognition layer of the recognition model to obtain a feature map of the training image sample and a feature map of the first image sample;
a residual map module, configured to determine a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample;
the living body identification module is used for carrying out living body identification on the training image sample based on the residual error map to obtain an identification result of the training image sample;
and the adjusting module is used for adjusting the parameters of the recognition model according to the recognition result of the training image sample and the difference between the corresponding class labels, and continuing training until the preset condition is met, and stopping training to obtain a trained recognition model.
In one embodiment, a computer device is provided comprising a memory storing a computer program and a processor implementing the following steps when executing the computer program:
acquiring a training image sample and a category label corresponding to the training image sample, wherein the category label comprises a living body and a non-living body;
converting the training image sample into a first image sample through a conversion layer of the recognition model, wherein the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images;
extracting features of the training image sample and the first image sample through an identification layer of the identification model to obtain a feature map of the training image sample and a feature map of the first image sample;
determining a residual error map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample;
performing living body recognition on the training image sample based on the residual image to obtain a recognition result of the training image sample;
and adjusting parameters of the recognition model according to the difference between the recognition result of the training image sample and the corresponding class label, and continuing training until the preset condition is met, stopping training, and obtaining the trained recognition model.
In one embodiment, a computer readable storage medium is provided having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring a training image sample and a category label corresponding to the training image sample, wherein the category label comprises a living body and a non-living body;
converting the training image sample into a first image sample through a conversion layer of the recognition model, wherein the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images;
extracting features of the training image sample and the first image sample through an identification layer of the identification model to obtain a feature map of the training image sample and a feature map of the first image sample;
determining a residual error map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample;
performing living body recognition on the training image sample based on the residual image to obtain a recognition result of the training image sample;
and adjusting parameters of the recognition model according to the difference between the recognition result of the training image sample and the corresponding class label, and continuing training until the preset condition is met, stopping training, and obtaining the trained recognition model.
The method, the device, the computer equipment and the storage medium for training the recognition model acquire the training image sample and the class label corresponding to the training image sample, wherein the class label comprises a living body and a non-living body, the training image sample is converted into the first image sample through a conversion layer of the recognition model, and the training image sample and the first image sample correspond to different attributes; the method comprises the steps of performing feature extraction on a training image sample and a first image sample through an identification layer of an identification model to obtain a feature map of the training image sample and a feature map of the first image sample, determining a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample, performing living body identification on the training image sample based on the residual map to obtain an identification result of the training image sample, adjusting parameters of the identification model and continuing training until a preset condition is met according to the identification result of the training image sample and the difference between corresponding class labels, thereby obtaining a trained identification model, enabling living body identification to be performed from a single image through the trained identification model, and avoiding any facial action by a user, thereby reducing the detection cost and improving the accuracy of living body identification.
Drawings
FIG. 1 is a diagram of an application environment for a method of living body identification in one embodiment;
FIG. 2 is a flow chart of a method of identifying a living body according to one embodiment;
FIG. 3 is a flowchart illustrating steps performed by an identification layer of an identification model to extract features of an image to be processed and a first image in one embodiment;
FIG. 4 is a flowchart showing steps for determining feature variations between a feature map of an image to be processed and a feature map of a first image in another embodiment;
FIG. 5 is a schematic diagram of generating a residual map between an image to be processed and a first image in one embodiment;
FIG. 6 is a flowchart illustrating steps for performing in-vivo recognition on an image to be processed based on weight values corresponding to each pixel point in a residual map in one embodiment;
FIG. 7 is a schematic diagram of in-vivo detection of an image to be processed in one embodiment;
FIG. 8 is a flow diagram of a method of training a recognition model in one embodiment;
FIG. 9 is a flow diagram of training steps for a generator in a translation layer of a recognition model in one embodiment;
FIG. 10 is a flow diagram of the training steps for the discriminator in the translation layer of the recognition model in one embodiment;
FIG. 11 is a block diagram of a translation layer of an identification model in one embodiment;
FIG. 12 is a diagram of an architecture of an identification model in one embodiment;
FIG. 13 is a block diagram showing the structure of a living body recognition apparatus in one embodiment;
FIG. 14 is a block diagram of an apparatus for training an identification model in one embodiment;
fig. 15 is an internal structural view of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
The living body identification method provided by the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers.
In this embodiment, the terminal 102 acquires an image to be processed, and sends the image to be processed to the server 104. The server 104 receives the image to be processed, inputs the image to be processed into the recognition model, converts the image to be processed into the first image through a conversion layer of the recognition model, and the image to be processed and the first image correspond to different attributes, wherein the attributes comprise a fake image and a non-fake image. Then, the conversion layer inputs the output first image into the recognition layer of the recognition model. And extracting the characteristics of the image to be processed and the first image through the identification layer of the identification model to obtain a characteristic image of the image to be processed and a characteristic image of the first image. And determining a residual error diagram between the image to be processed and the first image according to the feature diagram of the image to be processed and the feature diagram of the first image. And performing living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body. The server 104 then returns the category of the image to be processed to the terminal 102. Through the interaction between the terminal 102 and the server 104, the image to be processed is detected in a living body at the server end, so that the storage space of the terminal is saved, and the image to be processed can be detected as a living body or a non-living body accurately.
It will be appreciated that in vivo face recognition is often used in conjunction with other techniques during actual use, such as face verification of user identity, etc. Face verification identity has been used in a number of services, such as: the bank remote identity verification, face payment, drip driver remote authentication and community access control system.
In one embodiment, the application of the living body identification method in the face payment scenario is as follows:
when a user initiates a payment instruction, the terminal acquires face images of the user through the camera and inputs the face images into the recognition model. The conversion layer of the recognition model recognizes whether the face image is an attack image or a real image. In this embodiment, if the image collected by the terminal is a real image, the conversion layer converts the face image into an attack image.
Then, the recognition layer of the recognition model divides the face image and the attack image to obtain each region corresponding to the face image and each region corresponding to the attack image. And calculating the characteristic value corresponding to each region, and obtaining the characteristic map of the face image according to the characteristic value corresponding to each region of the face image. And obtaining a feature map of the attack image according to the feature values corresponding to the areas of the attack image.
Then, the recognition layer of the recognition model determines mutually matched pixel points in the feature map of the face image and the feature map of the attack image, and calculates feature difference values between the mutually matched pixel points. And carrying out normalization processing on each characteristic difference value to obtain each weight value. And generating a residual diagram between the face image and the attack image based on the weight values, wherein each pixel point in the residual diagram corresponds to one weight value.
Next, the face image and the residual map are input into a classification network in the recognition layer. The convolution layer of the first convolution layer of the classification network carries out convolution processing on the characteristics of each pixel point in the face image to obtain a first characteristic value corresponding to each pixel point. And then, determining corresponding pixel points in the residual image and the face image. And multiplying the weight value corresponding to each pixel point in the residual image by the first characteristic value corresponding to the corresponding pixel point in the face image to obtain the second characteristic value corresponding to the pixel point in the face image. The first convolution layer inputs the second characteristic value corresponding to each pixel point output into the second convolution layer to carry out convolution processing until the class probability of the face image output by the output layer is obtained. And comparing the class probability with a probability threshold, and when the class probability is larger than the probability threshold, the face image is a living face image. When the class probability is less than or equal to the probability threshold, the face image is a non-living face image.
And when the face image is identified as the living face image, the terminal executes payment operation to finish the face payment of the user.
By applying the living body face recognition method to the face payment scene, illegal attacks on attempted transactions can be accurately identified through high-precision living body detection, so that the transaction safety is ensured, and the interests of companies and individuals can be ensured not to be damaged.
In one embodiment, the living body identification method can be applied to the scene of verifying the identity of a user in the bank account opening process. The living body recognition method does not necessarily exist in a model form, and can be directly stored as a living body face recognition algorithm. In the process of remote account opening of a bank, in order to confirm the true identity of an account opening person, living face detection is also required for the account opening person. The general flow is as follows: firstly, a user obtains an image containing a human face through a camera at the front end of an application. The front end transmits the face image to the back end and invokes the living face recognition algorithm. The living body face recognition algorithm carries out living body face detection and returns a recognition result to the front end. If the face of the living body is judged to pass, otherwise, the verification fails. The living body identification method is applied to the scene of verifying the user identity in the bank account opening process, thereby avoiding that malicious users illegally use the identities of other people to transact banking related services and effectively ensuring personal information and property safety of the users.
In one embodiment, the living body identification method in the application can be applied to an access control system, and in order to improve the identity verification efficiency of the access control system, after the front end directly acquires the face image, the face image is sent into a packaged identification model, the face image is directly judged, and whether the face image is a living body face or not is fed back. The living body face recognition method is applied to the access control system, and can rapidly and accurately recognize the identity information of the user.
It is to be understood that the living body identification method provided in the present application can be applied to any scene where living body identification is required, and is not limited to the above examples.
In one embodiment, as shown in fig. 2, a living body identification method is provided, and the method is applied to the terminal in fig. 1 as an example, and includes the following steps:
step 202, obtaining an image to be processed, and converting the image to be processed into a first image through a conversion layer of the identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image.
The image to be processed is an image containing a face area, and also can be an image containing the face and parts of the body of the user, such as a face image, an upper body image, a whole body image and the like. The image to be processed may be an RGB (Red, green, blue) image. The recognition model is used to recognize whether the image to be processed is a model of a living body image. The identification model can be applied to a terminal and a server.
A non-counterfeit image refers to a captured source image, i.e. a real image. A fake image refers to an image obtained by flipping, changing the face of, or combining with other images, which changes all or key features of the source image, and is also called an attack image. For example, an image directly collected by a user through a camera is called a source image, and an image obtained by processing the collected image through functions of face matting, face beautifying, special effects and the like is called a fake image.
In this embodiment, when living face recognition is required for a face of a user, after a terminal obtains a user image including a face region, the terminal may detect and frame the region where the face of the user is located, expand a preset multiple by taking the region as a center, obtain more background content in the user image, and cut out the expanded region to obtain an image to be processed.
Specifically, the terminal inputs the image to be processed into the recognition model. The conversion layer of the recognition model is able to generate a new image of a different nature than the image to be processed, i.e. the first image.
In this implementation, when the image to be processed is a source image, the source image is converted into an attack image by a conversion layer of the recognition model. When the image to be processed is an attack image, the attack image is converted into a source image through a conversion layer of the identification model.
And 204, extracting the characteristics of the image to be processed and the first image through an identification layer of the identification model to obtain a characteristic image of the image to be processed and a characteristic image of the first image.
Specifically, the terminal may input the image to be processed into the recognition layer of the recognition model, and perform feature extraction on the image to be processed through the recognition layer to obtain a feature map of the image to be processed. The terminal can input a first image output by the conversion layer of the identification model into the identification layer, and the identification layer is used for extracting the characteristics of the first image to obtain a characteristic diagram of the first image.
In this embodiment, the terminal may perform LBP feature extraction on the image to be processed and the first image through the recognition layer of the recognition model, so as to obtain an LBP feature map corresponding to the image to be processed and an LBP feature map corresponding to the first image. LBP (Local Binary Pattern ) is an operator used to describe local texture features of an image.
And 206, determining a residual error diagram between the image to be processed and the first image according to the characteristic diagram of the image to be processed and the characteristic diagram of the first image.
The residual map is a map for representing a feature difference existing between two images. In this embodiment, the residual map is used to represent the feature difference between the image to be processed and the first image.
Specifically, the terminal may calculate a feature difference between the feature map of the image to be processed and the feature map of the first image, to obtain a residual map.
In this embodiment, the terminal may calculate a feature difference between the LBP feature map corresponding to the image to be processed and the LBP feature map corresponding to the first image, to obtain a residual map. Further, subtracting the characteristic value corresponding to the LBP characteristic map of the image to be processed from the characteristic value corresponding to the LBP characteristic map corresponding to the first image to obtain a residual map.
And step 208, performing living body recognition on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body.
The living body detection is to collect and identify an image of a user to detect whether the user is a living body.
Specifically, the terminal uses the residual map and the image to be processed as input images for the living body recognition process. The terminal carries out convolution processing on the residual image and the image to be processed through an identification layer in the identification model, and the position with characteristic difference in the image to be processed can be determined through the residual image so as to obtain the class probability corresponding to the image to be processed, which is output by the identification layer. And then, determining the category corresponding to the image to be processed according to the category probability, and outputting the category corresponding to the image to be processed.
In this embodiment, when the probability of the category corresponding to the image to be processed is greater than the probability threshold, the category corresponding to the image to be processed is a living body. When the probability of the category corresponding to the image to be processed is smaller than or equal to the probability threshold value, the category corresponding to the image to be processed is a non-living body.
In the living body identification method, the image to be processed is obtained, the image to be processed is converted into the first image through the conversion layer of the identification model, the image to be processed and the first image correspond to different attributes, the attributes comprise fake images and non-fake images, the characteristic extraction is carried out on the image to be processed and the first image through the identification layer of the identification model, the characteristic diagram of the image to be processed and the characteristic diagram of the first image are obtained, the residual diagram between the image to be processed and the first image is determined according to the characteristic diagram of the image to be processed and the characteristic diagram of the first image, and therefore the difference between the image to be processed and the first image can be determined. The image to be processed is subjected to living body recognition based on the residual image, so that the category of the image to be processed is obtained, the category is living body or non-living body, the user does not need to cooperate to make any face action, living body recognition can be carried out only from a single image, the detection cost is reduced, and the accuracy of living body recognition is improved.
In one embodiment, as shown in fig. 3, the feature extraction is performed on the image to be processed and the first image by the recognition layer of the recognition model to obtain a feature map of the image to be processed and a feature map of the first image, including:
step 302, dividing the image to be processed and the first image by an identification layer of the identification model to obtain each region of the image to be processed and each region of the first image.
Specifically, the terminal inputs the image to be processed and the first image into the recognition layer of the recognition model. Dividing the image to be processed into a plurality of areas through the identification layer to obtain each area corresponding to the image to be processed. And dividing the first image into a plurality of areas through the identification layer to obtain each area corresponding to the first image.
In the present embodiment, the image to be processed and the first image may be divided into the same number of areas in the same manner. The same division mode means that all areas obtained by the image to be processed are in one-to-one correspondence with all areas of the first image. For example, the image to be processed is divided into a nine-square, and the first image is divided into the same nine-square.
Step 304, determining the characteristic values corresponding to the areas in the image to be processed, and determining the characteristic values corresponding to the areas in the first image.
Specifically, the terminal may acquire an area corresponding to the image to be processed, and determine a pixel value of each pixel point in the area. Then, a central pixel point in the region is determined, and the pixel value of the central pixel point is compared with the pixel values of the pixel points around the central pixel point to obtain a comparison result. And determining the characteristic value of the region according to the comparison result. In the same manner, the feature values respectively corresponding to the respective areas in the image to be processed can be determined. In this same manner, the feature values corresponding to the respective regions in the first image can be determined.
In this embodiment, the terminal may perform LBP feature extraction on the image to be processed, and determine the LBP values corresponding to each region in the image to be processed. The terminal can extract LBP characteristics of the first image and determine LBP values corresponding to the areas in the first image.
Step 306, determining a feature map of the image to be processed according to the feature values corresponding to the areas in the image to be processed.
Step 308, determining a feature map of the first image according to the feature values corresponding to the regions in the first image.
Specifically, the feature values corresponding to the regions in the image to be processed represent key feature information of the regions, and the terminal can generate a feature map of the image to be processed according to the key feature information corresponding to the regions in the image to be processed. Similarly, the feature values corresponding to the regions in the first image represent key feature information of the regions, and the terminal can generate a feature map of the first image according to the key feature information corresponding to the regions in the first image.
In this embodiment, the recognition layer of the recognition model is used to divide the image to be processed and the first image to obtain each region of the image to be processed and each region of the first image, determine the feature values corresponding to each region in the image to be processed, and determine the feature values corresponding to each region in the first image, so as to obtain the key feature information of each region in the image. Determining a feature map of the image to be processed according to feature values corresponding to the areas in the image to be processed, determining the feature map of the first image according to the feature values corresponding to the areas in the first image, and generating the feature map according to the key feature information, so that the feature map contains all the key feature information in the image to visually display feature differences between the image to be processed and the first image.
In one embodiment, the determining a residual map between the image to be processed and the first image according to the feature map of the image to be processed and the feature map of the first image includes: determining a feature variation between a feature map of the image to be processed and a feature map of the first image; and generating a residual image between the image to be processed and the first image according to the characteristic variation.
The feature variation refers to feature differences between the same pixel points or matched feature points in the two images. The feature variation includes, but is not limited to, a feature difference between two identical pixel points or a feature difference between matched feature points.
Specifically, the terminal may determine feature point pairs that match each other in the feature map of the image to be processed and the feature map of the first image. Then, the terminal may calculate a feature variation amount between two feature points in each pair of feature points, to obtain a feature variation amount corresponding to each pair of feature points. And generating a residual image according to the feature variation corresponding to each pair of feature points. The residual map represents a characteristic difference between the image to be processed and the first image.
In this embodiment, the terminal obtains a preset number of feature point pairs between the feature map of the image to be processed and the feature map of the first image, where the feature point pairs are feature points matched with each other in the feature map of the image to be processed and the feature map of the first image. For each pair of feature points, feature variation between two feature points is calculated, so that a preset number of feature variation is obtained. Then, a residual map may be generated according to a preset number of feature variations.
In this embodiment, the terminal may determine a pair of pixels that are matched with each other in the feature map of the image to be processed and the feature map of the first image. Then, the terminal can calculate the characteristic variation between two pixel points in each pixel point pair to obtain the characteristic variation corresponding to each pair of pixel points. And generating a residual image according to the characteristic variation corresponding to each pair of pixel points. The residual map represents a characteristic difference between the image to be processed and the first image.
In the present embodiment, by determining the feature variation amount between the feature map of the image to be processed and the feature map of the first image; and generating a residual image between the image to be processed and the first image according to the characteristic variation, so that the characteristic difference between the image to be processed and the first image can be accurately represented by the residual image, and whether the image to be processed is a living body or not can be accurately identified based on the characteristic difference.
In one embodiment, as shown in fig. 4, the determining the feature variation between the feature map of the image to be processed and the feature map of the first image includes:
step 402, determining pixel point pairs between a feature map of an image to be processed and a feature map of a first image.
The pixel point pair refers to pixel points matched with each other in the two images. In this embodiment, the pixel point pair refers to a pixel point that is matched with each other in the feature map of the image to be processed and the feature map of the first image.
Specifically, the terminal may determine the pixel points in the feature map of the image to be processed, and select, in the feature map of the first image, the pixel points that are mutually matched with each pixel point in the feature map of the image to be processed, so as to obtain a pixel point pair.
In this embodiment, the terminal may select a preset number of pixels in the feature map of the image to be processed, and select pixels matching the preset number of pixels in the feature map of the first image, so as to obtain a preset number of pixel pairs.
Step 404, determining a characteristic difference value between two pixel points in the pixel point pairs, and obtaining a characteristic difference value corresponding to each pixel point pair.
The characteristic difference value refers to a difference value between characteristic values corresponding to two pixel points in the pixel point pair. The feature difference value represents a feature difference between two pixels in a pair of pixels.
Specifically, for a pixel point pair, the terminal acquires a feature value corresponding to each pixel point in the pixel point pair, and calculates a difference value between the feature values corresponding to the two pixel points, thereby obtaining a feature difference value corresponding to the pixel point pair. According to the same processing mode, the characteristic difference value corresponding to each pixel point respectively can be obtained.
The generating a residual image between the image to be processed and the first image according to the characteristic variation comprises the following steps:
and step 406, generating a residual image between the image to be processed and the first image according to the characteristic difference value corresponding to each pixel point pair.
Specifically, each pixel point of the terminal generates a residual map according to the characteristic difference value corresponding to each pixel point. The residual image represents the characteristic difference between the pixel points matched with each other in the characteristic image of the image to be processed and the characteristic image of the first image, thereby representing the characteristic difference between the image to be processed and the first image.
In this embodiment, by determining a pair of pixels between a feature map of an image to be processed and a feature map of a first image, determining a feature difference between two pixels in the pair of pixels, and obtaining a feature difference corresponding to each pair of pixels, the feature difference between the pixels matched with each other in the two feature maps is calculated. And generating a residual error map according to each characteristic difference value, so that the characteristic difference between the image to be processed and the first image can be visually displayed through the residual error map.
In one embodiment, the generating a residual map between the to-be-processed image and the first image according to the corresponding feature difference value of each pixel point pair includes: normalizing the characteristic difference value corresponding to each pixel point pair to obtain a weight value corresponding to each pixel point pair; and generating a residual error diagram between the image to be processed and the first image according to the weight value corresponding to each pixel point pair.
Specifically, after obtaining the corresponding feature difference value of each pixel point pair, the terminal performs normalization processing on the feature difference values to convert the difference values into the range of [0,255 ]. And (3) normalizing each characteristic difference value to obtain a new value, wherein the new value is taken as a weight value. And then, the terminal generates a residual image according to the weight value corresponding to each pixel point pair, and each pixel point in the residual image corresponds to one weight value.
In this embodiment, the normalization processing is performed on the feature difference value corresponding to each pixel point pair to obtain a weight value corresponding to each pixel point pair, and a residual image between the to-be-processed image and the first image is generated according to the weight value corresponding to each pixel point pair, so that the residual image is used as a weight image to mark a position with feature difference in the to-be-processed image in the identification process, so that the model is more focused on the position with feature difference in the identification process, and the accuracy of identifying whether the to-be-processed image is a living body can be improved.
As shown in fig. 5, a schematic diagram of generating a residual map between the image to be processed and the first image in one embodiment. The attack sample B is an image to be processed, and the terminal converts the attack sample B into a real sample B A Attack sample B is a forged image, and true sample B A Is the source image obtained after the attack sample B is restored. Then, the terminal uses LBP characteristic extraction algorithm to attack sample B and real sample B A Extracting features to obtain LBP diagram of B and B A Is a LBP graph of (b). The terminal may then divide the LBP map of B into a plurality of regions (window size adjustable, including but not limited to 3*3) through a window of 3*3, each region containing at least 9 pixels. And comparing the pixel values of the adjacent 8 pixel points with the central pixel value by taking the pixel value of the pixel point at the center of the window as a threshold value. When the adjacent pixel value is larger than the central pixel value, the position of the adjacent pixel point is marked as 1, otherwise, the position is marked as 0. Thus, 8 points within the 3*3 window can be compared to produce an 8-bit binary number. By converting the 8-bit binary number into a decimal number, an LBP code can be obtained. The LBP code is the LBP value of the pixel point in the center of the window, and the LBP value is used to reflect the texture information of the region. Next, the LBP value of 8 pixels adjacent to the center pixel may be set to 0. And obtaining the LBP value of each pixel point in the plurality of areas corresponding to the LBP map of B according to the same processing mode.
For true sample B A And processing the attack sample B in the same processing mode to obtain the LBP value of each pixel point in a plurality of areas corresponding to the LBP graph of the B. The terminal may then calculate LBPBA-LBP B I.e. determining attack sample B and true sample B A And calculating the difference value between LBP values corresponding to the mutually matched pixel points. Some differences may be 0, some differences may be positive numbers, and some negative numbers. The terminal pairs the differencesAnd carrying out normalization processing so as to convert the difference values to between 0 and 255, and generating a residual image according to the normalized values.
In one embodiment, the performing in-vivo recognition on the image to be processed based on the residual map to obtain a category of the image to be processed includes: acquiring weight values corresponding to all pixel points in the residual error map; and performing living body identification on the image to be processed based on the weight value corresponding to each pixel point in the residual error map to obtain the category of the image to be processed.
Specifically, the terminal may input the residual map and the image to be processed into a classification network of the recognition layer. The classification network can perform convolution processing on the image to be processed and acquire weight values corresponding to all pixel points in the residual image. The convolution processing of the image to be processed is participated in through the weight value corresponding to each pixel point in the residual image, so that the position with the characteristic difference in the image to be processed is emphasized, and the position with the characteristic difference obtained after the convolution processing of the image to be processed is more and more obvious. In the continuous convolution processing, the recognition model can refine the feature differences existing in the image to be processed, so that the category probability is obtained according to the feature differences existing in the image to be processed. And then, comparing the class probability with a probability threshold, and when the class probability is larger than the probability threshold, determining that the class corresponding to the image to be processed is a living image. When the class probability is smaller than or equal to the probability threshold, the class corresponding to the image to be processed is a non-living image.
In this embodiment, the weight value corresponding to each pixel point in the residual map is obtained, and the living body identification is performed on the image to be processed based on the weight value corresponding to each pixel point in the residual map, so that the residual map is used as a weight map to mark the position with the characteristic difference in the image to be processed in the identification process, so that the identification model is more focused on the position with the characteristic difference, the identification accuracy is improved, and the category corresponding to the image to be processed is accurately obtained.
In one embodiment, as shown in fig. 6, the performing in-vivo recognition on the to-be-processed image based on the weight value corresponding to each pixel point in the residual map to obtain the category of the to-be-processed image includes:
step 602, performing convolution processing on an image to be processed through an identification layer of an identification model to obtain a first characteristic value corresponding to each pixel point in the image to be processed.
Specifically, the terminal inputs the image to be processed and the first image into the recognition layer of the recognition model. The convolution layer in the identification layer can carry out convolution processing on the image to be processed through the convolution check corresponding to each layer, and a first characteristic value corresponding to each pixel point in the image to be processed is obtained.
In this embodiment, the terminal inputs the image to be processed and the first image into a first convolution layer among the recognition layers of the recognition model. The convolution check in the first convolution layer carries out convolution processing on the image to be processed to obtain a first characteristic value corresponding to each pixel point.
Step 604, determining a second feature value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual image and the first feature value corresponding to each pixel point in the image to be processed.
Specifically, the convolution layer in the identification layer can perform convolution processing on the image to be processed through convolution check, and after a first characteristic value corresponding to each pixel point in the image to be processed is obtained, a weight value corresponding to each pixel point in the residual image is obtained. And determining a second characteristic value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual error map and the first characteristic value of the corresponding pixel point in the image to be processed. The second eigenvalue is the eigenvalue output by the convolution layer.
In this embodiment, the weight value corresponding to the pixel point in the residual map is multiplied by the first feature value of the corresponding pixel point in the image to be processed, and the product is the second feature value. Multiplying the weight value corresponding to each pixel point in the residual image by the first characteristic value of the corresponding pixel point in the image to be processed to obtain the second characteristic value corresponding to each pixel point in the image to be processed.
In this embodiment, the convolution check in the first convolution layer in the identification layer performs convolution processing on the image to be processed, so as to obtain a first feature value corresponding to each pixel point. And multiplying the weight value corresponding to each pixel point in the residual image by the first characteristic value of the corresponding pixel point in the image to be processed to obtain the second characteristic value corresponding to each pixel point in the image to be processed output by the first convolution layer.
For example, if the weight value corresponding to some pixels in the residual map is 0, after the weight value is multiplied by the first feature value of the corresponding pixels in the image to be processed, the second feature value is 0, which indicates that the pixels with the second feature value of 0 have no feature difference or small difference with the corresponding pixels in the source image. If the weight value corresponding to the pixel point in the residual image is a positive number, the larger the value is, the larger the second characteristic value obtained by multiplying the first characteristic value is. The larger the second characteristic value is, the more obvious characteristic difference exists between the corresponding pixel point and the corresponding pixel point in the source image, so that the pixel point with obvious characteristic difference in the image to be processed can be screened out.
Step 606, determining the category of the image to be processed according to the second feature value corresponding to each pixel point in the image to be processed.
Specifically, the second eigenvalue output by the convolution layer is used as the input of the next convolution layer, and the convolution processing is carried out on the second eigenvalue corresponding to each pixel point in the image to be processed through the next convolution layer. And taking the characteristics of the output of the previous convolution layer as the input of the next convolution layer, thereby obtaining the class probability of the output of the last convolution layer. And determining the category of the image to be processed according to the category probability.
In this embodiment, when the probability of the category corresponding to the image to be processed is greater than the probability threshold, the category corresponding to the image to be processed is a living body. When the probability of the category corresponding to the image to be processed is smaller than or equal to the probability threshold value, the category corresponding to the image to be processed is a non-living body.
In this embodiment, the recognition layer of the recognition model performs convolution processing on the image to be processed to obtain a first feature value corresponding to each pixel point in the image to be processed, and determines a second feature value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual image and the first feature value corresponding to each pixel point in the image to be processed, so that the weight value in the residual image is applied to the convolution processing of the image to be processed, and therefore, pixel points with obvious feature differences in the image to be processed can be screened out. And determining the category of the image to be processed according to the second characteristic values corresponding to the pixel points in the image to be processed, so that the residual error map can be applied to living body detection, and whether the image to be processed belongs to a living body can be more accurately identified.
In one embodiment, the performing in-vivo recognition on the image to be processed based on the residual map to obtain a category of the image to be processed includes: extracting features of the residual image and the image to be processed to obtain features corresponding to the residual image and features corresponding to the image to be processed; and performing living body identification on the image to be processed based on the characteristics of the residual image and the characteristics of the image to be processed, and obtaining the category of the image to be processed.
Specifically, after the terminal obtains a residual image through an identification layer in the identification model, the residual image and the image to be processed can be subjected to feature extraction to obtain features corresponding to the residual image and features corresponding to the image to be processed. And then, inputting the features corresponding to the residual image and the features corresponding to the image to be processed into a classification network in an identification layer, and carrying out convolution processing on the features corresponding to the residual image and the features corresponding to the image to be processed by the classification network to obtain the class probability of the image to be processed. The class of the image to be processed can be obtained through comparison of the class probability and the probability threshold value, and the recognition layer of the recognition model outputs the class corresponding to the image to be processed.
In this embodiment, a first convolution layer in the classification network performs convolution processing on the feature corresponding to the residual image and the feature corresponding to the image to be processed, so as to obtain an output feature image. And then taking the characteristic diagram output by the first convolution layer as the input of the second convolution layer, and taking the characteristic diagram output by the last convolution layer as the input of the next convolution layer until the class probability of the image to be processed output by the last convolution layer.
In this embodiment, feature extraction is performed on the residual image and the image to be processed to obtain features corresponding to the residual image and features corresponding to the image to be processed, so as to obtain key feature information in the residual image and key feature information in the image to be processed. The living body identification is carried out on the image to be processed based on the characteristics of the residual image and the characteristics of the image to be processed, the category of the image to be processed can be identified based on key characteristic information, the calculated amount is reduced, and the living body detection efficiency is improved.
As shown in fig. 7, a schematic diagram of in-vivo detection of an image to be processed in one embodiment is shown. The attack sample B is an image to be processed, the terminal inputs the attack sample B and the residual image into a first convolution layer to carry out convolution processing, the output characteristic of the first convolution layer is used as the input of a second convolution layer, and the characteristic image output by the last convolution layer is used as the input of the next convolution layer until the characteristic image output by the last convolution layer is obtained. And predicting the feature map output by the last convolution layer through the output layer to determine whether the attack sample B belongs to the living body category.
In one embodiment, as shown in fig. 8, there is provided a recognition model training method, including:
Step 802, acquiring a training image sample and category labels corresponding to the training image sample, wherein the category labels comprise living bodies and non-living bodies.
The training image sample is an image containing a face area, and may also be an image containing a face and parts of a body of a user, such as a face image, an upper body image, a whole body image, and the like. The training image samples may be RGB (Red, green, blue) images. The training image sample includes a positive sample image and a negative sample image, the positive sample image referring to the acquired source image, i.e., the real image. A fake image refers to an image obtained by flipping, changing the face of, or combining with other images, which changes all or key features of the source image, and is also called an attack image.
Specifically, the terminal can collect the training image sample by directly shooting the user, can also obtain the training image sample from the local or network, and can determine the label corresponding to the training image sample by manual labeling. The positive and negative sample images in the training image sample do not need to be matched one to one, i.e. one positive and one negative sample image are not images of the same user. The number of positive and negative sample images need not be the same. It will be appreciated that the positive and negative sample images may also be matched one to one. The number of positive and negative sample images may also be the same.
Step 804, converting the training image sample into a first image sample through a conversion layer of the recognition model, wherein the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images.
Specifically, the terminal inputs training image samples into the constructed recognition model. The conversion layer of the constructed recognition model can generate a new image with different attributes from the training image sample, namely a first image sample.
In this implementation, when the training image sample is a source image, the source image is converted into an attack image by a conversion layer of the recognition model. When the training image sample is an attack image, the attack image is converted into a source image through a conversion layer of the recognition model.
Step 806, extracting features of the training image sample and the first image sample through the recognition layer of the recognition model, so as to obtain a feature map of the training image sample and a feature map of the first image sample.
Specifically, the terminal may input the training image sample into the recognition layer of the constructed recognition model, and perform feature extraction on the training image sample through the recognition layer to obtain a feature map of the training image sample. The terminal can input a first image sample output by the conversion layer of the identification model into the identification layer, and the identification layer is used for extracting the characteristics of the first image sample to obtain a characteristic diagram of the first image sample.
In this embodiment, the terminal may perform LBP feature extraction on the training image sample and the first image sample through the recognition layer of the recognition model, so as to obtain an LBP feature map corresponding to the training image sample and an LBP feature map corresponding to the first image sample.
Step 808, determining a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample.
Specifically, the terminal may calculate a feature difference between the feature map of the training image sample and the feature map of the first image sample, to obtain a residual map.
In this embodiment, the terminal may calculate a feature difference between the LBP feature map corresponding to the training image sample and the LBP feature map corresponding to the first image sample, to obtain a residual map.
And step 810, performing living body recognition on the training image sample based on the residual map to obtain a recognition result of the training image sample.
Specifically, the terminal uses the residual map and the training image sample as input images for the living body recognition process. And the terminal carries out convolution processing on the residual image and the training image sample through an identification layer in the identification model so as to obtain class probability corresponding to the training image sample output by the identification layer. And then, determining the category corresponding to the training image sample according to the category probability, and outputting the recognition result corresponding to the training image sample.
And step 812, adjusting parameters of the recognition model and continuing training according to the difference between the recognition result of the training image sample and the corresponding class label until the preset condition is met, and stopping training to obtain the trained recognition model.
Specifically, the terminal compares the recognition result of the training image sample output by the recognition model with the corresponding class label, and determines the difference between the recognition result and the class label. And adjusting parameters of the recognition model according to the difference between the parameters and the parameters, and continuing training until the preset conditions are met, stopping training, and obtaining the trained recognition model.
In this embodiment, the preset condition is that the difference between the recognition result of the training image sample and the corresponding category label is smaller than the preset difference. Or the preset condition is that the loss value output by the identification model is smaller than the loss threshold value. And stopping training when the difference between the recognition result of the training image sample and the corresponding class label is smaller than a preset difference or the preset condition is that the loss value output by the recognition model is smaller than a loss threshold value, so as to obtain a trained recognition model.
In this embodiment, a training image sample and a class label corresponding to the training image sample are obtained, the class label includes a living body and a non-living body, the training image sample is converted into a first image sample through a conversion layer of an identification model, and the training image sample and the first image sample correspond to different attributes; the method comprises the steps of performing feature extraction on a training image sample and a first image sample through an identification layer of an identification model to obtain a feature map of the training image sample and a feature map of the first image sample, determining a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample, performing living body identification on the training image sample based on the residual map to obtain an identification result of the training image sample, adjusting parameters of the identification model and continuing training until a preset condition is met according to the identification result of the training image sample and the difference between corresponding class labels, thereby obtaining a trained identification model, enabling living body identification to be performed from a single image through the trained identification model, and avoiding any facial action by a user, thereby reducing the detection cost and improving the accuracy of living body identification.
In one embodiment, as shown in FIG. 9, the conversion layer of the recognition model includes a first generator and a second generator; the first generator converts the counterfeit image into a non-counterfeit image, and the second generator converts the non-counterfeit image into a counterfeit image;
the training mode of the generator in the conversion layer of the recognition model comprises the following steps:
in step 902, a negative image is obtained from the training image sample, where the negative image has a counterfeit image as a property.
Step 904, converting, by the first generator, the negative image into a second image, the negative image and the second image having different properties.
Specifically, the terminal may acquire a negative sample image from the training image sample. The negative sample image is input to a first generator in a conversion layer of the recognition model. The first generator converts the fake image into a non-fake image, and the attributes of the images output by the first generator are all the non-fake images. The negative-sample image is converted into a second image by the first generator, the second image having the property of being a non-counterfeit image.
At step 906, the second image is converted into a third image by a second generator, the third image and the negative image having the same attribute.
Specifically, the second image output from the first generator is input to the second generator, the second generator converts the non-counterfeit image into a counterfeit image, and the properties of the images output from the second generator are all counterfeit images. The second image is converted into a third image by a second generator. The third image and the negative sample image are fake images, and the attributes of the third image and the negative sample image are the same.
Step 908 determines the similarity of the negative image and the third image.
Specifically, after the negative image is converted into the second image, the second image is converted into the third image, and then similar features exist between the negative image and the third image. The terminal calculates the similarity between the negative sample image and the third image. Further, the terminal may determine the feature vector corresponding to the negative sample image and the feature vector corresponding to the third image through the LBP algorithm. And calculating the similarity between the negative sample image and the third image according to the feature vector corresponding to the negative sample image and the feature vector corresponding to the third image.
And step 910, when the similarity of the negative sample image and the third image is smaller than the similarity threshold, adjusting parameters of the first generator and the second generator and continuing training until the training stopping condition is met, and obtaining the trained first generator and second generator.
Specifically, the terminal acquires a similarity threshold, and compares the similarity of the negative sample image and the third image with the similarity threshold. When the similarity between the negative sample image and the third image is smaller than the similarity threshold, the feature similarity between the negative sample image and the third image is not satisfied, namely the performance of the second generator is not good. It also indicates that the performance of the first generator generating the second image is not satisfactory. The terminal adjusts the parameters of the first generator and the second generator and continues training until the first generator and the second generator meet the training stop condition, and the trained first generator and second generator are obtained.
In this embodiment, the training stop condition is that the similarity of the negative sample image and the third image is greater than or equal to the similarity threshold. When the similarity between the negative sample image and the third image is greater than or equal to the similarity threshold, it is indicated that the feature similarity between the negative sample image and the third image meets the requirement, i.e. the performance of the second generator has met the requirement. Since the third image is converted from the second image, the first image is converted when the second image is obtained, and the performance of the second generator is required, which also indicates that the performance of the first generator is required.
In this embodiment, a negative sample image is obtained from a training image sample, the negative sample image is converted into a second image by the first generator, the negative sample image and the second image have different attributes, and the second image is converted into a third image by the second generator. The third image and the negative image have the same attribute, and by determining the similarity between the negative image and the third image, it can be determined whether the performance of the second generator meets the requirement, and whether the performance of the first generator meets the requirement is determined whether the performance of the second generator meets the requirement. And when the similarity of the negative sample image and the third image is smaller than a similarity threshold, adjusting parameters of the first generator and the second generator and continuing training until the training stopping condition is met, and obtaining the trained first generator and second generator. The first generator can convert the fake image into the non-fake image through training, and the second generator converts the non-fake image into the fake image, so that one image is converted into a plurality of images with different attributes, training image samples can be expanded, a data set is enhanced, and the cost of data acquisition is reduced.
In one embodiment, the conversion layer of the recognition model includes a first generator and a second generator; the first generator converts the counterfeit image into a non-counterfeit image, and the second generator converts the non-counterfeit image into a counterfeit image;
the training mode of the generator in the conversion layer of the recognition model comprises the following steps:
acquiring a positive sample image from a training image sample, wherein the attribute of the positive sample image is a non-fake image;
converting, by the second generator, the positive sample image into a fourth image, the positive sample image and the fourth image differing in attribute;
converting the fourth image into a fifth image by the first generator, wherein the attribute of the fifth image is the same as that of the positive sample image;
determining the similarity of the positive sample image and the fifth image;
and when the similarity of the positive sample image and the fifth image is smaller than a similarity threshold, adjusting parameters of the first generator and the second generator, and continuing training until the training stopping condition is met, and obtaining the trained first generator and second generator.
It will be appreciated that training of the generator in the translation layer of the recognition model may also be trained using negative-sample images. The principle of training the generator using the positive sample image may refer to the process from step 902 to step 910, and will not be described here.
In one embodiment, as shown in FIG. 10, the translation layer of the recognition model also includes a discriminator; the training mode of the discriminator in the conversion layer of the identification model comprises the following steps:
step 1002, a positive sample image is obtained from a training image sample, where the positive sample image has a non-counterfeit attribute.
In step 1004, the identifier identifies the second image and the positive sample image, and determines the attribute identification result corresponding to the second image and the positive sample image.
Wherein the discriminator is for discriminating an image as a counterfeit image or a non-counterfeit image. I.e. the discriminator is used to discriminate the properties of the image.
In particular, the terminal may acquire a positive sample image from the training image, the positive sample image being a non-counterfeit image. Next, the terminal inputs the positive sample image and the second image into a discriminator in the conversion layer of the recognition model. The discriminator discriminates the positive sample image and the second image, and outputs an attribute identification result of the positive sample image and an attribute identification result of the second image. The second image is a non-counterfeit image converted from the negative-sample image, and the untrained discriminator may recognize the second image as a counterfeit image.
And step 1006, when the attribute recognition results corresponding to the second image and the positive sample image are different, adjusting parameters of the discriminator and continuing training until the training stopping condition is met, and obtaining the trained discriminator.
Specifically, when the attribute recognition result of the second image output by the discriminator is different from the attribute recognition result corresponding to the positive sample image, it is indicated that the discrimination capability of the discriminator does not meet the requirement. For example, the attribute recognition result of the second image output by the discriminator is a counterfeit image, and the attribute recognition result corresponding to the output positive sample image is a non-counterfeit image. The terminal adjusts the parameters of the discriminator and continues training until the training stop condition is satisfied, thereby obtaining a trained discriminator.
In the present embodiment, the discriminator training stop condition is that the attribute recognition results of the second image and the positive sample image are both non-counterfeit images. And stopping training when the attribute identification result of the second image and the attribute identification result of the positive sample image output by the discriminator are both non-fake images, so as to obtain the trained discriminator.
In this embodiment, a positive sample image is obtained from a training image sample, where an attribute of the positive sample image is a non-counterfeit image, and a discriminator is used to identify a second image and a positive sample image with the same attribute, and an attribute identification result corresponding to the second image and the positive sample image is determined, so as to determine whether the discrimination performance of the discriminator meets the requirement. When the attribute identification results corresponding to the second image and the positive sample image are different, adjusting parameters of the discriminator and continuing training until the training stopping condition is met, obtaining a trained discriminator, and identifying the attribute of the output image to be processed through the trained discriminator in the application process of the identification model. And determining the attribute of the image to be converted according to the attribute of the image to be processed, so as to determine a generator for processing the image to be processed, and further accurately obtaining the converted image different from the attribute of the image to be processed.
In one embodiment, the translation layer of the recognition model further includes a discriminator; the training mode of the discriminator in the conversion layer of the identification model comprises the following steps:
acquiring a negative sample image from a training image sample, wherein the attribute of the negative sample image is a fake image;
identifying the fourth image and the negative sample image through a discriminator, and determining attribute identification results corresponding to the fourth image and the negative sample image;
and when the attribute identification results corresponding to the fourth image and the negative sample image are different, adjusting parameters of the discriminator and continuing training until the training stopping condition is met, and obtaining the trained discriminator.
It will be appreciated that training of the discriminators in the translation layer of the recognition model may also be performed using negative-sample images. The principle of training the discriminator using the negative-sample image may also refer to the processes from step 1002 to step 1006, and will not be described in detail herein.
As shown in FIG. 11, a diagram of a conversion layer architecture in a recognition model is shown in one embodiment. The conversion layer of the recognition model includes 4 generators and two discriminators. Two of the 4 generators are identical. A generator BtoA For the first generator, the generator AtoB Is the second generator. The discriminator A is used for discriminating the attack sample B and the attack sample A B The discriminator B is used for discriminating the true sample a and the true sample B A The attribute is the attack sample and the real sample. The attack sample is a counterfeit image, and the real sample is a non-counterfeit image. The attack sample B and the real sample a may correspond to the same user, or may correspond to different users.
The terminal inputs the attack sample B into a generator to be trained BtoA By a generator to be trained BtoA Converting attack sample B into real sample B A By a generator to be trained AtoB Will be true sample B A Converted to attack sample B'. Then, the terminal calculates the similarity between the attack sample B and the attack sample B'. When the similarity is smaller than the similarity threshold, the characteristic difference between the attack sample B and the reconstructed attack sample B' is obvious, and the similarity is not high, the generator is illustrated AtoB The reconstruction performance of (c) is poor. From this, it can be presumed that the generator BtoA The reconstruction performance of (c) is also poor. Then the terminal adjustable generator BtoA Sum generator AtoB And training repeatedly. When the similarity between the attack sample B and the attack sample B 'is larger than or equal to the threshold value of the similarity, the characteristic difference between the attack sample B and the reconstructed attack sample B' is small, and the similarity is very high, the generator is described AtoB The reconstruction performance of the (C) meets the requirement. From this, it can be presumed that the generator BtoA The reconstruction performance of (a) also meets the requirements. A generator AtoB Sum generator BtoA After training, obtaining a trained generator AtoB Sum generator BtoA
Then, the terminal acquires a trained generator BtoA Output true sample B A And acquires a real sample a. The real sample B A The real sample A can correspond to the same user or can correspond to different users. The terminal will be true sample B A And the real sample a is input to a discriminator B to be trained. Discriminator B to be trained discriminates true sample B A And which of the real samples a is the image converted by the generator and which is not the converted image. The discriminator B to be trained outputs a true sample B A And the attribute recognition result of the real sample a. When the discriminator B outputs the true sample B A Representing the discriminator to determine the true sample B when the attribute recognition result of (a) is converted into the image obtained by the generator A To attack the image, and not the real image. When the identifier B outputs the attribute identification result of the real sample A as the real image, the identifier B is indicated to judge the real sample B A Is a true image, and is not converted into a resulting image by the generator. But the recognition model requires the discriminator B to be on the true sample B A And the identification results of the real sample A are real images, and the images converted by the generator are not the images converted by the generator, so that the images converted by the generator can be ensured to accord with the actual conditions. The terminal adjusts the parameters of the discriminator B such that the discriminator B is specific to the true sample B A And stopping training when the identification results of the real sample A are real images, and obtaining a trained discriminator B.
In this embodiment, the terminal may input the real sample a into the generator to be trained AtoB By a generator to be trained AtoB Converting a real sample A into an attack sample A B By a generator to be trained BtoA Will attack sample a B Converted into a real sample a'. Then, the terminal calculates the similarity between the real sample a and the real sample a'. When the similarity is smaller than the similarity threshold, the characteristic difference between the real sample A and the reconstructed real sample A' is obvious, and the similarity is not high, the generator is described BtoA The reconstruction performance of (c) is poor. From this, it can be presumed that the generator AtoB The reconstruction performance of (c) is also poor. Then the terminal adjustable generator BtoA Sum generator AtoB And training repeatedly. When the similarity between the real sample A and the real sample A 'is larger than or equal to the similarity threshold value, the characteristic difference between the real sample A and the reconstructed real sample A' is small, and the similarity is very high, the generator is described BtoA The reconstruction performance of the (C) meets the requirement. From this, it can be presumed that the generator AtoB The reconstruction performance of (a) also meets the requirements. A generator AtoB Sum generator BtoA After training, obtaining a trained generator AtoB Sum generator BtoA
Then, the terminal acquires a trained generator AtoB Output attack sample A B And acquires attack sample B. The attack sample A B The attack sample B may correspond to the same user or may correspond to a different user. The terminal will attack sample a B And the attack sample B is input to the discriminator B to be trained. Discriminator B to be trained discriminates attack sample a B And which of the attack samples B is the image converted by the generator and which is notAnd (5) a converted image. The discriminator B to be trained outputs an attack sample A B And the attribute identification result of attack sample B. When discriminator B outputs attack sample a B When the attribute recognition result of the attack sample B is the image converted by the non-generator, the terminal adjusts the parameters of the discriminator B so that the discriminator B attacks the attack sample A B And stopping training when the identification results of the attack sample B are the images obtained by conversion of the non-generator, and obtaining the trained discriminator B.
As shown in FIG. 12, a diagram of an architecture for a recognition model in one embodiment is shown. The generating process comprises two parts of encoding and decoding, wherein the encoding process can adopt a depth convolution network, generally comprises 5 convolution blocks, and the number of the convolution blocks can be set according to requirements; each Convolution block contains three layers Conv (Convolution), BN (Batch Normalization, batch normalization network), relu (Rectified Linear Unit, linear rectification function, also called modified linear unit). The decoding process can employ a deep deconvolution network, generally similar in structure to the encoding process; each deconvolution block contains three layers TranConv (deconvolution), BN, relu. The input image can be converted into an image of a different attribute from the input image, i.e., a reconstructed image, by the generation process. The identification process is a deep convolution network, extracts features from the input image and the reconstructed image, determines a residual image between the input image and the reconstructed image, carries out convolution processing on the residual image and the input image, and finally generates a one-dimensional output convolution layer to determine whether the extracted features belong to a specific category, thereby obtaining the category of the input image.
In one embodiment, there is provided a living body recognition method including:
The terminal acquires a training image sample and category labels corresponding to the training image sample, wherein the category labels comprise living bodies and non-living bodies.
Then, the terminal converts the training image sample into a first image sample through a conversion layer of the recognition model, and the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images.
And then, the terminal performs feature extraction on the training image sample and the first image sample through the recognition layer of the recognition model to obtain a feature map of the training image sample and a feature map of the first image sample.
Further, the terminal determines a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample.
And then, the terminal carries out living body recognition on the training image sample based on the residual image to obtain a recognition result of the training image sample.
Further, the terminal adjusts parameters of the recognition model according to the recognition result of the training image sample and the difference between the corresponding class labels, and continues training until the preset condition is met, and the trained recognition model is obtained.
Then, the terminal acquires an image to be processed, the image to be processed is converted into a first image through a conversion layer of the identification model, and the image to be processed and the first image correspond to different attributes, wherein the attributes comprise a fake image and a non-fake image.
Then, the terminal divides the image to be processed and the first image through an identification layer of the identification model to obtain each region of the image to be processed and each region of the first image;
further, the terminal determines the characteristic values corresponding to the areas in the image to be processed respectively, and determines the characteristic values corresponding to the areas in the first image respectively;
further, the terminal determines a feature map of the image to be processed according to feature values corresponding to the areas in the image to be processed; and determining a feature map of the first image according to the feature values respectively corresponding to the areas in the first image.
Then, the terminal determines pixel point pairs between the feature map of the image to be processed and the feature map of the first image; and determining the characteristic difference value between two pixel points in the pixel point pairs to obtain the characteristic difference value corresponding to each pixel point pair.
And then, the terminal performs normalization processing on the characteristic difference value corresponding to each pixel point pair to obtain a weight value corresponding to each pixel point pair.
Further, the terminal generates a residual image between the image to be processed and the first image according to the weight value corresponding to each pixel point pair.
And then, the terminal acquires weight values corresponding to all the pixel points in the residual image, and carries out convolution processing on the image to be processed through an identification layer of the identification model to obtain first characteristic values corresponding to all the pixel points in the image to be processed.
And then, the terminal determines a second characteristic value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual image and the first characteristic value corresponding to each pixel point in the image to be processed.
Further, the terminal determines the class probability of the image to be processed according to the second characteristic value corresponding to each pixel point in the image to be processed, and compares the class probability with a probability threshold; when the class probability is greater than the probability threshold, the image to be processed is a living image. When the class probability is less than or equal to the probability threshold, the image to be processed is a non-living image.
In this embodiment, the conversion layer of the trained recognition model converts the image to be processed into the first image with different attribute from the image to be processed, and the feature images of the image to be processed and the first image are extracted to obtain the key information of the image to be processed and the first image. And calculating the characteristic difference value between the pixel points matched with each other in the characteristic image of the image to be processed and the characteristic image of the first image to generate a residual image, wherein the residual image represents the characteristic difference between the image to be processed and the first image. The weight value in the residual image is applied to convolution processing of the image to be processed, so that pixel points with obvious feature differences in the image to be processed can be screened out. And determining the category of the image to be processed based on the pixel points with obvious characteristic differences, and more accurately identifying whether the image to be processed belongs to a living body. In addition, in the embodiment, the user does not need to coordinate to make any face action, and the living body judgment can be performed only from a single image, so that the detection cost is reduced and the accuracy of living body identification is improved.
In one embodiment, there is provided a living body recognition method for recognizing whether a face image of a user is a living body face, including:
the terminal acquires a training image sample and class labels corresponding to the training image sample, wherein the class labels comprise living faces and non-living faces, and the training image sample is a face image.
Then, the terminal converts the training image sample into a first image sample through a conversion layer of the recognition model, and the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images.
And then, the terminal performs feature extraction on the training image sample and the first image sample through the recognition layer of the recognition model to obtain a feature map of the training image sample and a feature map of the first image sample.
Further, the terminal determines a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample.
And then, the terminal carries out living body recognition on the training image sample based on the residual image to obtain a recognition result of the training image sample.
Further, the terminal adjusts parameters of the recognition model according to the recognition result of the training image sample and the difference between the corresponding class labels, and continues training until the preset condition is met, and the trained recognition model is obtained.
Then, the terminal acquires an image to be processed, wherein the image to be processed is a face image, the image to be processed is converted into a first image through a conversion layer of the recognition model, and the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image.
Then, the terminal divides the image to be processed and the first image through an identification layer of the identification model to obtain each region of the image to be processed and each region of the first image;
further, the terminal determines the characteristic values corresponding to the areas in the image to be processed respectively, and determines the characteristic values corresponding to the areas in the first image respectively;
further, the terminal determines a feature map of the image to be processed according to feature values corresponding to the areas in the image to be processed; and determining a feature map of the first image according to the feature values respectively corresponding to the areas in the first image.
Then, the terminal determines pixel point pairs between the feature map of the image to be processed and the feature map of the first image; and determining the characteristic difference value between two pixel points in the pixel point pairs to obtain the characteristic difference value corresponding to each pixel point pair.
And then, the terminal performs normalization processing on the characteristic difference value corresponding to each pixel point pair to obtain a weight value corresponding to each pixel point pair.
Further, the terminal generates a residual image between the image to be processed and the first image according to the weight value corresponding to each pixel point pair.
And then, the terminal acquires weight values corresponding to all the pixel points in the residual image, and carries out convolution processing on the image to be processed through an identification layer of the identification model to obtain first characteristic values corresponding to all the pixel points in the image to be processed.
And then, the terminal determines a second characteristic value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual image and the first characteristic value corresponding to each pixel point in the image to be processed.
Further, the terminal determines the class probability of the image to be processed according to the second characteristic value corresponding to each pixel point in the image to be processed, and compares the class probability with a probability threshold; when the class probability is greater than the probability threshold, the image to be processed is a living body face image. When the class probability is smaller than or equal to the probability threshold, the image to be processed is a non-living face image.
In this embodiment, the conversion layer of the trained recognition model converts the image to be processed into the first image with different attribute from the image to be processed, and the feature images of the image to be processed and the first image are extracted to obtain the key information of the image to be processed and the first image. And calculating the characteristic difference value between the pixel points matched with each other in the characteristic image of the image to be processed and the characteristic image of the first image to generate a residual image, wherein the residual image represents the characteristic difference between the image to be processed and the first image. The weight value in the residual image is applied to convolution processing of the image to be processed, so that pixel points with obvious feature differences in the image to be processed can be screened out. And determining the category of the face image based on the pixel points with obvious characteristic differences, and more accurately identifying whether the face image belongs to a living face. In addition, in the embodiment, the user does not need to cooperate to make any face action, the living face can be judged from a single image, the detection cost is reduced, and the accuracy of the living face recognition is improved.
It should be understood that, although the steps in the flowcharts of fig. 2-12 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps of fig. 2-12 may include multiple steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the steps or stages in other steps or other steps.
In one embodiment, as shown in fig. 13, there is provided a living body recognition apparatus, which may employ a software module or a hardware module, or a combination of both, as a part of a computer device, the apparatus specifically including: a conversion module 1302, an extraction module 1304, a determination module 1306, and an identification module 1308, wherein:
the conversion module 1302 is configured to obtain an image to be processed, convert the image to be processed into a first image by using a conversion layer of the recognition model, where the image to be processed and the first image correspond to different attributes, and the attributes include a fake image and a non-fake image.
The extracting module 1304 is configured to perform feature extraction on the image to be processed and the first image through an identification layer of the identification model, so as to obtain a feature map of the image to be processed and a feature map of the first image.
A determining module 1306 is configured to determine a residual map between the image to be processed and the first image according to the feature map of the image to be processed and the feature map of the first image.
The identifying module 1308 is configured to perform living body identification on the image to be processed based on the residual map, so as to obtain a category of the image to be processed, where the category is a living body or a non-living body.
In the living body identification device, the image to be processed is obtained, the image to be processed is converted into the first image through the conversion layer of the identification model, the image to be processed and the first image correspond to different attributes, the attributes comprise fake images and non-fake images, the characteristic extraction is carried out on the image to be processed and the first image through the identification layer of the identification model, the characteristic diagram of the image to be processed and the characteristic diagram of the first image are obtained, the residual diagram between the image to be processed and the first image is determined according to the characteristic diagram of the image to be processed and the characteristic diagram of the first image, and therefore the difference between the image to be processed and the first image can be determined. The image to be processed is subjected to living body recognition based on the residual image, so that the category of the image to be processed is obtained, the category is living body or non-living body, the user does not need to cooperate to make any face action, living body recognition can be carried out only from a single image, the detection cost is reduced, and the accuracy of living body recognition is improved.
In one embodiment, the extraction module 1304 is further configured to: dividing the image to be processed and the first image through an identification layer of the identification model to obtain each region of the image to be processed and each region of the first image; determining characteristic values corresponding to all areas in the image to be processed respectively, and determining the characteristic values corresponding to all areas in the first image respectively; determining a feature map of the image to be processed according to the feature values respectively corresponding to the areas in the image to be processed; and determining a feature map of the first image according to the feature values respectively corresponding to the areas in the first image.
In this embodiment, the recognition layer of the recognition model is used to divide the image to be processed and the first image to obtain each region of the image to be processed and each region of the first image, determine the feature values corresponding to each region in the image to be processed, and determine the feature values corresponding to each region in the first image, so as to obtain the key feature information of each region in the image. Determining a feature map of the image to be processed according to feature values corresponding to the areas in the image to be processed, determining the feature map of the first image according to the feature values corresponding to the areas in the first image, and generating the feature map according to the key feature information, so that the feature map contains all the key feature information in the image to visually display feature differences between the image to be processed and the first image.
In one embodiment, the determining module 1306 is further configured to: determining a feature variation between a feature map of the image to be processed and a feature map of the first image; and generating a residual image between the image to be processed and the first image according to the characteristic variation.
In the present embodiment, by determining the feature variation amount between the feature map of the image to be processed and the feature map of the first image; and generating a residual image between the image to be processed and the first image according to the characteristic variation, so that the characteristic difference between the image to be processed and the first image can be accurately represented by the residual image, and whether the image to be processed is a living body or not can be accurately identified based on the characteristic difference.
In one embodiment, the determining module 1306 is further configured to: determining pixel point pairs between the feature map of the image to be processed and the feature map of the first image; determining the characteristic difference value between two pixel points in the pixel point pair to obtain the corresponding characteristic difference value of each pixel point pair; and generating a residual image between the image to be processed and the first image according to the characteristic difference value corresponding to each pixel point pair.
In this embodiment, by determining a pair of pixels between a feature map of an image to be processed and a feature map of a first image, determining a feature difference between two pixels in the pair of pixels, and obtaining a feature difference corresponding to each pair of pixels, the feature difference between the pixels matched with each other in the two feature maps is calculated. And generating a residual error map according to each characteristic difference value, so that the characteristic difference between the image to be processed and the first image can be visually displayed through the residual error map.
In one embodiment, the determining module 1306 is further configured to: normalizing the characteristic difference value corresponding to each pixel point pair to obtain a weight value corresponding to each pixel point pair; and generating a residual error diagram between the image to be processed and the first image according to the weight value corresponding to each pixel point pair.
In this embodiment, the normalization processing is performed on the feature difference value corresponding to each pixel point pair to obtain a weight value corresponding to each pixel point pair, and a residual image between the image to be processed and the first image is generated according to the weight value corresponding to each pixel point pair, so that the residual image is used as a weight image to mark a position with feature difference in the image to be processed in the identification process, and the accuracy of identifying whether the image to be processed is a living body can be improved.
In one embodiment, the identification module 1308 is further to: acquiring weight values corresponding to all pixel points in the residual error map; and performing living body identification on the image to be processed based on the weight value corresponding to each pixel point in the residual error map to obtain the category of the image to be processed.
In this embodiment, the weight value corresponding to each pixel point in the residual map is obtained, and the living body identification is performed on the image to be processed based on the weight value corresponding to each pixel point in the residual map, so that the residual map is used as a weight map to mark the position with the characteristic difference in the image to be processed in the identification process, so that the identification model is more focused on the position with the characteristic difference, the identification accuracy is improved, and the category corresponding to the image to be processed is accurately obtained.
In one embodiment, the identification module 1308 is further to: carrying out convolution processing on the image to be processed through an identification layer of the identification model to obtain a first characteristic value corresponding to each pixel point in the image to be processed; determining a second characteristic value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual image and the first characteristic value corresponding to each pixel point in the image to be processed; and determining the category of the image to be processed according to the second characteristic value corresponding to each pixel point in the image to be processed.
In this embodiment, the recognition layer of the recognition model performs convolution processing on the image to be processed to obtain a first feature value corresponding to each pixel point in the image to be processed, and determines a second feature value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual image and the first feature value corresponding to each pixel point in the image to be processed, so that the weight value in the residual image is applied to the convolution processing of the image to be processed, and therefore, pixel points with obvious feature differences in the image to be processed can be screened out. And determining the category of the image to be processed according to the second characteristic values corresponding to the pixel points in the image to be processed, so that the residual error map can be applied to living body detection, and whether the image to be processed belongs to a living body can be more accurately identified.
In one embodiment, the identification module 1308 is further to: extracting features of the residual image and the image to be processed to obtain features corresponding to the residual image and features corresponding to the image to be processed; and performing living body identification on the image to be processed based on the characteristics of the residual image and the characteristics of the image to be processed, and obtaining the category of the image to be processed.
In this embodiment, feature extraction is performed on the residual image and the image to be processed to obtain features corresponding to the residual image and features corresponding to the image to be processed, so as to obtain key feature information in the residual image and key feature information in the image to be processed. The living body identification is carried out on the image to be processed based on the characteristics of the residual image and the characteristics of the image to be processed, the category of the image to be processed can be identified based on key characteristic information, the calculated amount is reduced, and the living body detection efficiency is improved.
In one embodiment, as shown in fig. 14, there is provided an apparatus for training an identification model, which may employ a software module or a hardware module, or a combination of both, as a part of a computer device, and specifically includes: an acquisition module 1402, a sample conversion module 1404, a feature extraction module 1406, a residual map module 1408, a living body identification module 1410, and an adjustment module 1412. Wherein, the liquid crystal display device comprises a liquid crystal display device,
An obtaining module 1402, configured to obtain a training image sample and category labels corresponding to the training image sample, where the category labels include living bodies and non-living bodies;
a sample conversion module 1404, configured to convert, by a conversion layer of the recognition model, a training image sample into a first image sample, where the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images;
the feature extraction module 1406 is configured to perform feature extraction on the training image sample and the first image sample through the recognition layer of the recognition model to obtain a feature map of the training image sample and a feature map of the first image sample;
a residual map module 1408 for determining a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample;
the living body recognition module 1410 is configured to perform living body recognition on the training image sample based on the residual map, so as to obtain a recognition result of the training image sample;
and the adjusting module 1412 is used for adjusting parameters of the recognition model and continuing training according to the recognition result of the training image sample and the difference between the corresponding class labels until the preset condition is met, and stopping training to obtain the trained recognition model.
In this embodiment, a training image sample and a class label corresponding to the training image sample are obtained, the class label includes a living body and a non-living body, the training image sample is converted into a first image sample through a conversion layer of an identification model, and the training image sample and the first image sample correspond to different attributes; the method comprises the steps of performing feature extraction on a training image sample and a first image sample through an identification layer of an identification model to obtain a feature map of the training image sample and a feature map of the first image sample, determining a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample, performing living body identification on the training image sample based on the residual map to obtain an identification result of the training image sample, adjusting parameters of the identification model and continuing training until a preset condition is met according to the identification result of the training image sample and the difference between corresponding class labels, thereby obtaining a trained identification model, enabling living body identification to be performed from a single image through the trained identification model, and avoiding any facial action by a user, thereby reducing the detection cost and improving the accuracy of living body identification.
In one embodiment, the conversion layer of the recognition model includes a first generator and a second generator; the first generator converts the counterfeit image into a non-counterfeit image, and the second generator converts the non-counterfeit image into a counterfeit image;
the sample conversion module 1404 is also configured to: acquiring a negative sample image from the training image sample, wherein the attribute of the negative sample image is a fake image; converting the negative sample image into a second image by the first generator, the negative sample image and the second image having different properties; converting the second image into a third image by the second generator, the third image and the negative image having the same attribute; determining a similarity of the negative sample image and the third image; and when the similarity of the negative sample image and the third image is smaller than a similarity threshold, adjusting parameters of the first generator and the second generator and continuing training until the training stopping condition is met, and obtaining the trained first generator and second generator.
In this embodiment, a negative sample image is obtained from a training image sample, the negative sample image is converted into a second image by the first generator, the negative sample image and the second image have different attributes, and the second image is converted into a third image by the second generator. The third image and the negative image have the same attribute, and by determining the similarity between the negative image and the third image, it can be determined whether the performance of the second generator meets the requirement, and whether the performance of the first generator meets the requirement is determined whether the performance of the second generator meets the requirement. And when the similarity of the negative sample image and the third image is smaller than a similarity threshold, adjusting parameters of the first generator and the second generator and continuing training until the training stopping condition is met, and obtaining the trained first generator and second generator. The first generator can convert the fake image into the non-fake image through training, and the second generator converts the non-fake image into the fake image, so that one image is converted into a plurality of images with different attributes, training image samples can be expanded, a data set is enhanced, and the cost of data acquisition is reduced.
In one embodiment, the translation layer of the recognition model further includes a discriminator; the sample conversion module 1404 is also configured to: acquiring a positive sample image from the training image sample, wherein the attribute of the positive sample image is a non-fake image; identifying the second image and the positive sample image through the discriminator, and determining attribute identification results corresponding to the second image and the positive sample image; and when the attribute identification results corresponding to the second image and the positive sample image are different, adjusting parameters of the discriminator and continuing training until the training stopping condition is met, and obtaining the trained discriminator.
In this embodiment, a positive sample image is obtained from a training image sample, where an attribute of the positive sample image is a non-counterfeit image, and a discriminator is used to identify a second image and a positive sample image with the same attribute, and an attribute identification result corresponding to the second image and the positive sample image is determined, so as to determine whether the discrimination performance of the discriminator meets the requirement. When the attribute identification results corresponding to the second image and the positive sample image are different, adjusting parameters of the discriminator and continuing training until the training stopping condition is met, obtaining a trained discriminator, and identifying the attribute of the output image to be processed through the trained discriminator in the application process of the identification model. And determining the attribute of the image to be converted according to the attribute of the image to be processed, so as to determine a generator for processing the image to be processed, and further accurately obtaining the converted image different from the attribute of the image to be processed.
The specific definition of the living body recognition apparatus can be referred to the definition of the living body recognition method hereinabove, and will not be described in detail herein. The respective modules in the living body recognition apparatus described above may be realized in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
For specific limitations of the recognition model training apparatus, reference may be made to the above limitations of the recognition model training method, and no further description is given here. The respective modules in the above-described recognition model training apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a terminal, and an internal structure diagram thereof may be as shown in fig. 15. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of in-vivo recognition or recognition model training. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 15 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application is applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, storing a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (15)

1. A living body identification method, comprising:
acquiring an image to be processed, and converting the image to be processed into a first image through a conversion layer of an identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image;
extracting features of the image to be processed and the first image through an identification layer of the identification model to obtain a feature map of the image to be processed and a feature map of the first image;
Determining a residual error diagram between the image to be processed and the first image according to the feature diagram of the image to be processed and the feature diagram of the first image;
and performing living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body.
2. The method according to claim 1, wherein the feature extraction of the image to be processed and the first image by the recognition layer of the recognition model, to obtain a feature map of the image to be processed and a feature map of the first image, includes:
dividing the image to be processed and the first image through an identification layer of the identification model to obtain each region of the image to be processed and each region of the first image;
determining characteristic values corresponding to all areas in the image to be processed respectively, and determining the characteristic values corresponding to all areas in the first image respectively;
determining a feature map of the image to be processed according to feature values respectively corresponding to all areas in the image to be processed;
and determining a feature map of the first image according to the feature values respectively corresponding to the areas in the first image.
3. The method according to claim 1, wherein the determining a residual map between the image to be processed and the first image from the feature map of the image to be processed and the feature map of the first image comprises:
determining a feature variation between a feature map of the image to be processed and a feature map of the first image;
and generating a residual image between the image to be processed and the first image according to the characteristic variation.
4. A method according to claim 3, wherein said determining the feature variation between the feature map of the image to be processed and the feature map of the first image comprises:
determining pixel point pairs between the feature map of the image to be processed and the feature map of the first image;
determining the characteristic difference value between two pixel points in the pixel point pairs to obtain the corresponding characteristic difference value of each pixel point pair;
the generating a residual image between the image to be processed and the first image according to the characteristic variation comprises the following steps:
and generating a residual image between the image to be processed and the first image according to the characteristic difference value corresponding to each pixel point pair.
5. The method according to claim 4, wherein generating a residual map between the image to be processed and the first image according to the feature difference value corresponding to each pixel point pair comprises:
normalizing the characteristic difference value corresponding to each pixel point pair to obtain a weight value corresponding to each pixel point pair;
and generating a residual image between the image to be processed and the first image according to the weight value corresponding to each pixel point pair.
6. The method according to any one of claims 1 to 5, wherein the performing in-vivo recognition on the image to be processed based on the residual map, to obtain a category of the image to be processed, includes:
acquiring weight values corresponding to all pixel points in the residual error map;
and performing living body identification on the image to be processed based on the weight value corresponding to each pixel point in the residual image to obtain the category of the image to be processed.
7. The method according to claim 6, wherein the performing the living body recognition on the image to be processed based on the weight value corresponding to each pixel point in the residual map to obtain the category of the image to be processed includes:
Carrying out convolution processing on the image to be processed through an identification layer of the identification model to obtain a first characteristic value corresponding to each pixel point in the image to be processed;
determining a second characteristic value corresponding to each pixel point in the image to be processed according to the weight value corresponding to each pixel point in the residual image and the first characteristic value corresponding to each pixel point in the image to be processed;
and determining the category of the image to be processed according to the second characteristic value corresponding to each pixel point in the image to be processed.
8. The method according to claim 1, wherein the performing the living body recognition on the image to be processed based on the residual map, to obtain the category of the image to be processed, includes:
extracting features of the residual image and the image to be processed to obtain features corresponding to the residual image and features corresponding to the image to be processed;
and performing living body identification on the image to be processed based on the characteristics of the residual image and the characteristics of the image to be processed, so as to obtain the category of the image to be processed.
9. A method of training an identification model, comprising:
acquiring a training image sample and a category label corresponding to the training image sample, wherein the category label comprises a living body and a non-living body;
Converting the training image sample into a first image sample through a conversion layer of the recognition model, wherein the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images;
extracting features of the training image sample and the first image sample through an identification layer of the identification model to obtain a feature map of the training image sample and a feature map of the first image sample;
determining a residual error map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample;
performing living body recognition on the training image sample based on the residual image to obtain a recognition result of the training image sample;
and adjusting parameters of the recognition model according to the difference between the recognition result of the training image sample and the corresponding class label, and continuing training until the preset condition is met, stopping training, and obtaining the trained recognition model.
10. The method of claim 9, wherein the conversion layer of the recognition model includes a first generator and a second generator; the first generator converts the counterfeit image into a non-counterfeit image, and the second generator converts the non-counterfeit image into a counterfeit image;
The training mode of the generator in the conversion layer of the identification model comprises the following steps:
acquiring a negative sample image from the training image sample, wherein the attribute of the negative sample image is a fake image;
converting the negative sample image into a second image by the first generator, wherein the negative sample image and the second image have different attributes;
converting, by the second generator, the second image into a third image, the third image and the negative image having the same attribute;
determining a similarity of the negative sample image and the third image;
and when the similarity of the negative sample image and the third image is smaller than a similarity threshold, adjusting parameters of the first generator and the second generator and continuing training until a training stopping condition is met, and obtaining the trained first generator and second generator.
11. The method of claim 10, wherein the translation layer of the recognition model further comprises a discriminator;
the training mode of the discriminator in the conversion layer of the identification model comprises the following steps:
acquiring a positive sample image from the training image sample, wherein the attribute of the positive sample image is a non-fake image;
Identifying the second image and the positive sample image through the discriminator, and determining attribute identification results corresponding to the second image and the positive sample image;
and when the attribute identification results corresponding to the second image and the positive sample image are different, adjusting parameters of the discriminator and continuing training until the training stopping condition is met, and obtaining the trained discriminator.
12. A living body identification device, characterized in that the device comprises:
the conversion module is used for acquiring an image to be processed, converting the image to be processed into a first image through a conversion layer of the identification model, wherein the image to be processed and the first image correspond to different attributes, and the attributes comprise a fake image and a non-fake image;
the extraction module is used for extracting the characteristics of the image to be processed and the first image through the identification layer of the identification model to obtain a characteristic image of the image to be processed and a characteristic image of the first image;
the determining module is used for determining a residual error diagram between the image to be processed and the first image according to the characteristic diagram of the image to be processed and the characteristic diagram of the first image;
The identification module is used for carrying out living body identification on the image to be processed based on the residual image to obtain the category of the image to be processed, wherein the category is living body or non-living body.
13. An identification model training apparatus, the apparatus comprising:
the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring a training image sample and a class label corresponding to the training image sample, and the class label comprises a living body and a non-living body;
the sample conversion module is used for converting the training image sample into a first image sample through a conversion layer of the recognition model, and the training image sample and the first image sample correspond to different attributes; the attributes include counterfeit images and non-counterfeit images;
the feature extraction module is used for extracting features of the training image sample and the first image sample through the recognition layer of the recognition model to obtain a feature map of the training image sample and a feature map of the first image sample;
a residual map module, configured to determine a residual map between the training image sample and the first image sample according to the feature map of the training image sample and the feature map of the first image sample;
The living body identification module is used for carrying out living body identification on the training image sample based on the residual error map to obtain an identification result of the training image sample;
and the adjusting module is used for adjusting the parameters of the recognition model according to the recognition result of the training image sample and the difference between the corresponding class labels, and continuing training until the preset condition is met, and stopping training to obtain a trained recognition model.
14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 11 when the computer program is executed.
15. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of any one of claims 1 to 11.
CN202010107870.4A 2020-02-21 2020-02-21 Living body identification method, living body identification device, computer device, and storage medium Active CN111339897B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010107870.4A CN111339897B (en) 2020-02-21 2020-02-21 Living body identification method, living body identification device, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010107870.4A CN111339897B (en) 2020-02-21 2020-02-21 Living body identification method, living body identification device, computer device, and storage medium

Publications (2)

Publication Number Publication Date
CN111339897A CN111339897A (en) 2020-06-26
CN111339897B true CN111339897B (en) 2023-07-21

Family

ID=71185452

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010107870.4A Active CN111339897B (en) 2020-02-21 2020-02-21 Living body identification method, living body identification device, computer device, and storage medium

Country Status (1)

Country Link
CN (1) CN111339897B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882525A (en) * 2020-07-01 2020-11-03 上海品览数据科技有限公司 Image reproduction detection method based on LBP watermark characteristics and fine-grained identification
CN111680672B (en) * 2020-08-14 2020-11-13 腾讯科技(深圳)有限公司 Face living body detection method, system, device, computer equipment and storage medium
CN112115912B (en) * 2020-09-28 2023-11-28 腾讯科技(深圳)有限公司 Image recognition method, device, computer equipment and storage medium
CN112836625A (en) * 2021-01-29 2021-05-25 汉王科技股份有限公司 Face living body detection method and device and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805828A (en) * 2018-05-22 2018-11-13 腾讯科技(深圳)有限公司 Image processing method, device, computer equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102419136B1 (en) * 2017-06-15 2022-07-08 삼성전자주식회사 Image processing apparatus and method using multiple-channel feature map

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805828A (en) * 2018-05-22 2018-11-13 腾讯科技(深圳)有限公司 Image processing method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN111339897A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN111339897B (en) Living body identification method, living body identification device, computer device, and storage medium
US10248954B2 (en) Method and system for verifying user identity using card features
WO2020207189A1 (en) Method and device for identity authentication, storage medium, and computer device
Carvalho et al. Illuminant-based transformed spaces for image forensics
Peng et al. Face presentation attack detection using guided scale texture
Rattani et al. A survey of mobile face biometrics
Deb et al. Look locally infer globally: A generalizable face anti-spoofing approach
WO2022033220A1 (en) Face liveness detection method, system and apparatus, computer device, and storage medium
CN111160313B (en) Face representation attack detection method based on LBP-VAE anomaly detection model
US11244152B1 (en) Systems and methods for passive-subject liveness verification in digital media
Das et al. Lip biometric template security framework using spatial steganography
US11126827B2 (en) Method and system for image identification
CN111611873A (en) Face replacement detection method and device, electronic equipment and computer storage medium
CN111275685A (en) Method, device, equipment and medium for identifying copied image of identity document
Zhang et al. Face anti-spoofing detection based on DWT-LBP-DCT features
Smith-Creasey et al. Continuous face authentication scheme for mobile devices with tracking and liveness detection
CN109886223B (en) Face recognition method, bottom library input method and device and electronic equipment
EP4085369A1 (en) Forgery detection of face image
US11373449B1 (en) Systems and methods for passive-subject liveness verification in digital media
CN111582155B (en) Living body detection method, living body detection device, computer equipment and storage medium
CN113642639B (en) Living body detection method, living body detection device, living body detection equipment and storage medium
Tapia et al. Selfie periocular verification using an efficient super-resolution approach
Deng et al. Attention-aware dual-stream network for multimodal face anti-spoofing
Shahriar et al. An iris-based authentication framework to prevent presentation attacks
Jagadeesh et al. DBC based Face Recognition using DWT

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40025241

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant