CN114445917A

CN114445917A - Network training method and system for face living body recognition and electronic equipment

Info

Publication number: CN114445917A
Application number: CN202210070713.XA
Authority: CN
Inventors: 胡胤; 王涛; 汪云; 邱真; 陈阳; 刘健
Original assignee: Shenzhen Vclusters Information Technology Co ltd
Current assignee: Shenzhen Vclusters Information Technology Co ltd
Priority date: 2022-01-20
Filing date: 2022-01-20
Publication date: 2022-05-06

Abstract

The invention provides a network training method, a system and electronic equipment for face living body recognition, which comprises the following steps: acquiring a preprocessed image; extracting the features of the image, detecting the face features in the image and obtaining a classification prediction result; based on the classification prediction result, adopting an Am-softmax Loss function to calculate a corresponding Loss function value; based on the loss function value, a gradient is reversely obtained by adopting a gradient descent algorithm, and model parameters of the recognition network are updated, so that an angle boundary can be established between different samples by the trained recognition network, a large-angle interval supervision function is formed between different samples, the difference between the samples can be greatly pulled, the inter-class distance is enhanced, the trained recognition network has higher recognition precision when facing various attack faces, the living bodies of the faces are accurately distinguished, and the recognition accuracy is improved.

Description

Network training method and system for face living body recognition and electronic equipment

Technical Field

The invention relates to the technical field of face living body recognition networks, in particular to a face living body recognition network training method, a face living body recognition network training system and electronic equipment.

Background

Face anti-spoofing (face anti-spoofing) is a method for dealing with counterfeit face attack by applying computer image processing technology, aiming at capturing real face and preventing damage caused by face invasion of attack, at present, the face attack method is roughly divided into several kinds: paper attacks, screen attacks, and 3D mask attacks, which can be subdivided into more (e.g., a4 paper, posters, PCs, pads, etc.). The living body face and the non-living body face have slight difference on the image, such as color texture, non-rigid motion deformation and the like, and the early method is to design the characteristics based on the image characteristics and finally make decision analysis through a machine learning classifier.

With the development of the internet and big data, face recognition has been applied to various industries, particularly in door access, financial payment and other industries with higher security, and not only has higher requirements on the face recognition rate, but also faces more complex face attacks, the traditional face living body detection method based on image feature design cannot meet the requirements, and with the rapid development of deep learning in recent years, more and more feature engineering has been gradually replaced by a deep convolutional network.

The living human face detection is a two-class problem, the traditional convolutional neural network two-class method generally uses softmax Loss as a supervision function, and the traditional softmax Loss is as follows:

wherein f is_iThe ith sample of the last full-link layer is the sample, and the above supervision function of softmax Loss causes overlarge difference among different samples and lower accuracy of living body identification.

Disclosure of Invention

In order to overcome the problem of low accuracy of the existing face living body detection, the invention provides a face living body recognition network training method, a face living body recognition network training system and electronic equipment.

The invention provides a human face living body recognition network training method for solving the technical problems, which comprises the following steps:

acquiring a preprocessed image;

extracting the features of the image, detecting the face features in the image and obtaining a classification prediction result;

based on the classification prediction result, adopting an Am-softmax Loss function to calculate a corresponding Loss function value;

and based on the loss function value, reversely solving the gradient by adopting a gradient descent algorithm, and updating and identifying the model parameters of the network.

Preferably, the Am-softmax Loss function is specifically:

preferably, in the formula, the scaling factor s is set to 1 and m is set to 0.5.

Preferably, the pre-processing of the image comprises image enhancement and/or image scaling processing.

The invention also provides a face living body recognition network system, which comprises:

the image preprocessing unit is used for acquiring a preprocessed image;

the prediction unit is used for extracting the features of the image, detecting the face features in the image and obtaining a classification prediction result;

the Loss supervision unit is used for calculating a corresponding Loss function value by adopting an Am-softmax Loss function based on the classification prediction result;

and the model updating unit is used for reversely solving the gradient by adopting a gradient descent algorithm based on the loss function value and updating the model parameters of the identification network.

The invention also provides an electronic device, comprising a memory and a processor, wherein the memory stores a computer program which is set to execute the human face living body recognition network training method in any one of the above items when running; the processor is configured to execute the living human face recognition network training method described in any one of the above through the computer program.

Compared with the prior art, the human face living body recognition network training method, the human face living body recognition network training system and the electronic equipment provided by the invention have the following advantages:

the Am-Softmax Loss is introduced into the neural network for face recognition to be used as a Loss function to supervise gradient reduction, an angle boundary can be established among different samples, a large-angle interval supervision function is formed among the different samples, the difference among the samples can be pulled open more, the inter-class distance is enhanced, the trained recognition network has higher recognition precision when facing various attack faces, the living bodies of the faces are distinguished accurately, and the recognition accuracy is improved.

Drawings

Fig. 1 is an overall flowchart of a human face living body recognition network training method according to a first embodiment of the present invention.

Fig. 2 is a block diagram of a living human face recognition network training system according to a second embodiment of the present invention.

Fig. 3 is a block diagram of an electronic device according to a third embodiment of the invention.

Description of reference numerals:

1. an image preprocessing unit; 2. a prediction unit; 3. a loss supervision unit; 4. a model updating unit; 10. a memory; 20. a processor.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1, a first embodiment of the present invention provides a network training method for face living body recognition, which includes the following steps:

step S1: acquiring a preprocessed image;

step S2: extracting the features of the image, detecting the face features in the image and obtaining a classification prediction result;

step S3: based on the classification prediction result, adopting an Am-softmax Loss function to calculate a corresponding Loss function value;

step S4: and based on the loss function value, reversely solving the gradient by adopting a gradient descent algorithm, and updating and identifying the model parameters of the network.

It can be understood that the network for recognizing the living human face preset in step S1 acquires the image after the preprocessing, where the preprocessing includes the conventional common image enhancement and image scaling (image enhancement), so that the image is convenient for the network recognition and the accuracy is improved.

It is to be understood that in step S2, the recognition network performs feature extraction and recognition on the face in the image, and in some specific embodiments, the recognition network may also perform prediction on the deflection of the face angle, that is, the prediction is divided into recognition of the face feature and prediction of the deflection of the face angle. When the image of the living human face is obtained, the shooting angle of the user is different, so that the shooting angle cannot be completely opposite, and the human face angle deflection is also detected. Of course, the identification can be performed for other situations, which is not described herein again.

It is understood that, in step S3, the recognition network performs feature extraction and recognition, which can output the confidence that the input image belongs to living body (live) or non-living body (flood) based on the Softmax layer, and further obtain the recognition result, and after outputting the result, compare the real result, and calculate the loss function value of the predicted result and the real result. In this embodiment, Am-Softmax Loss (additive mark software) is used as a Loss function, a psi (θ) is added to the Softmax Loss, and before input, features and weights are first normalized to obtain a result

Only ψ (x) ═ x-m needs to be calculated in the inference process, so the reciprocal obtained for ψ (x) is ψ' (x) ═ 1, specifically, in the present embodiment, the complete Am-softmax Loss is as follows:

in the above equation, s is a scaling factor (hyper-parameter), the hyper-parameter s is used to scale a cosine value, and after many verification tests, learning the hyper-parameter s in reverse propagation does not change the value too much, and the convergence speed is slow, so that the hyper-parameter s is directly set to 1, and m is set to 0.5, which can accelerate the convergence of the model.

It can be understood that, in step S4, after obtaining the Am-softmax Loss-based supervised Loss function value, the gradient is reversely obtained through the gradient descent algorithm to update the model parameters of the face living body recognition network, and the update of the model parameters depends on the type of the sample and the number of the samples used for training with the image, that is, the face living body recognition network training method provided by the present application may perform multiple times of repeated training based on the selection of the user, so as to obtain the required recognition accuracy.

It can be understood that, in the present embodiment, a common Convolutional Neural Network (CNN) is used for identification, where a CNN network mainly includes a convolutional layer, a pooling layer (pooling), a fully-connected layer, and a lossy layer, the CNN of the present embodiment is based on Am-softmax Loss for monitoring gradient descent, and the common CNN for face identification is only configured with softmax Loss as gradient descent monitoring.

It can be understood that, in this embodiment, the introduced Am-Softmax Loss is used as a Loss function, an angle boundary can be established between different samples, a large-angle interval supervision function can be formed between different samples, a difference between the samples can be pulled open more, the inter-class distance is enhanced, the accuracy of the trained recognition network on living body detection is higher, and the interference of attacking the human face is reduced.

Referring to fig. 2, a second embodiment of the invention further provides a face living body recognition network training system. For executing the living face recognition network training method in the first embodiment, the living face recognition network training system may include:

an image preprocessing unit 1 for implementing the above step S1, and acquiring a preprocessed image;

a prediction unit 2, configured to implement step S2, configured to perform feature extraction on the image, detect a face feature in the image, and obtain a classification prediction result;

a Loss supervision unit 3, configured to implement step S3 described above, and configured to calculate, based on the classification prediction result, a corresponding Loss function value by using an Am-softmax Loss function;

and a model updating unit 4, configured to implement step S4 described above, and configured to use a gradient descent algorithm to reversely obtain a gradient based on the loss function value, and update the model parameters of the identification network.

Referring to fig. 3, a third embodiment of the present invention provides an electronic device for implementing the living human face recognition network training method, where the electronic device includes a memory 10 and a processor 20, the memory 10 stores therein an arithmetic computer program, and the arithmetic computer program is configured to execute the steps in any one of the above embodiments of the living human face recognition network training method when the arithmetic computer program is executed. The processor 20 is configured to execute the steps of any one of the above embodiments of the face living body recognition network training method through the computer program.

Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of an operating machine network.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart.

Which when executed by a processor performs the above-described functions defined in the method of the present application. It should be noted that the computer memory described herein may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer memory may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.

More specific examples of computer memory may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable signal medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an image preprocessing unit, a prediction unit, a loss supervision unit, and a model update unit. The names of these units do not in some cases constitute a limitation on the unit itself, and for example, the image preprocessing unit may also be described as a "unit for acquiring a preprocessed image".

As another aspect, the present application also provides a computer memory, which may be included in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer memory carries one or more programs that, when executed by the apparatus, cause the apparatus to: acquiring a preprocessed image; extracting the features of the image, detecting the face features in the image and obtaining a classification prediction result; based on the classification prediction result, adopting an Am-softmax Loss function to calculate a corresponding Loss function value; and based on the loss function value, reversely solving the gradient by adopting a gradient descent algorithm, and updating and identifying the model parameters of the network.

The present invention is not limited to the above preferred embodiments, and any modifications, equivalent alterations and improvements made within the spirit of the present invention should be included in the scope of the present invention.

Claims

1. A face living body recognition network training method is characterized in that: the method comprises the following steps:

acquiring a preprocessed image;

2. The human face living body recognition network training method as claimed in claim 1, wherein: the Am-softmax Loss function is specifically as follows:

3. the human face living body recognition network training method as claimed in claim 2, characterized in that: in the formula, the scaling factor s is set to 1, and m is set to 0.5.

4. The human face living body recognition network training method as claimed in claim 1, characterized in that: the pre-processing of the image includes image enhancement and/or image scaling processing.

5. A living human face recognition network system, comprising:

the image preprocessing unit is used for acquiring a preprocessed image;

6. An electronic device comprising a memory and a processor, characterized in that: the memory stores a computer program configured to execute the live human face recognition network training method of any one of claims 1 to 4 when running;

the processor is arranged to execute the live human face recognition network training method of any one of claims 1 to 4 by the computer program.