CN109034013B - Face image recognition method, device and storage medium - Google Patents

Face image recognition method, device and storage medium Download PDF

Info

Publication number
CN109034013B
CN109034013B CN201810750438.XA CN201810750438A CN109034013B CN 109034013 B CN109034013 B CN 109034013B CN 201810750438 A CN201810750438 A CN 201810750438A CN 109034013 B CN109034013 B CN 109034013B
Authority
CN
China
Prior art keywords
face
frame
face image
images
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810750438.XA
Other languages
Chinese (zh)
Other versions
CN109034013A (en
Inventor
陈志博
石楷弘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810750438.XA priority Critical patent/CN109034013B/en
Publication of CN109034013A publication Critical patent/CN109034013A/en
Application granted granted Critical
Publication of CN109034013B publication Critical patent/CN109034013B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C9/00Individual registration on entry or exit
    • G07C9/30Individual registration on entry or exit not involving the use of a pass
    • G07C9/32Individual registration on entry or exit not involving the use of a pass in combination with an identity check
    • G07C9/37Individual registration on entry or exit not involving the use of a pass in combination with an identity check using biometric data, e.g. fingerprints, iris scans or voice recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention discloses a face image recognition method, a device and a storage medium, wherein the embodiment of the invention can collect multi-frame face images and determine a face area to be recognized according to the multi-frame face images; carrying out gesture detection on faces in each frame of face image according to the face areas to obtain gesture parameters; acquiring the moving speed of a face image between every two adjacent frames of face images in a plurality of frames of face images, and determining the definition corresponding to each frame of face image according to the moving speed; screening face images corresponding to preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images; and carrying out face recognition on the target face image. According to the scheme, the face images meeting the requirements of the face images can be screened from the collected multi-frame face images to carry out face recognition, the problem that the face images with poor quality are subjected to face recognition and the recognition result is wrong is avoided, and the accuracy of face image recognition is improved.

Description

Face image recognition method, device and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method and apparatus for face image recognition, and a storage medium.
Background
With the increasing application demands of information security, access control, video monitoring and the like, the face recognition technology is more and more focused by people and is widely applied to various social life works. For example, the face recognition technology can be applied to face access control scenes, and can provide automatic security detection services for householder, security and the like.
In the prior art, in the process of recognizing the face image, the face image of the user is generally collected through the access control camera, then the face in the face image is directly recognized, whether the user has the authority to open the access control can be judged through the recognition result, if so, the access control is opened, and if not, the access control is not opened.
In the research and practice process of the prior art, the inventor of the invention finds that the acquired face image is very inaccurate, and even the image with poor quality can have problems of false recognition and the like, so that some safety problems are caused.
Disclosure of Invention
The embodiment of the invention provides a face image recognition method, a face image recognition device and a storage medium, aiming at improving the accuracy of face image recognition.
In order to solve the technical problems, the embodiment of the invention provides the following technical scheme:
a face image recognition method, comprising:
collecting a plurality of frames of face images, and determining a face area to be recognized according to the plurality of frames of face images;
carrying out gesture detection on the face in each frame of face image according to the face region to obtain gesture parameters;
acquiring the moving speed of a face image between every two adjacent frames of face images in the multi-frame face image, and determining the definition corresponding to each frame of face image according to the moving speed;
screening face images corresponding to preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images;
and carrying out face recognition on the target face image.
A face image recognition apparatus comprising:
the determining unit is used for acquiring a plurality of frames of face images and determining a face area to be recognized according to the plurality of frames of face images;
the detection unit is used for carrying out gesture detection on the human face in each frame of human face image according to the human face region to obtain gesture parameters;
The acquisition unit is used for acquiring the moving speed of the face image between every two adjacent frames of the multi-frame face images and determining the definition corresponding to each frame of face image according to the moving speed;
the screening unit is used for screening face images corresponding to preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images;
and the identification unit is used for carrying out face identification on the target face image.
Optionally, the determining unit includes:
the contact ratio obtaining subunit is used for obtaining the contact ratio of the face area between every two adjacent frames of face images in the multi-frame face images to obtain a plurality of contact ratios;
and the first screening subunit is used for screening the face area corresponding to the highest contact ratio from the multi-frame face image according to the contact ratios to obtain the face area to be identified.
Optionally, the contact ratio obtaining subunit is specifically configured to:
acquiring the intersection area of face areas between every two adjacent frames of face images in the multi-frame face images;
acquiring a union area of face areas between every two adjacent frames of face images in the multi-frame face images;
and calculating the contact ratio of the face areas between every two adjacent frames of face images according to the intersection area and the union area to obtain a plurality of contact ratios.
Optionally, the detection unit includes:
a score obtaining subunit, configured to obtain a face integral value of the face area in each frame of face image;
the second screening subunit is used for screening face images corresponding to the face integral value larger than a preset value from the multi-frame face images to obtain screened face images;
and the detection subunit is used for carrying out gesture detection on the faces in the screened face images to obtain gesture parameters.
Optionally, the score obtaining subunit is specifically configured to:
acquiring the area of a face area in each frame of face image;
acquiring a first mapping relation between the area and the area value;
and determining the face integral value of the face area in each frame of face image according to the first mapping relation.
Optionally, the detection subunit includes:
the parameter acquisition module is used for acquiring deflection parameters of the face in the face area of each frame of face image in the screened face images;
the mapping relation acquisition module is used for acquiring a second mapping relation between the deflection parameter and the gesture score;
and the determining module is used for determining the gesture score of the face region in each frame of face image according to the second mapping relation and setting the gesture score as a gesture parameter.
Optionally, the parameter obtaining module is specifically configured to:
acquiring first projection parameters of face projection in a two-dimensional plane in a face area of each frame of face image in the screened face image;
acquiring a second projection parameter of the projection of the preset face model in a two-dimensional plane;
and acquiring the deflection parameters of the face in the face area according to the first projection parameters and the second projection parameters.
Optionally, the acquiring unit is specifically configured to:
acquiring the distance and time interval between every two adjacent frames of face images in the multi-frame face images;
calculating the moving speed of each frame of face image according to the distance and the time interval;
acquiring a third mapping relation between the moving speed and the definition score;
and determining the definition score of each frame of face image according to the third mapping relation, and determining the definition corresponding to each frame of face image according to the definition score.
Optionally, the screening unit includes:
the calculating subunit is used for calculating the face quality score of each frame of face image according to the attitude parameters and the definition;
and the third screening subunit is used for screening face images corresponding to the face quality scores larger than a preset threshold value from the multi-frame face images to obtain target face images.
Optionally, the computing subunit is specifically configured to:
determining a face integral value, a gesture score and a definition score of each frame of face image according to the gesture parameters and the definition;
setting corresponding weight values for the face integral value, the gesture score and the definition score respectively;
and respectively calculating the face quality score corresponding to each frame of face image according to the face integral value, the gesture score, the definition score and the weight value corresponding to the face integral value.
Optionally, the screening unit further comprises:
and the fourth screening subunit is used for screening the face image corresponding to the maximum face quality score from the multi-frame face images to obtain the target face image when the face quality score is not greater than the preset threshold.
Optionally, the third screening subunit is specifically configured to:
screening face quality scores greater than a preset threshold value from the face quality scores to obtain candidate face quality scores;
when the candidate face quality scores comprise a plurality of candidate face quality scores, selecting one of the candidate face quality scores according to a preset algorithm to obtain target face quality;
and determining a face image corresponding to the target face quality from the multi-frame face image to obtain a target face image.
Optionally, the third screening subunit is specifically configured to:
taking a first frame of face image in the multi-frame face images as a current face image;
comparing the face quality score of the current face image with a preset threshold value;
and if the face quality score of the current face image is smaller than the preset threshold value, taking a second frame of face image in the multi-frame face image as the current face image, and returning to the step of comparing the face quality score of the current face image with the preset threshold value until the face quality score of the current face image is larger than the preset threshold value, so as to obtain the target face image.
A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of any of the face image recognition methods provided by the embodiments of the present invention.
The embodiment of the invention can acquire a plurality of frames of face images, determine a face area to be recognized according to the plurality of frames of face images, then perform gesture detection on faces in each frame of face images according to the face area to obtain gesture parameters, acquire the moving speed of the face images between every two adjacent frames of the plurality of frames of face images, and determine the definition corresponding to each frame of face images according to the moving speed; at this time, the face image corresponding to the preset condition can be screened out from the multi-frame face images according to the gesture parameters and the definition, so as to obtain a target face image, for example, the target face image is an image with better face image quality, and face recognition is performed on the target face image. According to the scheme, the face image with the best quality, which meets the conditions, can be screened out from the collected multi-frame face images, namely the face image with the best quality is screened out for face recognition, so that the problem that the face image with the poor quality is recognized, and the recognition result is wrong is avoided, and the accuracy of face image recognition is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic view of a face image recognition system according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a face image recognition method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a first mapping relationship between area and area values provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a second mapping relationship between yaw parameters and attitude scores provided by an embodiment of the present invention;
FIG. 5 is a diagram illustrating a third mapping relationship between movement speed and sharpness score according to an embodiment of the present invention;
fig. 6 is another flow chart of a face image recognition method according to an embodiment of the present invention;
fig. 7 is a schematic diagram of determining a face area to be identified according to an embodiment of the present invention;
fig. 8 is a schematic diagram of determining a face integration value of a face image according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of determining pose scores of face images according to an embodiment of the present invention;
FIG. 10 is a schematic diagram of determining a sharpness score of a face image according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a face image recognition device according to an embodiment of the present invention;
fig. 12 is another schematic structural diagram of a face image recognition device according to an embodiment of the present invention;
fig. 13 is another schematic structural diagram of a face image recognition device according to an embodiment of the present invention;
fig. 14 is another schematic structural diagram of a face image recognition device according to an embodiment of the present invention;
fig. 15 is a schematic structural diagram of a network device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
The embodiment of the invention provides a face image recognition method, a face image recognition device and a storage medium.
Referring to fig. 1, fig. 1 is a schematic view of a face image recognition system provided by an embodiment of the present invention, where the face image recognition system may include a face image recognition device, and the face image recognition device may be specifically integrated in a network device, for example, a terminal or a server, where the network device may receive a plurality of frames of face images sent by an access control or other terminals, where the access control may be set in a place such as a company, an airport, a market, a school, a district, or a house door, where the form of the access control may be flexibly set according to actual needs, and where the access control may collect face images in real time or at preset intervals through a preset image collector (for example, a camera) when a user enters and/or exits, and send the collected face images to the network device. The network device may determine the face area to be identified according to the multi-frame face image, for example, may detect one or more face areas included in the face image, so that the face area corresponding to the highest contact ratio may be determined as the face area to be identified, and the face area to be identified may be only one. Then, the face in each frame of face image is detected according to the face area to obtain gesture parameters, wherein the gesture parameters can comprise a face integral value and a gesture score of the face area in each frame of face image, the face integral value is related to the area of the face area, the gesture score is related to the deflection parameter of the face in the face area, for example, the face integral value of the face area in each frame of face image can be obtained, the face image corresponding to the face integral value larger than a preset value is screened, and gesture detection is carried out on the face in the screened face image. And acquiring the moving speed of the face image between every two adjacent frames of the multi-frame face images, and determining the definition corresponding to each frame of face image according to the moving speed, wherein the definition is related to the definition score, and the definition score is related to the moving speed of the face region (namely, the definition score is related to the moving speed of the face image). At this time, the face image corresponding to the preset condition can be screened out from the multi-frame face images according to the gesture parameters and the definition to obtain a target face image, wherein the target face image is an image with higher quality, for example, the face image corresponding to the face quality score larger than the preset threshold value can be screened out from the multi-frame face images to obtain the target face image, finally, the face recognition can be carried out on the target face image, when the identified face is matched with the pre-stored face, the user is indicated to have the right of opening the access control, and the network equipment can send a control instruction to the access control so as to open the access control and release the user; etc.
It should be noted that, the schematic view of the face image recognition system shown in fig. 1 is only an example, and the face image recognition system and the scene described in the embodiments of the present invention are for more clearly describing the technical solution of the embodiments of the present invention, and do not constitute a limitation on the technical solution provided by the embodiments of the present invention, and as a person of ordinary skill in the art can know that, along with the evolution of the face image recognition system and the appearance of a new service scene, the technical solution provided by the embodiments of the present invention is equally applicable to similar technical problems.
The following will describe in detail.
In the present embodiment, description will be made from the viewpoint of a face image recognition apparatus which may be integrated in a network device such as a server or a terminal.
A face image recognition method, comprising: collecting multiple frames of face images, and determining a face area to be recognized according to the multiple frames of face images; carrying out gesture detection on faces in each frame of face image according to the face areas to obtain gesture parameters; acquiring the moving speed of a face image between every two adjacent frames of face images in a plurality of frames of face images, and determining the definition corresponding to each frame of face image according to the moving speed; screening face images corresponding to preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images; and carrying out face recognition on the target face image.
Referring to fig. 2, fig. 2 is a flowchart of a face image recognition method according to an embodiment of the invention. The face image recognition method may include:
in step S101, a plurality of frames of face images are collected, and a face region to be recognized is determined according to the plurality of frames of face images.
The face image recognition device may continuously collect face images of multiple frames of users through a preset camera, a video camera, a still camera or the like, or the face image recognition device may receive multiple frames of face images sent by a terminal, a server or the like.
The multi-frame face image may be an image continuously acquired at intervals of a preset time within a preset time period, for example, 30 frames of images continuously acquired at intervals of 2 seconds within 1 minute. The face image may include one or more faces, and may further include other objects, and the specific content is not limited herein.
In order to identify a required face image, the face image identifying device may track faces in a plurality of frames of face images so as to determine a face area to be identified.
In some embodiments, the step of determining the face region to be identified from the multi-frame face image may include:
Acquiring the contact ratio of a face region between every two adjacent frames of face images in the multi-frame face images to obtain a plurality of contact ratios; and screening the face area corresponding to the highest contact ratio from the multi-frame face image according to the contact ratios to obtain the face area to be identified.
The face image recognition device can utilize a face tracking algorithm to detect a face of each frame of face image, calculate the contact ratio of face areas between two adjacent frames of face images in the front and back frames of face images in the multi-frame face image, obtain the contact ratio corresponding to each pair of adjacent two frames of face images, and the contact ratio of the multiple pairs of adjacent face images can be combined into multiple contact ratios, wherein the contact ratio can be the contact area between face areas of the same person in the two adjacent frames of face images. After the multiple contact ratios are obtained, a pair of adjacent two-frame face images corresponding to the highest contact ratio can be screened out from the multi-frame face images based on the multiple contact ratios, and the face areas in the pair of adjacent two-frame face images are the face areas to be identified, and the face areas to be identified can be also called as tracking face areas. The shape of the face region can be flexibly set according to actual needs, for example, the face region can be a rectangular region, a square region or a circular region.
For example, when the face image includes a first user face and a second user face, the coincidence degrees of the first user face area and the second user face area may be calculated respectively, and if the coincidence degree of the first user face area is greater than the coincidence degree of the second user face area, the first user face area is determined as the face area to be recognized.
In some embodiments, the step of obtaining the contact ratio of the face region between every two adjacent face images in the multi-frame image to obtain the multiple contact ratios may include:
acquiring the intersection area of face areas between every two adjacent frames of face images in the multi-frame face images;
acquiring a union area of face areas between every two adjacent frames of face images in the multi-frame face images;
and calculating the contact ratio of the face areas between every two adjacent frames of face images according to the intersection area and the union area to obtain a plurality of contact ratios.
The calculation mode of the overlap ratio of the face areas between two adjacent frames of face images can be as follows formula (1):
Figure BDA0001725412260000081
in the formula (1), IOU (A, B) represents the coincidence ratio of the face areas between the adjacent A-frame face images and B-frame face images, |A n B| represents the intersection area of the face areas between the adjacent A-frame face images and B-frame face images, the intersection area is the coincidence area of the face areas of the same person between the A-frame face images and the B-frame face images, and|AUB| is the union area of the face areas between the adjacent A-frame face images and B-frame face images, and the union area is the union area of the face areas of the same person between the A-frame face images and the B-frame face images.
For each pair of adjacent two frames of face images in the multi-frame face images, the coincidence ratio of the face areas corresponding to the adjacent two frames of face images can be obtained through calculation according to the formula (1). When the face image includes a plurality of faces, the overlap ratio of the face regions of each person in each pair of adjacent two-frame face images may be respectively calculated.
In step S102, gesture detection is performed on the faces in each frame of face image according to the face regions, so as to obtain gesture parameters.
In some embodiments, the step of detecting the pose of the face in each frame of face image according to the face region to obtain the pose parameter may include:
acquiring a face integral value of a face area in each frame of face image; screening face images of which the face integral value is larger than a preset value from a plurality of frames of face images to obtain screened face images; and carrying out gesture detection on the faces in the screened face images to obtain gesture parameters.
In order to improve the accuracy of gesture detection on faces in the face images, preliminary screening can be performed on the obtained multi-frame face images, for example, the face images with the areas of the face areas meeting the conditions can be screened. Specifically, a face integral value of a face region in each frame of face image may be acquired, the face integral value being related to the area of the face region (i.e., the size of the face region), and the face integral value being in a proportional relationship with the area of the face region. And then screening face images corresponding to face integral values larger than preset values from the multi-frame face images, wherein the preset values can be flexibly set according to actual needs so as to screen face images with larger areas of face areas, thus obtaining screened face images, the screened face images can comprise one or more faces, at the moment, gesture detection can be carried out on faces in the screened face images, gesture parameters can be obtained, the gesture parameters can comprise gesture scores, and the gesture scores are related to deflection angles of faces in the face images relative to the front faces.
In some embodiments, the step of acquiring the face integral value of the face region in each frame of the face image may include:
acquiring the area of a face area in each frame of face image;
acquiring a first mapping relation between the area and the area value;
and determining the face integral value of the face area in each frame of face image according to the first mapping relation.
The face image recognition apparatus may preset a first mapping relationship between the area of the face region and the face value, may store the first mapping relationship locally in the form of a list or text or the like or in a server or the like, and may evaluate the first mapping relationship using a sigmoid function, which may be represented by the following formula (2):
Figure BDA0001725412260000101
where x represents the area of the face region, and f (x) represents the face integration value.
For example, as shown in fig. 3, the larger the area of the face region, whereas the smaller the area of the face region, the smaller the area value, and when the area of the face region is large to some extent, the area value is not increased with the increase of the area of the face region. The value range of the area integral value f (x) may be flexibly set according to actual needs, and the specific value is not limited herein, for example, the value range of the area integral value f (x) may be set to a range of 0 to 1.
After determining the face region to be recognized based on the face tracking result, the face image recognition device may calculate the area (i.e., the size) of the face region in each frame of face image, for example, when the face region is a rectangular region, the area of the rectangular region may be obtained by calculating according to the length and the width of the rectangular region; for another example, when the face region is a circular region, the area of the circular region may be calculated from the radius of the circular region.
At this time, the face image recognition device may obtain a first mapping relationship between the area and the face value from a local or server or the like, and query the corresponding face value from the first mapping relationship according to the area of the face region in each frame of face image, to obtain the face integral value of the face region in each frame of face image. For example, the area of the face region in the a-th frame face image is A1, and the face integration value of the face region in the a-th frame face image is A1; the area of the face region in the B-th frame face image is B1, and the face integral value of the face region in the B-th frame face image is B1.
In some embodiments, the step of performing gesture detection on the face in the screened face image to obtain the gesture parameter may include:
Acquiring deflection parameters of faces in a face area of each frame of face image in the screened face images;
acquiring a second mapping relation between the deflection parameter and the attitude score;
and determining the gesture score of the face region in each frame of face image according to the second mapping relation, and setting the gesture score as a gesture parameter.
The face image recognition apparatus may preset a second mapping relationship between the deflection parameter and the pose score, may store the second mapping relationship locally in the form of a list or text or the like or in a server or the like, and may evaluate the second mapping relationship using a standard normal distribution, which may be represented by the following formula (3):
Figure BDA0001725412260000111
wherein x represents a deflection parameter, the deflection parameter is a deflection angle of a face relative to a front face in a face image, the deflection angle of the front face can be set to 0, and g (x) represents a gesture score.
For example, as shown in fig. 4, in the first section [ a, b ], the larger the deflection parameter that can indicate that the face is deflected rightward, the smaller the pose score g (x); in the second interval [ b, c ], the larger the deflection parameter which can indicate that the face deflects leftwards, the smaller the gesture score g (x) is, and when the gesture score g (x) reaches the maximum value, the face at the moment is indicated to be a positive face, namely, the user is right against the face image acquired by the camera. The range of the attitude score g (x) may be flexibly set according to actual needs, and the specific value is not limited herein, for example, the range of the attitude score g (x) may be set to be 0 to 1.
After determining a face region to be identified based on a face tracking result and screening face images with a face integral value larger than a corresponding preset value from a plurality of frames of face images, the face image identification device can acquire deflection parameters of faces in the face region of each frame of face image in the screened face images, wherein the deflection parameters can be deflection angles of the faces in an x-axis, y-axis and z-axis mode, or the deflection parameters can comprise a pitch angle, a yaw angle and a rolling angle. And obtaining a second mapping relation between the deflection parameters and the gesture scores from a local or server, and inquiring the corresponding gesture scores from the second mapping relation according to the deflection parameters of the face corresponding to the face areas in each frame of face image to obtain the gesture scores of the face areas in each frame of face image. For example, a deflection parameter of a face corresponding to a face region in an a-frame face image is θ1, and a pose score of the face region in the a-frame face image is a2; the deflection parameter of the face corresponding to the face region in the B-frame face image is theta 2, and the pose score of the face region in the B-frame face image is B2.
In some embodiments, the step of obtaining the deflection parameter of the face in the face area of each frame of face image in the face image after screening may include:
Acquiring first projection parameters of face projection in a two-dimensional plane in a face area of each frame of face image in the screened face image;
acquiring a second projection parameter of a preset face model projected in a two-dimensional plane;
and acquiring the deflection parameters of the face in the face area according to the first projection parameters and the second projection parameters.
Specifically, the face image recognition device may obtain a first projection parameter of a face projected in a two-dimensional plane in a face area of each frame of face image, where the first projection parameter may be a coordinate point of the face in the two-dimensional plane, and obtain a second projection parameter of a preset face model projected in the two-dimensional plane, where the second projection parameter may be a coordinate point of the preset face model in the two-dimensional plane. The preset face model may be a preset three-dimensional average face model, and three-dimensional face models of users with different expressions and different identities may be generated by changing position information of three-dimensional coordinate points of the three-dimensional average face model, that is, all three-dimensional face models may be represented by adding offset to the three-dimensional coordinate points of the three-dimensional average face model, and the three-dimensional face model of the face of the user may be represented by the following formula (4):
Figure BDA0001725412260000121
Wherein M in the formula (4) represents a three-dimensional face model of a face,
Figure BDA0001725412260000122
representing a preset face model (i.e. a three-dimensional average face model), a>
Figure BDA0001725412260000123
N is the number of points of three-dimensional coordinate points contained in the three-dimensional average face model; a is that id P id Representing identity offset term,/->
Figure BDA0001725412260000124
A id Is m id Wiki, P id Is an identity parameter; a is that exp P exp The expression offset item is represented as such,
Figure BDA0001725412260000125
A exp is m exp Wikipedia, P exp Is an expression parameter. />
After obtaining the first projection parameter of the face projected in the two-dimensional plane and the second projection parameter of the preset face model projected in the two-dimensional plane, the face image recognition device may obtain a three-dimensional coordinate point of the face in the three-dimensional plane according to the first projection parameter and the second projection parameter, for example, as shown in the following formula (5) and formula (6):
arg min||X 3d -X 2d || (5)
X 3d =p(M keypoint ) (6)
wherein X is 3d Representing three-dimensional coordinate points of a preset face model, M keypoint Representing a second projection parameter of a preset face model in a two-dimensional plane, wherein p is a projection function and X is a projection function 2d A first projection parameter for projecting a face in a two-dimensional plane.
Wherein, the projection can include orthographic projection, perspective projection, etc., taking orthographic projection for three-dimensional modeling as an example, orthographic projection can be performed by the following formula (7):
X 3d =S×R×M keypoint +T (7)
wherein S is a scaling factor, R is a rotation factor, and T is a two-dimensional translation component, so that the identity parameter P can be obtained by iteratively solving the formula (5) through the formulas (4), (6) and (7) id And expression parameter P exp The scaling coefficient S, the rotation coefficient R and the two-dimensional translation component T according to the parameter [ P ] id ,P exp ,S,R,T]And obtaining a three-dimensional coordinate point, and generating a three-dimensional face model according to the three-dimensional coordinate point. That is, if the projection of the three-dimensional face model of each angle onto the two-dimensional plane can be matched with the face in the two-dimensional image, the three-dimensional face model is the three-dimensional face model to be acquired.
After the three-dimensional model is obtained, the deflection parameter of the face in the face area of the face image can be obtained based on the three-dimensional model, and the deflection parameter is the rotation coefficient R in the formula (7).
Alternatively, the face image recognition device may also calculate the deflection parameters of the face in the face area of each frame of the face image by using the face pose estimation algorithm, for example, may acquire feature points of face organs (including eyes, nose, mouth, etc.) in the face area of each frame of the face image, calculate a proportional relationship between the face organs according to the feature points, and determine the deflection parameters of the face in the face area of each frame of the face image according to the proportional relationship. Or, rotating the three-dimensional average face model by a certain angle until the two-dimensional projection of the three-dimensional characteristic points of the face organs (including eyes, eyebrows, nose, mouth and the like) on the three-dimensional average face model is overlapped with the two-dimensional characteristic points of the face organs on the face image or overlapped as much as possible, so as to obtain the rotation angle of the three-dimensional average face model, wherein the rotation angle is the deflection parameter of the face.
For example, the positions of characteristic points such as a left external eye corner 1, a right external eye corner 2, a nose tip 3, a left mouth corner 4 and a right mouth corner 5 in a face region can be obtained, a midpoint 6 between the left external eye corner 1 and the right external eye corner 2 is obtained, a midpoint 8 between the left mouth corner 4 and the right mouth corner 5 is obtained, the two midpoints are connected to obtain a line segment 68 (namely, a connecting line between the midpoint 6 and the midpoint 8 is abbreviated as a line segment 68), a vertical foot 7 of the nose tip 3 relative to the line segment 68 is obtained, the ratio of the line segment 67 (connecting line between the midpoint 6 and the vertical foot 7 is abbreviated as a line segment 67, and the following line segments 78 and 37 are similar) to the line segment 78 is calculated to determine a pitch angle, a yaw angle yaw is determined by calculating the ratio of the line segment 37 to the line segment 68, and a roll angle is determined as deflection parameters by calculating the included angle 68 to the vertical direction.
It should be noted that, in addition to the above-mentioned manner of obtaining the deflection parameter, the deflection parameter may be obtained by another manner, and the specific obtaining manner is not limited herein.
After the deflection parameters are obtained, a second mapping relation between preset deflection parameters and gesture scores can be obtained from a local or server and the like, and corresponding gesture scores are inquired from the second mapping relation according to the deflection parameters, so that a face region in each frame of face image is obtained, and the gesture scores are gesture parameters.
In step S103, the moving speed of the face image between every two adjacent frames of the multi-frame face images is obtained, and the definition corresponding to each frame of face image is determined according to the moving speed.
For example, when the multi-frame face image includes a first frame face image, a second frame face image, and a third frame face image, the first frame face image may be taken as an origin, the moving speed of the second frame face image may be determined according to the distance and time interval acquired between the first frame face image and the second frame face image, the moving speed of the third frame face image may be determined according to the distance and time interval acquired between the second frame face image and the third frame face image, or the moving speed of the third frame face image may be determined according to the distance and time interval acquired between the first frame face image and the third frame face image. The moving speed may be a moving speed of the face image or a moving speed of the face region, and the greater the moving speed is, the lower the definition is, whereas the smaller the moving speed is, the higher the definition is.
In some embodiments, the step of obtaining a moving speed of a face image between every two adjacent frames of face images in the multiple frames of face images and determining the definition corresponding to each frame of face image according to the moving speed may include:
Acquiring the distance and time interval between every two adjacent frames of face images in the multi-frame face images;
calculating the moving speed of each frame of face image according to the distance and the time interval;
acquiring a third mapping relation between the moving speed and the definition score;
and determining the definition score of each frame of face image according to the third mapping relation, and determining the definition corresponding to each frame of face image according to the definition score.
The face image recognition apparatus may preset a third mapping relationship between the moving speed and the sharpness score, may store the third mapping relationship locally in the form of a list or text or the like or in a server or the like, and the third mapping relationship may be evaluated using an inverse proportion function, which may be represented by the following formula (8):
Figure BDA0001725412260000141
wherein y represents the definition score of the face region, k is a constant, the value of k can be flexibly set according to actual needs, and t represents the moving speed of the face image, namely the moving speed of the face region.
For example, as shown in fig. 5, the greater the moving speed of the face region (i.e., the moving speed of the face image), the greater the sharpness score of the face region, whereas the smaller the moving speed of the face region, the smaller the sharpness score of the face region, and the value range of the sharpness score may be flexibly set according to actual needs, and specific content is not limited herein. For example, when the moving speed is B, the sharpness score is C.
When the time intervals between every two adjacent frames of face images are consistent, a mapping relation between the distance between the two adjacent frames of face images and the definition score can be established, and the definition score of the face region is larger as the distance is larger, otherwise, the definition score of the face region is smaller as the distance is smaller.
After the face area to be identified is determined based on the face tracking result, the face image identification device can acquire the distance and time interval between every two adjacent frames of face images in the multi-frame face images, then calculate the moving speed of each frame of face image according to the distance and time interval, acquire a third mapping relation between the moving speed and the definition score from a local or server, and query the corresponding definition score from the third mapping relation according to the moving speed corresponding to each frame of face image, so as to obtain the definition score of the face area in each frame of face image. For example, the moving speed of the face corresponding to the face region in the a-frame face image is v1, and the sharpness score of the face region in the a-frame face image is a3; the moving speed of the face corresponding to the face region in the B-frame face image is v2, and the pose score of the face region in the B-frame face image is B3. At this time, the definition corresponding to each frame of face image can be determined according to the definition score, the larger the definition score is, the higher the definition is, whereas the smaller the definition score is, the lower the definition is, for example, the definition corresponding to the obtained definition score of each frame of face image can be determined according to the corresponding relationship between the definition score and the definition.
In step S104, a face image corresponding to a preset condition is selected from the multi-frame face images according to the gesture parameters and the definition, and a target face image is obtained.
In step S105, face recognition is performed on the target face image.
The preset conditions can be flexibly set according to actual needs so as to screen out face images with higher quality.
In some embodiments, the step of screening the face image corresponding to the preset condition from the multi-frame face image according to the gesture parameter and the definition to obtain the target face image may include:
calculating the face quality score of each frame of face image according to the gesture parameters and the definition; and screening face images corresponding to the face quality scores larger than a preset threshold value from the multi-frame face images to obtain target face images.
In some embodiments, the step of calculating the face quality score of each frame of face image based on the pose parameters and the sharpness may include:
determining a face integral value, a gesture score and a definition score of each frame of face image according to the gesture parameters and the definition;
respectively setting corresponding weight values for the face integral value, the gesture score and the definition score;
And respectively calculating the face quality scores corresponding to each frame of face image according to the face integral value, the gesture scores, the definition scores and the weight values corresponding to the face integral value, the gesture scores and the definition scores.
Because the attitude parameter is related to the face integral value, the attitude score and the like, and the definition is related to the definition score, the face image recognition device can obtain the face integral value, the attitude score and the definition score of the face area in each frame of face image (namely, the face integral value, the attitude score and the definition score of each frame of face image) and the like according to the attitude parameter and the definition, and can perform weighted summation after obtaining the face integral value, the attitude score and the definition score corresponding to the face area in each frame of face image, so as to obtain the face quality score corresponding to each frame of face image, namely, the face quality score corresponding to the face area in each frame of face image.
For example, the calculation formula of the face quality score may be the following formula (9):
S=f×a+g×b+y×c(9)
wherein S represents the face quality score, f represents the face integral value, a represents the weight value corresponding to the face integral value, g represents the gesture score, b represents the weight value corresponding to the gesture score, y represents the definition score, and c represents the weight value corresponding to the definition score. The weight values corresponding to the face integral value, the attitude score and the definition score can be flexibly set according to actual needs, and each frame of face image can be calculated by using a formula (9) to obtain the corresponding face quality score.
It should be noted that, for the face integral value, the gesture score, the definition score, the face quality score, or the like, other functions, or even a neural network, may be used to calculate, so that a high-quality face image may be screened out for face recognition later.
In order to screen out a face image with higher quality, the preset condition may be that the face quality score is greater than a preset threshold, for example, the face image recognition device may screen out a face image corresponding to the face quality score greater than the preset threshold from multiple frames of face images, so as to obtain one or more frames of target face images, when one frame of target face image is obtained, face recognition may be performed on the frame of target face image, and when multiple frames of target face images are obtained, face recognition may be performed on one frame of target face image.
In some embodiments, the step of screening face images corresponding to face quality scores greater than a preset threshold from the multi-frame face images to obtain the target face image may include:
screening face quality scores greater than a preset threshold value from the face quality scores to obtain candidate face quality scores;
when the candidate face quality scores comprise a plurality of candidate face quality scores, selecting one of the candidate face quality scores according to a preset algorithm to obtain target face quality;
And determining a face image corresponding to the target face quality from the multi-frame face images to obtain a target face image.
The face image recognition device may be configured to calculate face quality scores corresponding to each frame of face image after continuously acquiring multiple frames of face images at intervals of preset time in a preset time period to obtain a face quality score set, and then compare each face quality score in the face quality score set with a preset threshold value, and screen face quality scores greater than the preset threshold value from the face quality score set to obtain candidate face quality scores. The preset threshold value can be flexibly set according to actual needs, and the candidate face quality score can comprise one or more of the candidate face quality scores.
When the candidate face quality score includes one, face recognition can be performed on the face image (i.e., the target face image) corresponding to the candidate face quality score. When the candidate face quality scores comprise a plurality of candidate face quality scores, one of the candidate face quality scores can be selected according to a preset algorithm to obtain target face quality; the preset algorithm may be flexibly set according to actual needs, for example, one of the candidate face qualities may be randomly selected from a plurality of candidate face quality scores, or one of the candidate face qualities with the highest score may be selected from a plurality of candidate face quality scores, or one of the candidate face qualities corresponding to the most forward face image frame may be selected from a plurality of candidate face quality scores, and so on. Then, a face image corresponding to the quality of the target face can be determined from the multi-frame face image, the target face image is obtained, and face recognition is carried out on the target face image.
In some embodiments, the step of screening face images corresponding to face quality scores greater than a preset threshold from the multi-frame face images to obtain the target face image may include:
taking a first frame of face image in the multi-frame face images as a current face image;
comparing the face quality score of the current face image with a preset threshold value;
if the face quality score of the current face image is smaller than the preset threshold, taking a second frame of face image in the multi-frame face image as the current face image, and returning to the step of comparing the face quality score of the current face image with the preset threshold until the face quality score of the current face image is larger than the preset threshold, so as to obtain the target face image.
The face image recognition device may calculate the face quality score of the face image while collecting the face image in a preset period of time, for example, the face quality score of at least two frames of face images may be calculated first after the face image collects the at least two frames of face images. At this time, a first frame of face images in the multi-frame face images with the face quality scores obtained through calculation can be used as a current face image, the face quality scores of the current face image are compared with a preset threshold value, and whether the face quality scores of the current face image are larger than the preset threshold value is judged. If the face quality of the current face image is greater than a preset threshold, carrying out face recognition on the current face image; if the face quality score of the current face image is smaller than a preset threshold value, judging whether the detection time reaches a preset time period, wherein the preset time period can be flexibly set according to actual needs, and if the detection time reaches the preset time period, in order to avoid detection of infinite time length, ending the face image identification process; and if the preset time period is not reached, taking a second frame of face image in the multi-frame face images as the current face image, returning to the step of comparing the face quality score of the current face image with a preset threshold value until the face quality score of the current face image is greater than the preset threshold value, obtaining a target face image, and carrying out face recognition on the target face image.
In some embodiments, after the step of calculating the face quality score of each frame of face image according to the pose parameter and the sharpness, the face image recognition method may further include:
when no face quality score is larger than a preset threshold value, a face image corresponding to the largest face quality score is screened from the multi-frame face images, and a target face image is obtained.
When the face image recognition device continuously collects multiple frames of face images at intervals of preset time within a preset time period, if the face quality value corresponding to each frame of face image is not greater than a preset threshold value, in order to be able to recognize the collected face images, the face image corresponding to the largest face quality value can be screened from the multiple frames of face images to obtain a target face image, and face recognition is performed on the target face image.
Or, the face image recognition device may be configured to collect images in a preset time period and calculate face quality of the face image, and if the detection time reaches the preset time period and the face image with the face quality score greater than the preset value is not obtained yet, in order to be able to recognize the collected face image, select the face image corresponding to the largest face quality score from the multi-frame face images, obtain the target face image, and perform face recognition on the target face image. The face recognition method and device can be used for recognizing the face image with better quality, so that stability and reliability of a face recognition result are guaranteed.
As can be seen from the foregoing, in the embodiment of the present invention, a plurality of frames of face images may be collected, a face region to be identified may be determined according to the plurality of frames of face images, and then pose detection may be performed on faces in each frame of face images according to the face region to obtain pose parameters, for example, the pose parameters may be related to a face integral value and a pose score, where the face integral value is related to an area of the face region, the pose score is related to a deflection parameter of the face, and a movement speed of a face image between every two adjacent frames of images in the plurality of frames of face images may be obtained, and a definition corresponding to each frame of face image may be determined according to the movement speed, and the definition may be related to the definition score; at this time, the face image corresponding to the preset condition can be screened out from the multi-frame face images according to the gesture parameters and the definition, so as to obtain a target face image, for example, the target face image is an image with better face image quality, and face recognition is performed on the target face image. According to the scheme, the face image with the best quality, which meets the conditions, of the face images can be screened out from the collected multi-frame face images, namely, the face image with the best quality is screened out for face recognition, so that the problem that the face image with poor quality, such as blurring, small face area, poor face posture and the like, is subjected to face recognition, and the problem of incorrect recognition result is solved, and the accuracy of face image recognition is improved.
The method described in the above embodiments is described in further detail below by way of example.
In this embodiment, the face image recognition device is taken as an example of a network device, and the network device is applied to scenes such as a company, an airport, a market, a school, a district or a resident door, and the like, and mainly performs face recognition tasks of an access control, so as to provide automatic security detection services for householders, security guards and the like. For example, the network device can control the camera to collect a plurality of frames of face images in front of the entrance guard, and rapidly select a high-quality face image (namely, a face image corresponding to a face quality score larger than a preset threshold value) to identify from the plurality of frames of face images collected in a face entrance guard scene in a mode of face tracking, score evaluation and the like, so that the size of the identified face image is large, the gesture is good, the definition is high and the like, and the face identification effect is improved.
Referring to fig. 6, fig. 6 is another flow chart of the face image recognition method according to the embodiment of the invention. The method flow may include:
s201, the network equipment acquires a plurality of frames of face images, acquires the coincidence degree of face areas between every two adjacent frames of face images in the plurality of frames of face images, and determines the face area corresponding to the highest coincidence degree as the face area to be identified.
The network device can control a camera or a video camera and the like preset by the access control to continuously acquire face images of a plurality of frames of users, wherein the plurality of frames of face images can be images continuously acquired at intervals of preset time within a preset time period, for example, 20 frames of images continuously acquired at intervals of 3 seconds within 1 minute. The face image may include one or more faces, and may further include other objects, and the specific content is not limited herein.
Then, the network device may track the faces in the multi-frame face image so as to determine the face area to be recognized, for example, the network device may obtain an intersection area and a union area of the face areas between every two adjacent frames of face images in the multi-frame face image, calculate the contact ratio of the face areas between every two adjacent frames of face images according to the intersection area and the union area, so as to obtain a plurality of contact ratios, and at this time, may screen the face area corresponding to the highest contact ratio from the multi-frame face image according to the plurality of contact ratios, so as to obtain the face area to be recognized.
For example, the user a is a resident of the cell a, the user a needs to recognize a face through a face recognition system of the entrance guard when entering the cell a, at this time, the user a walks to a gate of the cell a and stands in front of a camera of the entrance guard, and the network device can control the camera to continuously collect face images of multiple frames of users a, at this time, the passerby user B passes through the point, and the user B just is collected by the camera to obtain face images of part of the users B when passing through. For example, as shown in fig. 7, n frames of face images are acquired, including an a-th frame of face image, a B-th frame of face image, a c-th frame of face image and the like, where the n values can be flexibly set according to actual needs, and the obtained a-th frame of face image, B-th frame of face image and the like all include faces of a user a (left face in fig. 7) and a user B (right face in fig. 7), and only the faces of the user a are included in the subsequent face image frames (e.g., c-th frame of face image). Then the network equipment carries out tracking detection on the face in the acquired n frames of face images, calculates the coincidence ratio of the face area of the user A and the coincidence ratio of the face area of the user B between the adjacent a frame of face images and the adjacent B frame of face images, calculates the coincidence ratio of the face area of the user A and the coincidence ratio of the face area of the user B between the B frame of face images and the next frame of face images, calculates the coincidence ratio of the face area of the user A and the coincidence ratio of the face area of the user B between the c frame of face images and the previous frame of face images, and the like, and after the coincidence ratio of the face area of each adjacent two frames of face images is obtained by calculation, the face area corresponding to the highest coincidence ratio can be selected from the multiple frames of face images as the face area of the user A, namely the face area to be identified, and then carries out calculation on the face integral value, the gesture score, the definition score, the face quality score and the like, and the face area of other users is eliminated.
S202, the network equipment acquires the area of the face area in each frame of face image, and determines the face integral value of the face area in each frame of face image according to a first mapping relation between the area and the face integral value.
The network device may preset a first mapping relationship between the area of the face area and the face integral value, and may store the first mapping relationship locally or in a server or the like in a list or text or the like, where the first mapping relationship is that the larger the area of the face area is, the larger the face integral value is, otherwise, the smaller the area of the face area is, the smaller the face integral value is, and when the area of the face area is large to a certain extent, the face integral value is no longer increased along with the increase of the area of the face area. The value range of the surface integral value can be flexibly set according to actual needs, and specific value is not limited herein.
After determining the face region to be identified based on the face tracking result, the network device may calculate the area of the face region in each frame of face image, for example, when the face region is a rectangular region, the area of the rectangular region may be calculated according to the length and the width of the rectangular region. Then, a first mapping relation between the area and the face integral value can be obtained from a local or server and the like, and the corresponding face integral value is inquired from the first mapping relation according to the area of the face area in each frame of face image, so that the face integral value of the face area in each frame of face image is obtained.
For example, as shown in fig. 8, since the face of the user a acquired includes only a part if the face of the user a is not all within the acquisition view of the camera, and the face of the user a acquired is caused to include all if the face of the user a is all within the acquisition view of the camera, the face region included in the n frames of face images acquired may be a part or all. At this time, the network device may calculate the area of the face area of the user a in each frame of face image, for example, calculate the areas of the face areas of the user a such as the d-th frame of face image, the e-th frame of face image, and the f-th frame of face image respectively, and then determine the face integral value of the face area of the user a in each frame of face image according to the first mapping relationship between the areas and the face integral value, where the larger the area of the face area of the user a is, the higher the corresponding face integral value is, for example, the face images of all the face areas of the user a included are face images with the highest face integral value, that is, the face integral value of the e-th frame of face image will be greater than the face integral value of the d-th frame of face image and will also be greater than the face integral value of the f-th frame of face image.
S203, the network equipment acquires deflection parameters of the face in the face area of each frame of face image, and determines the pose score of the face area in each frame of face image according to a second mapping relation between the deflection parameters and the pose score.
The network device may preset a second mapping relationship between the deflection parameters and the gesture scores, and may store the second mapping relationship locally or in a server or the like in a list or text form, where the second mapping relationship may be that the gesture scores are smaller as the deflection parameters of the face deflected rightward in the first interval are larger, and conversely, the gesture scores are larger as the deflection parameters of the face deflected rightward are smaller (i.e. the face is more opposite to the camera); in the second interval, the larger the deflection parameter of the face deflecting leftwards is, the smaller the gesture score is, otherwise, the smaller the deflection parameter of the face deflecting leftwards is, the larger the gesture score is; and when the gesture score reaches the maximum value, the face at the moment is a positive face, namely, the user is opposite to the face image acquired by the camera. The value range of the gesture score can be flexibly set according to actual needs, and specific values are not limited herein.
After determining the face region to be identified based on the face tracking result, the network device may acquire a deflection parameter of the face in the face region of each frame of the face image, where the deflection parameter may be a deflection angle of the face in x-axis, y-axis and z-axis manners, or the deflection parameter may include a pitch angle, a yaw angle and a roll angle, for example, the deflection parameter of the face in the face region of each frame of the face image may be calculated by using a face pose estimation algorithm. And obtaining a second mapping relation between the deflection parameters and the gesture scores from a local or server, and inquiring the corresponding gesture scores from the second mapping relation according to the deflection parameters of the face corresponding to the face areas in each frame of face image to obtain the gesture scores of the face areas in each frame of face image.
For example, as shown in fig. 9, since the user a may face the camera on the side or face the camera on the front, the user a may shake, so that the n frames of face images of the user a may include the side or front face of the user a. At this time, the network device may calculate the deflection parameters of the face in the face area of the user a in each frame of face image, for example, calculate the deflection parameters of the face in the face area of the user a such as the h frame of face image, and the j frame of face image respectively, and then determine the pose score of the face area of the user a in each frame of face image according to the second mapping relationship between the deflection parameters and the pose score, where the more the face of the user a faces the camera, the bigger the pose score, and the face image with the deflection parameter of zero is the face image with the highest pose score, for example, the pose score of the i frame of face image will be greater than the pose score of the h frame of face image and will also be greater than the pose score of the j frame of face image.
S204, the network equipment acquires the moving speed of each frame of face image, and determines the definition score of the face region in each frame of face image according to a third mapping relation between the moving speed and the definition score.
The network device may preset a third mapping relationship between the moving speed and the definition score, and may store the third mapping relationship locally or in a server or the like in a list or text or the like, where the third mapping relationship may be that the greater the moving speed of the face area is, the greater the definition score of the face area is, otherwise, the smaller the moving speed of the face area is, the smaller the definition score of the face area is, and the value range of the definition score may be flexibly set according to actual needs, and specific content is not limited herein.
After the face area to be identified is determined based on the face tracking result, the network device can acquire the distance and time interval between every two adjacent frames of face images in the multi-frame face images, then calculate the moving speed of each frame of face image according to the distance and time interval, acquire a third mapping relation between the moving speed and the definition score from a local or server, and query the corresponding definition score from the third mapping relation according to the moving speed corresponding to each frame of face image, so as to obtain the definition score of the face area in each frame of face image.
For example, as shown in fig. 10, since there may be movement when the user a stands in front of the camera, the movement of the user a may make some of the n frames of face images acquired by the user a clearer, some of the n frames of face images are blurred, and the faster the movement speed, the more blurred the acquired face images. At this time, the network device may calculate the moving speed of the face area of the user a in each frame of face image (i.e., the moving speed of each frame of face image), for example, calculate the moving speed of the face area of the user a such as the kth frame of face image, the r frame of face image, and the s frame of face image respectively, and then determine the sharpness score of the face area of the user a in each frame of face image according to the third mapping relationship between the moving speed and the sharpness score, where the face of the user a does not move, and the greater the sharpness score, the face image with the moving speed of zero is the face image with the highest sharpness value, for example, the sharpness score of the r frame of face image will be greater than the sharpness score of the k frame of face image and will also be greater than the sharpness score of the s frame of face image.
S205, the network equipment calculates face quality scores corresponding to each frame of face image according to the face integral value, the gesture scores and the definition scores.
After obtaining the face integral value, the gesture score and the definition score corresponding to the face area in each frame of face image, the network device may set a first weight value for the face integral value, set a second weight value for the gesture score, set a third weight value for the definition score, then multiply the face integral value corresponding to each frame of face image with the first weight value to obtain a first accumulated value, multiply the gesture score with the second weight value to obtain a second accumulated value, multiply the definition score with the third weight value to obtain a third accumulated value, and then add the first accumulated value, the second accumulated value and the third accumulated value to obtain the face quality score corresponding to each frame of face image.
S206, the network equipment takes a first frame of face image in the multi-frame face images as a current face image.
S207, the network equipment judges whether the face quality score of the current face image is larger than a preset threshold value; if yes, go to step S208; if yes, step S209 is executed.
In the process of detection, the network device may first use the first frame of face image as the current face image after calculating the face quality score of the first frame of face image, compare the face quality score of the current face image with a preset threshold, and determine whether the face quality score of the current face image is greater than the preset threshold.
S208, the network equipment performs face recognition on the current face image.
If the face quality of the current face image is greater than a preset threshold, the network device can conduct face recognition on the current face image.
S209, the network equipment takes a second frame of face image in the multi-frame face images as a current face image.
If the face quality score of the current face image is smaller than the preset threshold, the network device can judge whether the detection time reaches the preset time, the preset time can be flexibly set according to actual needs, if the detection time reaches the preset time, the face image recognition process can be finished, if the detection time does not reach the preset time, a second frame of face image in the multi-frame face image is used as the current face image, the step of comparing the face quality score of the current face image with the preset threshold is carried out, until the face quality score of the current face image is larger than the preset threshold, the required face image is obtained, and face recognition is carried out on the required face image.
Or if the face quality score of the current face image is smaller than the preset threshold, judging whether the acquired multi-frame face image is completely compared, if not, taking a second frame face image in the multi-frame face image as the current face image, and returning to the step of comparing the face quality score of the current face image with the preset threshold until the face quality score of the current face image is larger than the preset threshold, obtaining the required face image, and carrying out face recognition on the required face image.
When no face quality score is larger than a preset threshold, the network equipment can screen the face image corresponding to the largest face quality score from the multi-frame face images and conduct face recognition on the face image corresponding to the largest face quality score.
The network device may store a face of a user with access opening authority (referred to as an authority face) in advance, may store the authority face in a database, in a process of recognizing a face image, extract a face area from the face image, and compare the extracted face area with the authority face in the database one by one, if the similarity between the authority face and the extracted face area in the database is greater than a preset similarity threshold (the preset similarity threshold may be flexibly set according to actual needs), it indicates that the authority face and the face area are successfully matched, the network device may control the access opening to release the user, otherwise, if the similarity between the authority face and the extracted face area is not greater than the preset similarity threshold in the database, it indicates that the authority face and the face area are failed to be matched, and the network device may control the access to maintain a closed state so as to prohibit the user.
The embodiment of the invention can collect multi-frame face images, calculate the face integral value, the gesture score, the definition score and the like of the face region to be identified in each frame of face images, respectively calculate the face quality score corresponding to each frame of face images according to the face integral value, the gesture score and the definition score, and indicate that the higher the face quality score is, the better the quality of the collected face images is, at the moment, the face images with better face quality can be screened out from the multi-frame face images according to the face quality score, and the faces in the face images can be accurately identified. The face recognition method has the advantages that the problem of misrecognition caused by face recognition of face images with poor quality is solved, the accuracy of face image recognition is improved, the safety of access control is improved, and the use experience of users is also improved.
In order to facilitate better implementation of the face image recognition method provided by the embodiment of the invention, the embodiment of the invention also provides a device based on the face image recognition method. The meaning of the nouns is the same as that in the face image recognition method, and specific implementation details can be referred to the description in the embodiment of the method.
Referring to fig. 11, fig. 11 is a schematic structural diagram of a facial image recognition device according to an embodiment of the present invention, where the facial image recognition device may include a determining unit 301, a detecting unit 302, an obtaining unit 303, a filtering unit 304, a recognition unit 305, and the like.
The determining unit 301 is configured to collect multiple frames of face images, and determine a face area to be identified according to the multiple frames of face images.
The determining unit 301 may continuously acquire face images of a plurality of frames of users through a preset video camera, still camera, or the like, or may receive a plurality of frames of face images or the like transmitted by a terminal, a server, or the like.
The multi-frame face image may be an image continuously acquired at intervals of a preset time within a preset time period, for example, 30 frames of images continuously acquired at intervals of 2 seconds within 1 minute. The face image may include one or more faces, and may further include other objects, and the specific content is not limited herein.
In order to be able to recognize a desired face image, the determining unit 301 may track faces in a plurality of frames of face images so as to determine a face region to be recognized.
In some embodiments, as shown in fig. 12, the determining unit 301 may include a contact ratio obtaining subunit 3011, a first screening subunit 3012, and the like, and may specifically be as follows:
the contact ratio obtaining subunit 3011 is configured to obtain contact ratios of face areas between every two adjacent frames of face images in the multi-frame face images, so as to obtain multiple contact ratios;
The first screening subunit 3012 is configured to screen, according to the multiple degrees of overlap, a face region corresponding to the highest degree of overlap from the multiple frames of face images, to obtain a face region to be identified.
The contact ratio obtaining subunit 3011 may perform face detection on each frame of face image by using a face tracking algorithm, and calculate the contact ratio of a face region between two adjacent frames of face images in front and back in a plurality of frames of face images, so as to obtain the contact ratio corresponding to each pair of adjacent frames of face images, where the contact ratios of the plurality of pairs of adjacent face images may be combined into a plurality of contact ratios, and the contact ratio may be the contact area between face regions of the same person in two adjacent frames of face images. After obtaining the multiple coincidence degrees, the first screening subunit 3012 may screen a pair of adjacent two-frame face images corresponding to the highest coincidence degree from the multiple-frame face images based on the multiple coincidence degrees, where face areas in the pair of adjacent two-frame face images are face areas to be identified, and the face areas to be identified may also be referred to as tracking face areas. The shape of the face region can be flexibly set according to actual needs, for example, the face region can be a rectangular region, a square region or a circular region.
In some embodiments, the contact ratio obtaining subunit 3011 may specifically be configured to: acquiring the intersection area of face areas between every two adjacent frames of face images in the multi-frame face images; acquiring a union area of face areas between every two adjacent frames of face images in the multi-frame face images; and calculating the contact ratio of the face areas between every two adjacent frames of face images according to the intersection area and the union area to obtain a plurality of contact ratios. That is, the coincidence acquisition subunit 3011 may calculate the coincidence of the face regions between every two adjacent frames of face images using the above formula (1).
And the detection unit 302 is configured to perform gesture detection on a face in each frame of face image according to the face region, so as to obtain gesture parameters.
In some embodiments, as shown in fig. 13, the detection unit 302 may include a score acquisition subunit 3021, a second screening subunit 3022, a detection subunit 3023, and the like, which may specifically be as follows:
a score obtaining subunit 3021, configured to obtain a face integral value of a face area in each frame of face image;
a second screening subunit 3022, configured to screen face images from the multi-frame face images, where the face integral value is greater than a preset value, to obtain a screened face image;
And the detection subunit 3023 is configured to perform gesture detection on the face in the screened face image to obtain gesture parameters.
In order to improve the accuracy of gesture detection on faces in the face images, preliminary screening can be performed on the obtained multi-frame face images, for example, the face images with the areas of the face areas meeting the conditions can be screened. Specifically, the score acquisition subunit 3021 may acquire a face integral value of a face region in each frame of the face image, the face integral value being related to the area of the face region, and the face integral value being in a proportional relationship with the area of the face region. Then, the second screening subunit 3022 may screen the face image corresponding to the face integral value greater than the preset value from the multi-frame face images, where the preset value may be flexibly set according to actual needs, so as to screen the face image with the large area of the face area, so as to obtain a screened face image, where the screened face image may include one or more pieces, and at this time, the detecting subunit 3023 may perform gesture detection on the face in the screened face image to obtain a gesture parameter, where the gesture parameter may include a gesture score, and the gesture score is related to a deflection angle of the face in the face image relative to the front face.
In some embodiments, score acquisition subunit 3021 may be specifically configured to:
acquiring the area of a face area in each frame of face image;
and acquiring a first mapping relation between the area and the face integral value, and determining the face integral value of the face area in each frame of face image according to the first mapping relation.
The score acquisition subunit 3021 may preset a first mapping relationship between the area of the face region and the face score value, may store the first mapping relationship locally in the form of a list or text or in the form of a server or the like, and may evaluate the first mapping relationship using a sigmoid function, which may be as shown in the above formula (2). The larger the area of the face area is, the larger the face integral value is, otherwise, the smaller the area of the face area is, the smaller the face integral value is, and when the area of the face area is large to a certain extent, the face integral value is not increased along with the increase of the area of the face area. The value range of the surface integral value can be flexibly set according to actual needs, and specific value is not limited herein.
After determining the face region to be identified based on the result of face tracking, the score obtaining subunit 3021 may calculate the area of the face region in each frame of face image, for example, when the face region is a rectangular region, may calculate the area of the rectangular region according to the length and the width of the rectangular region; for another example, when the face region is a circular region, the area of the circular region may be calculated from the radius of the circular region. At this time, the score obtaining subunit 3021 may obtain a first mapping relationship between the area and the face score value from the local area, the server, or the like, and query the corresponding face score value from the first mapping relationship according to the area of the face region in each frame of the face image, to obtain the face score value of the face region in each frame of the face image.
In some embodiments, the detection subunit 3023 may include a parameter obtaining module, a mapping relationship obtaining module, and a determining module, which may specifically be as follows:
the parameter acquisition module is used for acquiring deflection parameters of the face in the face area of each frame of face image in the face images after screening;
the mapping relation acquisition module is used for acquiring a second mapping relation between the deflection parameter and the gesture score;
and the determining module is used for determining the gesture score of the face region in each frame of face image according to the second mapping relation and setting the gesture score as a gesture parameter.
The detection subunit 3023 may preset a second mapping relationship between the deflection parameter and the pose score, may store the second mapping relationship locally in a form of a list or text, or store the second mapping relationship in a server, or the like, and may evaluate the second mapping relationship using a standard normal distribution, where the larger the deflection parameter of the face deflected rightward in the first interval, the smaller the pose score may be, as shown in the above formula (3); and when the deflection parameters of the face deflecting leftwards in the second interval are larger, the gesture score is smaller, and the gesture score reaches the maximum value, the face at the moment is a positive face, namely the user is opposite to the face image acquired by the camera. The value range of the gesture score can be flexibly set according to actual needs, and specific values are not limited herein.
After determining a face region to be identified based on a face tracking result and screening face images corresponding to face integral values larger than a preset value from a plurality of frames of face images, the parameter acquisition module can acquire deflection parameters of faces in the face region of each frame of face image in the screened face images, wherein the deflection parameters can be deflection angles of the faces in an x-axis, y-axis and z-axis mode, or the deflection parameters can comprise pitch angle, yaw angle and rolling angle. At this time, the mapping relation obtaining module may obtain a second mapping relation between the deflection parameter and the pose score from the local or server, and query the corresponding pose score from the second mapping relation according to the deflection parameter of the face corresponding to the face region in each frame of the face image, to obtain the pose score of the face region in each frame of the face image.
In some embodiments, the parameter acquisition module may be specifically configured to: acquiring first projection parameters of face projection in a two-dimensional plane in a face area of each frame of face image in the screened face image; acquiring a second projection parameter of a preset face model projected in a two-dimensional plane; and acquiring the deflection parameters of the face in the face area according to the first projection parameters and the second projection parameters.
The parameter obtaining module may obtain a first projection parameter of a face projected in a two-dimensional plane in a face area of each frame of face image, where the first projection parameter may be a coordinate point of the face in the two-dimensional plane, and obtain a second projection parameter of a preset face model projected in the two-dimensional plane, where the second projection parameter may be a coordinate point of the preset face model in the two-dimensional plane. The preset face model may be a preset three-dimensional average face model, and three-dimensional face models of users with different expressions and different identities may be generated by changing position information of three-dimensional coordinate points of the three-dimensional average face model, that is, all three-dimensional face models may be represented by adding offset to the three-dimensional coordinate points of the three-dimensional average face model, and the three-dimensional face model of the face of the user may be represented by the above formula (4).
After obtaining a first projection parameter of a face projected in a two-dimensional plane and a second projection parameter of a preset face model projected in the two-dimensional plane, the face image recognition device can obtain a three-dimensional coordinate point of the face in the three-dimensional plane according to the first projection parameter and the second projection parameter by using the formula (5) and the formula (6), orthographic projection can be carried out by using the formula (7), the formula (5) can be solved by iteration of the formula (4), the formula (6) and the formula (7), and a three-dimensional coordinate point can be obtained, and a three-dimensional face model can be generated according to the three-dimensional coordinate point. That is, if the projection of the three-dimensional face model of each angle onto the two-dimensional plane can be matched with the face in the two-dimensional image, the three-dimensional face model is the three-dimensional face model to be acquired. After the three-dimensional model is obtained, the deflection parameter of the face in the face area of the face image can be obtained based on the three-dimensional model, and the deflection parameter is the rotation coefficient R in the formula (7).
Alternatively, the parameter obtaining module may also calculate the deflection parameters of the face in the face area of each frame of the face image by using the face pose estimation algorithm, for example, may obtain feature points of face organs (including eyes, nose, mouth, etc.) in the face area of each frame of the face image, calculate a proportional relationship between the face organs according to the feature points, and determine the deflection parameters of the face in the face area of each frame of the face image according to the proportional relationship. Or, rotating the three-dimensional average face model by a certain angle until the two-dimensional projection of the three-dimensional characteristic points of the face organs (including eyes, eyebrows, nose, mouth and the like) on the three-dimensional average face model is overlapped with the two-dimensional characteristic points of the face organs on the face image or overlapped as much as possible, so as to obtain the rotation angle of the three-dimensional average face model, wherein the rotation angle is the deflection parameter of the face.
It should be noted that, in addition to the above-mentioned manner of obtaining the deflection parameter, the deflection parameter may be obtained by another manner, and the specific obtaining manner is not limited herein.
After the deflection parameters are obtained, the mapping relation obtaining module can obtain a second mapping relation between preset deflection parameters and gesture scores from a local or server and the like, and the determining module inquires corresponding gesture scores from the second mapping relation according to the deflection parameters to obtain face areas in each frame of face image, wherein the gesture scores are gesture parameters.
The obtaining unit 303 is configured to obtain a moving speed of a face image between every two adjacent frames of face images in the multiple frames of face images, and determine a definition corresponding to each frame of face image according to the moving speed.
For example, when the multi-frame face image includes a first frame face image, a second frame face image, and a third frame face image, the first frame face image may be taken as an origin, the moving speed of the second frame face image may be determined according to the distance and time interval acquired between the first frame face image and the second frame face image, the moving speed of the third frame face image may be determined according to the distance and time interval acquired between the second frame face image and the third frame face image, or the moving speed of the third frame face image may be determined according to the distance and time interval acquired between the first frame face image and the third frame face image. The moving speed may be a moving speed of the face image or a moving speed of the face region, and the greater the moving speed is, the lower the definition is, whereas the smaller the moving speed is, the higher the definition is.
In some embodiments, the obtaining unit 303 may specifically be configured to:
acquiring the distance and time interval between every two adjacent frames of face images in the multi-frame face images;
Calculating the moving speed of each frame of face image according to the distance and the time interval;
acquiring a third mapping relation between the moving speed and the definition score;
and determining the definition score of each frame of face image according to the third mapping relation, and determining the definition corresponding to each frame of face image according to the definition score.
The obtaining unit 303 may set in advance a third mapping relation between the moving speed and the sharpness score, may store the third mapping relation locally in the form of a list or text or the like or in a server or the like, and the third mapping relation may be evaluated using an inverse proportion function, which may be as shown in the above formula (8). The greater the moving speed of the face area is, the greater the definition score of the face area is, whereas the lesser the moving speed of the face area is, the lesser the definition score of the face area is, the value range of the definition score can be flexibly set according to actual needs, and specific contents are not limited here.
After determining the face region to be identified based on the face tracking result, the obtaining unit 303 may obtain a distance and a time interval between every two adjacent face images in the multi-frame face image, and then calculate a moving speed of each face image according to the distance and the time interval, at this time, the obtaining unit 303 may obtain a third mapping relationship between the moving speed and the sharpness score from the local or server, and query the corresponding sharpness score from the third mapping relationship according to the moving speed corresponding to each face image, to obtain the sharpness score of the face region in each face image. At this time, the definition corresponding to each frame of face image can be determined according to the definition score, the larger the definition score is, the higher the definition is, whereas the smaller the definition score is, the lower the definition is, for example, the definition corresponding to the obtained definition score of each frame of face image can be determined according to the corresponding relationship between the definition score and the definition.
And the screening unit 304 is configured to screen out face images corresponding to the preset conditions from the multi-frame face images according to the gesture parameters and the definition, so as to obtain a target face image.
And the recognition unit 305 is used for recognizing the face of the target face image.
The preset conditions can be flexibly set according to actual needs so as to screen out face images with higher quality.
In some embodiments, as shown in fig. 14, the filtering unit 304 may include a calculating subunit 3041, a third filtering subunit 3042, and the like, and may specifically be as follows:
a computing subunit 3041, configured to calculate a face quality score of each frame of face image according to the pose parameter and the definition;
and a third screening subunit 3042, configured to screen out a face image corresponding to a face quality score greater than a preset threshold from multiple frames of face images, so as to obtain a target face image.
In some embodiments, the computing subunit 3041 may be specifically configured to: determining a face integral value, a gesture score and a definition score of each frame of face image according to the gesture parameters and the definition; respectively setting corresponding weight values for the face integral value, the gesture score and the definition score; and respectively calculating the face quality scores corresponding to each frame of face image according to the face integral value, the gesture scores, the definition scores and the weight values corresponding to the face integral value, the gesture scores and the definition scores. Namely, each frame of face image can be calculated to obtain the corresponding face quality score by utilizing the formula (9).
It should be noted that, for the face integral value, the gesture score, the definition score, the face quality score, or the like, other functions, or even a neural network, may be used to calculate, so that a high-quality face image may be screened out for face recognition later.
In order to screen out a face image with higher quality, the preset condition may be that the face quality score is greater than a preset threshold, for example, the third screening subunit 3042 may screen out a face image corresponding to the face quality score greater than the preset threshold from multiple frames of face images, so as to obtain one or multiple frames of target face images, when one frame of target face image is obtained, face recognition may be performed on the frame of target face image, and when multiple frames of target face images are obtained, face recognition may be performed on one frame of target face image.
In some embodiments, the third screening subunit 3042 may be specifically configured to: screening face quality scores greater than a preset threshold value from the face quality scores to obtain candidate face quality scores; when the candidate face quality scores comprise a plurality of candidate face quality scores, selecting one of the candidate face quality scores according to a preset algorithm to obtain target face quality; and determining a face image corresponding to the target face quality from the multi-frame face images to obtain a target face image.
The third filtering subunit 3042 may calculate face quality scores corresponding to each frame of face image after continuously acquiring multiple frames of face images at intervals of preset time in a preset time period to obtain a face quality score set, and then compare each face quality score in the face quality score set with a preset threshold value, and filter face quality scores greater than the preset threshold value from the face quality score set to obtain candidate face quality scores. The preset threshold value can be flexibly set according to actual needs, and the candidate face quality score can comprise one or more of the candidate face quality scores.
When the candidate face quality score includes one, the third filtering sub-unit 3042 may perform face recognition on the face image corresponding to the candidate face quality score. When the candidate face quality scores comprise a plurality of candidate face quality scores, one of the candidate face quality scores can be selected according to a preset algorithm to obtain target face quality; the preset algorithm may be flexibly set according to actual needs, for example, one of the candidate face qualities may be randomly selected from a plurality of candidate face quality scores, or one of the candidate face qualities with the highest score may be selected from a plurality of candidate face quality scores, or one of the candidate face qualities corresponding to the most forward face image frame may be selected from a plurality of candidate face quality scores, and so on. Then, a face image corresponding to the quality of the target face can be determined from the multi-frame face image, the target face image is obtained, and face recognition is carried out on the target face image.
In some embodiments, the third screening subunit 3042 may be specifically configured to: taking a first frame of face image in the multi-frame face images as a current face image; comparing the face quality score of the current face image with a preset threshold value; if the face quality score of the current face image is smaller than the preset threshold, taking a second frame of face image in the multi-frame face image as the current face image, and returning to the step of comparing the face quality score of the current face image with the preset threshold until the face quality score of the current face image is larger than the preset threshold, so as to obtain the target face image.
The third filtering sub-unit 3042 may calculate the face quality score of the face image while collecting the face image in a preset period of time, for example, the face quality score of at least two frames of face images may be calculated first after the face image collects the face images. The third filtering subunit 3042 may use a first frame of face images in the multi-frame face images after the face quality score is calculated as the current face image, compare the face quality score of the current face image with a preset threshold, and determine whether the face quality score of the current face image is greater than the preset threshold. If the face quality of the current face image is greater than a preset threshold, carrying out face recognition on the current face image; if the face quality score of the current face image is smaller than a preset threshold value, judging whether the detection time reaches a preset time period, wherein the preset time period can be flexibly set according to actual needs, and if the detection time reaches the preset time period, in order to avoid detection of infinite time length, ending the face image identification process; and if the preset time period is not reached, taking a second frame of face image in the multi-frame face images as the current face image, returning to the step of comparing the face quality score of the current face image with a preset threshold value until the face quality score of the current face image is greater than the preset threshold value, obtaining a target face image, and carrying out face recognition on the target face image.
In some embodiments, the face image recognition apparatus further includes:
and the fourth screening subunit is used for screening the face image corresponding to the maximum face quality score from the multi-frame face images to obtain the target face image when the face quality score is not greater than the preset threshold. .
When the face quality score corresponding to each frame of face image is not greater than the preset threshold value in the multi-frame face images continuously acquired at intervals of preset time in the preset time period, in order to identify the acquired face image, the fourth screening subunit can screen the face image corresponding to the largest face quality score from the multi-frame face image to obtain a target face image, and face recognition is performed on the target face image. Or in the process of collecting images and calculating the face quality of the face image in the preset time period, if the detection time reaches the preset time period and the face image with the face quality score larger than the preset value is not obtained yet, the fourth screening subunit can screen the face image corresponding to the largest face quality score from the multi-frame face image to obtain the target face image, and face recognition is carried out on the target face image. The face recognition method and device can be used for recognizing the face image with better quality, so that stability and reliability of a face recognition result are guaranteed.
As can be seen from the foregoing, in the embodiment of the present invention, the determining unit 301 may collect multiple frames of face images, determine a face area to be identified according to the multiple frames of face images, and then the detecting unit 302 performs gesture detection on faces in each frame of face images according to the face areas to obtain gesture parameters, for example, the gesture parameters are related to a face integral value and a gesture score, where the face integral value is related to an area of the face area, the gesture score is related to a deflection parameter of a face, and the obtaining unit 303 obtains a moving speed of a face image between each two adjacent frames of images in the multiple frames of face images, and determines a definition corresponding to each frame of face image according to the moving speed, where the definition is related to the definition score; at this time, the screening unit 304 may screen out the face image corresponding to the preset condition from the multi-frame face images according to the pose parameter and the definition, so as to obtain a target face image, for example, the target face image is an image with better face image quality, and the recognition unit 305 performs face recognition on the target face image. According to the scheme, the face image with the best quality, which meets the conditions, of the face images can be screened out from the collected multi-frame face images, namely, the face image with the best quality is screened out for face recognition, so that the problem that the face image with poor quality, such as blurring, small face area, poor face posture and the like, is subjected to face recognition, and the problem of incorrect recognition result is solved, and the accuracy of face image recognition is improved.
The embodiment of the invention also provides a network device which can be a server or a terminal and other devices. As shown in fig. 15, a schematic structural diagram of a network device according to an embodiment of the present invention is shown, specifically:
the network device may include one or more processors 401 of a processing core, memory 402 of one or more computer readable storage media, power supply 403, and input unit 404, among other components. Those skilled in the art will appreciate that the network device structure shown in fig. 15 is not limiting of the network device and may include more or fewer components than shown, or may combine certain components, or may be a different arrangement of components. Wherein:
the processor 401 is a control center of the network device, connects various parts of the entire network device using various interfaces and lines, and performs various functions of the network device and processes data by running or executing software programs and/or modules stored in the memory 402, and calling data stored in the memory 402, thereby performing overall monitoring of the network device. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, an application program, etc., and the modem processor mainly processes wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by executing the software programs and modules stored in the memory 402. The memory 402 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the network device, etc. In addition, memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 with access to the memory 402.
The network device further comprises a power supply 403 for supplying power to the various components, and preferably the power supply 403 may be logically connected to the processor 401 by a power management system, so that functions of charge, discharge, and power consumption management are performed by the power management system. The power supply 403 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The network device may also include an input unit 404, which input unit 404 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the network device may further include a display unit or the like, which is not described herein. In this embodiment, the processor 401 in the network device loads executable files corresponding to the processes of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application programs stored in the memory 402, so as to implement the face image recognition method provided by the embodiment of the present invention, as follows:
collecting multiple frames of face images, and determining a face area to be recognized according to the multiple frames of face images; carrying out gesture detection on faces in each frame of face image according to the face areas to obtain gesture parameters; acquiring the moving speed of a face image between every two adjacent frames of face images in a plurality of frames of face images, and determining the definition corresponding to each frame of face image according to the moving speed; screening face images corresponding to preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images; and carrying out face recognition on the target face image.
Optionally, the step of determining the face region to be identified according to the multi-frame face image may include: acquiring the contact ratio of a face region between every two adjacent frames of face images in the multi-frame face images to obtain a plurality of contact ratios; and screening the face area corresponding to the highest contact ratio from the multi-frame face image according to the contact ratios to obtain the face area to be identified.
Optionally, the step of performing gesture detection on the face in each frame of face image according to the face region to obtain the gesture parameter may include: acquiring a face integral value of a face area in each frame of face image; screening face images of which the face integral value is larger than a preset value from a plurality of frames of face images to obtain screened face images; and carrying out gesture detection on the faces in the screened face images to obtain gesture parameters.
Optionally, the step of screening the face image corresponding to the preset condition from the multi-frame face image according to the gesture parameter and the definition to obtain the target face image may include: calculating the face quality score of each frame of face image according to the gesture parameters and the definition; and screening face images corresponding to the face quality scores larger than a preset threshold value from the multi-frame face images to obtain target face images.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and the portions of an embodiment that are not described in detail may be referred to the above detailed description of the face image recognition method, which is not repeated herein.
As can be seen from the above, the embodiment of the invention can collect multiple frames of face images, determine the face region to be identified according to the multiple frames of face images, then perform gesture detection on the face in each frame of face images according to the face region to obtain gesture parameters, obtain the moving speed of the face images between every two adjacent frames of face images, and determine the definition corresponding to each frame of face images according to the moving speed; at this time, the face image corresponding to the preset condition can be screened out from the multi-frame face images according to the gesture parameters and the definition, so as to obtain a target face image, for example, the target face image is an image with better face image quality, and face recognition is performed on the target face image. According to the scheme, the face image with the best quality, which meets the conditions, can be screened out from the collected multi-frame face images, namely the face image with the best quality is screened out for face recognition, so that the problem that the face image with the poor quality is recognized, and the recognition result is wrong is avoided, and the accuracy of face image recognition is improved.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, an embodiment of the present invention provides a storage medium in which a plurality of instructions are stored, where the instructions can be loaded by a processor to perform steps in any of the face image recognition methods provided in the embodiments of the present invention. For example, the instructions may perform the steps of:
collecting multiple frames of face images, and determining a face area to be recognized according to the multiple frames of face images; carrying out gesture detection on faces in each frame of face image according to the face areas to obtain gesture parameters; acquiring the moving speed of a face image between every two adjacent frames of face images in a plurality of frames of face images, and determining the definition corresponding to each frame of face image according to the moving speed; screening face images corresponding to preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images; and carrying out face recognition on the target face image.
Optionally, the step of determining the face region to be identified according to the multi-frame face image may include: acquiring the contact ratio of a face region between every two adjacent frames of face images in the multi-frame face images to obtain a plurality of contact ratios; and screening the face area corresponding to the highest contact ratio from the multi-frame face image according to the contact ratios to obtain the face area to be identified.
Optionally, the step of performing gesture detection on the face in each frame of face image according to the face region to obtain the gesture parameter may include: acquiring a face integral value of a face area in each frame of face image; screening face images of which the face integral value is larger than a preset value from a plurality of frames of face images to obtain screened face images; and carrying out gesture detection on the faces in the screened face images to obtain gesture parameters.
Optionally, the step of screening the face image corresponding to the preset condition from the multi-frame face image according to the gesture parameter and the definition to obtain the target face image may include: calculating the face quality score of each frame of face image according to the gesture parameters and the definition; and screening face images corresponding to the face quality scores larger than a preset threshold value from the multi-frame face images to obtain target face images.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The steps in any face image recognition method provided by the embodiment of the present invention can be executed due to the instructions stored in the storage medium, so that the beneficial effects that any face image recognition method provided by the embodiment of the present invention can be achieved, and detailed descriptions of the previous embodiments are omitted.
The foregoing describes in detail a face image recognition method, apparatus and storage medium provided by the embodiments of the present invention, and specific examples are applied to illustrate the principles and embodiments of the present invention, where the foregoing description of the embodiments is only for helping to understand the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present invention, the present description should not be construed as limiting the present invention.

Claims (9)

1. A face image recognition method, comprising:
collecting a plurality of frames of face images, and determining a face area to be recognized according to the plurality of frames of face images;
carrying out gesture detection on the face in each frame of face image according to the face region to obtain gesture parameters;
the method for acquiring the moving speed of the face image between every two adjacent frames of the multi-frame face images, and determining the definition corresponding to each frame of face image according to the moving speed comprises the following steps: acquiring the distance and time interval between every two adjacent frames of face images in the multi-frame face images; calculating the moving speed of each frame of face image according to the distance and the time interval; acquiring a third mapping relation between the moving speed and the definition score; determining a definition score of each frame of face image according to the third mapping relation, and determining the definition corresponding to each frame of face image according to the definition score;
screening face images corresponding to the preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images, wherein the method comprises the following steps: determining a face integral value, a gesture score and a definition score of each frame of face image according to the gesture parameters and the definition; setting corresponding weight values for the face integral value, the gesture score and the definition score respectively; respectively calculating face quality scores corresponding to each frame of face image according to the face integral value, the gesture scores, the definition scores and the weight values corresponding to the face integral value, the gesture scores and the definition scores; screening face images corresponding to face quality scores greater than a preset threshold from the multi-frame face images to obtain target face images;
Performing face recognition on the target face image;
the step of carrying out gesture detection on the human face in each frame of human face image according to the human face region to obtain gesture parameters comprises the following steps:
screening face images with the corresponding face values larger than a preset value from the multi-frame face images to obtain screened face images;
acquiring first projection parameters of face projection in a two-dimensional plane in a face area of each frame of face image in the screened face image; the first projection parameter is a coordinate point of the face in a two-dimensional plane;
acquiring a second projection parameter of a preset face model projected in a two-dimensional plane; the second projection parameters are coordinate points of the preset face model in a two-dimensional plane;
iteratively adjusting the position of the three-dimensional coordinate point of the preset face model based on the first projection parameter and the second projection parameter until the adjusted preset face model is projected to a two-dimensional plane to be matched with a face region in the face image, and taking the adjusted preset face model as a three-dimensional face model corresponding to the face region in the face image;
determining deflection parameters of a face in the face region according to a three-dimensional face model corresponding to the face region in the face image;
And determining the pose score of the face region in each frame of face image according to a second mapping relation between the deflection parameter and the pose score and the deflection parameter of the face in the face region, and setting the pose score as the pose parameter.
2. The face image recognition method according to claim 1, wherein the step of determining a face region to be recognized from the multi-frame face image includes:
acquiring the contact ratio of a face region between every two adjacent frames of face images in the multi-frame face images to obtain a plurality of contact ratios;
and screening the face area corresponding to the highest contact ratio from the multi-frame face image according to the contact ratios to obtain the face area to be identified.
3. The face image recognition method according to claim 2, wherein the step of obtaining the degree of coincidence of face areas between every two adjacent face images in the multi-frame image to obtain the plurality of degrees of coincidence comprises:
acquiring the intersection area of face areas between every two adjacent frames of face images in the multi-frame face images;
acquiring a union area of face areas between every two adjacent frames of face images in the multi-frame face images;
and calculating the contact ratio of the face areas between every two adjacent frames of face images according to the intersection area and the union area to obtain a plurality of contact ratios.
4. The face image recognition method according to claim 1, wherein the method further comprises, before screening out face images with face values greater than a preset value from the plurality of frames of face images, and obtaining the screened face images:
acquiring the area of a face area in each frame of face image;
acquiring a first mapping relation between the area and the area value;
and determining the face integral value of the face area in each frame of face image according to the first mapping relation.
5. The face image recognition method according to claim 1, wherein after the step of calculating a face quality score of each frame of face image based on the pose parameters and sharpness, the method further comprises:
when no face quality score is larger than a preset threshold value, a face image corresponding to the largest face quality score is screened from the multi-frame face images, and a target face image is obtained.
6. The face image recognition method according to claim 1, wherein the step of screening face images corresponding to face quality scores greater than a preset threshold from the plurality of frames of face images to obtain a target face image comprises:
Screening face quality scores greater than a preset threshold value from the face quality scores to obtain candidate face quality scores;
when the candidate face quality scores comprise a plurality of candidate face quality scores, selecting one of the candidate face quality scores according to a preset algorithm to obtain target face quality;
and determining a face image corresponding to the target face quality from the multi-frame face image to obtain a target face image.
7. The face image recognition method according to claim 1, wherein the step of screening face images corresponding to face quality scores greater than a preset threshold from the multi-frame face images to obtain a target face image comprises:
taking a first frame of face image in the multi-frame face images as a current face image;
comparing the face quality score of the current face image with a preset threshold value;
and if the face quality score of the current face image is smaller than the preset threshold value, taking a second frame of face image in the multi-frame face image as the current face image, and returning to the step of comparing the face quality score of the current face image with the preset threshold value until the face quality score of the current face image is larger than the preset threshold value, so as to obtain the target face image.
8. A face image recognition apparatus, comprising:
the determining unit is used for acquiring a plurality of frames of face images and determining a face area to be recognized according to the plurality of frames of face images;
the detection unit is used for carrying out gesture detection on the human face in each frame of human face image according to the human face region to obtain gesture parameters;
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring the moving speed of the face image between every two adjacent frames of the multi-frame face images, determining the definition corresponding to each frame of face image according to the moving speed, and is specifically used for: acquiring the distance and time interval between every two adjacent frames of face images in the multi-frame face images; calculating the moving speed of each frame of face image according to the distance and the time interval; acquiring a third mapping relation between the moving speed and the definition score; determining a definition score of each frame of face image according to the third mapping relation, and determining the definition corresponding to each frame of face image according to the definition score;
the screening unit is used for screening face images corresponding to preset conditions from the multi-frame face images according to the gesture parameters and the definition to obtain target face images, and is specifically used for: determining a face integral value, a gesture score and a definition score of each frame of face image according to the gesture parameters and the definition; setting corresponding weight values for the face integral value, the gesture score and the definition score respectively; respectively calculating face quality scores corresponding to each frame of face image according to the face integral value, the gesture scores, the definition scores and the weight values corresponding to the face integral value, the gesture scores and the definition scores; screening face images corresponding to face quality scores greater than a preset threshold from the multi-frame face images to obtain target face images;
The identification unit is used for carrying out face identification on the target face image;
the step of carrying out gesture detection on the human face in each frame of human face image according to the human face region to obtain gesture parameters comprises the following steps:
screening face images with the corresponding face values larger than a preset value from the multi-frame face images to obtain screened face images;
acquiring first projection parameters of face projection in a two-dimensional plane in a face area of each frame of face image in the screened face image; the first projection parameter is a coordinate point of the face in a two-dimensional plane;
acquiring a second projection parameter of a preset face model projected in a two-dimensional plane; the second projection parameters are coordinate points of the preset face model in a two-dimensional plane;
iteratively adjusting the position of the three-dimensional coordinate point of the preset face model based on the first projection parameter and the second projection parameter until the adjusted preset face model is projected to a two-dimensional plane to be matched with a face region in the face image, and taking the adjusted preset face model as a three-dimensional face model corresponding to the face region in the face image;
determining deflection parameters of a face in the face region according to a three-dimensional face model corresponding to the face region in the face image;
And determining the pose score of the face region in each frame of face image according to a second mapping relation between the deflection parameter and the pose score and the deflection parameter of the face in the face region, and setting the pose score as the pose parameter.
9. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of the face image recognition method of any one of claims 1 to 7.
CN201810750438.XA 2018-07-10 2018-07-10 Face image recognition method, device and storage medium Active CN109034013B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810750438.XA CN109034013B (en) 2018-07-10 2018-07-10 Face image recognition method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810750438.XA CN109034013B (en) 2018-07-10 2018-07-10 Face image recognition method, device and storage medium

Publications (2)

Publication Number Publication Date
CN109034013A CN109034013A (en) 2018-12-18
CN109034013B true CN109034013B (en) 2023-06-13

Family

ID=64641054

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810750438.XA Active CN109034013B (en) 2018-07-10 2018-07-10 Face image recognition method, device and storage medium

Country Status (1)

Country Link
CN (1) CN109034013B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871243B (en) * 2019-02-22 2021-12-21 山东诺蓝信息科技有限公司 Intelligent terminal multi-application software control method and system
CN110008673B (en) * 2019-03-06 2022-02-18 创新先进技术有限公司 Identity authentication method and device based on face recognition
CN110232323A (en) * 2019-05-13 2019-09-13 特斯联(北京)科技有限公司 A kind of parallel method for quickly identifying of plurality of human faces for crowd and its device
CN110263680B (en) * 2019-06-03 2022-01-28 北京旷视科技有限公司 Image processing method, device and system and storage medium
CN110363126A (en) * 2019-07-04 2019-10-22 杭州视洞科技有限公司 A kind of plurality of human faces real-time tracking and out of kilter method
CN112307817B (en) * 2019-07-29 2024-03-19 中国移动通信集团浙江有限公司 Face living body detection method, device, computing equipment and computer storage medium
CN110472567A (en) * 2019-08-14 2019-11-19 旭辉卓越健康信息科技有限公司 A kind of face identification method and system suitable under non-cooperation scene
CN110532957B (en) * 2019-08-30 2021-05-07 北京市商汤科技开发有限公司 Face recognition method and device, electronic equipment and storage medium
CN110740256B (en) * 2019-09-27 2021-07-20 深圳市海雀科技有限公司 Doorbell camera cooperation method and related product
CN110796108B (en) * 2019-11-04 2022-05-17 北京锐安科技有限公司 Method, device and equipment for detecting face quality and storage medium
CN110740315B (en) * 2019-11-07 2021-07-16 杭州宇泛智能科技有限公司 Camera correction method and device, electronic equipment and storage medium
CN113592874A (en) * 2020-04-30 2021-11-02 杭州海康威视数字技术股份有限公司 Image display method and device and computer equipment
CN111814613A (en) * 2020-06-24 2020-10-23 浙江大华技术股份有限公司 Face recognition method, face recognition equipment and computer readable storage medium
CN112070739A (en) * 2020-09-03 2020-12-11 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN112329638A (en) * 2020-11-06 2021-02-05 上海优扬新媒信息技术有限公司 Image scoring method, device and system
CN112560775A (en) * 2020-12-25 2021-03-26 深圳市商汤科技有限公司 Switch control method and device, computer equipment and storage medium
CN113283305B (en) * 2021-04-29 2024-03-26 百度在线网络技术(北京)有限公司 Face recognition method, device, electronic equipment and computer readable storage medium
CN113297423A (en) * 2021-05-24 2021-08-24 深圳市优必选科技股份有限公司 Pushing method, pushing device and electronic equipment
CN113313009A (en) * 2021-05-26 2021-08-27 Oppo广东移动通信有限公司 Method, device and terminal for continuously shooting output image and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550637A (en) * 2015-12-04 2016-05-04 小米科技有限责任公司 Contour point positioning method and contour point positioning device
CN108062791A (en) * 2018-01-12 2018-05-22 北京奇虎科技有限公司 A kind of method and apparatus for rebuilding human face three-dimensional model

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799877A (en) * 2012-09-11 2012-11-28 上海中原电子技术工程有限公司 Method and system for screening face images
CN103942525A (en) * 2013-12-27 2014-07-23 高新兴科技集团股份有限公司 Real-time face optimal selection method based on video sequence
CN106682619B (en) * 2016-12-28 2020-08-11 上海木木聚枞机器人科技有限公司 Object tracking method and device
CN108256459B (en) * 2018-01-10 2021-08-24 北京博睿视科技有限责任公司 Security check door face recognition and face automatic library building algorithm based on multi-camera fusion

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550637A (en) * 2015-12-04 2016-05-04 小米科技有限责任公司 Contour point positioning method and contour point positioning device
CN108062791A (en) * 2018-01-12 2018-05-22 北京奇虎科技有限公司 A kind of method and apparatus for rebuilding human face three-dimensional model

Also Published As

Publication number Publication date
CN109034013A (en) 2018-12-18

Similar Documents

Publication Publication Date Title
CN109034013B (en) Face image recognition method, device and storage medium
US9224037B2 (en) Apparatus and method for controlling presentation of information toward human object
WO2019179441A1 (en) Focus tracking method and device of smart apparatus, smart apparatus, and storage medium
CN109389185B (en) Video smoke identification method using three-dimensional convolutional neural network
CN110705357A (en) Face recognition method and face recognition device
CN108596157B (en) Crowd disturbance scene detection method and system based on motion detection
US10915739B2 (en) Face recognition device, face recognition method, and computer readable storage medium
CN111091057A (en) Information processing method and device and computer readable storage medium
JP2010002976A (en) Image monitoring device
CN112308879A (en) Image processing apparatus, method of tracking target object, and storage medium
Li et al. Attention-based addressee selection for service and social robots to interact with multiple persons
CN113192164A (en) Avatar follow-up control method and device, electronic equipment and readable storage medium
KR20190009006A (en) Real time multi-object tracking device and method by using global motion
CN108875488B (en) Object tracking method, object tracking apparatus, and computer-readable storage medium
CN112700568B (en) Identity authentication method, equipment and computer readable storage medium
CN104751144B (en) A kind of front face fast appraisement method of facing video monitoring
Luo et al. Dynamic face recognition system in recognizing facial expressions for service robotics
Lorenzo-Navarro et al. An study on re-identification in RGB-D imagery
KR20140134549A (en) Apparatus and Method for extracting peak image in continuously photographed image
CN111368608B (en) Face recognition method, device and system
CN114359646A (en) Video analysis method, device, system, electronic equipment and medium
JP2022019339A (en) Information processing apparatus, information processing method, and program
CN111399627B (en) Energy-saving method and system for 3D display device
Jain et al. System to Detect the Relative Distance between User and Screen
Tran et al. A robust method for head orientation estimation using histogram of oriented gradients

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant