WO2022042135A1 - 人脸图像的选择方法、装置、设备及存储介质 - Google Patents

人脸图像的选择方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022042135A1
WO2022042135A1 PCT/CN2021/107182 CN2021107182W WO2022042135A1 WO 2022042135 A1 WO2022042135 A1 WO 2022042135A1 CN 2021107182 W CN2021107182 W CN 2021107182W WO 2022042135 A1 WO2022042135 A1 WO 2022042135A1
Authority
WO
WIPO (PCT)
Prior art keywords
face image
score
quality
face
overall quality
Prior art date
Application number
PCT/CN2021/107182
Other languages
English (en)
French (fr)
Inventor
陈星宇
张睿欣
汪韬
李绍欣
黄渊
程盼
李广源
阳斯拯
李季檩
吴永坚
黄飞跃
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022042135A1 publication Critical patent/WO2022042135A1/zh
Priority to US17/964,730 priority Critical patent/US20230030267A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • G06V10/993Evaluation of the quality of the acquired pattern
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/235Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/95Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • G06V10/987Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns with the intervention of an operator
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a method, apparatus, device and storage medium for selecting a face image.
  • Face recognition is a kind of biometric recognition technology based on human facial feature information, and it is an important part of artificial intelligence technology. Before face recognition detection, it is often necessary to go through the process of face selection. Usually, the device will cache face images of a fixed number of frames, and select a better quality image as the object of face recognition.
  • the traditional face selection method is time-consuming and inflexible.
  • the embodiments of the present application provide a method, device, device and storage medium for selecting a face image, which can effectively reduce the time required for the face selection process and improve the flexibility of the face selection process.
  • a method for selecting a face image comprising:
  • determining an overall quality score of the first face image In response to detecting the first face image that meets the preliminary quality screening conditions, determining an overall quality score of the first face image, where the overall quality score is used to characterize the overall quality of the face image;
  • the first face image is sent to the face recognition process.
  • a device for selecting a face image comprising:
  • a quality initial inspection module used to detect whether the face image meets the preliminary quality screening conditions after acquiring a frame of face image
  • the overall score determination module is configured to, in response to detecting the first face image that meets the preliminary quality screening conditions, determine the overall quality score of the first face image, where the overall quality score is used to characterize the quality of the face image. overall quality;
  • An image determination module configured to send the first face image to a face recognition process in response to the overall quality score of the first face image being greater than a first-level threshold.
  • a computer device includes a processor and a memory, and the memory stores at least one instruction, at least one program, a code set or an instruction set, the at least one The instructions, the at least one piece of program, the code set or the instruction set are loaded and executed by the processor to implement the above method for selecting a face image.
  • a computer-readable storage medium where at least one instruction, at least one segment of program, code set or instruction set is stored in the storage medium, the at least one instruction, the at least one segment of The program, the code set or the instruction set are loaded and executed by the processor to implement the above method for selecting a face image.
  • a computer program product or computer program where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the above method for selecting a face image.
  • the face images are preliminarily screened by frame-by-frame detection, and the overall quality score is determined only when the preliminary screening is qualified, instead of rigidly filtering out the first few frames of face images, which improves the efficiency of the face selection process. flexibility.
  • the preliminary quality screening it is accurately determined whether the automatic exposure adjustment state is over. After the automatic exposure adjustment state is over, the quality of the face image can be judged. Compared with the related art, it mechanically waits for several frames of face images. Just start judging the quality of the face image, which can reduce the time-consuming by more than half.
  • the face image can be sent to the face recognition process, which effectively reduces the time required for the overall selection of the face, thereby helping to shorten the time-consuming of the entire face recognition process. , to improve the user experience.
  • FIG. 1 is a schematic diagram of an application running environment provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of a method for selecting a face image provided by an embodiment of the present application
  • FIG. 3 exemplarily shows a schematic diagram of an interface for sending a first face image into a face recognition process
  • FIG. 4 exemplarily shows a schematic interface diagram of displaying prompt information when the face screening process is stopped
  • FIG. 5 is a flowchart of a method for selecting a face image provided by another embodiment of the present application.
  • FIG. 6 exemplarily shows a schematic diagram of a preliminary screening process of a face image
  • Fig. 7 exemplarily shows a schematic diagram of determining the overall quality scoring process by the first scoring model
  • FIG. 8 exemplarily shows a schematic diagram of an interface for displaying adjustment information according to quality attribution points
  • FIG. 9 exemplarily shows a schematic diagram of a basic capability of face quality assessment
  • FIG. 10 exemplarily shows a schematic diagram of a selection scheme of a face image
  • FIG. 11 exemplarily shows a schematic diagram of a comparison of selection schemes of face images
  • FIG. 12 is a flowchart of a training method for a first scoring model provided by an embodiment of the present application.
  • FIG. 13 is a flowchart of a training method for a second scoring model provided by an embodiment of the present application.
  • FIG. 14 exemplarily shows a schematic diagram of training a first scoring model and a second scoring model
  • FIG. 15 exemplarily shows a schematic diagram of performing correction processing on label information of conflicting samples
  • 16 is a block diagram of an apparatus for selecting a face image provided by an embodiment of the present application.
  • 17 is a block diagram of an apparatus for selecting a face image provided by another embodiment of the present application.
  • FIG. 18 is a structural block diagram of a computer device provided by an embodiment of the present application.
  • FIG. 1 shows a schematic diagram of a solution implementation environment provided by an embodiment of the present application.
  • the implementation environment of this solution can be realized as a face recognition system.
  • the solution implementation environment may include: a terminal 10 and a server 20 .
  • the terminal 10 may be an electronic device such as a mobile phone, a tablet computer, a multimedia playback device, a wearable device, a PC (Personal Computer), a face payment terminal, a face check-in terminal, a smart camera, and the like.
  • the terminal 10 may be configured or connected to a camera, and face video data can be collected through the camera.
  • a client running an application program may be installed in the terminal 10, and the application program may include a face recognition function.
  • the type of the application is not limited, for example, it may be a social application, a payment application, a monitoring application, an instant messaging application, a video application, and a news information application , music apps, shopping apps, etc.
  • the server 20 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server providing cloud computing services.
  • the server 20 may be a background server of the above application program, and is used to provide background services for the application program.
  • Communication between the terminal 10 and the server 20 may be performed through a network, which is not limited in this application.
  • the execution subject of each step may be the server 20, the terminal 10 (such as the client of the application program running in the terminal 10), or the terminal 10 and the server. 20 interactive cooperation execution.
  • the server 20 the terminal 10 (such as the client of the application program running in the terminal 10), or the terminal 10 and the server. 20 interactive cooperation execution.
  • a computer device is used as the execution subject of each step for introduction and description, but this does not constitute a limitation.
  • the application scenarios of face-swiping payment include but are not limited to self-service terminal payment scenarios, mobile terminal payment scenarios, and unmanned retail store scenarios.
  • self-service terminal payment scenario the above method can be applied to the cash register equipment installed in large commercial complexes, supermarkets, gas stations, hospitals, self-service vending machines, campuses and other places.
  • mobile terminal payment scenario the above method can be applied to mobile terminals such as smart phones and wearable devices.
  • unmanned retail store the above method should be applied to the terminal of an unmanned retail store.
  • Terminals in the above-mentioned face recognition environment such as face-swiping payment terminals, can be connected to the cloud platform through the network.
  • the terminal is also provided with a face selection module trained based on AI (Artificial Intelligence, artificial intelligence) technology, which can execute the face image selection method provided by the present application to achieve the effect of quickly selecting face images.
  • AI Artificial Intelligence, artificial intelligence
  • FIG. 2 shows a flowchart of a method for selecting a face image provided by an embodiment of the present application.
  • the method may include the following steps (201-203).
  • Step 201 After acquiring a frame of face image, check whether the face image meets the preliminary quality screening conditions.
  • each frame of face image is acquired, it is detected whether the face image meets the preliminary quality screening conditions. That is, each time a frame of face image is acquired, preliminary quality screening is performed on this frame of face image, thereby realizing frame-by-frame detection of the preliminary quality.
  • the above-mentioned face image refers to an image to be detected that includes a human face.
  • the above-mentioned face image may be obtained from a face video stream, and one image frame in the above-mentioned face video stream corresponds to one face image.
  • the face image is an image frame in the face video stream, or the face image is a part of the image area in the image frame that includes the face.
  • the face video stream can be acquired by the computer device itself.
  • the above-mentioned preliminary quality screening conditions are conditions used to preliminarily judge the quality of face images, and serve as the basis for preliminary screening of face images.
  • face image collection devices such as cameras or cameras in terminals, often need to undergo an automatic exposure (Automatic Exposure, AE) adjustment process, so that the face image has a good brightness effect.
  • AE Automatic Exposure
  • the above automatic exposure is that the camera automatically adjusts the exposure according to the intensity of the light to prevent overexposure or underexposure.
  • the lens aperture, sensor exposure time, sensor analog gain, and sensor/ISP (Image Signal Processing) digital gain the automatic exposure enables the appreciation of brightness levels or so-called target brightness levels in different lighting conditions and scenes. This results in a captured video or image that is neither too dark nor too bright.
  • the face images collected during the adjustment process of automatic exposure are often of poor quality due to the brightness problem. Therefore, the face images collected during the adjustment process of automatic exposure are usually not selected as images for face recognition. , so as to avoid affecting the accuracy of face recognition.
  • the face images collected during the automatic exposure adjustment process can be filtered out, and then the face images after the automatic exposure adjustment process are screened out, which reduces the calculation amount of the subsequent face image selection steps.
  • Step 202 when the first face image that meets the preliminary quality screening conditions is detected, determine the overall quality score of the first face image.
  • the above-mentioned first face image includes a face image corresponding to the first image frame that meets the preliminary quality screening conditions in the above-mentioned face video stream, for example, the first frame of face image collected after the automatic exposure adjustment process ends. .
  • the above-mentioned overall quality score is used to characterize the overall quality of the face image.
  • the overall quality score is positively correlated with the overall quality of the face image, and the higher the overall quality score is, the better the overall quality of the face image is.
  • Step 203 If the overall quality score of the first face image is greater than the first-level threshold, the first face image is sent to the face recognition process.
  • the above-mentioned first-level threshold is a preset value, and is used as a basis for determining whether to send the first face image into the face recognition process. If the overall quality score of the first face image is greater than the first-level threshold, it means that the overall quality of the first face image is good and meets the quality requirements for face recognition. The image can be sent to the face recognition process.
  • the above-mentioned setting of the first-level threshold may be set according to the actual situation and according to experience or experimental data. The embodiment of the present application does not limit the value of the first-level threshold and the basis for the value.
  • the above-mentioned face recognition is a kind of biometric recognition technology based on human facial feature information for identification.
  • the method for selecting a face image can be applied to a variety of scenarios involving face quality assessment, including but not limited to face-scanning payment, camera imaging quality review, or ID photo quality review, and many other applications.
  • Scenes the above content is only exemplified by taking face-swiping payment as an example.
  • the face-swiping payment scenarios can be roughly divided into three types according to the degree of cooperation of the user: cooperation scenarios, semi-cooperative scenarios and non-cooperative scenarios. match the scene.
  • the above cooperation scenario means that most users are in a normal cooperation state when paying, so the face image collected by the payment device is of good quality and can be used as an image for face recognition.
  • the above-mentioned semi-coordinated scene refers to a scene in which the overall quality of the collected face image when the user pays is poor due to environmental or force majeure factors.
  • the above-mentioned non-cooperative scenarios refer to scenarios in which the user performs face-swiping payment in a non-cooperative state such as wearing sunglasses or turning his head too far.
  • the first-level threshold as the basis for judging whether the face-swiping payment scene is compatible with the scene, only one judgment is needed, that is, whether the overall quality score of the face image is greater than the first-level threshold, and the face-swiping payment scene can be determined.
  • the overall quality score of the face image is greater than the first-level threshold, it can be determined that the user is paying by swiping the face under the matching scene, and the collected face image can be sent to the face recognition process for face recognition detection to ensure that most users
  • the face image is passed through at one time, which shortens the time spent in the face image selection process.
  • the first face image is sent to the face recognition process. That is, in an implementation manner of the present application, the processing method in which the overall quality score of the first face image is equal to the first-level threshold is the same as the processing method in which the overall quality score of the first face image is greater than the first-level threshold. In another implementation manner of the present application, the processing method in which the overall quality score of the first face image is equal to the first-level threshold is the same as the processing method in which the overall quality score of the first face image is smaller than the first-level threshold.
  • the processing method can refer to the processing method when the score is greater than the threshold, or can refer to the processing method when the score is less than the threshold. , this application does not limit it.
  • FIG. 3 it exemplarily shows a schematic diagram of an interface for sending a first face image into a face recognition process.
  • prompt information 31 for face recognition processing and a first face image 32 are displayed.
  • a dynamic face video stream (not shown in the figure) is displayed in the face image circular display area 33, and the first face image is sent into the During the face recognition process, the first face image 32 is still displayed in the face image circular display area 33 .
  • step 203 After the above-mentioned step 203, the following steps are further included.
  • Step 204 if the overall quality score of the first face image is less than the second-level threshold, stop the face screening process and display a prompt message.
  • the above-mentioned secondary threshold is a preset value and is used as a basis for determining whether to stop the face screening process. Wherein, the secondary threshold is smaller than the primary threshold. If the overall quality score of the first face image is lower than the second-level threshold, it means that the overall quality of the first face image is poor and cannot meet the quality requirements for face recognition, and the face screening process can be stopped.
  • the setting of the above-mentioned secondary threshold may be set according to the actual situation and according to experience or experimental data. The embodiment of the present application does not limit the value of the secondary threshold and the basis for the value. Similarly, the above content is exemplified here by taking face-swiping payment as an example.
  • the secondary threshold By setting a secondary threshold as the basis for determining whether the face-swiping payment scene is a non-cooperative scene, low-quality face images can be effectively intercepted.
  • the secondary threshold In the actual application process of face-swiping payment, the secondary threshold is often low, mainly for some illegal malicious network attacks.
  • the secondary threshold By setting the secondary threshold, the pictures carried by these malicious network attacks can be effectively intercepted, or users can be effectively blocked in Low-quality face images collected in the uncooperative state.
  • the secondary threshold may also be equal to the primary threshold.
  • the above prompt information is used to prompt the user that the computer device needs to acquire the face image again, and to prompt the user to stop the face screening process at this time.
  • FIG. 4 it exemplarily shows a schematic diagram of an interface for displaying prompt information when the face screening process is stopped.
  • a prompt information box 41 is displayed in the display interface 40, and the information content 42 when prompting to stop the face screening process is displayed in the prompt information box.
  • the prompt information box 41 also includes an exit control 43 and a re-detection control 44.
  • the technical solutions provided by the embodiments of the present application perform preliminary screening of face images by means of frame-by-frame detection, and determine the overall quality score only when the preliminary screening is qualified, rather than rigidly filtering out the face images.
  • the first several frames of face images improve the flexibility of the face selection process.
  • the preliminary quality screening it is accurately determined whether the automatic exposure adjustment state is over. After the automatic exposure adjustment state is over, the quality of the face image can be judged.
  • the overall quality of the face image is qualified, the face image can be sent to the face recognition process, effectively reducing the time required for face selection, thereby helping to shorten the time-consuming of the entire face recognition process and improving the user experience. experience.
  • the face screening process is stopped to effectively intercept the pictures carried by these malicious network attacks or the low-quality face images collected by the user in a non-cooperative state.
  • FIG. 5 shows a flowchart of a method for selecting a face image provided by another embodiment of the present application.
  • the method may include the following steps (501-517).
  • Step 501 After each frame of a face image is acquired, acquire the ray score of the face image.
  • the above-mentioned light points are used to characterize the brightness and darkness of the face image.
  • the light score is the basis for judging whether the adjustment process of automatic exposure mentioned in the previous embodiment ends.
  • Step 502 according to the light distribution of the face image, detect whether the face image meets the preliminary quality screening conditions.
  • whether the face image meets the preliminary quality screening conditions is detected by means of adaptive judgment.
  • it is detected whether the face image meets the preliminary quality screening conditions by comparing the ray score of the face image with the ray score threshold. If the light score of the face image is greater than or equal to the light score threshold, the face image meets the preliminary quality screening conditions; if the light score of the face image is lower than the light score threshold, the face image does not meet the preliminary quality screening conditions.
  • the above-mentioned light sub-threshold is a preset value, which may be determined according to at least one of parameters of automatic exposure, parameters of an image acquisition device, or environmental parameters, which are not limited in this embodiment of the present application.
  • the above steps 501-502 are a preliminary screening process of face images.
  • FIG. 6 it exemplarily shows a schematic diagram of a preliminary screening process of a face image.
  • a face video stream 61 is shown in FIG. 6 .
  • the sixth frame of face image 62 is the first frame of face image that meets the preliminary quality screening conditions.
  • the quality of the sixth frame of image 62 can be detected.
  • Overall score if the overall quality score of the sixth frame image 62 is higher than the first-level threshold, it can be determined that the quality of the face is qualified, and the process of face optimization is ended in advance.
  • the device waits for n frames 63, and selects the last frame as the beginning of the face optimization process. This method does not judge the automatic exposure adjustment state of the face image acquisition device, which leads to increased time-consuming.
  • Step 503 when the first face image that meets the preliminary quality screening conditions is detected, the first scoring model is called.
  • the above-mentioned first scoring model is a neural network model for determining the overall quality score.
  • the above-mentioned first scoring model is based on a residual network (Residual Network, ResNet), combined with compression and excitation network (Squeeze-and-Excitation Networks, SENet), group convolution (Group Convolution), asymmetric volume Asymmetric ConvolutionNet (ACNet) and other structures constitute the neural network model.
  • ResNet residual network
  • SENet compression and excitation network
  • group convolution Group Convolution
  • ACNet asymmetric volume Asymmetric ConvolutionNet
  • the above-mentioned convolutional neural network based on residual network is characterized by being easy to optimize and can improve the accuracy by adding considerable depth.
  • the internal residual block uses skip connections to alleviate the gradient disappearance problem caused by increasing depth in deep neural networks.
  • the above grouped convolution is to group the feature map (feature map) input by the convolutional neural network according to the channel (channel), and then convolve each group separately.
  • the amount of parameters in the neural network model can be effectively reduced, and a better model application effect can be obtained.
  • the above-mentioned asymmetric convolutional network is a convolutional neural network constructed by replacing standard box convolutions, such as 3 ⁇ 3 convolutional blocks, by asymmetric convolution blocks (Asymmetric Convolution Blocks, ACB).
  • ACB Asymmetric Convolution Blocks
  • an ACB can be constructed that contains three parallel branches d ⁇ d, 1 ⁇ d, and d ⁇ 1, and the outputs of the three branches are added to enrich the feature space.
  • Asymmetric convolutional networks can improve the accuracy and expressiveness of models without introducing additional parameters and increasing computational time.
  • a gradient image corresponding to the first face image is obtained, and the first scoring model is called to compare the first face image and the first face image.
  • the corresponding gradient images are input into the first scoring model.
  • the above-mentioned gradient image refers to an image including gradient information of the first face image.
  • the image can be regarded as a two-dimensional discrete function, and the image gradient is actually the derivation of this two-dimensional discrete function.
  • the first face image is processed by a sobel operator to obtain a gradient image corresponding to the first face image.
  • Step 504 Determine the overall quality score of the first face image through the first scoring model.
  • the first face image is input into the first scoring model, and the overall quality score of the first face image is output through the first scoring model.
  • the first scoring model obtains channel information of the first face image and a feature map corresponding to the first face image based on the first face image.
  • the first scoring model performs convolution processing based on channel information of the first face image and a feature map corresponding to the first face image.
  • the activation function in the first scoring model processes the input content, such as a Rectified Linear Unit (ReLU).
  • the input data is pooled through the first scoring model.
  • an overall quality score of the first face image is output.
  • the first face image and the gradient image corresponding to the first face image are input into the first scoring model, and the overall quality score of the first face image is output through the first scoring model.
  • the overall quality score of the first face image is output through the first scoring model.
  • FIG. 7 it exemplarily shows a schematic diagram of a process of determining the overall quality score by the first scoring model.
  • Figure 7 shows a process 72 of gradient map priors for a face image 71, as well as a network structure 73 combining compression and excitation networks, and a network structure 74 of asymmetric convolutional networks.
  • Step 505 determine whether the overall quality score of the first face image is greater than the first-level threshold; if so, execute the following step 506 ; if not, execute the following step 507 .
  • Step 506 sending the first face image into the face recognition process.
  • Step 507 Determine whether the overall quality score of the first face image is less than the second-level threshold, if so, end the face screening process; if not, execute the following step 508.
  • Step 508 Obtain the overall quality score of the next frame of face image.
  • the initial next frame of face image refers to the next frame of face image of the first face image.
  • the next frame of face image is a face image corresponding to the next image frame of the image frame corresponding to the current face image in the face video stream.
  • the first face image is stored in the buffer area, and the overall quality score of the next frame of the face image is acquired.
  • the above-mentioned buffer area refers to the memory for temporarily placing output or input data.
  • Step 509 Determine whether the overall quality score of the next frame of face image is greater than the first-level threshold; if so, execute the following step 510; if not, execute the following step 511.
  • Step 510 sending the next frame of face image into the face recognition process.
  • Step 511 determine whether the overall quality score of the next frame of the face image is less than the second-level threshold, if so, end the face screening process; if not, execute the following step 512 .
  • the next frame of the face image is stored in the buffer area, and again from the step of obtaining the overall quality score of the next frame of the face image. Begin execution.
  • step 511 further includes the following steps.
  • Step 512 Determine whether the overall quality score of the continuous n frames of face images is less than the first-level threshold and higher than the second-level threshold. If so, execute the following step 513;
  • Step 513 select the second face image with the highest overall quality score from the consecutive n frames of face images.
  • n is a positive integer greater than 1.
  • the value of n is a preset value, and the value of n may be set according to the actual situation, which is not limited in this embodiment of the present application.
  • n is 5.
  • the above steps 512-513 can also be implemented in the following manner: if the overall quality score of the n frames of face images in the buffer area is less than the first-level threshold, then select the overall quality from the n frames of face images in the buffer area. The second face image with the highest score.
  • Step 514 Determine the quality attribution score of the second face image.
  • the introduction of the first-level threshold and the second-level threshold please refer to the content of the introduction about the first-level threshold and the second-level threshold in the previous embodiment, which will not be repeated here.
  • the secondary threshold is smaller than the primary threshold.
  • the above quality attribution score includes quality scores on multiple quality reference dimensions, and the quality of the face image is reflected in multiple quality reference dimensions. Through the quality attribution score, it can be intuitively seen that the face image is in a certain quality reference dimension. The level of quality on it.
  • the above-mentioned quality reference dimension is a reference component for measuring the quality of the face image, and is used to evaluate the quality of the face image in more detail.
  • the quality reference dimension includes at least one of an angle dimension, a blur dimension, an occlusion dimension, and a light dimension.
  • the process of determining the quality attribution score of the second face image in the above step 514 may be implemented by the following steps.
  • Step 514a calling a second scoring model, which is a machine learning model for determining a quality attribution score.
  • the second scoring model is a neural network model for determining quality attribution scores.
  • the structure of the foregoing second scoring model is similar to the structure of the foregoing first scoring model.
  • For the structure of the second scoring model reference may be made to the content of the foregoing first scoring model, which will not be repeated here.
  • Step 514b determining the quality attribution score of the second face image through the second scoring model.
  • the above-mentioned quality attribution score includes at least one of angle score, blur score, occlusion score, and light score.
  • the angle score is used to characterize the face angle of the face image
  • the fuzzy score is used to characterize the blur degree of the face image
  • the occlusion score is used to characterize the occlusion of the face image
  • the light score is used to characterize the brightness of the face image. degree.
  • the above-mentioned angle score, blur score, occlusion score, and light score have a correlation with image quality.
  • the specific correlation such as a positive correlation or a negative correlation, it can be formulated according to the actual situation, and this application is implemented. The example does not limit this.
  • Step 515 Determine whether the quality attribution score of the second face image meets the conditions, if yes, execute the following step 516; if not, execute the following step 517.
  • Step 516 sending the second face image into the face recognition process.
  • the quality attribution score of the second face image meets the condition means that any item of the quality attribution score meets the corresponding condition.
  • the quality attribution score includes angle score, blur score, occlusion score, and light score as an example.
  • the quality attribution score of the second face image meets the conditions, which means that the angle score, blur score, occlusion score, and light score are all equal.
  • the corresponding threshold conditions are met, for example, the angle score meets the angle score threshold condition, and the fuzzy score meets the fuzzy score threshold condition, and the occlusion score meets the occlusion score threshold condition, and the light score meets the light score threshold condition.
  • Step 517 Display adjustment information according to the quality attribution score.
  • the above-mentioned quality attribution score of the second face image does not meet the conditions means that any quality attribution score does not meet the conditions.
  • the quality attribution score includes angle score, blur score, occlusion score and light score, as long as there is one of them. If the score does not meet the corresponding threshold condition, it can be determined that the quality attribution score of the second face image does not meet the condition.
  • the above adjustment information refers to information that prompts the user to make adjustments to improve the quality of the face image.
  • FIG. 8 it exemplarily shows a schematic diagram of an interface for displaying adjustment information according to quality attribution points.
  • FIG. 8 it exemplarily shows a schematic diagram of an interface for displaying adjustment information according to quality attribution points.
  • the 8 shows three interfaces 81 , 82 and 83 for displaying adjustment information.
  • the content of the adjustment information 84 displayed in the interface 81 is please do not cover your face; the content of the adjustment information 85 displayed in the interface 82 is please take off your glasses or hat; the content of the adjustment information 86 displayed in the interface 83 is please keep The face is straight.
  • FIG. 9 it exemplarily shows a schematic diagram of a basic capability of face quality assessment.
  • the angle score increases accordingly from the case of the large angle of the face to the case of the frontal face.
  • the blurred score also increases gradually.
  • the occlusion distribution of the face image increases from the scene with severe occlusion to the scene without occlusion.
  • FIG. 10 it exemplarily shows a schematic diagram of a selection scheme of a face image.
  • the process of performing face optimization in the cooperating scene is embodied in the part circled by the dotted line frame 1010 in FIG. 10
  • the process of performing face optimization in the semi-cooperating scene is embodied in the part circled by the dotted line box 1020 in FIG. 10 .
  • it is necessary to judge the quality attribution score of the face image and then judge the light score, blur score, angle score and occlusion score of the face image in turn.
  • the process of face optimization in the non-cooperative scene is mainly reflected in the part circled by the dotted box 1030 in Figure 10.
  • the device prompts the user to make corresponding adjustments according to the quality attribution points, such as excessive light, Prompt information such as too low light, blurred face, face occlusion, and large face angle.
  • the quality attribution points such as excessive light, Prompt information such as too low light, blurred face, face occlusion, and large face angle.
  • the reason for the low quality of the face image can be determined by comparing the relationship between the angle score and the occlusion score. If the angle score is greater than the occlusion score , it is desirable to determine that there is a problem that the face twist angle is too large at this time. If the angle score is less than the occlusion score, it can be determined that there is a problem of face occlusion at this time.
  • the complete process of completing face recognition usually includes three stages, namely the video streaming stage, the face optimization stage, and the face recognition stage.
  • the method adopted by the conventional technical solution in the video stream acquisition stage is to filter out the face image of a fixed number of frames in the collected face video stream, and then enter the face optimization stage to judge the quality of the face image, and filter out the quality of the face image.
  • the poor-quality face images captured by the image capture device in the state of automatic exposure adjustment are discarded.
  • the first 20 frames of face images in the face video stream are fixedly filtered, and the face selection process starts from the 21st frame of face images.
  • most of the actual face-swiping payment scenarios are matched with the scene.
  • the automatic exposure adjustment of the image acquisition device is very short, and the conventional technical solution cannot judge by itself that the automatic exposure adjustment has ended. Some useful face image frames will be wasted, resulting in increased time-consuming.
  • the method adopted in the video streaming stage of the technical solution of the present application is to adaptively determine the automatic exposure adjustment state of the image acquisition device according to the image brightness. As long as there is a face image whose brightness meets the conditions, the quality of the face image can be determined. For example, the automatic exposure adjustment process ends at the 8th frame, and the technical solution of the present application can determine that the brightness of the face image in the 8th frame meets the conditions through self-adaptive judgment, and then judge the quality of the face image in the 8th frame. Wait until the 21st frame to start again, effectively reducing the time consumption by more than half in the video streaming stage.
  • the method adopted by the conventional technical solution in the face optimization stage is to cache the face image of a fixed number of frames from the face video stream for detection, and select a face image of good quality from it. Face recognition then caches face images with a fixed number of frames from the face video stream, and repeats the above steps, and finally sends the selected images to the face recognition process. For example, the face images of frames 21-25 are cached from the face video stream, the quality of the five frames of face images is detected respectively, and then a face image of good quality is selected or the next five frames of face images are continuously cached.
  • the method adopted by the technical solution of the present application in the face optimization stage is to first calculate the overall quality score of the face image frame by frame according to the overall quality.
  • the overall quality score of the face image is higher than the threshold, it can be sent to the face recognition process. If the overall quality score of n consecutive frames of face images is lower than the threshold, the quality attribution score of the face image with the highest overall quality score can be calculated from multiple dimensions, analyze the reasons for the low quality of the face image, and prompt Users make corresponding adjustments to improve user experience and cultivate users' correct usage habits. For example, when the brightness of the face image in the eighth frame meets the conditions, calculate the overall quality score of the face image in the eighth frame. If the overall quality score of the face image in the eighth frame is higher than the threshold, the face image in the eighth frame can be classified into Enter the face recognition process.
  • the starting position of the face optimization process of the conventional technical solution and the technical solution of the present application is the same in the face video stream, which is the 21st frame, and the conventional technical solution takes the 21st frame. -25 frames to judge the face quality of five frames, and the technical solution of the present application is to detect frame by frame, and the overall quality score is calculated immediately from the 21st frame. If the 21st frame is a face image with better quality, the present application can immediately The 21st frame of face image is sent into the face recognition process, and the conventional technology needs to calculate five frames and then select the 21st frame, and then send it into the face recognition process. In this case, the technical solution of the present application is faster than the conventional technical solution. It is 5 times faster, and even the worst case is similar to the number of detections of the conventional scheme, so it can effectively improve the speed of face selection, and ultimately shorten the time required for the complete face recognition process.
  • FIG. 11 it exemplarily shows a schematic diagram of comparison of selection schemes of face images.
  • the solution adopted for the automatic exposure adjustment process is to filter 20 frames fixedly;
  • the solution adopted for judging the quality of the face image is to cache 5 frames, preferably 1 frame, and judge the quality samples on a frame-by-frame basis.
  • the interception effect is poor; and the solution adopted by the technical solution 1101 provided by this embodiment for the automatic exposure adjustment process is to adaptively determine the AE end time; the solution adopted for judging the quality of the face image is to judge the image quality frame by frame, which is higher than The threshold is sent for recognition; and the quality attribution score is used to cover the angle, blur, occlusion, and light dimensions to judge the quality of the face image, and the effect is significantly better than the conventional scheme.
  • the technical solutions provided by the embodiments of the present application take the image brightness as qualified as a condition for satisfying the preliminary screening, and then output the overall quality score of the face image by the first scoring model, and then evaluate the quality of the face images in consecutive frames.
  • the overall score is less than the first-level threshold
  • the quality attribution score of the face image is output through the second scoring model, and the quality of the face image can be judged from multiple dimensions.
  • the face image is sent to the face recognition process, which effectively reduces the time required for face selection.
  • the reason why the quality of the face image does not meet the standard can also be analyzed according to the quality attribution score, and the user is prompted to make corresponding adjustments.
  • the training method of the first scoring model includes the following steps (1201-1204).
  • Step 1201 acquiring training samples.
  • the training samples include sample face images and standard face images corresponding to the sample face images.
  • the above-mentioned sample face image is an image containing a sample face.
  • the above-mentioned standard face image corresponding to the sample face image refers to a high-quality image corresponding to the sample face used as a reference.
  • the sample face image is a life photo containing the sample face.
  • the standard face image is a certificate photo corresponding to the sample face.
  • Step 1202 Obtain the similarity between the sample face image and the standard face image.
  • the similarity may reflect the similarity between the sample face image and the standard face image, and is generally determined by calculating the distance between the corresponding feature vectors of the sample face image and the standard face image.
  • the above step 1102 includes the following sub-steps.
  • Step 1202a perform feature identification processing on the sample face image to obtain feature information of the sample face image.
  • the above feature identification process refers to the process of identifying the feature information of the sample face in the sample face image, and the feature information of the sample face image reflects the information richness of the sample face.
  • the feature of the sample face image is obtained by performing feature recognition processing on the sample face image by using a face feature recognition model.
  • the above face feature recognition model is a mathematical model for identifying face feature information.
  • Step 1202b perform feature identification processing on the standard face image to obtain feature information of the standard face image.
  • the feature of the standard face image is obtained by performing feature identification processing on the standard face image through the face feature identification model.
  • Step 1202c based on the feature information of the sample face image and the feature information of the standard face image, obtain the similarity between the sample face image and the standard face image.
  • the feature information of the sample face image and the feature information of the standard face image are compared and processed, and the similarity between the sample face image and the standard face image is calculated.
  • the above comparison process refers to a process of comparing the similarity in the feature information of the sample face image and the feature information of the standard face image.
  • the similarity between the sample face image and the standard face image is reflected by calculating the distance between the feature of the sample face image and the feature of the standard face image.
  • the distance between the feature vector of the above-mentioned sample face image and the feature vector of the standard face image includes Euclidean Distance, Manhattan Distance, Minkowski Distance, Cosine Distance.
  • Similarity and the like reflect the distance of similarity between two feature vectors, which is not limited in this embodiment of the present application.
  • the similarity between the sample face image and the standard face image is measured by Pearson Correlation Coefficient.
  • the Pearson correlation coefficient also known as the Pearson Product-Moment Correlation Coefficient (PPMCC or PCCs) is used to measure the degree of correlation (linear correlation) between two variables. between -1 and 1.
  • the Pearson correlation coefficient between two variables is defined as the quotient of the covariance and standard deviation between the two variables.
  • the above similarity is used to determine the first label information of the sample face image.
  • the first label information refers to the label information of the overall quality score.
  • the similarity is taken as the overall quality score of the sample face image, and recorded as the first label information of the sample face image, reflecting the overall quality of the sample face image. The higher the similarity of the sample face images, the higher the overall quality score of the sample face images, and the better the overall quality of the sample face images.
  • the feature of the sample face image is denoted as f(I k ).
  • the feature of the standard face image denotes f(I 0 ).
  • the similarity between the sample face image and the standard face image is denoted as S k
  • the overall quality of the label information of the sample face image is denoted as Q k
  • the sample face image and the standard face image can be obtained by the following formula
  • the similarity between the images S k and the quality of the label information of the sample face images are overall scored as Q k :
  • the label of the overall quality of the sample face image can be automatically generated directly from the feature recognition process, eliminating the need for The labeling cost of the sample face image is used to train the first scoring model, and finally the overall quality score of a picture can be obtained without a standard face image reference.
  • Step 1203 Determine the first label information of the sample face image.
  • the similarity that is, the overall quality of the sample face image
  • the similarity is used as the first label information of the sample face image.
  • Step 1204 Train the first scoring model based on the first label information of the sample face image.
  • the sample face image marked with the first label information is input into the first scoring model, and the overall prediction quality score of the sample face image is output through the first scoring model.
  • the above-mentioned overall prediction quality score refers to the overall quality score for predicting the sample face image output by the first scoring model.
  • the first scoring model is constrained to improve the accuracy of the first scoring model.
  • the mean square error (Mean-Square Error, MSE) is combined with the Pearson correlation coefficient to construct a loss function corresponding to the first scoring model, which can be based on the linear regression of the features of the identified sample face images and The interval order-preserving method is used to fit the overall score of the predicted quality.
  • MSE mean square error
  • the loss function can be represented by the following formula:
  • X is the predicted quality overall score
  • Y is the label value
  • ⁇ X and ⁇ Y are the mean
  • ⁇ X and ⁇ Y are the variances.
  • the overall quality score is constrained by the MSE loss function, and in order to ensure the consistency of the overall order, the Pearson correlation coefficient is added to constrain the overall order preservation of the sample.
  • the lower the value of the loss function the better the accuracy of the corresponding first scoring model, that is, the closer the quality overall score in the label information of the sample face image is to the predicted quality overall score.
  • the training method of the second scoring model includes the following steps (1301-1302).
  • Step 1301 acquiring training samples.
  • the training samples include sample face images and second label information of the sample face images.
  • the second label information includes quality level information on multiple quality reference dimensions.
  • the above-mentioned quality level information is used to reflect the quality level of the sample face image in a certain quality reference dimension.
  • the quality level corresponding to each quality reference dimension is divided into five levels, that is, the sample face image is divided into five levels under each quality reference dimension.
  • the second scoring model can learn the order relationship distribution within the quality level under each quality reference dimension by itself. , so as to obtain the score of each quality reference dimension, and solve the difficult problem of labeling training samples under the condition of continuous variables.
  • the label value of the second label information reflects the probability that the sample face image is distributed under a certain quality reference dimension, for example, when the quality level is five, the value of the label value of the angle score in the second label information
  • the range can be 0, 0.25, 0.5, 0.75, 1.
  • the second tag information includes respective tag values of angle score, blur score, light score, and occlusion score.
  • the angle score is 0, the blur score is 1, the light score is 0.25, and the occlusion score is 0.5.
  • Step 1302 Train a second scoring model based on the second label information of the sample face image.
  • the sample face image carrying the second label information is input into the second scoring model, and the quality attribution score of the sample face image is output through the second scoring model.
  • the second scoring model is constrained to improve the accuracy of the second scoring model.
  • GMM Gaussian Mixture Model
  • a weakly supervised training loss function - Gaussian Mixture Loss function GMM Loss is designed.
  • the above-mentioned Gaussian mixture model uses a Gaussian probability density function (normal distribution curve) to quantify things accurately. It is a model that decomposes things into several models based on a Gaussian probability density function (normal distribution curve).
  • the mixture Gaussian model uses K Gaussian models to characterize the quality of the sample face image under each quality reference dimension.
  • the formula of the loss function corresponding to the second scoring model is as follows:
  • x i is the input image, is the mean of the zith class, ⁇ zi is the variance of the zith class, p( zi ) is the probability of the zith class, k is the type, K is the number of types, and p(k) is the probability of the kth class.
  • the loss function of the second scoring model may be selected according to the difference between the label value of the training sample and the predicted value output by the second scoring model. If the difference between the label value of the training sample and the predicted value output by the second scoring model is greater than the preset threshold, a loss function of the second scoring model constructed based on the mean square error is selected to constrain the second scoring model. If the difference between the label value of the training sample and the predicted value output by the second scoring model is less than or equal to the preset threshold, select the loss function of the second scoring model constructed based on the Gaussian mixture model and cross entropy (Cross Entropy). The two-scoring model is constrained. The above cross entropy is used to measure the difference information between two probability distributions.
  • FIG. 14 it exemplarily shows a schematic diagram of training the first scoring model and the second scoring model.
  • the similarity 1403 between the request photo 1401 and the certificate photo 1402 as the first label value of the sample by taking the similarity 1403 between the request photo 1401 and the certificate photo 1402 as the first label value of the sample, and sending the request photo 1401 carrying the first label value into the first scoring model 1404, each of the above request photos 1401 is obtained. The corresponding overall quality score.
  • the training samples are firstly divided into four categories according to the four dimensions of angle, blur, occlusion and light, and then the face images in each dimension are divided into five categories to obtain the training samples 1405 of the second scoring model 1406, and then The training sample 1405 carrying the gear information is sent to the second scoring model 1406 to obtain the quality attribution score of each picture in the training sample 1405 .
  • the above-mentioned training method of the first scoring model or the above-mentioned training method of the second scoring model further includes the following steps.
  • Step 1 Obtain conflicting samples in the training samples.
  • Conflicting samples refer to training samples with conflicting overall quality scores and quality attribution scores. For example, a sample face image whose overall quality score is greater than the first-level threshold, but the quality attribution score does not meet the conditions; or a sample face image whose quality attribution score meets the conditions but whose overall quality score is less than the first-level threshold.
  • Step 2 correcting the label information of the conflicting samples.
  • the label information of the conflicting samples is corrected by a gradient descent tree (Gradient Boosting Decision Tree, GDBT) algorithm, and the first label information and the second label information of the sample face images in the conflicting samples are re-marked, so that the conflicting samples are The predicted overall quality score and the quality attribution score no longer conflict.
  • GDBT gradient descent tree
  • FIG. 15 it exemplarily shows a schematic diagram of a modification process for label information of conflicting samples.
  • the training samples are pre-labeled for total score (ie overall quality score) and attribution (quality attribution score), respectively, and then the samples are sent to the overall quality score model (the first scoring model) and the quality attribution model (the first model).
  • the conflict samples are obtained manually, the total score correction strategy function G(z) and the attribution correction strategy function H(g) are formulated based on the conflict samples, and the second-generation total score label and the second-generation attribution are obtained. due to labels.
  • Step 3 get the revised training samples.
  • the above-mentioned revised training samples are used to retrain the above-mentioned first scoring model and the second scoring model, so as to obtain the first scoring model and the second scoring model with more accurate predicted scores.
  • the technical solutions provided by the embodiments of the present application greatly reduce the sample labeling cost by using the similarity between the sample image and the standard image as the label value of the first scoring model.
  • the correlation coefficient is combined to construct a loss function corresponding to the first scoring model, and a more accurate first scoring model is obtained, which improves the accuracy of the overall face quality prediction.
  • the sample image is divided into four categories according to the four dimensions of angle, blur, occlusion and light, and then the face image in each dimension is divided into different gears and the gear information is used as the weak supervision information of the sample.
  • Train the second scoring model so that the second scoring model outputs continuous quality attribution scores, and solves the problem that it is difficult to label training samples under the condition of continuous variables.
  • the second The scoring model is more accurate.
  • the above-mentioned first scoring model and the second scoring model are retrained, which further improves the accuracy of the model in predicting the quality of face images.
  • FIG. 16 shows a block diagram of an apparatus provided by an embodiment of the present application.
  • the device has the function of realizing the above method.
  • the apparatus 1600 may include: a quality preliminary inspection module 1601 , an overall score determination module 1602 and an image determination module 1603 .
  • the quality preliminary inspection module 1601 is configured to detect whether the face image meets the preliminary quality screening conditions after acquiring a frame of face image.
  • An overall score determination module 1602 configured to determine the overall quality score of the first face image in response to detecting the first face image that meets the preliminary quality screening conditions, where the overall quality score is used to characterize the face image overall quality.
  • the image determination module 1603 is configured to send the first face image to the face recognition process in response to the overall quality score of the first face image being greater than the first-level threshold.
  • the quality preliminary inspection module 1601 is used to:
  • the light score is used to characterize the brightness and darkness of the face image
  • the face image According to the light score of the face image, it is detected whether the face image meets the preliminary quality screening conditions.
  • the overall score determination module 1602 is used to:
  • the first scoring model is a neural network model for determining the overall quality score
  • the overall quality score of the first face image is determined by the first scoring model.
  • the training process of the first scoring model is as follows: obtaining a training sample, the training sample includes a sample face image and a standard face image corresponding to the sample face image; obtaining the sample The similarity between the face image and the standard face image, the similarity is used to determine the first label information of the sample face image, and the first label information refers to the label of the overall quality score information; the first scoring model is trained based on the first label information of the sample face image.
  • the apparatus 1600 further includes a frame-by-frame detection module 1604 for: if the overall quality score of the first face image is less than the first-level threshold, obtain the next The overall quality score of a frame of face image; wherein, the initial next frame of face image refers to the next frame of face image of the first face image; if the overall quality score of the next frame of face image is greater than The first-level threshold, then the next frame of face image is sent to the face recognition process; if the overall quality score of the next frame of face image is less than the first-level threshold, then from the The step of acquiring the overall quality score of the next frame of face image starts to be executed.
  • a frame-by-frame detection module 1604 for: if the overall quality score of the first face image is less than the first-level threshold, obtain the next The overall quality score of a frame of face image; wherein, the initial next frame of face image refers to the next frame of face image of the first face image; if the overall quality score of the next frame of face image is greater than The first
  • the apparatus 1600 further includes: an image selection module 1605 and an attribution score determination module 1606 .
  • the image selection module 1605 is configured to select a second face image with the highest overall quality score from the consecutive n frames of face images if the overall quality score of the consecutive n frames of face images is less than the first-level threshold.
  • the attribution score determination module 1606 is configured to determine the quality attribution score of the second face image if the overall quality score of the second face image is greater than the secondary threshold, where the quality attribution score includes a plurality of The quality score on the quality reference dimension; wherein, the second-level threshold is smaller than the first-level threshold.
  • the image determination module 1603 is configured to send the second face image to the face recognition process if the quality attribution score of the second face image meets the conditions.
  • the attribution score determination module 1606 is configured to: invoke a second scoring model, the second scoring model being a neural network model for determining the quality attribution score;
  • the quality attribution score of the second face image is determined by the second scoring model, and the quality attribution score includes at least one of angle score, blur score, occlusion score and light score;
  • the angle score is used to characterize the face angle of the face image
  • the fuzzy score is used to characterize the blur degree of the face image
  • the occlusion score is used to characterize the occlusion of the face image.
  • the light distribution is used to characterize the brightness and darkness of the face image.
  • the training process of the second scoring model is as follows: obtaining a training sample, the training sample includes a sample face image and second label information of the sample face image, the second label information The quality level information on the multiple quality reference dimensions is included; and the second scoring model is trained based on the second label information of the sample face image.
  • the attribution score determination module 1606 is further configured to display adjustment information according to the quality attribution score in response to the quality attribution score of the second face image not meeting the condition, the adjustment information Refers to information that prompts the user to make adjustments to improve the quality of the face image.
  • the training process of the first scoring model and the second scoring model further includes: acquiring conflicting samples in the training samples, where the conflicting samples refer to a conflict between the overall quality score and the quality attribution score The training samples of the conflicting samples are modified; the label information of the conflicting samples is corrected.
  • the apparatus 1600 further includes a screening stop module 1607, configured to: stop the face screening process if the overall quality score of the first face image is less than the second-level threshold, Displaying prompt information, the prompt information is used to prompt the user that the computer device needs to acquire the face image again; wherein the secondary threshold is smaller than the primary threshold.
  • a screening stop module 1607 configured to: stop the face screening process if the overall quality score of the first face image is less than the second-level threshold, Displaying prompt information, the prompt information is used to prompt the user that the computer device needs to acquire the face image again; wherein the secondary threshold is smaller than the primary threshold.
  • the technical solutions provided by the embodiments of the present application perform preliminary screening of face images by means of frame-by-frame detection, which improves the flexibility of the face selection process, and then determines the face images that have passed the preliminary screening.
  • the overall quality score reflects the overall quality of the face image.
  • the face image can be sent to the face recognition process, which effectively reduces the time required for face selection, thereby helping Shorten the time-consuming of the entire face recognition process and improve user experience.
  • FIG. 18 shows a structural block diagram of a computer device 1800 provided by an embodiment of the present application.
  • the computer device 1800 may be an electronic device such as a mobile phone, a tablet computer, a multimedia playback device, a wearable device, a PC (Personal Computer), a face payment terminal, a face check-in terminal, and a smart camera.
  • the terminal is used to implement the method provided in the above embodiment.
  • the computer device may be the terminal 10 or the server 20 in the application running environment shown in FIG. 1 .
  • computer device 1800 includes: processor 1801 and memory 1802 .
  • the processor 1801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 1801 can be implemented by at least one hardware form among DSP (Digital Signal Processing, digital signal processing), FPGA (Field Programmable Gate Array, field programmable gate array), and PLA (Programmable Logic Array, programmable logic array).
  • the processor 1801 may also include a main processor and a co-processor.
  • the main processor is a processor used to process data in the wake-up state, also called CPU (Central Processing Unit, central processing unit); the co-processor is A low-power processor for processing data in a standby state.
  • the processor 1801 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen.
  • the processor 1801 may further include an AI (Artificial Intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • Memory 1802 may include one or more computer-readable storage media, which may be non-transitory. Memory 1802 may also include high-speed random access memory, as well as non-volatile memory, such as one or more disk storage devices, flash storage devices. In some embodiments, a non-transitory computer-readable storage medium in memory 1802 is used to store at least one instruction, at least one program, code set or instruction set, the at least one instruction, at least one program, code set or instruction set and configured to be executed by one or more processors to implement the above-described method of selecting a face image.
  • the computer device 1800 may also optionally include: a peripheral device interface 1803 and at least one peripheral device.
  • the processor 1801, the memory 1802 and the peripheral device interface 1803 can be connected through a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 1803 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 1804 , a display screen 1805 , a camera assembly 1806 , an audio circuit 1807 , a positioning assembly 1808 and a power supply 1809 .
  • FIG. 18 does not constitute a limitation on the computer device 1800, and may include more or less components than those shown, or combine some components, or adopt different component arrangements.
  • a computer-readable storage medium stores at least one instruction, at least one piece of program, code set or instruction set, the at least one instruction, the at least one piece of program .
  • code set or the instruction set is executed by the processor, the above method for selecting a face image is implemented.
  • a computer-readable storage medium stores at least one instruction, at least one piece of program, code set or instruction set, the at least one instruction, the at least one piece of program .
  • code set or the instruction set is executed by the processor, the above method for selecting a face image is implemented.
  • the computer-readable storage medium may include: ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), SSD (Solid State Drives, solid-state hard disk), or an optical disk.
  • the random access memory may include ReRAM (Resistance Random Access Memory, resistive random access memory) and DRAM (Dynamic Random Access Memory, dynamic random access memory).
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the above method for selecting a face image.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the above method for selecting a face image.
  • references herein to "a plurality” means two or more.
  • "And/or" which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone.
  • the character "/" generally indicates that the associated objects are an "or” relationship.
  • the numbering of the steps described in this document only exemplarily shows a possible execution sequence between the steps. In some other embodiments, the above steps may also be executed in different order, such as two different numbers. The steps are performed at the same time, or two steps with different numbers are performed in a reverse order to that shown in the figure, which is not limited in this embodiment of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

一种人脸图像的选择方法、装置、设备及存储介质,属于人工智能技术领域。所述方法包括:在获取到一帧人脸图像之后,检测人脸图像是否符合初步质量筛选条件(201);在检测到符合初步质量筛选条件的第一人脸图像时,确定第一人脸图像的质量整体分(202);若第一人脸图像的质量整体分大于一级阈值,则将第一人脸图像送入人脸识别流程(203)。

Description

人脸图像的选择方法、装置、设备及存储介质
本公开要求于2020年8月25日提交的申请号为202010863256.0、发明名称为“人脸图像的选择方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本申请涉及人工智能技术领域,特别涉及一种人脸图像的选择方法、装置、设备及存储介质。
背景技术
随着人工智能技术的研究和进步,人工智能技术在多个领域展开应用。
人脸识别是基于人的脸部特征信息进行身份识别的一种生物识别技术,是人工智能技术中重要的一环。在人脸识别检测前往往需要先经过人脸优选的过程,通常设备会缓存固定帧数的人脸图像,并从中选取一幅质量较好的图像作为人脸识别的对象。
传统的人脸优选方法耗时长、灵活度差。
发明内容
本申请实施例提供了一种人脸图像的选择方法、装置、设备及存储介质,能够有效减少人脸优选过程所需时间,提升人脸优选过程的灵活性。
根据本申请实施例的一个方面,提供了一种人脸图像的选择方法,所述方法包括:
在获取到一帧人脸图像之后,检测所述人脸图像是否符合初步质量筛选条件;
响应于检测到符合所述初步质量筛选条件的第一人脸图像,确定所述第一人脸图像的质量整体分,所述质量整体分用于表征人脸图像的整体质量;
响应于所述第一人脸图像的质量整体分大于一级阈值,将所述第一人脸图像送入人脸识别流程。
根据本申请实施例的一个方面,提供了一种人脸图像的选择装置,所述装置包括:
质量初检模块,用于在获取到一帧人脸图像之后,检测所述人脸图像是否符合初步质量筛选条件;
整体分确定模块,用于响应于检测到符合所述初步质量筛选条件的第一人脸图像,确定所述第一人脸图像的质量整体分,所述质量整体分用于表征人脸图像的整体质量;
图像确定模块,用于响应于所述第一人脸图像的质量整体分大于一级阈值,将所述第一人脸图像送入人脸识别流程。
根据本申请实施例的一个方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述人脸图像的选择方法。
根据本申请实施例的一个方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述人脸图像的选择方法。
根据本申请实施例的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述人脸图像的选择方法。
本申请实施例提供的技术方案可以带来如下有益效果:
通过逐帧检测的方式对人脸图像进行初步筛选,并在初步筛选合格的情况下才进行质量整体分的确定,而不是死板地过滤掉前若干帧人脸图像,提高了人脸优选过程的灵活度。同时,根据初步质量筛选准确地确定出自动曝光调节状态是否结束,在自动曝光调节状态结束后,即可判断人脸图像的质量,相比于相关技术中,机械式地等待若干帧人脸图像才开始判断人脸图像的质量,能够减少一半以上耗时。另外,在人脸图像的整体质量合格的情况下即可将该人脸图像送入人脸识别流程,有效减少人脸优选整体所需时间,从而有助于缩短整个人脸识别过程的耗时,提升用户体验。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一个实施例提供的应用程序运行环境的示意图;
图2是本申请一个实施例提供的人脸图像的选择方法的流程图;
图3示例性示出了一种将第一人脸图像送入人脸识别流程的界面示意图;
图4示例性示出了一种停止人脸筛选流程时显示提示信息的界面示意图;
图5是本申请另一个实施例提供的人脸图像的选择方法的流程图;
图6示例性示出了一种人脸图像的初步筛选过程的示意图;
图7示例性示出了一种通过第一打分模型确定质量整体分流程的示意图;
图8示例性示出了一种根据质量归因分显示调整信息的界面示意图;
图9示例性示出了一种人脸质量评估基础能力的示意图;
图10示例性示出了一种人脸图像的选择方案的示意图;
图11示例性示出了一种人脸图像的选择方案对比的示意图;
图12是本申请一个实施例提供的第一打分模型的训练方法的流程图;
图13是本申请一个实施例提供的第二打分模型的训练方法的流程图;
图14示例性示出了一种训练第一打分模型以及第二打分模型的示意图;
图15示例性示出了一种对冲突样本的标签信息进行修正处理的示意图;
图16是本申请一个实施例提供的人脸图像的选择装置的框图;
图17是本申请另一个实施例提供的人脸图像的选择装置的框图;
图18是本申请一个实施例提供的计算机设备的结构框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
请参考图1,其示出了本申请一个实施例提供的方案实施环境的示意图。该方案实施环境可以实现成为一个人脸识别系统。该方案实施环境可以包括:终端10和服务器20。
终端10可以是诸如手机、平板电脑、多媒体播放设备、可穿戴设备、PC(Personal Computer)、人脸支付终端、人脸签到终端、智能摄像机等电子设备。终端10可以配置或者连接摄像头,通过该摄像头采集人脸视频数据。终端10中可以安装运行有应用程序的客户端,该应用程序可以包含有人脸识别功能。在本申请实施例中,对该应用程序的类型不作限定,如其可以是社交类应用程序、支付类应用程序、监控类应用程序、即时通信类应用程序、视频类应用程序、新闻资讯类应用程序、音乐类应用程序、购物类应用程序等。
服务器20可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云计算服务的云服务器。服务器20可以是上述应用程序的后台服务器,用于为该应用程序提供后台服务。
终端10和服务器20之间可以通过网络进行通信,本申请在此不做限制。
本申请实施例提供的人脸图像的选择方法,各步骤的执行主体可以是服务器20,也可以是终端10(如终端10中运行的应用程序的客户端),还可以是由终端10和服务器20交互配合执行。为了便于说明,在下述方法实施例中,仅以各步骤的执行主体为计算机设备进行介绍说明,但对此不构成限定。
在一个示例中,以刷脸支付作为典型应用举例说明,刷脸支付的应用场景包括但不限于自助终端支付场景、移动终端支付场景以及无人零售商店场景。在自助终端支付场景中,上述方法可应用于设置安装在大型商业综合体、超市、加油站、医院、自助售货机、校园等场所的收银设备中。在移动终端支付场景下,上述方法可应用于智能手机、可穿戴设备等移动终端中。在无人零售商店场景下,上述方法应该应用于无人零售商店的终端上,可通过在买单流程中增加人脸支付渠道,用户刷脸即可完成支付,节约了排队结账的时间,大幅提升用户体验。
随着人工智能技术和云技术的研究和进步,人工智能技术和云技术在多个领域展开研究和应用,上述人脸识别环境中的终端,如刷脸支付终端可以通过网络与云平台相连。终端中还设置有基于AI(Artificial Intelligence,人工智能)技术训练的人脸选择模块,可以执行本申请提供的人脸图像的选择方法,达到快速优选人脸图像的效果。
请参考图2,其示出了本申请一个实施例提供的人脸图像的选择方法的流程图。该方法可以包括以下几个步骤(201~203)。
步骤201,在获取到一帧人脸图像之后,检测人脸图像是否符合初步质量筛选条件。
示例性地,在每获取到一帧人脸图像之后,检测人脸图像是否符合初步质量筛选条件。也即,在每获取到一帧人脸图像,就对这一帧人脸图像进行初步质量筛选,从而实现初步质量的逐帧检测。
上述人脸图像是指包含人脸的待检测图像。可选地,上述人脸图像可从人脸视频流中获取,上述人脸视频流中的一个图像帧对应一张人脸图像。可选地,人脸图像是人脸视频流中的一个图像帧,或者人脸图像是图像帧中包含人脸的一部分图像区域。可选地,人脸视频流可以通过计算机设备自身进行采集得到。
上述初步质量筛选条件是用于初步判断人脸图像质量的条件,作为初步筛选人脸图像的依据。在人脸图像采集的开始阶段,人脸图像的采集设备,比如相机或者终端中的摄像头,往往需要经过自动曝光(Automatic Exposure,AE)的调节过程,使得人脸图像具有良好的亮度效果。上述自动曝光是相机根据光线的强弱自动调整曝光量,防止曝光过度或者不足。自动曝光通过调整镜头孔径,传感器曝光时间,传感器模拟增益和传感器/ISP(Image Signal Processing,图像信号处理)数字增益,使得在不同的照明条件和场景中实现欣赏亮度级别或所谓的目标亮度级别,从而使得捕获到的视频或图像既不太暗也不太亮。但在自动曝光的调节过程中采集的人脸图像往往因亮度问题而导致这部分人脸图像的质量不佳,因此通常不选取自动曝光的调节过程中采集的人脸图像作为人脸识别的图像,以此避免影响人脸识别的准确性。通过设置初步质量筛选条件可以过滤掉自动曝光的调节过程中采集的人脸图像,进而筛选出自动曝光调节过程结束后的人脸图像,减轻后续筛选人脸图像步骤的计算量。
步骤202,在检测到符合初步质量筛选条件的第一人脸图像时,确定第一人脸图像的质量整体分。
示例性地,上述第一人脸图像包括上述人脸视频流中第一个符合初步质量筛选条件的图像帧对应的人脸图像,例如,自动曝光调节过程结束后采集的第一帧人脸图像。
上述质量整体分用于表征人脸图像的整体质量。可选地,上述质量整体分与人脸图像的整体质量呈正相关关系,上述质量整体分越高,相应地,人脸图像的整体质量越好。
步骤203,若第一人脸图像的质量整体分大于一级阈值,则将第一人脸图像送入人脸识别流程。
上述一级阈值是预设值,用于作为是否将第一人脸图像送入人脸识别流程的判定依据。若第一人脸图像的质量整体分大于一级阈值,则说明第一人脸图像的整体质量良好,达到用作人脸识别的质量要求,可以将该第一人脸图像作为人脸识别的图像,即可送入人脸识别流程。上述一级阈值的设定可结合实际情况,根据经验或者实验数据进行设定,本申请实施例对一级阈值的取值以及取值依据不作限定。上述人脸识别是基于人的脸部特征信息进行身份识别的一种生物识别技术。
可选地,本申请实施例提供的人脸图像的选择方法可以应用于多种涉及人脸质量评估的场景,包括但不限于刷脸支付、摄像头成像质量审核、或者证件照质量审核等诸多应用场景。这里,仅以刷脸支付为例对上述内容作示例性说明,在刷脸支付过程中,可以根据用户的配合程度,将刷脸支付场景大致划分为三种:配合场景,半配合场景和非配合场景。上述配合场景是指大部分用户在支付时都是正常配合状态,因此支付设备采集到的人脸图像质量较好,可以作为人脸识别的图像。上述半配合场景是指由于环境或不可抗力因素,导致部分场景下采集到的用户支付时的人脸图像的整体质量不佳的场景。上述非配合场景是指用户在戴墨镜,或者扭头角度过大等非配合状态下进行刷脸支付的场景。此时通过设置一级阈值作为判定刷脸支付场景是否为配合场景的依据,只需进行一次判断,即人脸图像的质量整体分是否大于一级阈值,即可确定刷脸支付场景。若人脸图像的质量整体分大于一级阈值,则可确定用户是在配合场景下进行刷脸支付,此时采集的人脸图像可以送入人脸识别流程做人脸识别检测,保证大部分用户的人脸图像一次通过,缩短人脸图像的优选过程所耗时间。
可选地,若第一人脸图像的质量整体分等于一级阈值,则将第一人脸图像送入人脸识别流程。也即在本申请的一种实现方式中,第一人脸图像的质量整体分等于一级阈值的处理方式,和第一人脸图像的质量整体分大于一级阈值的处理方式相同。在本申请的另一种实现方式中,第一人脸图像的质量整体分等于一级阈值的处理方式,和第一人脸图像的质量整体分小于一级阈值的处理方式相同。值得说明的是,后续所涉及的和阈值比较的步骤,当所比较的分值和阈值相等时,处理方式既可以参照分值大于阈值时的处理方式,也可以参照分值小于阈值时的处理方式,本申请对此不做限制。
在一个示例中,如图3所示,其示例性示出了一种将第一人脸图像送入人脸识别流程的界面示意图。在显示界面30中显示有人脸识别处理的提示信息31以及第一人脸图像32。可选地,将第一人脸图像送入人脸识别流程前,人脸图像圆形显示区域33中显示动态的人脸视频流(图中未示出),将第一人脸图像送入人脸识别流程时,人脸图像圆形显示区域33中静止显示第一人脸图像32。
在示例性实施例中,上述步骤203之后还包括如下步骤。
步骤204,若第一人脸图像的质量整体分小于二级阈值,则停止人脸筛选流程,显示提示信息。
上述二级阈值是预设值,用于作为是否停止人脸筛选流程的判定依据。其中,二级阈值小于一级阈值。若第一人脸图像的质量整体分低于二级阈值,则说明第一人脸图像的整体质量较差,不能达到用作人脸识别的质量要求,即可停止人脸筛选流程。上述二级阈值的设定可结合实际情况,根据经验或者实验数据进行设定,本申请实施例对二级阈值的取值以及取值依据不作限定。同样,这里以刷脸支付为例对上述内容作示例性说明,通过设置二级阈值作为判定刷脸支付场景是否为非配合场景的依据,可有效拦截低质量的人脸图像。在刷脸支付的实际应用过程中,二级阈值往往较低,主要是针对一些非法的恶意网络攻击,通过设置二级阈值可以有效拦截这些恶意网络攻击所携带的图片,或者可以有效拦截用户在不配合状态下采集到的低质量人脸图像。可选地,二级阈值也可等于一级阈值。
上述提示信息用于提示用户计算机设备需要重新获取人脸图像,以及提示用户此时人脸筛选流程停止。在一个示例中,如图4所示,其示例性示出了一种停止人脸筛选流程时显示提示信息的界面示意图。在显示界面40中显示提示信息框41,在提示信息框中显示有提示停止人脸筛选流程时的信息内容42,提示信息框41中还包括退出控件43以及重新检测控件 44。
综上所述,本申请实施例提供的技术方案,通过逐帧检测的方式对人脸图像进行初步筛选,并在初步筛选合格的情况下才进行质量整体分的确定,而不是死板地过滤掉前若干帧人脸图像,提高了人脸优选过程的灵活度。同时,根据初步质量筛选准确地确定出自动曝光调节状态是否结束,在自动曝光调节状态结束后,即可判断人脸图像的质量,相比于相关技术中,机械式地等待若干帧人脸图像才开始判断人脸图像的质量,能够减少一半以上耗时。在人脸图像的整体质量合格的情况下即可将该人脸图像送入人脸识别流程,有效减少人脸优选所需时间,从而有助于缩短整个人脸识别过程的耗时,提升用户体验。
另外,在人脸图像的整体质量不合格的情况下,停止人脸筛选流程,有效拦截这些恶意网络攻击所携带的图片或者用户在不配合状态下采集到的低质量人脸图像。
请参考图5,其示出了本申请另一个实施例提供的人脸图像的选择方法的流程图。该方法可以包括以下几个步骤(501~517)。
步骤501,在每获取到一帧人脸图像之后,获取人脸图像的光线分。
上述光线分用于表征人脸图像的亮暗程度。可选地,光线分是判断上一实施例中所提及的自动曝光的调节过程是否结束的依据。
步骤502,根据人脸图像的光线分,检测人脸图像是否符合初步质量筛选条件。
可选地,通过自适应判断的方式检测人脸图像是否符合初步质量筛选条件。可选地,通过将人脸图像的光线分与光线分阈值进行比较检测人脸图像是否符合初步质量筛选条件。若人脸图像的光线分大于或者等于光线分阈值,则人脸图像符合初步质量筛选条件;若人脸图像的光线分低于光线分阈值,则人脸图像不符合初步质量筛选条件。上述光线分阈值是预设值,可根据自动曝光的参数、图像采集设备的参数或者环境参数中至少一项来确定,本申请实施例对此不作限定。
可选地,上述步骤501-502是人脸图像的初步筛选过程。在一个示例中,如图6所示,其示例性示出了一种人脸图像的初步筛选过程的示意图。图6中显示了人脸视频流61。其中,第六帧人脸图像62为符合初步质量筛选条件的第一帧人脸图像,此时可以判断人脸图像的采集设备的自动曝光调节过程结束,可以对检测第六帧图像62的质量整体分,若第六帧图像62的质量整体分高于一级阈值,则可判定人脸质量合格,进而提前结束人脸优选的过程。但通常设备是等待n帧63,选择最后一帧作为开始人脸优选过程的开始,这种方式没有判断人脸图像的采集设备的自动曝光调节状态,进而导致耗时增多。
步骤503,在检测到符合初步质量筛选条件的第一人脸图像时,调用第一打分模型。
上述第一打分模型是用于确定质量整体分的神经网络模型。可选地,上述第一打分模型是以残差网络(Residual Network,ResNet)为基础,结合压缩和激励网络(Squeeze-and-Excitation Networks,SENet)、分组卷积(Group Convolution)、非对称卷积网络(Asymmetric ConvolutionNet,ACNet)等结构,所构成的神经网络模型。
上述以残差网络为基础的卷积神经网络,其特点是容易优化,并且能够通过增加相当的深度来提高准确率。其内部的残差块使用了跳跃连接,缓解了在深度神经网络中增加深度带来的梯度消失问题。
上述分组卷积是对卷积神经网络输入的特征图(feature map)按照通道(channel)进行分组,然后每组分别卷积。通过分组卷积的方式,可使得神经网络模型中的参数量有效减少的同时,获得更好的模型应用效果。
上述非对称卷积网络是通过非对称卷积块(Asymmetric Convolution Blocks,ACB),替换标准方框卷积,如3×3的卷积块,构建的卷积神经网络。具体而言,对于d×d卷积,可构建一个包含三个并行分支d×d、1×d、d×1的ACB,三个分支的输出通过相加的方式来丰富特征空间。非对称卷积网络可在不引入额外参数以及不增加计算耗时的情况下,提升模型的准确性与表达力。
可选地,在检测到符合初步质量筛选条件的第一人脸图像时,得到第一人脸图像对应的梯度图像,并调用第一打分模型,将第一人脸图像以及第一人脸图像对应的梯度图像输入至第一打分模型中。上述梯度图像是指包含第一人脸图像的梯度信息的图像。可选地,图像可看成是二维离散函数,图像梯度其实就是对这个二维离散函数的求导。可选地,通过sobel算子对第一人脸图像进行处理,得到与第一人脸图像对应的梯度图像。
步骤504,通过第一打分模型确定第一人脸图像的质量整体分。
将第一人脸图像输入至第一打分模型,通过第一打分模型输出第一人脸图像的质量整体分。
可选地,第一人脸图像输入至第一打分模型之后,第一打分模型基于第一人脸图像获取第一人脸图像的通道信息,以及与第一人脸图像对应的特征图。可选地,第一打分模型基于第一人脸图像的通道信息以及与第一人脸图像对应的特征图作卷积处理。可选地,第一打分模型中的激活函数对输入内容进行处理,比如线性整流函数(Rectified Linear Unit,ReLU)。可选地,通过第一打分模型对输入数据作池化处理。可选地,上述第一打分模型对第一人脸图像进行处理后,输出第一人脸图像的质量整体分。
可选地,将第一人脸图像以及第一人脸图像对应的梯度图像输入至第一打分模型,通过第一打分模型输出第一人脸图像的质量整体分。这样,通过在输入模型时加入人脸图像对应的梯度图的先验信息,有利于提升模型对人脸图像细节的关注度,使得输出的人脸图像的质量整体分更加准确。
在一个示例中,如图7所示,其示例性示出了一种通过第一打分模型确定质量整体分流程的示意图。图7中示出了针对人脸图像71进行梯度图先验的过程72,以及结合压缩和激励网络的网络结构73,还有非对称卷积网络的网络结构74。
步骤505,判断第一人脸图像的质量整体分是否大于一级阈值;若是,则执行下述步骤506;若否,则执行下述步骤507。
步骤506,将第一人脸图像送入人脸识别流程。
步骤507,判断第一人脸图像的质量整体分是否小于二级阈值,若是,则结束人脸筛选流程;若否,则执行下述步骤508。
步骤508,获取下一帧人脸图像的质量整体分。
其中,初始的下一帧人脸图像是指第一人脸图像的下一帧人脸图像。可选地,下一帧人脸图像是当前人脸图像在人脸视频流中对应的图像帧的下一图像帧对应的人脸图像。
可选地,在第一人脸图像的质量整体分小于一级阈值的情况下,将第一人脸图像存入缓存区,并获取下一帧人脸图像的质量整体分。上述缓存区是指暂时置放输出或输入数据的内存。
步骤509,判断下一帧人脸图像的质量整体分是否大于一级阈值;若是,则执行下述步骤510;若否,则执行下述步骤511。
步骤510,将下一帧人脸图像送入人脸识别流程。
步骤511,判断下一帧人脸图像的质量整体分是否小于二级阈值,若是,则结束人脸筛选流程;若否,则执行下述步骤512。
可选地,在下一帧人脸图像的质量整体分小于一级阈值的情况下,将下一帧人脸图像存入缓存区,并再次从获取下一帧人脸图像的质量整体分的步骤开始执行。
在示例性实施例中,上述步骤511之后还包括如下步骤。
步骤512,判断是否存在连续n帧人脸图像的质量整体分小于一级阈值且高于二级阈值,若是,则执行下述步骤513,若否,则再次从上述步骤508开始执行。
步骤513,从连续n帧人脸图像中选择质量整体分最高的第二人脸图像。
上述n为大于1的正整数。可选地,n的值为预设值,可根据实际情况对n取值,本申请实施例对此不作限定。示例性地,n为5。
可选地,上述步骤512-513还可以由如下方式实现:若缓存区中的n帧人脸图像的质量 整体分小于一级阈值,则从缓存区中的n帧人脸图像中选择质量整体分最高的第二人脸图像。
步骤514,确定第二人脸图像的质量归因分。
需要说明的是,在本申请实施例中,当存在连续n帧人脸图像的质量整体分小于一级阈值时,是先判断第二人脸图像的质量整体分是否高于二级阈值,然后判断质量归因分是否符合条件的;在其他实施例中,当存在连续n帧人脸图像的质量整体分小于一级阈值时,也可以先判断第二人脸图像的质量归因分是否符合条件,然后再判断质量整体分是否高于二级阈值。
针对一级阈值与二级阈值的介绍请参考上一实施例中有关一级阈值与二级阈值的介绍内容,此处不再赘述。此处针对二级阈值小于一级阈值的情况。
上述质量归因分包括多个质量参考维度上的质量分数,将人脸图像的质量体现在多个质量参考维度上,通过质量归因分可以直观看出该人脸图像在某一质量参考维度上的质量的好坏程度。上述质量参考维度是衡量人脸图像质量好坏的参考分量,用于更加细化地评价人脸图像质量。可选地,质量参考维度包括角度维度、模糊维度、遮挡维度以及光线维度中至少一个维度。
在示例性实施例中,上述步骤514中确定第二人脸图像的质量归因分的过程可由如下步骤实现。
步骤514a,调用第二打分模型,第二打分模型是用于确定质量归因分的机器学习模型。
第二打分模型是用于确定质量归因分的神经网络模型。上述第二打分模型的结构与上述第一打分模型结构类似,关于第二打分模型的结构可参考上述第一打分模型所述内容,这里不再赘述。
步骤514b,通过第二打分模型确定第二人脸图像的质量归因分。
可选地,上述质量归因分包括角度分、模糊分、遮挡分以及光线分中至少一种。其中,角度分用于表征人脸图像的人脸角度,模糊分用于表征人脸图像的模糊程度,遮挡分用于表征人脸图像的遮挡情况,光线分用于表征人脸图像的亮暗程度。
可选地,上述角度分、模糊分、遮挡分以及光线分与图像质量具有相关关系,至于具体具有何种相关关系,比如正相关关系或者负相关关系,可根据实际情况来制定,本申请实施例对此不作限定。
步骤515,判断第二人脸图像的质量归因分是否符合条件,若是,则执行下述步骤516;若否,则执行下述步骤517。
步骤516,将第二人脸图像送入人脸识别流程。
可选地,上述第二人脸图像的质量归因分符合条件是指任意项质量归因分均符合与其对应的条件。这里以质量归因分包括角度分、模糊分、遮挡分以及光线分为例进行说明,第二人脸图像的质量归因分符合条件便是指角度分、模糊分、遮挡分以及光线分均符合各自对应的阈值条件,例如角度分符合角度分阈值条件,且模糊分符合模糊分阈值条件,且遮挡分符合遮挡分阈值条件,且光线分符合光线分阈值条件。
步骤517,根据质量归因分显示调整信息。上述第二人脸图像的质量归因分不符合条件是指任一项质量归因分不符合条件,例如质量归因分包括角度分、模糊分、遮挡分以及光线分,其中只要有一项的分数不符合与其对应的阈值条件,即可认定第二人脸图像的质量归因分不符合条件。上述调整信息是指提示用户做出调整以提升人脸图像质量的信息。在一个示例中,如图8所示,其示例性示出了一种根据质量归因分显示调整信息的界面示意图。图8中示出了3种显示调整信息的界面81、82以及83。在界面81中显示的调整信息84的内容是请不要遮挡面部;在界面82中显示的调整信息85的内容是请摘掉眼镜或帽子;在界面83中显示的调整信息86的内容是请保持脸部端正。
在一个示例中,如图9所示,其示例性示出了一种人脸质量评估基础能力的示意图。图9中人脸图像中人脸大角度的情况至正脸的情况,其角度分是随之增加的。人脸图像由模糊至清晰的情况下,其模糊分也随之逐渐增加。人脸图像由存在严重遮挡的场景至无遮挡的场 景,其遮挡分与是随之增加的。人脸图像由欠曝光、正常至过曝光的情况下,其光照分也是随之增加的。
在一个示例中,如图10所示,其示例性示出了一种人脸图像的选择方案的示意图。其中,在配合场景下进行人脸优选的过程体现在图10中虚线框1010所圈出的部分,在半配合场景下进行人脸优选的过程体现在图10中虚线框1020所圈出的部分,此时需要判断人脸图像的质量归因分,依次判断人脸图像的光线分、模糊分、角度分以及遮挡分,其中有任一项质量归因分不合格,便可认定此时处于非配合场景,在非配合场景下进行人脸优选的过程主要体现在图10中虚线框1030所圈出的部分,此时设备根据质量归因分提示用户作出相应的调整,例如光线过亮、光线过暗、人脸模糊、人脸遮挡以及人脸大角度等提示信息。其中,因人脸遮挡与人脸大角度在某些情况下较为类似,可通过比较角度分与遮挡分的大小关系,确定此时导致人脸图像质量较低的原因,若角度分大于遮挡分,可取定此时存在人脸扭转角度过大的问题,若角度分小于遮挡分,则可确定此时存在人脸遮挡的问题。
下面介绍本申请技术方案的一种典型实施方式,进而对本申请技术方案所带来的有益效果作充分说明。以刷脸支付场景为例,完成人脸识别的完整过程通常包括三个阶段,分别是视频取流阶段、人脸优选阶段以及人脸识别阶段。
常规技术方案在视频取流阶段所采用的方法是,在采集到的人脸视频流中过滤掉固定帧数的人脸图像之后,再进入人脸优选阶段判断人脸图像的质量,以此过滤掉图像采集设备在自动曝光调节状态下所采集的质量不佳的人脸图像。比如固定过滤人脸视频流的前20帧人脸图像,从第21帧人脸图像开始进入人脸优选过程。而实际的刷脸支付场景大多是配合场景,图像采集设备的自动曝光调节很短,常规技术方案无法自行判断自动曝光调节已经结束,仍是过滤固定帧数的人脸图像后开始人脸优选,会浪费部分有用的人脸图像帧,从而导致耗时增长。而本申请技术方案在视频取流阶段所采用的方法是根据图像亮度自适应判断图像采集设备的自动曝光调节状态,只要存在亮度符合条件的人脸图像,即可判断该人脸图像的质量。比如,自动曝光调节过程在第8帧的时候就已结束,本申请技术方案通过自适应判断即可确定第8帧的人脸图像亮度符合条件,进而判断第8帧的人脸图像质量,无需等到第21帧再开始,在视频取流阶段有效减少一半以上耗时。
常规技术方案在人脸优选阶段所采用的方法是,从人脸视频流中缓存固定帧数的人脸图像进行检测,从中优选一帧质量好的人脸图像,若该人脸图像无法通过人脸识别再从人脸视频流中缓存固定帧数的人脸图像,并重复执行上述步骤,最终将选取的图像送入人脸识别流程。比如,从人脸视频流中缓存第21-25帧的人脸图像,分别检测这五帧人脸图像的质量,之后选取一张质量好的人脸图像或者继续缓存后面五帧人脸图像。而本申请技术方案在人脸优选阶段所采用的方法是先根据整体质量逐帧计算人脸图像的质量整体分,只要人脸图像的质量整体分高于阈值即可送入人脸识别流程,若出现连续n帧人脸图像的质量整体分低于阈值的情况,可从多个维度计算其中质量整体分最高的人脸图像的质量归因分,分析人脸图像质量低的原因,并提示用户作出相应调整,提升用户体验并培养用户正确的使用习惯。比如,在第8帧的人脸图像亮度符合条件时,计算第8帧人脸图像的质量整体分,若第8帧人脸图像的质量整体分高于阈值即可将第8帧人脸图像送入人脸识别流程。
这里还可以仅对人脸优选阶段进行对比,假设常规技术方案与本申技术方案的人脸优选流程在人脸视频流中的起始位置相同,均是第21帧,常规技术方案取第21-25帧判断五帧的人脸质量,而本申请技术方案是逐帧检测,立即从第21帧开始计算质量整体分,若第21帧为质量较佳的人脸图像,本申请即可立即将第21帧人脸图像送入人脸识别流程,而常规技术需要计算五帧再将第21帧选出,再送入人脸识别流程,此情况之下本申请技术方案比常规技术方案的速度快5倍,即使是最差情况也和常规方案的检测次数相似,因此可以有效提升人脸优选速度,最终缩短完整人脸识别过程所耗时长。
这里可参见表1所提供的实验统计数据,表1从消耗时长的角度对本申请技术方案和常规技术方案作了对比。经实验统计发现,应用常规技术方案完成刷脸支付所需消耗时长大约 在3.05秒左右,应用本申请技术方案完成刷脸支付所需消耗时长大约在1.37秒左右,相比于常规技术方案,本申请技术方案将刷脸支付时间缩短了一半以上。
表1
方案 常规方案 本申请
消耗时长(秒) 3.05左右 1.37左右
在一个示例中,如图11所示,其示例性示出了一种人脸图像的选择方案对比的示意图。传统技术方案1102中是针对自动曝光调节过程所采用的方案是固定过滤20帧;针对判断人脸图像质量的所采用的方案是缓存5帧,优选1帧,以逐帧的方式判断质量样本,拦截效果差;而本实施例提供的技术方案1101针对自动曝光调节过程所采用的方案是自适应判断AE结束时间;针对判断人脸图像质量的所采用的方案是逐帧判断图像质量,高于阈值则送去识别;并且使用质量归因分,可覆盖角度、模糊、遮挡、光线维度去判断人脸图像质量,效果明显优于常规方案。
综上所述,本申请实施例提供的技术方案将图像亮度合格作为满足初步筛选的条件,之后再通过对第一打分模型输出人脸图像的质量整体分,在连续多帧人脸图像的质量整体分小于一级阈值的情况下,通过第二打分模型输出人脸图像的质量归因分,可从多个维度判断人脸图像质量,在质量归因分满足条件的情况下,也可将该人脸图像送入人脸识别流程,有效减少人脸优选所需时间。
另外,在质量归因分不满足条件的情况下,还可以根据质量归因分分析出人脸图像质量不达标的原因所在,提示用户做出相应调整。
在示例性实施例中,如图12所示,第一打分模型的训练方法包括以下几个步骤(1201~1204)。
步骤1201,获取训练样本。
训练样本包括样本人脸图像以及与样本人脸图像对应的标准人脸图像。上述样本人脸图像是包含样本人脸的图像。上述与样本人脸图像对应的标准人脸图像是指用于作为参照的样本人脸对应的高质量图像。可选地,样本人脸图像是包含样本人脸的生活照片。可选地,标准人脸图像是样本人脸对应的证件照片。
步骤1202,获取样本人脸图像与标准人脸图像之间的相似度。
上述相似度可体现上述样本人脸图像与标准人脸图像两者之间的相似性,一般通过计算样本人脸图像与标准人脸图像各自对应的特征向量之间的距离确定。可选地,上述步骤1102包括如下几个子步骤。
步骤1202a,对样本人脸图像进行特征识别处理,得到样本人脸图像的特征信息。
上述特征识别处理是指识别样本人脸图像中样本人脸的特征信息的处理,上述样本人脸图像的特征信息反映样本人脸的信息丰富度。
可选地,通过人脸特征识别模型对样本人脸图像进行特征识别处理,得到样本人脸图像的特征。上述人脸特征识别模型是用于识别人脸特征信息的数学模型。
步骤1202b,对标准人脸图像进行特征识别处理,得到标准人脸图像的特征信息。可选地,通过人脸特征识别模型对标准人脸图像进行特征识别处理,得到标准人脸图像的特征。
步骤1202c,基于样本人脸图像的特征信息以及标准人脸图像的特征信息,得到样本人脸图像与标准人脸图像之间的相似度。
对样本人脸图像的特征信息以及标准人脸图像的特征信息作比较处理,计算样本人脸图像与标准人脸图像之间的相似度。上述比较处理是指对比样本人脸图像的特征信息以及标准人脸图像的特征信息中的相似性的处理过程。可选地,通过计算样本人脸图像的特征与标准人脸图像的特征之间的距离,反映样本人脸图像与标准人脸图像之间的相似度。可选地,上述样本人脸图像的特征向量与标准人脸图像的特征向量之间的距离包括欧式距离(Eucledian Distance)、曼哈顿距离(Manhattan Distance)、明可夫斯基距离(MinkowskiDistance)、余弦 相似度(Cosine Similarity)等反映两个特征向量之间相似度的距离,本申请实施例对此不作限定。可选地,通过皮尔森相关系数(Pearson Correlation Coefficient)度量样本人脸图像与标准人脸图像之间的相似性。在统计学中,皮尔逊相关系数又称皮尔逊积矩相关系数(Pearson Product-Moment Correlation Coefficient,简称PPMCC或PCCs),是用于度量两个变量之间的相关程度(线性相关),其值介于-1与1之间。两个变量之间的皮尔逊相关系数定义为两个变量之间的协方差和标准差的商。
上述相似度用于确定样本人脸图像的第一标签信息。第一标签信息是指质量整体分的标签信息。可选地,将相似度作为样本人脸图像的质量整体分,并记录为样本人脸图像的第一标签信息,反映样本人脸图像的整体质量。样本人脸图像的相似度越高,样本人脸图像的质量整体分越高,样本人脸图像的整体质量就越好。
可选地,将样本人脸图像的特征记为f(I k)。将标准人脸图像的特征记为f(I 0)。将样本人脸图像与标准人脸图像之间的相似度记为S k,将样本人脸图像的标签信息的质量整体分记为Q k,通过下式可以得到样本人脸图像与标准人脸图像之间的相似度S k以及样本人脸图像的标签信息的质量整体分记为Q k
Figure PCTCN2021107182-appb-000001
通过将样本人脸图像与标准人脸图像之间的相似度作为样本人脸图像的标签信息,这样可以直接从通过特征识别处理中自动生成样本人脸图像的质量整体分的标签,省去了样本人脸图像的标注成本,并以此训练第一打分模型,最终可以在无标准人脸图像参考的情况下,得到一张图片的质量整体分。
步骤1203,确定样本人脸图像的第一标签信息。
可选地,将上述相似度,即样本人脸图像的质量整体分作为样本人脸图像的第一标签信息。
步骤1204,基于样本人脸图像的第一标签信息,对第一打分模型进行训练。
可选地,将标记有第一标签信息的样本人脸图像输入至第一打分模型中,通过第一打分模型输出样本人脸图像的预测质量整体分。上述预测质量整体分是指通过第一打分模型输出的对样本人脸图像进行预测的质量整体分。
可选地,通过设置与第一打分模型对应的损失函数,对第一打分模型进行约束,提升第一打分模型的准确性。可选地,将均方误差(Mean-Square Error,MSE)与皮尔森相关系数相结合,构造与第一打分模型对应的损失函数,这样可基于识别的样本人脸图像的特征的线性回归以及区间保序的方式来对预测质量整体分进行拟合。可选地,该损失函数可由如下公式体现:
Figure PCTCN2021107182-appb-000002
其中,X为预测的质量整体分,Y为标签值,μX、μY为均值,σX,σY为方差。通过MSE损失函数对质量整体分进行约束,并且为了保证整体的一致有序性,加入了皮尔逊相关系数约束样本的整体保序性。相应的,损失函数的值越低,与其对应的第一打分模型的准确性越好,即样本人脸图像的标签信息中的质量整体分与预测质量整体分越接近。
在示例性实施例中,如图13所示,第二打分模型的训练方法包括以下几个步骤(1301~1302)。
步骤1301,获取训练样本。
训练样本包括样本人脸图像以及样本人脸图像的第二标签信息。第二标签信息包括多个质量参考维度上的质量档位信息。上述质量档位信息用于反映样本人脸图像在某一质量参考维度上的质量好坏程度。可选地,将每个质量参考维度对应的质量档位划分为五档,即在每个质量参考维度下将样本人脸图像划分为五个档次。仅通过标记样本人脸图像所在的质量档 位作为样本人脸图像的弱监督信息,即第二标签信息,使得第二打分模型自己去学习每个质量参考维度下质量档位内部的序关系分布,从而得到每个质量参考维度的分数,解决在变量连续的条件下标记训练样本难度大的问题。
可选地,第二标签信息的标签值反映样本人脸图像分布在某一质量参考维度下的概率,比如在质量档位为五档时,第二标签信息中角度分的标签值的取值范围可以是0、0.25、0.5、0.75、1。具体来讲,第二标签信息中包括角度分、模糊分、光线分以及遮挡分各自对应的标签值,例如,角度分为0、模糊分1、光线分为0.25,遮挡分为0.5。
步骤1302,基于样本人脸图像的第二标签信息,对第二打分模型进行训练。
将携带有第二标签信息的样本人脸图像输入至第二打分模型,通过第二打分模型输出样本人脸图像的质量归因分。
可选地,通过设置与第二打分模型对应的损失函数,对第二打分模型进行约束,提升第二打分模型的准确性。可选地,基于高斯混合模型(Gaussian Mixture Model,GMM),设计一种弱监督训练损失函数-高斯混合损失函数GMM Loss。上述高斯混合模型是用高斯概率密度函数(正态分布曲线)精确地量化事物,它是一个将事物分解为若干的基于高斯概率密度函数(正态分布曲线)形成的模型。可选地,混合高斯模型使用K个高斯模型来表征样本人脸图像在各质量参考维度下的质量。
可选地,第二打分模型对应的损失函数的公式如下:
Figure PCTCN2021107182-appb-000003
Figure PCTCN2021107182-appb-000004
其中x i为输入图片,
Figure PCTCN2021107182-appb-000005
为第z i类均值,∑z i为z i类方差,p(z i)为第z i类的概率,k为种类,K为类型数量,p(k)为第k类的概率。
可选地,第二打分模型的损失函数可根据训练样本的标签值与第二打分模型输出的预测值之间的差值选取。若训练样本的标签值与第二打分模型输出的预测值之间的差值大于预设阈值,选取基于均方误差构造的第二打分模型的损失函数对第二打分模型进行约束。若训练样本的标签值与第二打分模型输出的预测值之间的差值小于或者等于预设阈值,选取基于高斯混合模型和交叉熵(Cross Entropy)构造的第二打分模型的损失函数对第二打分模型进行约束。上述交叉熵用于度量两个概率分布间的差异性信息。
在一个示例中,如图14所示,其示例性示出了一种训练第一打分模型以及第二打分模型的示意图。其中,通过将请求照1401与证件照1402之间的相似度1403作为样本的第一标签值,将携带有第一标签值的请求照1401送入第一打分模型1404,得到上述请求照1401各自对应的质量整体分。此外,首先将训练样本按照角度、模糊、遮挡以及光线四个维度划分为四类,其次再将每个维度下的人脸图像划分为五档,得到第二打分模型1406的训练样本1405,再将携带有档位信息的训练样本1405送入第二打分模型1406,得到训练样本1405中各图片的质量归因分。
在示例性实施例中,上述第一打分模型的训练方法或者上述第二打分模型的训练方法还包括如下步骤。
步骤1,获取训练样本中的冲突样本。
冲突样本是指质量整体分与质量归因分存在冲突的训练样本。比如,质量整体分大于一级阈值,但质量归因分不符合条件的样本人脸图像;或者质量归因分符合条件,但质量整体分小于一级阈值的样本人脸图像。
步骤2,对冲突样本的标签信息进行修正处理。
可选地,通过梯度下降树(Gradient Boosting Decision Tree,GDBT)算法对冲突样本的标签信息进行修正处理,重新标记冲突样本中样本人脸图像的第一标签信息以及第二标签信 息,使得冲突样本预测后的质量整体分与质量归因分不再冲突。
在一个示例中,如图15所示,其示例性示出了一种对冲突样本的标签信息进行修正处理的示意图。首先对训练样本分别进行总分(即质量整体分)预标注以及归因(质量归因分)预标注,将样本分别送入质量总分模型(第一打分模型)以及质量归因模型(第二打分模型)中,通过人工的方式获取冲突样本,基于冲突样本制定总分修正策略函数G(z),以及归因修正策略函数H(g),进而得到二代总分标签以及二代归因标签。
步骤3,得到修正后的训练样本。
上述修正后的训练样本用于重新对上述第一打分模型以及第二打分模型进行训练,以得到预测分数更加准确的第一打分模型以及第二打分模型。
综上所述,本申请实施例所提供的技术方案,通过将样本图像与标准图像之间的相似度作为第一打分模型的标签值,大大减少了样本标注成本,基于均方误差与皮尔森相关系数相结合,构造与第一打分模型对应的损失函数,得到更加精准的第一打分模型,提升人脸整体质量预测的准确性。
此外,通过将样本图像按照角度、模糊、遮挡以及光线四个维度划分为四类,其次再将每个维度下的人脸图像划分为不同档位并将档位信息作为样本的弱监督信息,训练第二打分模型,使得第二打分模型输出连续的质量归因分,解决在变量连续的条件下标记训练样本难度大的问题,通过设计基于高斯混合模型的弱监督训练损失函数,使得第二打分模型更加准确。
并且,通过寻找冲突样本并对其进行修正,重新训练上述第一打分模型以及第二打分模型,进一步提升了模型预测人脸图像质量的准确性。
下述为本申请装置实施例,可用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图16,其示出了本申请一个实施例提供的装置的框图。该装置具有实现上述方法的功能。该装置1600可以包括:质量初检模块1601、整体分确定模块1602以及图像确定模块1603。
质量初检模块1601,用于在获取到一帧人脸图像之后,检测所述人脸图像是否符合初步质量筛选条件。
整体分确定模块1602,用于响应于检测到符合所述初步质量筛选条件的第一人脸图像,确定所述第一人脸图像的质量整体分,所述质量整体分用于表征人脸图像的整体质量。
图像确定模块1603,用于响应于所述第一人脸图像的质量整体分大于一级阈值,将所述第一人脸图像送入人脸识别流程。
在示例性实施例中,所述质量初检模块1601用于:
获取所述人脸图像的光线分,所述光线分用于表征所述人脸图像的亮暗程度;
根据所述人脸图像的光线分,检测所述人脸图像是否符合所述初步质量筛选条件。
在示例性实施例中,所述整体分确定模块1602用于:
调用第一打分模型,所述第一打分模型是用于确定所述质量整体分的神经网络模型;
通过所述第一打分模型确定所述第一人脸图像的质量整体分。
在示例性实施例中,所述第一打分模型的训练过程如下:获取训练样本,所述训练样本包括样本人脸图像以及与所述样本人脸图像对应的标准人脸图像;获取所述样本人脸图像与所述标准人脸图像之间的相似度,所述相似度用于确定所述样本人脸图像的第一标签信息,所述第一标签信息是指所述质量整体分的标签信息;基于所述样本人脸图像的第一标签信息,对所述第一打分模型进行训练。
在示例性实施例中,请参考图17,所述装置1600还包括逐帧检测模块1604,用于:若所述第一人脸图像的质量整体分小于所述一级阈值,则获取下一帧人脸图像的质量整体分;其中,初始的下一帧人脸图像是指所述第一人脸图像的下一帧人脸图像;若所述下一帧人脸 图像的质量整体分大于所述一级阈值,则将所述下一帧人脸图像送入所述人脸识别流程;若所述下一帧人脸图像的质量整体分小于所述一级阈值,则再次从所述获取下一帧人脸图像的质量整体分的步骤开始执行。
在示例性实施例中,请参考图17,所述装置1600还包括:图像选择模块1605以及归因分确定模块1606。
图像选择模块1605,用于若连续n帧人脸图像的质量整体分小于所述一级阈值,则从所述连续n帧人脸图像中选择所述质量整体分最高的第二人脸图像。
归因分确定模块1606,用于若所述第二人脸图像的质量整体分大于二级阈值,则确定所述第二人脸图像的质量归因分,所述质量归因分包括多个质量参考维度上的质量分数;其中,所述二级阈值小于所述一级阈值。
所述图像确定模块1603,用于若所述第二人脸图像的质量归因分符合条件,则将所述第二人脸图像送入所述人脸识别流程。
在示例性实施例中,所述归因分确定模块1606用于:调用第二打分模型,所述第二打分模型是用于确定所述质量归因分的神经网络模型;
通过所述第二打分模型确定所述第二人脸图像的质量归因分,所述质量归因分包括角度分、模糊分、遮挡分以及光线分中至少一种;
其中,所述角度分用于表征所述人脸图像的人脸角度,所述模糊分用于表征所述人脸图像的模糊程度,所述遮挡分用于表征所述人脸图像的遮挡情况,所述光线分用于表征所述人脸图像的亮暗程度。
在示例性实施例中,所述第二打分模型的训练过程如下:获取训练样本,所述训练样本包括样本人脸图像以及所述样本人脸图像的第二标签信息,所述第二标签信息包括所述多个质量参考维度上的质量档位信息;基于所述样本人脸图像的第二标签信息,对所述第二打分模型进行训练。
在示例性实施例中,归因分确定模块1606,还用于响应于所述第二人脸图像的质量归因分不符合条件,根据所述质量归因分显示调整信息,所述调整信息是指提示用户做出调整以提升人脸图像质量的信息。
在示例性实施例中,所述第一打分模型以及第二打分模型的训练过程还包括:获取所述训练样本中的冲突样本,所述冲突样本是指质量整体分与质量归因分存在冲突的训练样本;对所述冲突样本的标签信息进行修正处理。
在示例性实施例中,请参考图17,所述装置1600还包括筛选停止模块1607,用于:若所述第一人脸图像的质量整体分小于二级阈值,则停止人脸筛选流程,显示提示信息,所述提示信息用于提示用户计算机设备需要重新获取所述人脸图像;其中,所述二级阈值小于所述一级阈值。
综上所述,本申请实施例所提供的技术方案,通过逐帧检测的方式对人脸图像进行初步筛选,提高了人脸优选过程的灵活度,之后再确定通过初步筛选的人脸图像的质量整体分以反映人脸图像的整体质量,在人脸图像的整体质量合格的情况下即可将该人脸图像送入人脸识别流程,有效减少人脸优选所需时间,从而有助于缩短整个人脸识别过程的耗时,提升用户体验。
请参考图18,其示出了本申请一个实施例提供的计算机设备1800的结构框图。该计算机设备1800可以是诸如手机、平板电脑、多媒体播放设备、可穿戴设备、PC(Personal Computer)、人脸支付终端、人脸签到终端、智能摄像机等电子设备。该终端用于实施上述实施例中提供的方法。该计算机设备可以是图1所示应用程序运行环境中的终端10或者服务器20。
通常,计算机设备1800包括有:处理器1801和存储器1802。
处理器1801可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理 器1801可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1801也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1801可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1801还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1802可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1802还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器1802中的非暂态的计算机可读存储介质用于存储至少一个指令,至少一段程序、代码集或指令集,所述至少一条指令、至少一段程序、代码集或指令集,且经配置以由一个或者一个以上处理器执行,以实现上述人脸图像的选择方法。
在一些实施例中,计算机设备1800还可选包括有:外围设备接口1803和至少一个外围设备。处理器1801、存储器1802和外围设备接口1803之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口1803相连。具体地,外围设备包括:射频电路1804、显示屏1805、摄像头组件1806、音频电路1807、定位组件1808和电源1809中的至少一种。
本领域技术人员可以理解,图18中示出的结构并不构成对计算机设备1800的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
在示例性实施例中,还提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集在被处理器执行时以实现上述人脸图像的选择方法。
在示例性实施例中,还提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集在被处理器执行时以实现上述人脸图像的选择方法。
可选地,该计算机可读存储介质可以包括:ROM(Read Only Memory,只读存储器)、RAM(Random Access Memory,随机存取记忆体)、SSD(Solid State Drives,固态硬盘)或光盘等。其中,随机存取记忆体可以包括ReRAM(Resistance Random Access Memory,电阻式随机存取记忆体)和DRAM(Dynamic Random Access Memory,动态随机存取存储器)。
在示例性实施例中,还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述人脸图像的选择方法。
在示例性实施例中,还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述人脸图像的选择方法。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。另外,本文中描述的步骤编号,仅示例性示出了步骤间的一种可能的执行先后顺序,在一些其它实施例中,上述步骤也可以不按照编号顺序来执行,如两个不同编号的步骤同时执行,或者两个 不同编号的步骤按照与图示相反的顺序执行,本申请实施例对此不作限定。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (16)

  1. 一种人脸图像的选择方法,其特征在于,由计算机设备执行,所述方法包括:
    在获取到一帧人脸图像之后,检测所述人脸图像是否符合初步质量筛选条件;
    响应于检测到符合所述初步质量筛选条件的第一人脸图像,确定所述第一人脸图像的质量整体分,所述质量整体分用于表征人脸图像的整体质量;
    响应于所述第一人脸图像的质量整体分大于一级阈值,将所述第一人脸图像送入人脸识别流程。
  2. 根据权利要求1所述的方法,其特征在于,所述检测所述人脸图像是否符合初步质量筛选条件,包括:
    获取所述人脸图像的光线分,所述光线分用于表征所述人脸图像的亮暗程度;
    根据所述人脸图像的光线分,检测所述人脸图像是否符合所述初步质量筛选条件。
  3. 根据权利要求1或2所述的方法,其特征在于,所述确定所述第一人脸图像的质量整体分,包括:
    调用第一打分模型,所述第一打分模型是用于确定所述质量整体分的神经网络模型;
    通过所述第一打分模型确定所述第一人脸图像的质量整体分。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:按照如下训练过程训练第一打分模型:
    获取训练样本,所述训练样本包括样本人脸图像以及与所述样本人脸图像对应的标准人脸图像;
    获取所述样本人脸图像与所述标准人脸图像之间的相似度,所述相似度用于确定所述样本人脸图像的第一标签信息,所述第一标签信息是指所述质量整体分的标签信息;
    基于所述样本人脸图像的第一标签信息,对所述第一打分模型进行训练。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,在所述确定所述第一人脸图像的质量整体分之后,所述方法还包括:
    响应于所述第一人脸图像的质量整体分小于所述一级阈值,获取下一帧人脸图像的质量整体分;其中,所述下一帧人脸图像是指所述第一人脸图像的下一帧人脸图像;
    响应于所述下一帧人脸图像的质量整体分大于所述一级阈值,则将所述下一帧人脸图像送入所述人脸识别流程;
    响应于所述下一帧人脸图像的质量整体分小于所述一级阈值,再次从所述获取下一帧人脸图像的质量整体分的步骤开始执行。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述方法还包括:
    响应于连续n帧人脸图像的质量整体分小于一级阈值,确定第二人脸图像的质量整体分及质量归因分是否符合条件,所述第二人脸图像为所述连续n帧人脸图像中质量整体分最高的人脸图像,所述质量归因分包括多个质量参考维度上的质量分数,n为大于1的正整数;
    响应于所述第二人脸图像的质量整体分及质量归因分符合条件,将所述第二人脸图像送入人脸识别流程。
  7. 根据权利要求6所述的方法,其特征在于,所述确定第二人脸图像的质量整体分及质量归因分是否符合条件,包括:
    确定所述第二人脸图像的质量整体分是否小于二级阈值,其中,所述二级阈值小于所述 一级阈值;
    响应于所述第二人脸图像的质量整体分大于所述二级阈值,确定所述第二人脸图像的质量归因分,所述质量归因分包括多个质量参考维度上的质量分数。
  8. 根据权利要求7所述的方法,其特征在于,所述确定所述第二人脸图像的质量归因分,包括:
    调用第二打分模型,所述第二打分模型是用于确定所述质量归因分的神经网络模型;
    通过所述第二打分模型确定所述第二人脸图像的质量归因分,所述质量归因分包括角度分、模糊分、遮挡分以及光线分中至少一种;
    其中,所述角度分用于表征所述人脸图像的人脸角度,所述模糊分用于表征所述人脸图像的模糊程度,所述遮挡分用于表征所述人脸图像的遮挡情况,所述光线分用于表征所述人脸图像的亮暗程度。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:按照如下训练过程训练第二打分模型:
    获取训练样本,所述训练样本包括样本人脸图像以及所述样本人脸图像的第二标签信息,所述第二标签信息包括所述多个质量参考维度上的质量档位信息;
    基于所述样本人脸图像的第二标签信息,对所述第二打分模型进行训练。
  10. 根据权利要求7至9任一项所述的方法,其特征在于,所述方法还包括:
    响应于所述第二人脸图像的质量归因分不符合条件,根据所述质量归因分显示调整信息,所述调整信息是指提示用户做出调整以提升人脸图像质量的信息。
  11. 根据权利要求4或9所述的方法,其特征在于,所述方法还包括:
    获取所述训练样本中的冲突样本,所述冲突样本是指质量整体分与质量归因分存在冲突的训练样本;
    对所述冲突样本的标签信息进行修正处理。
  12. 根据权利要求1至11任一项所述的方法,其特征在于,在所述确定所述第一人脸图像的质量整体分时,所述方法还包括:
    停止检测所述第一人脸图像之后获取的人脸图像是否符合所述初步质量筛选条件。
  13. 根据权利要求1至12任一项所述的方法,其特征在于,在所述确定所述第一人脸图像的质量整体分之后,所述方法还包括:
    响应于所述第一人脸图像的质量整体分小于二级阈值,停止人脸筛选流程,显示提示信息,所述提示信息用于提示用户所述计算机设备需要重新获取所述人脸图像;
    其中,所述二级阈值小于所述一级阈值。
  14. 一种人脸图像的选择装置,其特征在于,所述装置包括:
    质量初检模块,用于在获取到一帧人脸图像之后,检测所述人脸图像是否符合初步质量筛选条件;
    整体分确定模块,用于响应于检测到符合所述初步质量筛选条件的第一人脸图像,确定所述第一人脸图像的质量整体分,所述质量整体分用于表征人脸图像的整体质量;
    图像确定模块,用于响应于所述第一人脸图像的质量整体分大于一级阈值,将所述第一人脸图像送入人脸识别流程。
  15. 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如权利要求1至13任一项所述的方法。
  16. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至13任一项所述的方法。
PCT/CN2021/107182 2020-08-25 2021-07-19 人脸图像的选择方法、装置、设备及存储介质 WO2022042135A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/964,730 US20230030267A1 (en) 2020-08-25 2022-10-12 Method and apparatus for selecting face image, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010863256.0A CN111738243B (zh) 2020-08-25 2020-08-25 人脸图像的选择方法、装置、设备及存储介质
CN202010863256.0 2020-08-25

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/964,730 Continuation US20230030267A1 (en) 2020-08-25 2022-10-12 Method and apparatus for selecting face image, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022042135A1 true WO2022042135A1 (zh) 2022-03-03

Family

ID=72658776

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/107182 WO2022042135A1 (zh) 2020-08-25 2021-07-19 人脸图像的选择方法、装置、设备及存储介质

Country Status (3)

Country Link
US (1) US20230030267A1 (zh)
CN (1) CN111738243B (zh)
WO (1) WO2022042135A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738243B (zh) * 2020-08-25 2020-11-20 腾讯科技(深圳)有限公司 人脸图像的选择方法、装置、设备及存储介质
US11810398B2 (en) * 2020-11-16 2023-11-07 Nec Corporation Face clustering with image uncertainty
CN112669289A (zh) * 2020-12-30 2021-04-16 科大讯飞股份有限公司 影像质量评估方法、装置、电子设备及存储介质
CN113076969B (zh) * 2021-02-24 2022-10-25 上海辛玮智能科技有限公司 基于混合高斯损失函数的图像目标检测方法
CN113591704B (zh) * 2021-07-30 2023-08-08 四川大学 体重指数估计模型训练方法、装置和终端设备
CN113706502B (zh) * 2021-08-26 2023-09-05 重庆紫光华山智安科技有限公司 一种人脸图像质量评估方法及装置
CN113792682B (zh) * 2021-09-17 2024-05-10 平安科技(深圳)有限公司 基于人脸图像的人脸质量评估方法、装置、设备及介质
CN114254716B (zh) * 2022-03-02 2022-05-27 浙江鹏信信息科技股份有限公司 一种基于用户行为分析的高危操作识别方法及系统
CN115953819B (zh) * 2022-12-28 2023-08-15 中国科学院自动化研究所 人脸识别模型的训练方法、装置、设备和存储介质
WO2024166250A1 (ja) * 2023-02-08 2024-08-15 日本電気株式会社 情報処理装置、情報処理方法、プログラム

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171256A (zh) * 2017-11-27 2018-06-15 深圳市深网视界科技有限公司 人脸图像质评模型构建、筛选、识别方法及设备和介质
CN108509969A (zh) * 2017-09-06 2018-09-07 腾讯科技(深圳)有限公司 数据标注方法及终端
CN109215010A (zh) * 2017-06-29 2019-01-15 沈阳新松机器人自动化股份有限公司 一种图像质量判断的方法及机器人人脸识别系统
US20190065906A1 (en) * 2017-08-25 2019-02-28 Baidu Online Network Technology (Beijing) Co., Ltd . Method and apparatus for building human face recognition model, device and computer storage medium
CN110298310A (zh) * 2019-06-28 2019-10-01 深圳市商汤科技有限公司 图像处理方法及装置、电子设备和存储介质
CN110826519A (zh) * 2019-11-14 2020-02-21 深圳市华付信息技术有限公司 人脸遮挡检测方法、装置、计算机设备及存储介质
CN111738243A (zh) * 2020-08-25 2020-10-02 腾讯科技(深圳)有限公司 人脸图像的选择方法、装置、设备及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9613058B2 (en) * 2014-12-17 2017-04-04 Adobe Systems Incorporated Neural network image curation control
CN106921829A (zh) * 2015-12-25 2017-07-04 北京奇虎科技有限公司 一种拍照方法和装置及拍照设备
CN109711311B (zh) * 2018-12-20 2020-11-20 北京以萨技术股份有限公司 一种基于动态人脸最优帧选取方法
CN111241927A (zh) * 2019-12-30 2020-06-05 新大陆数字技术股份有限公司 级联式人脸图像优选方法、系统、设备及可读存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215010A (zh) * 2017-06-29 2019-01-15 沈阳新松机器人自动化股份有限公司 一种图像质量判断的方法及机器人人脸识别系统
US20190065906A1 (en) * 2017-08-25 2019-02-28 Baidu Online Network Technology (Beijing) Co., Ltd . Method and apparatus for building human face recognition model, device and computer storage medium
CN108509969A (zh) * 2017-09-06 2018-09-07 腾讯科技(深圳)有限公司 数据标注方法及终端
CN108171256A (zh) * 2017-11-27 2018-06-15 深圳市深网视界科技有限公司 人脸图像质评模型构建、筛选、识别方法及设备和介质
CN110298310A (zh) * 2019-06-28 2019-10-01 深圳市商汤科技有限公司 图像处理方法及装置、电子设备和存储介质
CN110826519A (zh) * 2019-11-14 2020-02-21 深圳市华付信息技术有限公司 人脸遮挡检测方法、装置、计算机设备及存储介质
CN111738243A (zh) * 2020-08-25 2020-10-02 腾讯科技(深圳)有限公司 人脸图像的选择方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN111738243A (zh) 2020-10-02
US20230030267A1 (en) 2023-02-02
CN111738243B (zh) 2020-11-20

Similar Documents

Publication Publication Date Title
WO2022042135A1 (zh) 人脸图像的选择方法、装置、设备及存储介质
US20200334830A1 (en) Method, apparatus, and storage medium for processing video image
US10133921B2 (en) Methods and apparatus for capturing, processing, training, and detecting patterns using pattern recognition classifiers
CN108197618B (zh) 用于生成人脸检测模型的方法和装置
WO2021179851A1 (zh) 图像处理方法、装置、终端及存储介质
TW202004637A (zh) 一種風險預測方法、存儲介質和伺服器
CN111918130A (zh) 视频封面确定方法、装置、电子设备及存储介质
WO2020253508A1 (zh) 异常细胞检测方法、装置及计算机可读存储介质
CN108810406B (zh) 人像光效处理方法、装置、终端及计算机可读存储介质
CN114463586A (zh) 图像识别模型的训练与图像识别方法、装置、设备和介质
CN104182721A (zh) 提升人脸识别率的图像处理系统及图像处理方法
US20230169554A1 (en) System and method for automated electronic catalogue management and electronic image quality assessment
CN111080746B (zh) 图像处理方法、装置、电子设备和存储介质
CN111225162B (zh) 图像曝光的控制方法、系统、可读存储介质及摄像设备
CN112308802A (zh) 一种基于大数据的图像分析方法及系统
CN116012841A (zh) 一种基于深度学习的开集图像场景匹配方法及装置
CN112102175A (zh) 图像对比度的增强方法、装置、存储介质及电子设备
CN115471439A (zh) 显示面板缺陷的识别方法、装置、电子设备及存储介质
US20230066331A1 (en) Method and system for automatically capturing and processing an image of a user
CN111899239A (zh) 图像处理方法和装置
CN111046232A (zh) 一种视频分类方法、装置及系统
CN115830362A (zh) 图像处理方法、装置、设备、介质及产品
CN103971111B (zh) 图像处理方法以及终端设备
Ramkissoon et al. Scene and Texture Based Feature Set for DeepFake Video Detection
CN118506407B (zh) 基于随机颜色丢弃和注意力的轻量行人重识别方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859975

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 10.07.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21859975

Country of ref document: EP

Kind code of ref document: A1