WO2021259033A1 - Facial recognition method, electronic device, and storage medium - Google Patents

Facial recognition method, electronic device, and storage medium Download PDF

Info

Publication number
WO2021259033A1
WO2021259033A1 PCT/CN2021/098156 CN2021098156W WO2021259033A1 WO 2021259033 A1 WO2021259033 A1 WO 2021259033A1 CN 2021098156 W CN2021098156 W CN 2021098156W WO 2021259033 A1 WO2021259033 A1 WO 2021259033A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
feature
image
images
multiple frames
Prior art date
Application number
PCT/CN2021/098156
Other languages
French (fr)
Chinese (zh)
Inventor
丁肇臻
侯春华
申光
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to BR112022026549A priority Critical patent/BR112022026549A2/en
Publication of WO2021259033A1 publication Critical patent/WO2021259033A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • This application relates to the field of image processing technology, and in particular to a face recognition method, electronic equipment, and storage medium.
  • face recognition is widely used in various application scenarios such as security monitoring, criminal arrest, and crowd statistics analysis.
  • face recognition is susceptible to interference from various external noises in the actual application process. For example: face deflection; large side face; motion blur and out-of-focus blur; face has obstructions (such as masks, sunglasses); low light intensity and contrast; artificial blocks generated by the encoding and decoding process of video transmission, etc. Due to the interference of noise, the accuracy of face recognition is greatly reduced, which limits the application and development of face recognition technology.
  • the embodiments of the present application provide a face recognition method, an electronic device, and a storage medium, which can reduce the influence of noise interference on the accuracy of face recognition, thereby improving the success rate of face recognition.
  • an embodiment of the present application provides a face recognition method, which includes: extracting multiple frames of face images containing a target face from a video stream; extracting face features of the multiple frames of the face images, respectively, Obtain a first face feature; perform feature enhancement on the first face feature, and fuse the enhanced first face feature to obtain a second face feature; combine the second face feature with pre-stored The third face feature is compared to determine the face recognition result.
  • an embodiment of the present application provides an electronic device that includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor executes the program, the claims are as stated above. The steps of the face recognition method described.
  • an embodiment of the present application provides a computer-readable storage medium storing a computer program, which when executed by a processor, implements the steps of the face recognition method described above.
  • FIG. 1 is a flowchart of a face recognition method provided by an embodiment of the present application
  • FIG. 2 is a sub-flow chart of step S100 in FIG. 1;
  • FIG. 3 is a sub-flow chart of step S110 in FIG. 2;
  • Fig. 4 is a sub-flow chart of step S130 in Fig. 2;
  • Fig. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • multiple means two or more, greater than, less than, exceeding, etc. are understood to not include the number, and above, below, and within are understood to include the number. If there are descriptions of "first”, “second”, etc., only for the purpose of distinguishing technical features, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The precedence of technical characteristics.
  • Fig. 1 shows a flowchart of a face recognition method provided by an embodiment of the present application. As shown in FIG. 1, the method includes but is not limited to the following steps S100 to S400.
  • Step S100 Extract multiple frames of face images containing the target face from the video stream.
  • the video can be collected through the front-end camera, and then the video stream output by the camera is subjected to subsequent processing to obtain multiple frames of face images containing the target face.
  • extracting a multi-frame face image containing a target face from a video stream can be implemented through steps S110 to S130 as shown in FIG. 2.
  • Step S110 Extract multiple frames of first face images containing the target face from the video stream.
  • step S110 may be specifically implemented by steps S111 and S112 as shown in FIG. 3.
  • Step S111 Perform face detection on the video stream, and obtain face position information of the target face in the current frame of the video stream.
  • a face detection network such as Multi-tasks Cascade Neural Network (MTCNN) and RetinaFace may be used to obtain the position information of the target face in the video screen of the current frame.
  • the location information may be information such as the location information of the key points of the face and the face boundary information.
  • Step S112 Perform face trajectory tracking according to the face position information, and extract multiple frames of first face images containing the target face from the video stream.
  • the position of the target face in the video frame of the current frame obtained during face detection can be used to predict the position of the target face in the next frame of the video frame, thus achieving face trajectory tracking.
  • the image of the target face can be intercepted from the multi-frame video images of the video stream, so as to obtain a series of face trajectory images containing the target face, and use a series of face trajectory images as Multiple frames of the first face image.
  • the key point position information of the face may specifically include multiple contour point position information.
  • the multiple contour point position information may include left eye position information, right eye position information, nose position information, left mouth corner position information, and right mouth corner position information.
  • step S110 may further include step S113, according to the position information of multiple contour points, calibrating the angle of the target face in the first face image.
  • the first face image composed of a series of face trajectory images obtained through face trajectory tracking may be part of the first face image.
  • the angle of the target face is oblique.
  • the above-mentioned contour point position information can be used to achieve calibration of the target face in the first face image.
  • the aforementioned multiple contour point position information may be input into the face calibration algorithm, and the face calibration algorithm may be used to perform tilt correction on the target face in the first face image.
  • Step S120 Perform face quality analysis and processing on the first face images of multiple frames to obtain the prior information of the face of each frame of the first face image.
  • a lightweight face quality evaluation algorithm is used to perform face quality analysis processing on each frame of the first face image, and obtain the prior information of the face corresponding to each frame of the first face image.
  • multiple dimensions of face quality evaluation may be performed on each frame of the first face image, and the face prior information obtained in this way includes multiple different types of index parameters.
  • the index parameters may include three types of index parameters: blur degree parameters, deflection angle parameters, and resolution parameters.
  • the lightweight face feature extraction model can be used to obtain the feature length of the first face image, and the fuzzy degree parameters can be determined according to the obtained feature length.
  • the larger the feature length the more the blur degree.
  • LBP local feature binarization LBP can be used to binarize the first face image, output the face symmetry index, and determine the deflection angle parameter according to the face symmetry index, for example, when the symmetry index is 1, it is represented as Face angle, deflection angle is 0
  • use the left eye position information and right eye position information obtained during face detection to determine the interpupillary distance, and determine the resolution parameters according to the interpupillary distance.
  • the larger the interpupillary distance the higher the resolution
  • the smaller the interpupillary distance the lower the resolution.
  • the embodiment of the present application performs multi-dimensional quality evaluation on each frame of the first face image by using multiple different types of index parameters, so as to reflect the strength of the face detail features in the first face image in different dimensions.
  • a lightweight face quality scoring method is used to examine face quality in the three dimensions of blurring, face deflection angle, and resolution, and the obtained prior face information is used for subsequent selection quality on the one hand
  • Higher images are used for facial feature extraction to ensure that the extracted features have good richness and feature diversification; on the other hand, it can be used in subsequent feature enhancement links to improve the generalization of facial features.
  • Step S130 selecting multiple frames of second face images from the multiple frames of first face images according to the face prior information of each frame of the first face image.
  • step S130 may include steps S131 to S133 as shown in FIG. 4.
  • Step S131 Linearly weight multiple index parameters to obtain a global quality score, and obtain a first preset number of primary selected images from multiple frames of first face images according to the global quality score.
  • the global quality score may be obtained by linearly weighting multiple index parameters included in the face prior information in step S120.
  • the comprehensive quality evaluation of the first face image of each frame can be carried out, and the first face image of each frame can be ranked according to the global quality score, and the first face image with the highest ranking is selected as the primary selection image.
  • the number of primary selected images to be acquired can be determined by pre-setting the first preset number value.
  • the first preset quantity may be a percentage value, for example, the first preset quantity is set to 30%.
  • the face prior information contains Perform linear weighting calculation on multiple index parameters of, and obtain the global quality score of the first face image in each frame. Then, according to the global quality score of each frame of the first face image, the 100 frames of the first face image are ranked according to the score from high to low, and the top 30 first face images are taken as the primary selection image.
  • step S132 the first preset number of primary selection images are arranged and combined to obtain multiple primary selection image combinations, wherein each primary selection image combination includes a second preset number of primary selection images.
  • the second preset number can be set according to the number of second face images to be finally obtained, for example, the second preset number is set to 3.
  • the 30 primary selected images can be permuted and combined to obtain A primary selection image combination, each primary selection image combination contains 3 frames of primary selection images.
  • Step S133 Obtain the image discrimination degree parameter of each primary selection image combination according to the multiple index parameters, and select the final selection image combination from the multiple primary selection image combinations according to the image discrimination degree parameter.
  • the image discrimination degree parameter is used to characterize the difference of face detail features between the multiple frames of the primary images included in the primary image combination.
  • the greater the degree of differentiation of the face detail features the more available information the image combination contains.
  • the facial features extracted in this way show strong generalization and more generalization. Adapt to the face recognition system in open scenes.
  • the image distinguishing degree parameter of the primary selected image combination can be determined by calculating the cumulative distance between the images in the combination in multiple dimensions to determine the image distinguishing degree parameter of the current primary selected image combination.
  • the combination T1 contains three images numbered P1, P2, and P3, and calculate the distances between P1 and P2, P1 and P3, and P2 and P3 in each dimension. For example, calculate the distance between P1 and P2 in the three dimensions of blur degree, face deflection angle, and resolution as S1 (P1P2), S2 (P1P2), S3 (P1P2), and calculate the degree of blur of P1 and P3
  • the distances in the three dimensions of, face deflection angle, and resolution are S1 (P1P3), S2 (P1P3), and S3 (P1P3).
  • S1 S1(P1P2)+S2(P1P2)+S3(P1P2)+S1(P1P3)+S2(P1P3)+S3(P1P3)+S1(P2P3)+S2(P2P3)+S3(P2P3).
  • the initial selection image combination with the largest image discrimination degree parameter is selected as the final selection image combination.
  • step S134 the image included in the final selected image combination is used as the second face image.
  • the image included in the final selected image combination is used as the second face image.
  • the image T1 is selected as the final image combination
  • the images P1, P2, and P3 included in the combination T1 are used as the second face image.
  • Step S200 Perform face feature extraction on multiple frames of face images to obtain a first face feature.
  • the face image in step S200 may be multiple frames of second face images obtained through step S134.
  • a neural network may be used to extract face features of multiple frames of face images to obtain the first face feature.
  • the extracted first face feature includes a multi-dimensional face vector.
  • the neural network can use a facial feature extraction algorithm such as Resnet152 to output a set of 256-dimensional deep facial features. These features represent the original face image information encoding without feature enhancement.
  • step S300 feature enhancement is performed on the first face feature, and the enhanced first face feature is merged to obtain a second face feature.
  • a deep convolutional neural network is used to perform a dot product operation on the first face feature and face prior information to obtain the enhanced first face feature.
  • the face prior information is obtained by performing face quality analysis processing on the face image in the aforementioned step S120.
  • the 256-dimensional deep face features extracted from the second face image by the Resnet152 algorithm and the face prior information corresponding to the second face image (blur degree parameter, face deflection angle parameter, Resolution parameter) is input to the deep convolutional neural network, and the deep face feature and the face prior information are multiplied by the deep convolutional neural network to use the face prior information output by the face quality evaluation algorithm. Face features are enhanced.
  • the embodiment of the present application adopts a feature-level enhancement method.
  • the advantage of using feature-level enhancement is that the processing object is a set of multi-dimensional face vectors, and the amount of calculation is small, which can greatly improve the processing efficiency.
  • the deep convolutional neural network used to implement feature enhancement may be a series connection of two fully connected layers, which are trained on a face feature extraction data set to obtain a feature enhancement module, which is used to compensate the original features.
  • the three quality indicators output by the face quality scoring module reflect the strength of the face image in the three dimensions of blur, deflection angle and resolution. These indicators are used to control the feature enhancement module to enhance the original features through the dot multiplication operation. deal with.
  • the enhanced first face feature is merged to obtain the second face feature.
  • the enhanced first face feature can be merged through an average pooling operation to obtain the second face feature.
  • step S400 the second face feature is compared with the pre-stored third face feature to determine the face recognition result.
  • the European algorithm can be used to compare the second face feature with the pre-stored third face feature to determine the face recognition result.
  • Scene 1 Smart city night face monitoring scene
  • Step S501 Collect face image sets of fugitives, social idlers, and key surveillance personnel. These facial images are usually frontal, high-definition pictures, so no additional image processing is required. Use facial feature extraction algorithms to encode these facial images and store them to form a base database set.
  • Step S502 Obtain surveillance videos collected by the surveillance equipment in the surveillance area at night.
  • the surveillance area may be a residential area, a street, or a fixed area.
  • Surveillance video can be transmitted by online video streaming, or it can be saved locally. These video stream information will be transmitted to the back-end data processing module, ready for video image analysis.
  • Step S503 Perform face detection and trajectory tracking on the video information collected by each monitoring device to obtain a group of face trajectory images containing the target face.
  • step S504 a lightweight face quality evaluation algorithm is used to score each face image in the trajectory in the three dimensions of blur degree, deflection angle, and resolution. At the same time, the overall quality scoring results of the three indicators are also output.
  • the score is the linear weight of the three indicators, and the weighting coefficient is obtained by the regression method.
  • the quality of the face image is given by the global quality score, and the three indicators reflect the strength of face detail features in different dimensions.
  • step S505 according to the global quality score and the three-dimensional indicators, multiple images with relatively high quality and high degree of discrimination of face detail features are selected from the face trajectory images as the face candidate set.
  • Step S506 Perform face feature extraction on the images of the face candidate set, use feature-level enhancement methods to perform enhancement processing on the extracted face features, perform an average pooling operation on the enhanced face features, and collect the face candidates The facial features of all are merged, and the facial features that are finally used for subsequent comparison and matching are output.
  • Step S507 Compare the facial features output in step 506 with the facial features stored in the base database, and calculate the Euclidean distance between the two. When the distance is less than a certain threshold, the captured face is considered to be in the base database. Stored ID matches of fugitives or social idlers. Send a signal to the terminal device and display the recognition result on the display device.
  • Scenario 2 Monitoring and analysis scenario of personnel activity trajectory in a crowded environment
  • the use of personnel activity trajectory information to count personnel residence time, personnel density, and crowd flow has high economic value and social significance.
  • the trajectory of personnel is counted, and the flow of personnel is analyzed, so that evacuation channels can be rationally deployed and the transfer efficiency can be improved.
  • analyzing the residence time of personnel and the flow of people have important reference values for rationalizing the location of exhibition areas and arranging merchandise sales areas.
  • this example illustrates a system for monitoring and analyzing people's activity trajectories in crowded scenes.
  • the face recognition method provided in the embodiments of this application is applied to the system, the following method steps may be specifically included:
  • Step S601 Obtain surveillance videos of various surveillance devices in a public place over a period of time.
  • the places may specifically be public areas such as shopping malls, subway transfer centers, and airports.
  • Assign ID numbers corresponding to the surveillance videos collected by each surveillance device such as ID1, ID2, ..., IDN.
  • the video data collected by these monitoring equipment is transmitted to the background for video image analysis.
  • step S602 the face detection and face tracking methods are adopted to process the video stream collected by the monitoring device of each ID to obtain a face trajectory image set corresponding to the ID.
  • step S603 a lightweight face quality evaluation algorithm is used to score each face image in the face trajectory image set corresponding to the ID in the three dimensions of blurriness, deflection angle, and resolution. At the same time, the overall quality scoring results of the three indicators are also output.
  • Step S604 Generate a face candidate set corresponding to the ID according to the global quality score and three-dimensional indicators.
  • Step S605 Perform face feature extraction, enhancement and fusion on the images contained in the face candidate set corresponding to the ID, and output the face features corresponding to the ID;
  • each ID contains a certain number of facial features, and these facial features represent the number of people captured by this monitoring device during this period of time.
  • step S607 the personnel trajectory is saved in the database in the form of a time axis, or displayed on the interface for the operator to read or use.
  • the solutions provided by the embodiments of the present application aim at the situation that the traditional solutions are susceptible to various types of noise interference in an open monitoring scenario, and a large number of optimizations are performed on the data processing unit, which greatly improves the overall performance of the face recognition monitoring system.
  • the solutions provided by the embodiments of the present application aim at the situation that the traditional solutions are susceptible to various types of noise interference in an open monitoring scenario, and a large number of optimizations are performed on the data processing unit, which greatly improves the overall performance of the face recognition monitoring system.
  • a single face image captured by the face detection algorithm is often disturbed by noise, and there are often various types of defects in the details of the face.
  • the collected face image will often be affected by out-of-focus blur, and when the distance between the two is small, the face image Motion blur is often produced.
  • the face recognition method proposed in the present application uses multiple face images of one motion track of the same object to perform feature fusion extraction, which effectively avoids the lack of information that may occur in the traditional use of a single face image.
  • the prior information of the face image in multiple dimensions is used to enhance the face features, and the finally obtained face features have a high degree of generalization.
  • the face image is enhanced through feature enhancement. For some highly incomplete face images, such as yin and yang faces, large deflection angles, scarf masks, etc., it can also maintain considerable feature generalization.
  • the face recognition method proposed in this application is based on the principle of lightweight design and is used in the face quality evaluation, face feature fusion and face feature enhancement modules A lightweight deep neural convolutional network is used.
  • the face quality evaluation algorithm calculates in the three dimensions of blur, deflection angle and resolution, so as to avoid excessive occupation of system performance, and can better adapt to the real-time requirements of the face recognition monitoring system.
  • FIG. 5 shows an electronic device 70 provided by an embodiment of the present application. As shown in Fig. 5, the electronic device 70 includes but is not limited to:
  • the memory 72 is set to store programs
  • the processor 71 is configured to execute the program stored in the memory 72.
  • the processor 71 executes the program stored in the memory 72, the processor 71 is configured to execute the aforementioned face recognition method.
  • the processor 71 and the memory 72 may be connected by a bus or in other ways.
  • the memory 72 can be configured to store non-transitory software programs and non-transitory computer-executable programs, such as the face recognition method described in the embodiments of the present application.
  • the processor 71 executes the non-transitory software programs and instructions stored in the memory 72 to realize the aforementioned face recognition method.
  • the memory 72 may include a program storage area and a data storage area.
  • the program storage area may store an operating system and an application program required by at least one function; the data storage area may store and execute the aforementioned face recognition method.
  • the memory 72 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the memory 72 includes a memory remotely provided with respect to the processor 71, and these remote memories may be connected to the processor 71 via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the non-transitory software programs and instructions required to implement the above-mentioned face recognition method are stored in the memory 72, and when executed by one or more processors 71, the above-mentioned face recognition method is executed, for example, as described in FIG. 1
  • the method steps S100 to S400 are described in FIG. 2, the method steps S110 to S130 are described in FIG. 2, the method steps S111 to S113 are described in FIG. 3, and the method steps S131 to S134 are described in FIG. 4.
  • the embodiment of the present application also provides a storage medium storing computer-executable instructions, and the computer-executable instructions are used to execute the aforementioned face recognition method.
  • the storage medium stores computer-executable instructions
  • the computer-executable instructions are executed by one or more control processors 71, for example, executed by a processor 71 in the aforementioned electronic device 70, so that the aforementioned One or more processors 71 execute the aforementioned face recognition method, for example, execute the method steps S100 to S400 described in FIG. 1, the method steps S110 to S130 described in FIG. 2, and the method steps S111 to S113 described in FIG. , The method steps S131 to S134 described in FIG. 4.
  • the embodiments of the application include: extracting multiple frames of face images containing a target face from a video stream; performing face feature extraction on multiple frames of the face images to obtain first face features; The face features are feature-enhanced, and the enhanced first face features are merged to obtain the second face feature; the second face feature is compared with the pre-stored third face feature to determine the face Recognition results.
  • the technical solution provided by the embodiments of the present application performs face recognition based on the facial features extracted from multiple frames of face images, so that the face feature samples are richer and diversified, feature complementarity is achieved, and the information available for face recognition is more This overcomes the problem that the traditional method only performs face recognition based on the characteristics of a single image, and the recognition result is greatly affected by noise interference.
  • the embodiment of the present application also performs feature enhancement and fusion on the first face feature extracted from the multiple frames of the face image, so as to realize compensation for the face feature, and further improve the success rate and reliability of face recognition.
  • computer storage medium includes volatile and non-volatile data implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data).
  • Information such as computer-readable instructions, data structures, program modules, or other data.
  • Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other storage technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or Any other medium used to store desired information and that can be accessed by a computer.
  • communication media usually include computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as carrier waves or other transmission mechanisms, and may include any information delivery media. .

Abstract

A facial recognition method, an electronic device, and a storage medium. The method comprises: extracting, from a video stream, multiple facial image frames comprising a target face (S100); separately performing facial feature extraction on the multiple facial image frames to obtain a first facial feature (S200); performing feature enhancement on the first facial feature, and fusing the enhanced first facial feature to obtain a second facial feature (S300); and comparing the second facial feature with a pre-stored third facial feature to determine a facial recognition result (S400).

Description

人脸识别方法、电子设备以及存储介质Face recognition method, electronic equipment and storage medium
相关申请的交叉引用Cross-references to related applications
本申请基于申请号为202010587883.6、申请日为2020年6月24日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is filed based on a Chinese patent application with an application number of 202010587883.6 and an application date of June 24, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into this application by reference.
技术领域Technical field
本申请涉及图像处理技术领域,特别是涉及一种人脸识别方法、电子设备以及存储介质。This application relates to the field of image processing technology, and in particular to a face recognition method, electronic equipment, and storage medium.
背景技术Background technique
目前,人脸识别广泛应用在安防监控、犯罪抓捕、人流统计分析等多种应用场景中。但是,人脸识别在实际应用过程中容易受到外界各类噪声的干扰。比如:人脸偏转;大幅度侧脸;运动模糊和失焦模糊;人脸有遮挡物(例如口罩,墨镜);低的光照强度和对比度;视频传输由于编解码过程产生的人造块等等。由于受到噪声的干扰,造成人脸识别的精度大幅度下降,从而限制了人脸识别技术的应用发展。At present, face recognition is widely used in various application scenarios such as security monitoring, criminal arrest, and crowd statistics analysis. However, face recognition is susceptible to interference from various external noises in the actual application process. For example: face deflection; large side face; motion blur and out-of-focus blur; face has obstructions (such as masks, sunglasses); low light intensity and contrast; artificial blocks generated by the encoding and decoding process of video transmission, etc. Due to the interference of noise, the accuracy of face recognition is greatly reduced, which limits the application and development of face recognition technology.
发明内容Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this article. This summary is not intended to limit the scope of protection of the claims.
本申请实施例提供了一种人脸识别方法、电子设备以及存储介质,能够降低噪声干扰对人脸识别精度的影响,从而提高人脸识别的成功率。The embodiments of the present application provide a face recognition method, an electronic device, and a storage medium, which can reduce the influence of noise interference on the accuracy of face recognition, thereby improving the success rate of face recognition.
一方面,本申请实施例提供了一种人脸识别方法,包括:从视频流中提取出包含目标人脸的多帧人脸图像;对多帧所述人脸图像分别进行人脸特征 提取,得到第一人脸特征;对所述第一人脸特征进行特征增强,并对增强后的第一人脸特征进行融合,得到第二人脸特征;将所述第二人脸特征与预先存储的第三人脸特征进行比较,确定人脸识别结果。On the one hand, an embodiment of the present application provides a face recognition method, which includes: extracting multiple frames of face images containing a target face from a video stream; extracting face features of the multiple frames of the face images, respectively, Obtain a first face feature; perform feature enhancement on the first face feature, and fuse the enhanced first face feature to obtain a second face feature; combine the second face feature with pre-stored The third face feature is compared to determine the face recognition result.
另一方面,本申请实施例提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求如上所述的人脸识别方法的步骤。On the other hand, an embodiment of the present application provides an electronic device that includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor. When the processor executes the program, the claims are as stated above. The steps of the face recognition method described.
再一方面,本申请实施例提供了一种计算机可读存储介质,存储有计算机程序,该程序被处理器执行时实现如上所述的人脸识别方法的步骤。In another aspect, an embodiment of the present application provides a computer-readable storage medium storing a computer program, which when executed by a processor, implements the steps of the face recognition method described above.
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。Other features and advantages of the present application will be described in the following description, and partly become obvious from the description, or understood by implementing the present application. The purpose and other advantages of the application can be realized and obtained through the structures specifically pointed out in the description, claims and drawings.
附图说明Description of the drawings
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。The accompanying drawings are used to provide a further understanding of the technical solution of the present application, and constitute a part of the specification. Together with the embodiments of the present application, they are used to explain the technical solution of the present application, and do not constitute a limitation to the technical solution of the present application.
图1是本申请实施例提供的一种人脸识别方法的流程图;FIG. 1 is a flowchart of a face recognition method provided by an embodiment of the present application;
图2是图1中的步骤S100的子流程图;FIG. 2 is a sub-flow chart of step S100 in FIG. 1;
图3是图2中的步骤S110的子流程图;FIG. 3 is a sub-flow chart of step S110 in FIG. 2;
图4是图2中的步骤S130子流程图;Fig. 4 is a sub-flow chart of step S130 in Fig. 2;
图5是本申请实施例提供的一种电子设备的结构示意图。Fig. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施 例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.
应了解,在本申请实施例的描述中,多个(或多项)的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到“第一”、“第二”等只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。It should be understood that in the description of the embodiments of the present application, multiple (or multiple) means two or more, greater than, less than, exceeding, etc. are understood to not include the number, and above, below, and within are understood to include the number. If there are descriptions of "first", "second", etc., only for the purpose of distinguishing technical features, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The precedence of technical characteristics.
图1示出了本申请实施例提供的一种人脸识别方法的流程图。如图1所示,所述方法包括但不限于如下步骤S100至S400。Fig. 1 shows a flowchart of a face recognition method provided by an embodiment of the present application. As shown in FIG. 1, the method includes but is not limited to the following steps S100 to S400.
步骤S100,从视频流中提取出包含目标人脸的多帧人脸图像。Step S100: Extract multiple frames of face images containing the target face from the video stream.
在具体实现时,可以通过前端的摄像头完成视频的采集,然后对摄像头输出的视频流进行后续处理,获取包含目标人脸的多帧人脸图像。本申请实施例的步骤S100中,从视频流中提取出包含目标人脸的多帧人脸图像,可以通过如图2所示的步骤S110至S130实现。In a specific implementation, the video can be collected through the front-end camera, and then the video stream output by the camera is subjected to subsequent processing to obtain multiple frames of face images containing the target face. In step S100 of the embodiment of the present application, extracting a multi-frame face image containing a target face from a video stream can be implemented through steps S110 to S130 as shown in FIG. 2.
步骤S110,从视频流中提取出包含目标人脸的多帧第一人脸图像。Step S110: Extract multiple frames of first face images containing the target face from the video stream.
在一些示例中,步骤S110具体可以通过如图3所示的步骤S111和S112实现。In some examples, step S110 may be specifically implemented by steps S111 and S112 as shown in FIG. 3.
步骤S111,对视频流进行人脸检测,获取目标人脸在视频流当前帧的脸部位置信息。Step S111: Perform face detection on the video stream, and obtain face position information of the target face in the current frame of the video stream.
在一些示例中,可以采用如多任务级联神经网络(Multi-tasks cascade neural network,MTCNN)、RetinaFace等人脸检测网络,获取目标人脸在当前帧的视频画面的位置信息。其中,位置信息可以是诸如人脸关键点位置信息和人脸边界信息等信息。In some examples, a face detection network such as Multi-tasks Cascade Neural Network (MTCNN) and RetinaFace may be used to obtain the position information of the target face in the video screen of the current frame. Wherein, the location information may be information such as the location information of the key points of the face and the face boundary information.
步骤S112,根据脸部位置信息进行人脸轨迹跟踪,从视频流中提取出包 含目标人脸的多帧第一人脸图像。Step S112: Perform face trajectory tracking according to the face position information, and extract multiple frames of first face images containing the target face from the video stream.
在一些示例中,可以根据人脸检测时获取的目标人脸在当前帧的视频画面位置信息,预测目标人脸在下一帧视频画面中的位置,如此实现人脸轨迹跟踪。通过对目标人脸轨迹进行跟踪,可以从视频流的多帧视频画面中截取目标人脸的图像,如此得到包含目标人脸的一系列人脸轨迹图像,并将一系列的人脸轨迹图像作为多帧第一人脸图像。In some examples, the position of the target face in the video frame of the current frame obtained during face detection can be used to predict the position of the target face in the next frame of the video frame, thus achieving face trajectory tracking. By tracking the target face trajectory, the image of the target face can be intercepted from the multi-frame video images of the video stream, so as to obtain a series of face trajectory images containing the target face, and use a series of face trajectory images as Multiple frames of the first face image.
在一些实施例中,人脸关键点位置信息具体可以包括多个轮廓点位置信息。其中,多个轮廓点位置信息可以包括左眼位置信息、右眼位置信息、鼻子位置信息、左嘴角位置信息和右嘴角位置信息。In some embodiments, the key point position information of the face may specifically include multiple contour point position information. Wherein, the multiple contour point position information may include left eye position information, right eye position information, nose position information, left mouth corner position information, and right mouth corner position information.
对应的,如图3所示,步骤S110中还可以包括步骤S113,根据多个轮廓点位置信息,校准第一人脸图像中目标人脸的角度。Correspondingly, as shown in FIG. 3, step S110 may further include step S113, according to the position information of multiple contour points, calibrating the angle of the target face in the first face image.
在一些示例中,由于在采集的视频流中目标对象可能是活动的,因此通过人脸轨迹跟踪得到的由一系列人脸轨迹图像组成的多帧第一人脸图像中,可能存在部分图像中目标人脸的角度是倾斜的。如此,可以利用上述的轮廓点位置信息实现对第一人脸图像中的目标人脸进行校准。具体的,可以将上述的多个轮廓点位置信息输入至人脸校准算法中,利用人脸校准算法对第一人脸图像中的目标人脸进行倾斜校正。In some examples, since the target object may be active in the captured video stream, the first face image composed of a series of face trajectory images obtained through face trajectory tracking may be part of the first face image. The angle of the target face is oblique. In this way, the above-mentioned contour point position information can be used to achieve calibration of the target face in the first face image. Specifically, the aforementioned multiple contour point position information may be input into the face calibration algorithm, and the face calibration algorithm may be used to perform tilt correction on the target face in the first face image.
步骤S120,分别对多帧第一人脸图像进行人脸质量分析处理,得到每帧第一人脸图像的人脸先验信息。Step S120: Perform face quality analysis and processing on the first face images of multiple frames to obtain the prior information of the face of each frame of the first face image.
本申请实施例采用轻量级人脸质量评价算法对每一帧第一人脸图像进行人脸质量分析处理,得到对应于每一帧第一人脸图像的人脸先验信息。In this embodiment of the application, a lightweight face quality evaluation algorithm is used to perform face quality analysis processing on each frame of the first face image, and obtain the prior information of the face corresponding to each frame of the first face image.
具体的,可以对每一帧第一人脸图像进行多个维度的人脸质量评价,如此得到的人脸先验信息包括多个不同类型的指标参数。Specifically, multiple dimensions of face quality evaluation may be performed on each frame of the first face image, and the face prior information obtained in this way includes multiple different types of index parameters.
在一些示例中,指标参数可以包括模糊程度参数、偏转角参数和分辨率参数三种类型的指标参数。在具体实现时,可以利用轻量级人脸特征提取模型获取第一人脸图像的特征模长,并根据得到的特征模长确定模糊程度参数,一般来说,特征模长越大模糊程度越低;可以利用局部特征二值化LBP对第一人脸图像进行二值化处理,输出人脸对称性指数,并根据人脸对称性指数确定偏转角参数,比如对称性指数为1时表征为正脸角度,偏转角为0;利用人脸检测时得到的左眼位置信息与右眼位置信息确定瞳间距,并根据瞳间距确定分辨率参数,一般来说,瞳间距越大分辨率越高,瞳间距越小分辨率越低。In some examples, the index parameters may include three types of index parameters: blur degree parameters, deflection angle parameters, and resolution parameters. In specific implementation, the lightweight face feature extraction model can be used to obtain the feature length of the first face image, and the fuzzy degree parameters can be determined according to the obtained feature length. Generally speaking, the larger the feature length, the more the blur degree. Low; local feature binarization LBP can be used to binarize the first face image, output the face symmetry index, and determine the deflection angle parameter according to the face symmetry index, for example, when the symmetry index is 1, it is represented as Face angle, deflection angle is 0; use the left eye position information and right eye position information obtained during face detection to determine the interpupillary distance, and determine the resolution parameters according to the interpupillary distance. Generally speaking, the larger the interpupillary distance, the higher the resolution , The smaller the interpupillary distance, the lower the resolution.
应理解,本申请实施例不局限于上述三种指标参数,也可以包括其他不同类型指标参数,或者用其他不同类型指标参数替换上述三种指标参数中的任一种或多种,本申请实施例对此不作限定。It should be understood that the embodiments of this application are not limited to the above three index parameters, and may also include other different types of index parameters, or replace any one or more of the above three index parameters with other different types of index parameters. The implementation of this application The example does not limit this.
本申请实施例通过多个不同类型的指标参数对每一帧第一人脸图像进行多维的质量评价,从而能够反映第一人脸图像中的人脸细节特征在不同维度上的强弱情况。在前述的示例中,采用轻量级人脸质量评分方法,在模糊程度、人脸偏转角、分辨率三个维度上考察人脸质量,获得的人脸先验信息一方面用于后续选取质量较高的图像进行人脸特征提取,保证提取出的特征具有良好的丰富度,保证特征多样化;另一方面可以作用在后续的特征增强环节,以提高人脸特征的泛化性。The embodiment of the present application performs multi-dimensional quality evaluation on each frame of the first face image by using multiple different types of index parameters, so as to reflect the strength of the face detail features in the first face image in different dimensions. In the foregoing example, a lightweight face quality scoring method is used to examine face quality in the three dimensions of blurring, face deflection angle, and resolution, and the obtained prior face information is used for subsequent selection quality on the one hand Higher images are used for facial feature extraction to ensure that the extracted features have good richness and feature diversification; on the other hand, it can be used in subsequent feature enhancement links to improve the generalization of facial features.
步骤S130,根据每帧第一人脸图像的人脸先验信息,从多帧第一人脸图像中选出多帧第二人脸图像。具体的,步骤S130可以包括如图4所示的步骤S131至S133。Step S130, selecting multiple frames of second face images from the multiple frames of first face images according to the face prior information of each frame of the first face image. Specifically, step S130 may include steps S131 to S133 as shown in FIG. 4.
步骤S131,对多个指标参数线性加权得到全局质量评分,根据全局质量 评分从多帧第一人脸图像中获取第一预设数量的初选图像。Step S131: Linearly weight multiple index parameters to obtain a global quality score, and obtain a first preset number of primary selected images from multiple frames of first face images according to the global quality score.
在一些示例中,全局质量评分可以通过对步骤S120中的人脸先验信息所包含的多个指标参数进行线性加权计算得到。通过全局质量评分能够对每一帧第一人脸图像进行综合的质量评估,并根据全局质量评分对每一帧第一人脸图像进行排名,选取排名靠前的第一人脸图像作为初选图像。而且可以通过预先设定第一预设数量值确定要获取的初选图像数量。In some examples, the global quality score may be obtained by linearly weighting multiple index parameters included in the face prior information in step S120. Through the global quality score, the comprehensive quality evaluation of the first face image of each frame can be carried out, and the first face image of each frame can be ranked according to the global quality score, and the first face image with the highest ranking is selected as the primary selection image. Moreover, the number of primary selected images to be acquired can be determined by pre-setting the first preset number value.
作为示例,第一预设数量可以为百分比值,比如设定第一预设数量为30%。当通过步骤S110从视频流中提取出100帧第一人脸图像,以及根据步骤S120提供的方法获取每一帧第一人脸图像的人脸先验信息时,对人脸先验信息所包含的多个指标参数进行线性加权计算,得到每一帧第一人脸图像的全局质量评分。然后根据每一帧第一人脸图像的全局质量评分,按照评分从高至低对100帧第一人脸图像进行排名,取排名前30的第一人脸图像作为初选图像。As an example, the first preset quantity may be a percentage value, for example, the first preset quantity is set to 30%. When 100 frames of the first face image are extracted from the video stream through step S110, and the face prior information of each frame of the first face image is obtained according to the method provided in step S120, the face prior information contains Perform linear weighting calculation on multiple index parameters of, and obtain the global quality score of the first face image in each frame. Then, according to the global quality score of each frame of the first face image, the 100 frames of the first face image are ranked according to the score from high to low, and the top 30 first face images are taken as the primary selection image.
步骤S132,对第一预设数量的初选图像进行排列组合,得到多个初选图像组合,其中,每个初选图像组合中包含第二预设数量的初选图像。In step S132, the first preset number of primary selection images are arranged and combined to obtain multiple primary selection image combinations, wherein each primary selection image combination includes a second preset number of primary selection images.
继续沿用前述的示例,可以根据最终要获取的第二人脸图像数量设定第二预设数量,比如设定第二预设数量为3。如此可以对30帧初选图像进行排列组合,得到
Figure PCTCN2021098156-appb-000001
个初选图像组合,每个初选图像组合中包含了3帧初选图像。
Continuing to use the foregoing example, the second preset number can be set according to the number of second face images to be finally obtained, for example, the second preset number is set to 3. In this way, the 30 primary selected images can be permuted and combined to obtain
Figure PCTCN2021098156-appb-000001
A primary selection image combination, each primary selection image combination contains 3 frames of primary selection images.
步骤S133,根据多个指标参数获取每个初选图像组合的图像区分程度参数,并根据图像区分程度参数从多个初选图像组合中选出终选图像组合。Step S133: Obtain the image discrimination degree parameter of each primary selection image combination according to the multiple index parameters, and select the final selection image combination from the multiple primary selection image combinations according to the image discrimination degree parameter.
其中,图像区分程度参数用以表征初选图像组合所包含的多帧初选图像之间的人脸细节特征的差异性。一般来说,在保证人脸质量的前提下,人脸细节特征的区分程度越大,图像组合包含的可利用信息就越多,如此提取得到的人脸特征就表现出强的泛化性,更适应于开放场景下的人脸识别系统。Among them, the image discrimination degree parameter is used to characterize the difference of face detail features between the multiple frames of the primary images included in the primary image combination. Generally speaking, under the premise of ensuring the quality of the face, the greater the degree of differentiation of the face detail features, the more available information the image combination contains. The facial features extracted in this way show strong generalization and more generalization. Adapt to the face recognition system in open scenes.
初选图像组合的图像区分程度参数可以通过计算组合中的图像两两之间在多个维度上的累计距离,确定当前初选图像组合的图像区分程度参数。The image distinguishing degree parameter of the primary selected image combination can be determined by calculating the cumulative distance between the images in the combination in multiple dimensions to determine the image distinguishing degree parameter of the current primary selected image combination.
继续沿用前述的示例,假定当前初选图像组合为T1,组合T1中包含编号为P1、P2和P3的三幅图像,计算P1和P2、P1和P3、P2和P3分别在各个维度上的距离,比如:计算出P1和P2在模糊程度、人脸偏转角、分辨率三个维度上的距离分别为S1(P1P2)、S2(P1P2)、S3(P1P2),计算出P1和P3在模糊程度、人脸偏转角、分辨率三个维度上的距离分别为S1(P1P3)、S2(P1P3)、S3(P1P3),计算出P2和P3在模糊程度、人脸偏转角、分辨率三个维度上的距离分别为S1(P2P3)、S2(P2P3)、S3(P2P3),则初选图像组合T1的图像区分程度参数为:Continue to use the previous example, assuming that the current primary selection image combination is T1, the combination T1 contains three images numbered P1, P2, and P3, and calculate the distances between P1 and P2, P1 and P3, and P2 and P3 in each dimension. For example, calculate the distance between P1 and P2 in the three dimensions of blur degree, face deflection angle, and resolution as S1 (P1P2), S2 (P1P2), S3 (P1P2), and calculate the degree of blur of P1 and P3 The distances in the three dimensions of, face deflection angle, and resolution are S1 (P1P3), S2 (P1P3), and S3 (P1P3). Calculate P2 and P3 in the three dimensions of blur degree, face deflection angle, and resolution The distances above are respectively S1(P2P3), S2(P2P3), S3(P2P3), then the image discrimination degree parameter of the primary selected image combination T1 is:
S1=S1(P1P2)+S2(P1P2)+S3(P1P2)+S1(P1P3)+S2(P1P3)+S3(P1P3)+S1(P2P3)+S2(P2P3)+S3(P2P3)。S1=S1(P1P2)+S2(P1P2)+S3(P1P2)+S1(P1P3)+S2(P1P3)+S3(P1P3)+S1(P2P3)+S2(P2P3)+S3(P2P3).
根据计算得到的各个初选图像组合的图像区分程度参数,选取图像区分程度参数最大的初选图像组合作为终选图像组合。According to the calculated image discrimination degree parameters of each initial selection image combination, the initial selection image combination with the largest image discrimination degree parameter is selected as the final selection image combination.
步骤S134,将终选图像组合所包含的图像作为第二人脸图像。In step S134, the image included in the final selected image combination is used as the second face image.
当选出终选图像组合,将终选图像组合中所包含的图像作为第二人脸图像。比如选出组合T1作为终选图像组合,则将组合T1包含的图像P1、P2和P3作为第二人脸图像。When the final selected image combination is selected, the image included in the final selected image combination is used as the second face image. For example, if the combination T1 is selected as the final image combination, the images P1, P2, and P3 included in the combination T1 are used as the second face image.
步骤S200,对多帧人脸图像分别进行人脸特征提取,得到第一人脸特征。Step S200: Perform face feature extraction on multiple frames of face images to obtain a first face feature.
在一些示例中,步骤S200中的人脸图像可以是通过步骤S134得到的多帧第二人脸图像。In some examples, the face image in step S200 may be multiple frames of second face images obtained through step S134.
在一些示例中,可以使用神经网络对多帧人脸图像分别进行人脸特征提取,得到第一人脸特征。其中,提取到的第一人脸特征包括多维的人脸向量。 神经网络可以采用如Resnet152的人脸特征提取算法,如此输出一组256维的深度人脸特征。这些特征代表着原始未经过特征增强的人脸图像信息编码。In some examples, a neural network may be used to extract face features of multiple frames of face images to obtain the first face feature. Among them, the extracted first face feature includes a multi-dimensional face vector. The neural network can use a facial feature extraction algorithm such as Resnet152 to output a set of 256-dimensional deep facial features. These features represent the original face image information encoding without feature enhancement.
步骤S300,对第一人脸特征进行特征增强,并对增强后的第一人脸特征进行融合,得到第二人脸特征。In step S300, feature enhancement is performed on the first face feature, and the enhanced first face feature is merged to obtain a second face feature.
在一些示例中,使用深度卷积神经网络将第一人脸特征与人脸先验信息进行点乘操作,得到增强后的第一人脸特征。其中,人脸先验信息是在前述的步骤S120中通过对人脸图像进行人脸质量分析处理得到的。In some examples, a deep convolutional neural network is used to perform a dot product operation on the first face feature and face prior information to obtain the enhanced first face feature. Wherein, the face prior information is obtained by performing face quality analysis processing on the face image in the aforementioned step S120.
沿用前述的示例,将通过Resnet152算法从第二人脸图像提取得到的256维的深度人脸特征以及与第二人脸图像对应的人脸先验信息(模糊程度参数、人脸偏转角参数、分辨率参数)输入到深度卷积神经网络中,通过深度卷积神经网络对深度人脸特征与人脸先验信息进行点乘操作,以利用人脸质量评价算法输出的人脸先验信息对人脸特征进行增强处理。Following the previous example, the 256-dimensional deep face features extracted from the second face image by the Resnet152 algorithm and the face prior information corresponding to the second face image (blur degree parameter, face deflection angle parameter, Resolution parameter) is input to the deep convolutional neural network, and the deep face feature and the face prior information are multiplied by the deep convolutional neural network to use the face prior information output by the face quality evaluation algorithm. Face features are enhanced.
区别于传统的图像级增强方法,例如图像去模糊、超分辨率等,本申请实施例采用一种特征级的增强方法。相比图像级增强方法,采用特征级增强的好处在于,处理对象为一组多维的人脸向量,计算量小,从而可大大提高处理效率。Different from traditional image-level enhancement methods, such as image deblurring, super-resolution, etc., the embodiment of the present application adopts a feature-level enhancement method. Compared with image-level enhancement methods, the advantage of using feature-level enhancement is that the processing object is a set of multi-dimensional face vectors, and the amount of calculation is small, which can greatly improve the processing efficiency.
在一些示例中,用于实现特征增强的深度卷积神经网络可以为两个全连接层的串联,在人脸特征提取数据集上训练,得到特征增强模块,用于对原始特征进行补偿。人脸质量评分模块输出得到的三种质量指标,反应了人脸图像在模糊程度、偏转角以及分辨率三个维度上的强弱,这些指标通过点乘操作控制特征增强模块对原始特征进行增强处理。In some examples, the deep convolutional neural network used to implement feature enhancement may be a series connection of two fully connected layers, which are trained on a face feature extraction data set to obtain a feature enhancement module, which is used to compensate the original features. The three quality indicators output by the face quality scoring module reflect the strength of the face image in the three dimensions of blur, deflection angle and resolution. These indicators are used to control the feature enhancement module to enhance the original features through the dot multiplication operation. deal with.
当完成对第一人脸特征的增强处理后,对增强后的第一人脸特征进行融合,得到第二人脸特征。After the enhancement processing of the first face feature is completed, the enhanced first face feature is merged to obtain the second face feature.
具体的,可以通过平均池化操作对增强后的第一人脸特征进行融合,得到第二人脸特征。Specifically, the enhanced first face feature can be merged through an average pooling operation to obtain the second face feature.
步骤S400,将第二人脸特征与预先存储的第三人脸特征进行比较,确定人脸识别结果。In step S400, the second face feature is compared with the pre-stored third face feature to determine the face recognition result.
具体的,可以使用欧式算法对第二人脸特征与预先存储的第三人脸特征进行比较,确定人脸识别结果。Specifically, the European algorithm can be used to compare the second face feature with the pre-stored third face feature to determine the face recognition result.
以下结合具体的应用场景对本申请实施例提供的人脸识别方法作进一步的示例性说明。The face recognition method provided by the embodiment of the present application will be further exemplified below in conjunction with specific application scenarios.
场景一:智慧城市夜间人脸监控场景Scene 1: Smart city night face monitoring scene
在国家实施的智慧城市信息化建设中,智能安防监控系统扮演重要地位。传统的人脸识别监控系统在光照条件良好的晴天下表现出较好的性能。而在夜晚情况下,由于夜间场景复杂,亮度偏低、补光设备老化、角度配置不良、气温、雨雪等诸多原因,传统监控系统往往会出现识别精度大幅度下降的情况。而城市夜间监控系统建设对于通缉在逃人员、社会闲散人士的监控具有重要意义。基于这样的背景,本实例阐释了一种城市夜间布控场景下的人脸识别监控系统,当将本申请实施例提供的人脸识别方法应用于该系统中时,具体可以包括如下的方法步骤:In the smart city information construction implemented by the country, smart security monitoring systems play an important role. The traditional face recognition monitoring system shows better performance in sunny days with good lighting conditions. At night, due to complex night scenes, low brightness, aging of supplementary light equipment, poor angle configuration, temperature, rain and snow, and many other reasons, traditional monitoring systems often experience a significant decline in recognition accuracy. The construction of urban night monitoring system is of great significance to the monitoring of wanted persons and social idlers. Based on this background, this example illustrates a face recognition monitoring system in an urban nighttime deployment and control scene. When the face recognition method provided by the embodiment of the present application is applied to the system, the following method steps may be specifically included:
步骤S501,收集在逃人员,社会闲散人士、重点监控人员的人脸图像集。这些人脸图像通常情况下为正脸、高清图片,因而无需做额外的图像处理。采用人脸特征提取算法对这些人脸图像进行编码,入库形成底库集。Step S501: Collect face image sets of fugitives, social idlers, and key surveillance personnel. These facial images are usually frontal, high-definition pictures, so no additional image processing is required. Use facial feature extraction algorithms to encode these facial images and store them to form a base database set.
步骤S502,获取监控区域中的监控设备在夜间采集得到的监控视频,监控区域可以为小区、街道、等固定区域。监控视频可以采用在线视频流的传输方式,亦可以采用离线保存到本地的方式。这些视频流信息将传输到后端 的数据处理模块,准备进行视频图像分析。Step S502: Obtain surveillance videos collected by the surveillance equipment in the surveillance area at night. The surveillance area may be a residential area, a street, or a fixed area. Surveillance video can be transmitted by online video streaming, or it can be saved locally. These video stream information will be transmitted to the back-end data processing module, ready for video image analysis.
步骤S503,对各个监控设备采集的视频信息进行人脸检测和轨迹跟踪,获得包含目标人脸的一组人脸轨迹图像。Step S503: Perform face detection and trajectory tracking on the video information collected by each monitoring device to obtain a group of face trajectory images containing the target face.
步骤S504,采用轻量级人脸质量评价算法对轨迹中的每一张人脸图像在模糊程度、偏转角、分辨率三个维度上进行打分。同时,综合三种指标的全局质量评分结果也一并输出,其得分为三种指标的线性加权,加权系数由回归方法获得。人脸图像的质量由全局质量评分给出,而三种指标则反应了人脸细节特征在不同维度上的强弱情况。In step S504, a lightweight face quality evaluation algorithm is used to score each face image in the trajectory in the three dimensions of blur degree, deflection angle, and resolution. At the same time, the overall quality scoring results of the three indicators are also output. The score is the linear weight of the three indicators, and the weighting coefficient is obtained by the regression method. The quality of the face image is given by the global quality score, and the three indicators reflect the strength of face detail features in different dimensions.
步骤S505,根据全局质量评分和三个维度的指标,从人脸轨迹图像中选出多张质量相对较高且人脸细节特征区分度大的图像作为人脸候选集。In step S505, according to the global quality score and the three-dimensional indicators, multiple images with relatively high quality and high degree of discrimination of face detail features are selected from the face trajectory images as the face candidate set.
步骤S506,对人脸候选集的图像进行人脸特征提取,采用特征级增强方法对提取出的人脸特征进行增强处理,将经过增强的人脸特征进行平均池化操作,将人脸候选集中的所有人脸特征进行融合,输出最终用于后续进行比较匹配的人脸特征。Step S506: Perform face feature extraction on the images of the face candidate set, use feature-level enhancement methods to perform enhancement processing on the extracted face features, perform an average pooling operation on the enhanced face features, and collect the face candidates The facial features of all are merged, and the facial features that are finally used for subsequent comparison and matching are output.
步骤S507,将步骤506输出的人脸特征与底库中存储的人脸特征进行比较,计算两者之间的欧式距离,当距离小于一定阈值时,则认为捕获到的人脸与底库中存储的在逃人员或者社会闲散人士的I D匹配。向终端设备发信号,将识别的结果展示在显示设备上。Step S507: Compare the facial features output in step 506 with the facial features stored in the base database, and calculate the Euclidean distance between the two. When the distance is less than a certain threshold, the captured face is considered to be in the base database. Stored ID matches of fugitives or social idlers. Send a signal to the terminal device and display the recognition result on the display device.
场景二:人群密集环境下的人员活动轨迹监控分析场景Scenario 2: Monitoring and analysis scenario of personnel activity trajectory in a crowded environment
在大型公共场所的智能化升级改造中,利用人员活动轨迹信息统计人员驻留时间、人员密度、人群流向具有较高的经济价值和社会意义。例如,在地铁的换乘中心统计人员的活动轨迹,分析人员流向,可以合理地调配人员疏散通道,提高换乘效率。再如,在大型商场中,分析人员驻留时间,人群 流向,对于合理化安排展区位置,布置商品销售区域具有重要参考价值。在串联人员活动轨迹时,由于单台视频采集设备的采集范围有限,人脸图像往往来自于多台采集设备,为了获得正确的人脸轨迹路径,人脸特征必须具备较强的泛化性,才能匹配正确的轨迹路径。然而,在人群密集场景下,人脸往往容易发生遮挡、侧脸和运动模糊,这些噪声严重妨碍了人脸特征的稳定性。基于这样的背景,本实例阐释了一种人群密集场景下人员活动轨迹监控分析系统,当将本申请实施例提供的人脸识别方法应用于该系统中时,具体可以包括如下的方法步骤:In the intelligent upgrading and transformation of large public places, the use of personnel activity trajectory information to count personnel residence time, personnel density, and crowd flow has high economic value and social significance. For example, in the subway transfer center, the trajectory of personnel is counted, and the flow of personnel is analyzed, so that evacuation channels can be rationally deployed and the transfer efficiency can be improved. For another example, in a large shopping mall, analyzing the residence time of personnel and the flow of people have important reference values for rationalizing the location of exhibition areas and arranging merchandise sales areas. When connecting people's trajectories, because the collection range of a single video capture device is limited, the face images often come from multiple collection devices. In order to obtain the correct path of the face trajectory, the facial features must have strong generalization. In order to match the correct trajectory path. However, in crowded scenes, human faces are often prone to occlusion, side face and motion blur, and these noises seriously hinder the stability of human face features. Based on this background, this example illustrates a system for monitoring and analyzing people's activity trajectories in crowded scenes. When the face recognition method provided in the embodiments of this application is applied to the system, the following method steps may be specifically included:
步骤S601,获取某一公共场所各个监控设备在一段时间内的监控视频,这里的场所具体可以为商场、地铁换乘中心、机场等公共区域。赋予每一个监控设备采集的监控视频对应的ID号,例如ID1、ID2、……、IDN。这些监控设备采集的视频数据被传输到后台,进行视频图像分析。Step S601: Obtain surveillance videos of various surveillance devices in a public place over a period of time. The places here may specifically be public areas such as shopping malls, subway transfer centers, and airports. Assign ID numbers corresponding to the surveillance videos collected by each surveillance device, such as ID1, ID2, ..., IDN. The video data collected by these monitoring equipment is transmitted to the background for video image analysis.
步骤S602,采用人脸检测和人脸跟踪方法,对每一个ID的监控设备采集的视频流进行处理,得到与ID对应的人脸轨迹图像集。In step S602, the face detection and face tracking methods are adopted to process the video stream collected by the monitoring device of each ID to obtain a face trajectory image set corresponding to the ID.
步骤S603,采用轻量级人脸质量评价算法对与ID对应的人脸轨迹图像集中的每一张人脸图像在模糊程度、偏转角、分辨率三个维度上进行打分。同时,综合三种指标的全局质量评分结果也一并输出。In step S603, a lightweight face quality evaluation algorithm is used to score each face image in the face trajectory image set corresponding to the ID in the three dimensions of blurriness, deflection angle, and resolution. At the same time, the overall quality scoring results of the three indicators are also output.
步骤S604,根据全局质量评分和三个维度的指标,生成与ID对应的人脸候选集。Step S604: Generate a face candidate set corresponding to the ID according to the global quality score and three-dimensional indicators.
步骤S605,对与ID对应的人脸候选集所包含的图像进行人脸特征提取、增强和融合,输出与ID对应的人脸特征;Step S605: Perform face feature extraction, enhancement and fusion on the images contained in the face candidate set corresponding to the ID, and output the face features corresponding to the ID;
步骤S606,每一个ID都包含一定数目的人脸特征,这些人脸特征表示了此台监控设备在这段时间内捕获的人员数目。对ID1、ID2、……、IDN的人 脸特征两两之间进行欧式距离计算,当距离小于一定阈值时,则认为身份匹配成功,将人员在不同ID监控设备下的轨迹信息关联起来;In step S606, each ID contains a certain number of facial features, and these facial features represent the number of people captured by this monitoring device during this period of time. Calculate the Euclidean distance between the face features of ID1, ID2, ..., IDN. When the distance is less than a certain threshold, the identity matching is considered successful, and the trajectory information of the person under different ID monitoring equipment is associated;
步骤S607,将人员轨迹以时间轴的方式保存在数据库中,或者展示在界面上,供操作人员调阅或者使用。In step S607, the personnel trajectory is saved in the database in the form of a time axis, or displayed on the interface for the operator to read or use.
本申请实施例提供的方案,针对在开放的监控场景下,传统方案易受到各类噪声干扰这一情况,在数据处理单元上进行大量优化,极大地提高了人脸识别监控系统的整体性能。具体地体现在:The solutions provided by the embodiments of the present application aim at the situation that the traditional solutions are susceptible to various types of noise interference in an open monitoring scenario, and a large number of optimizations are performed on the data processing unit, which greatly improves the overall performance of the face recognition monitoring system. Specifically embodied in:
极大地提高了系统的识别精度:在典型监控场景下,人脸检测算法捕获的单张人脸图像往往受到噪声干扰,在人脸细节特征上经常存在各类缺失。例如,在某一片监控区域内,当对象与采集设备之间的距离较大时,采集得到的人脸图像往往会受到失焦模糊的影响,而对于两者之间距离较小时,人脸图像往往会产生运动模糊。本申请提出的人脸识别方法运用同一对象一条运动轨迹的多张人脸图像进行特征融合提取,有效规避了传统采用单张人脸图像可能出现的信息缺失的情况。同时,人脸图像在多个维度上的先验信息被用于进行人脸特征增强,最终获得的人脸特征具有高度泛化性。同时通过特征增强对人脸图像进行增强处理,对于一些高度残缺的人脸图像,比如阴阳脸、大偏转角、围巾口罩遮挡等情况,也能够保持有相当的特征泛化性。Greatly improve the recognition accuracy of the system: in a typical surveillance scene, a single face image captured by the face detection algorithm is often disturbed by noise, and there are often various types of defects in the details of the face. For example, in a certain surveillance area, when the distance between the object and the collection device is large, the collected face image will often be affected by out-of-focus blur, and when the distance between the two is small, the face image Motion blur is often produced. The face recognition method proposed in the present application uses multiple face images of one motion track of the same object to perform feature fusion extraction, which effectively avoids the lack of information that may occur in the traditional use of a single face image. At the same time, the prior information of the face image in multiple dimensions is used to enhance the face features, and the finally obtained face features have a high degree of generalization. At the same time, the face image is enhanced through feature enhancement. For some highly incomplete face images, such as yin and yang faces, large deflection angles, scarf masks, etc., it can also maintain considerable feature generalization.
极大地提高了系统的运行效率:在保证系统高识别精度前提下,本申请提出的人脸识别方法本着轻量化设计原则,在人脸质量评价、人脸特征融合和人脸特征增强模块中使用了轻量级的深度神经卷积网络。例如,人脸质量评价算法在模糊度、偏转角和分辨率三个维度上进行计算,从而避免过多占用系统的性能,能够较好地适应人脸识别监控系统的实时性要求。Greatly improve the operating efficiency of the system: Under the premise of ensuring the high recognition accuracy of the system, the face recognition method proposed in this application is based on the principle of lightweight design and is used in the face quality evaluation, face feature fusion and face feature enhancement modules A lightweight deep neural convolutional network is used. For example, the face quality evaluation algorithm calculates in the three dimensions of blur, deflection angle and resolution, so as to avoid excessive occupation of system performance, and can better adapt to the real-time requirements of the face recognition monitoring system.
图5示出了本申请实施例提供的电子设备70。如图5所示,该电子设备 70包括但不限于:FIG. 5 shows an electronic device 70 provided by an embodiment of the present application. As shown in Fig. 5, the electronic device 70 includes but is not limited to:
存储器72,被设置为存储程序;The memory 72 is set to store programs;
处理器71,被设置为执行存储器72存储的程序,当处理器71执行存储器72存储的程序时,处理器71被设置为执行上述的人脸识别方法。The processor 71 is configured to execute the program stored in the memory 72. When the processor 71 executes the program stored in the memory 72, the processor 71 is configured to execute the aforementioned face recognition method.
处理器71和存储器72可以通过总线或者其他方式连接。The processor 71 and the memory 72 may be connected by a bus or in other ways.
存储器72作为一种非暂态计算机可读存储介质,可被设置为存储非暂态软件程序以及非暂态性计算机可执行程序,如本申请实施例描述的人脸识别方法。处理器71通过运行存储在存储器72中的非暂态软件程序以及指令,从而实现上述的人脸识别方法。As a non-transitory computer-readable storage medium, the memory 72 can be configured to store non-transitory software programs and non-transitory computer-executable programs, such as the face recognition method described in the embodiments of the present application. The processor 71 executes the non-transitory software programs and instructions stored in the memory 72 to realize the aforementioned face recognition method.
存储器72可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储执行上述的人脸识别方法。此外,存储器72可以包括高速随机存取存储器,还可以包括非暂态存储器,比如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器72包括相对于处理器71远程设置的存储器,这些远程存储器可以通过网络连接至该处理器71。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 72 may include a program storage area and a data storage area. The program storage area may store an operating system and an application program required by at least one function; the data storage area may store and execute the aforementioned face recognition method. In addition, the memory 72 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory 72 includes a memory remotely provided with respect to the processor 71, and these remote memories may be connected to the processor 71 via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
实现上述的人脸识别方法所需的非暂态软件程序以及指令存储在存储器72中,当被一个或者多个处理器71执行时,执行上述的人脸识别方法,比如,执行图1中描述的方法步骤S100至S400,图2中描述的方法步骤S110至S130,图3中描述的方法步骤S111至S113,图4中描述的方法步骤S131至S134。The non-transitory software programs and instructions required to implement the above-mentioned face recognition method are stored in the memory 72, and when executed by one or more processors 71, the above-mentioned face recognition method is executed, for example, as described in FIG. 1 The method steps S100 to S400 are described in FIG. 2, the method steps S110 to S130 are described in FIG. 2, the method steps S111 to S113 are described in FIG. 3, and the method steps S131 to S134 are described in FIG. 4.
本申请实施例还提供了一种存储介质,存储有计算机可执行指令,计算机可执行指令用于执行上述的人脸识别方法。The embodiment of the present application also provides a storage medium storing computer-executable instructions, and the computer-executable instructions are used to execute the aforementioned face recognition method.
在一实施例中,该存储介质存储有计算机可执行指令,该计算机可执行 指令被一个或多个控制处理器71执行,比如,被上述电子设备70中的一个处理器71执行,可使得上述一个或多个处理器71执行上述的人脸识别方法,比如,执行图1中描述的方法步骤S100至S400,图2中描述的方法步骤S110至S130,图3中描述的方法步骤S111至S113,图4中描述的方法步骤S131至S134。In an embodiment, the storage medium stores computer-executable instructions, and the computer-executable instructions are executed by one or more control processors 71, for example, executed by a processor 71 in the aforementioned electronic device 70, so that the aforementioned One or more processors 71 execute the aforementioned face recognition method, for example, execute the method steps S100 to S400 described in FIG. 1, the method steps S110 to S130 described in FIG. 2, and the method steps S111 to S113 described in FIG. , The method steps S131 to S134 described in FIG. 4.
本申请实施例包括:从视频流中提取出包含目标人脸的多帧人脸图像;对多帧所述人脸图像分别进行人脸特征提取,得到第一人脸特征;对所述第一人脸特征进行特征增强,并对增强后的第一人脸特征进行融合,得到第二人脸特征;将所述第二人脸特征与预先存储的第三人脸特征进行比较,确定人脸识别结果。本申请实施例提供的技术方案,基于多帧人脸图像提取出的人脸特征进行人脸识别,使得人脸特征样本更加丰富而且多样化,实现特征互补,人脸识别时的可利用信息更多,由此克服传统方法只基于单一图像的特征进行人脸识别而使得识别结果受到噪声干扰影响大的问题。本申请实施例还对从多帧所述人脸图像提取出的第一人脸特征进行特征加强和融合,实现对人脸特征进行补偿,进一步提高人脸识别的成功率和可靠性。The embodiments of the application include: extracting multiple frames of face images containing a target face from a video stream; performing face feature extraction on multiple frames of the face images to obtain first face features; The face features are feature-enhanced, and the enhanced first face features are merged to obtain the second face feature; the second face feature is compared with the pre-stored third face feature to determine the face Recognition results. The technical solution provided by the embodiments of the present application performs face recognition based on the facial features extracted from multiple frames of face images, so that the face feature samples are richer and diversified, feature complementarity is achieved, and the information available for face recognition is more This overcomes the problem that the traditional method only performs face recognition based on the characteristics of a single image, and the recognition result is greatly affected by noise interference. The embodiment of the present application also performs feature enhancement and fusion on the first face feature extracted from the multiple frames of the face image, so as to realize compensation for the face feature, and further improve the success rate and reliability of face recognition.
以上所描述的实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电 路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包括计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。A person of ordinary skill in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. Certain physical components or all physical components can be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on a computer-readable medium, and the computer-readable medium may include a computer storage medium (or a non-transitory medium) and a communication medium (or a transitory medium). As is well known by those of ordinary skill in the art, the term computer storage medium includes volatile and non-volatile data implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data). Sexual, removable and non-removable media. Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other storage technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or Any other medium used to store desired information and that can be accessed by a computer. In addition, as is well known to those of ordinary skill in the art, communication media usually include computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as carrier waves or other transmission mechanisms, and may include any information delivery media. .
以上是对本申请的较佳实施进行了具体说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请方案的情况下还可作出种种等同的变形或替换,这些等同的变形或替换均包括在本发明权利要求所限定的范围内。The above is a detailed description of the preferred implementation of the application, but the application is not limited to the above-mentioned embodiments. Those skilled in the art can also make various equivalent modifications or substitutions without departing from the solution of the application. Equivalent modifications or replacements are all included in the scope defined by the claims of the present invention.

Claims (12)

  1. 一种人脸识别方法,包括:A face recognition method, including:
    从视频流中提取出包含目标人脸的多帧人脸图像;Extract multiple frames of face images containing the target face from the video stream;
    对多帧所述人脸图像分别进行人脸特征提取,得到第一人脸特征;Performing face feature extraction on multiple frames of the face images to obtain the first face feature;
    对所述第一人脸特征进行特征增强,并对增强后的第一人脸特征进行融合,得到第二人脸特征;Performing feature enhancement on the first face feature, and fusing the enhanced first face feature to obtain a second face feature;
    将所述第二人脸特征与预先存储的第三人脸特征进行比较,确定人脸识别结果。The second face feature is compared with the pre-stored third face feature to determine the face recognition result.
  2. 根据权利要求1所述的人脸识别方法,其中,所述从视频流中提取出包含目标人脸的多帧人脸图像,包括:The face recognition method according to claim 1, wherein said extracting a multi-frame face image containing a target face from a video stream comprises:
    从视频流中提取出包含所述目标人脸的多帧第一人脸图像;Extracting multiple frames of first face images containing the target face from the video stream;
    分别对多帧所述第一人脸图像进行人脸质量分析处理,得到每帧所述第一人脸图像的人脸先验信息;Performing face quality analysis processing on multiple frames of the first face image to obtain face prior information of the first face image in each frame;
    根据每帧所述第一人脸图像的所述人脸先验信息,从多帧所述第一人脸图像中选出多帧第二人脸图像;Selecting multiple frames of second face images from multiple frames of the first face images according to the face prior information of each frame of the first face image;
    所述对多帧所述人脸图像分别进行人脸特征提取,得到第一人脸特征,包括:The step of extracting the face features of the multiple frames of the face images to obtain the first face features includes:
    对多帧所述第二人脸图像分别进行人脸特征提取,得到第一人脸特征。Face feature extraction is performed on the multiple frames of the second face images to obtain the first face feature.
  3. 根据权利要求2所述的人脸识别方法,其中,所述人脸先验信息包括多个不同类型的指标参数;The face recognition method according to claim 2, wherein the prior information about the face includes a plurality of different types of index parameters;
    所述根据所述人脸先验信息,从多帧所述第一人脸图像中选出多帧第二人脸图像,包括:The selecting multiple frames of second face images from multiple frames of the first face images according to the face prior information includes:
    对所述多个指标参数线性加权得到全局质量评分,根据所述全局质量评 分从多帧所述第一人脸图像中获取第一预设数量的初选图像;Linearly weighting the multiple index parameters to obtain a global quality score, and acquiring a first preset number of primary selected images from multiple frames of the first face images according to the global quality score;
    对所述第一预设数量的所述初选图像进行排列组合,得到多个初选图像组合,其中,每个初选图像组合中包含第二预设数量的所述初选图像;Arranging and combining the first preset number of the primary selection images to obtain a plurality of primary selection image combinations, wherein each primary selection image combination includes a second preset number of the primary selection images;
    根据所述多个指标参数获取每个所述初选图像组合的图像区分程度参数,并根据所述图像区分程度参数从多个所述初选图像组合中选出终选图像组合;Acquiring an image discrimination degree parameter of each of the primary selection image combinations according to the multiple index parameters, and selecting a final selection image combination from the plurality of primary selection image combinations according to the image discrimination degree parameter;
    将所述终选图像组合所包含的所述初选图像作为所述第二人脸图像。The primary selection image included in the final selection image combination is used as the second face image.
  4. 根据权利要求3所述的人脸识别方法,其中,所述指标参数包括模糊程度参数、偏转角参数和分辨率参数。The face recognition method according to claim 3, wherein the index parameters include blur degree parameters, deflection angle parameters, and resolution parameters.
  5. 根据权利要求2所述的人脸识别方法,其中,所述从视频流中提取出包含所述目标人脸的多帧第一人脸图像,包括:The face recognition method according to claim 2, wherein said extracting from a video stream a multi-frame first face image containing the target face comprises:
    对所述视频流进行人脸检测,获取所述目标人脸在所述视频流当前帧的脸部位置信息;Performing face detection on the video stream, and obtaining facial position information of the target face in the current frame of the video stream;
    根据所述脸部位置信息进行人脸轨迹跟踪,从所述视频流中提取出包含所述目标人脸的多帧第一人脸图像。Perform face trajectory tracking according to the face position information, and extract multiple frames of first face images containing the target face from the video stream.
  6. 根据权利要求5所述的人脸识别方法,其中,所述脸部位置信息包括多个轮廓点位置信息;The face recognition method according to claim 5, wherein the face position information includes a plurality of contour point position information;
    所述从视频流中提取出包含所述目标人脸的多帧第一人脸图像还包括:The extracting multiple frames of first face images containing the target face from the video stream further includes:
    根据所述多个轮廓点位置信息,校准所述第一人脸图像中所述目标人脸的角度。Calibrating the angle of the target face in the first face image according to the position information of the plurality of contour points.
  7. 根据权利要求1所述的人脸识别方法,其中,所述对多帧所述人脸图像分别进行人脸特征提取,得到第一人脸特征,包括:The face recognition method according to claim 1, wherein said performing face feature extraction on multiple frames of said face images to obtain the first face feature comprises:
    使用神经网络对多帧所述人脸图像分别进行人脸特征提取,得到所述第一人脸特征;其中,提取到的所述第一人脸特征包括多维的人脸向量。A neural network is used to extract the face features of the multiple frames of the face images to obtain the first face features; wherein the extracted first face features include multi-dimensional face vectors.
  8. 根据权利要求7所述的人脸识别方法,其中,所述对所述第一人脸特征进行特征增强,包括:The face recognition method according to claim 7, wherein said performing feature enhancement on said first face feature comprises:
    使用深度卷积神经网络将所述第一人脸特征与人脸先验信息进行点乘操作,得到增强后的所述第一人脸特征;其中,所述人脸先验信息是通过对所述人脸图像进行人脸质量分析处理得到。Use a deep convolutional neural network to perform a dot product operation on the first face feature and face prior information to obtain the enhanced first face feature; wherein, the face prior information is obtained by The face image is obtained by analyzing and processing the face quality.
  9. 根据权利要求1或7所述的人脸识别方法,其中,所述对增强后的第一人脸特征进行融合,得到第二人脸特征,包括:The face recognition method according to claim 1 or 7, wherein said fusing the enhanced first face feature to obtain the second face feature comprises:
    通过平均池化操作对增强后的所述第一人脸特征进行融合,得到第二人脸特征。The enhanced first face feature is merged through an average pooling operation to obtain a second face feature.
  10. 根据权利要求1所述的人脸识别方法,其中,将所述第二人脸特征与预先存储的第三人脸特征进行比较,确定人脸识别结果,包括:The face recognition method according to claim 1, wherein comparing the second face feature with a pre-stored third face feature to determine a face recognition result comprises:
    使用欧式算法对所述第二人脸特征与预先存储的第三人脸特征进行比较,确定人脸识别结果。The European algorithm is used to compare the second face feature with the pre-stored third face feature to determine the face recognition result.
  11. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现权利要求1-10任一项所述方法的步骤。An electronic device comprising a memory, a processor, and a computer program stored on the memory and running on the processor, wherein the processor implements the method of any one of claims 1-10 when the program is executed step.
  12. 一种计算机可读存储介质,存储有计算机程序,其中,该程序被处理器执行时实现权利要求1-10任一项所述方法的步骤。A computer-readable storage medium storing a computer program, wherein the program is executed by a processor to implement the steps of the method described in any one of claims 1-10.
PCT/CN2021/098156 2020-06-24 2021-06-03 Facial recognition method, electronic device, and storage medium WO2021259033A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
BR112022026549A BR112022026549A2 (en) 2020-06-24 2021-06-03 FACE RECOGNITION METHOD, ELECTRONIC DEVICE AND COMPUTER READABLE STORAGE MEDIA

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010587883.6 2020-06-24
CN202010587883.6A CN113836980A (en) 2020-06-24 2020-06-24 Face recognition method, electronic device and storage medium

Publications (1)

Publication Number Publication Date
WO2021259033A1 true WO2021259033A1 (en) 2021-12-30

Family

ID=78964520

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/098156 WO2021259033A1 (en) 2020-06-24 2021-06-03 Facial recognition method, electronic device, and storage medium

Country Status (3)

Country Link
CN (1) CN113836980A (en)
BR (1) BR112022026549A2 (en)
WO (1) WO2021259033A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114419824A (en) * 2021-12-29 2022-04-29 厦门熙重电子科技有限公司 Face track system applied to campus interior and periphery

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116756A (en) * 2013-01-23 2013-05-22 北京工商大学 Face detecting and tracking method and device
CN104008370A (en) * 2014-05-19 2014-08-27 清华大学 Video face identifying method
US20150205997A1 (en) * 2012-06-25 2015-07-23 Nokia Corporation Method, apparatus and computer program product for human-face features extraction
US20180204052A1 (en) * 2015-08-28 2018-07-19 Baidu Online Network Technology (Beijing) Co., Ltd. A method and apparatus for human face image processing
CN109948489A (en) * 2019-03-09 2019-06-28 闽南理工学院 A kind of face identification system and method based on the fusion of video multiframe face characteristic

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150205997A1 (en) * 2012-06-25 2015-07-23 Nokia Corporation Method, apparatus and computer program product for human-face features extraction
CN103116756A (en) * 2013-01-23 2013-05-22 北京工商大学 Face detecting and tracking method and device
CN104008370A (en) * 2014-05-19 2014-08-27 清华大学 Video face identifying method
US20180204052A1 (en) * 2015-08-28 2018-07-19 Baidu Online Network Technology (Beijing) Co., Ltd. A method and apparatus for human face image processing
CN109948489A (en) * 2019-03-09 2019-06-28 闽南理工学院 A kind of face identification system and method based on the fusion of video multiframe face characteristic

Also Published As

Publication number Publication date
BR112022026549A2 (en) 2023-04-18
CN113836980A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN109934176B (en) Pedestrian recognition system, recognition method, and computer-readable storage medium
US11704936B2 (en) Object tracking and best shot detection system
WO2018188453A1 (en) Method for determining human face area, storage medium, and computer device
CN105139040B (en) A kind of queueing condition information detecting method and its system
KR20210090139A (en) Information processing apparatus, information processing method, and storage medium
WO2019033574A1 (en) Electronic device, dynamic video face recognition method and system, and storage medium
US20090087038A1 (en) Image processing apparatus, image pickup apparatus, processing method for the apparatuses, and program for the apparatuses
US8130285B2 (en) Automated searching for probable matches in a video surveillance system
CN111209818A (en) Video individual identification method, system, equipment and readable storage medium
WO2020052275A1 (en) Image processing method and apparatus, terminal device, server and system
CN111898592B (en) Track data processing method and device and computer readable storage medium
CN111241928A (en) Face recognition base optimization method, system, equipment and readable storage medium
CN113947742A (en) Person trajectory tracking method and device based on face recognition
WO2021259033A1 (en) Facial recognition method, electronic device, and storage medium
US9286707B1 (en) Removing transient objects to synthesize an unobstructed image
CN112131984A (en) Video clipping method, electronic device and computer-readable storage medium
US20200043175A1 (en) Image processing device, image processing method, and recording medium storing program
WO2023019927A1 (en) Facial recognition method and apparatus, storage medium, and electronic device
WO2022134916A1 (en) Identity feature generation method and device, and storage medium
CN110543813A (en) Face image and gaze counting method and system based on scene
US20230076241A1 (en) Object detection systems and methods including an object detection model using a tailored training dataset
CN112232113B (en) Person identification method, person identification device, storage medium, and electronic apparatus
CN112329665B (en) Face snapshot system
CN114882576A (en) Face recognition method, electronic device, computer-readable medium, and program product
CN110390234B (en) Image processing apparatus and method, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21830216

Country of ref document: EP

Kind code of ref document: A1

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112022026549

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 112022026549

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20221223

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17-05-2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21830216

Country of ref document: EP

Kind code of ref document: A1