WO2020244032A1 - Face image detection method and apparatus - Google Patents

Face image detection method and apparatus Download PDF

Info

Publication number
WO2020244032A1
WO2020244032A1 PCT/CN2019/096575 CN2019096575W WO2020244032A1 WO 2020244032 A1 WO2020244032 A1 WO 2020244032A1 CN 2019096575 W CN2019096575 W CN 2019096575W WO 2020244032 A1 WO2020244032 A1 WO 2020244032A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
face image
image
sequence
image frame
Prior art date
Application number
PCT/CN2019/096575
Other languages
French (fr)
Chinese (zh)
Inventor
连桄雷
张龙
Original Assignee
罗普特科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 罗普特科技集团股份有限公司 filed Critical 罗普特科技集团股份有限公司
Publication of WO2020244032A1 publication Critical patent/WO2020244032A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • determining the sharpness of each face image based on the key point information set of each face image includes: extracting target key point information from the key point information set of each face image; based on the target key point information , Determine the target area from each face image, and determine the average pixel gradient of the pixels included in the target area; determine the sharpness of each face image based on the average pixel gradient.
  • the above-mentioned execution subject may input the image frame into a pre-trained face detection model to obtain face position information.
  • the face detection model is used to characterize the correspondence between the image sequence and the face position information.
  • Step 1 Determine the face posture angle information of each face image based on the key point information set of each face image included in the face image sequence.
  • the face detection model is also used to generate the key point information set of the image frame, where the key point information is used to characterize the position of the face key point in the face image; and
  • the output module 504 includes: a first determining unit (not shown in the figure), configured to determine the face pose angle information of each face image based on the key point information set of each face image included in the face image sequence;
  • the second determining unit (not shown in the figure) is used to determine the quality score of each face image based on the face posture angle information.
  • the computer-readable storage medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Embodiments of the present invention disclose a face image detection method and apparatus. A specific embodiment of the method comprises: acquiring a target image frame sequence; for each image frame comprised in the target image frame sequence, inputting the image frame into a pre-trained face detection model to acquire face position information; determining, on the basis of the acquired face position information, at least one face image sequence from the image frames comprised in the target image frame sequence, wherein a face image comprised in each face image sequence is used to indicate the same face; for each face image sequence in the at least one face image sequence, determining a quality score of each face image comprised in the face image sequence; and extracting a face image from the face image sequence on the basis of the acquired quality score, and outputting the face image. This embodiment extracts a high-quality face image from the target image sequence, thereby improving the accuracy for operations such as face recognition performed using the extracted face image.

Description

用于检测人脸图像的方法和装置Method and device for detecting face image
相关申请Related application
本申请要求保护在2019年6月3日提交的申请号为201910475881.5的中国专利申请的优先权,该申请的全部内容以引用的方式结合到本文中。This application claims the priority of the Chinese patent application with application number 201910475881.5 filed on June 3, 2019, and the entire content of the application is incorporated herein by reference.
技术领域Technical field
本公开实施例涉及计算机技术领域,具体涉及用于检测人脸图像的方法和装置。The embodiments of the present disclosure relate to the field of computer technology, and in particular to methods and devices for detecting face images.
背景技术Background technique
目前视频监控网络已覆盖中国各大中小城市,人脸识别技术可以应用在安防监控领域。通常,为了建立从云端到前端软硬一体的新型智能安防体系,就必须在前端部署足够丰富的人脸抓拍设备。在监控点位改造和社会资源接入的过程中,纯粹依靠后端抓拍、分析的模式不仅给网络的数据传输能力带来挑战,而且也给后端平台的数据处理能力带来很大的压力,存在运行效率缩减、运营成本大的问题。通常,可以把人脸抓拍功能分担到前端,但是大批量的更换抓拍摄像机会造成项目建设成本的骤增。目前随着5G时代的来临,边缘计算作为云计算的补充,可以充当替代解决方案,这样,就需要一种网关设备能实现前端视频的人脸抓拍,供后端进行分析。At present, the video surveillance network has covered all large, medium and small cities in China, and face recognition technology can be applied in the field of security surveillance. Generally, in order to establish a new intelligent security system integrating software and hardware from the cloud to the front end, it is necessary to deploy sufficient face capture equipment at the front end. In the process of monitoring point transformation and social resource access, purely relying on the back-end capture and analysis mode not only brings challenges to the data transmission capacity of the network, but also brings great pressure to the data processing capacity of the back-end platform , There are problems of reduced operating efficiency and high operating costs. Generally, the face capture function can be shared to the front end, but the replacement of capture cameras in large quantities will cause a sudden increase in project construction costs. At present, with the advent of the 5G era, edge computing, as a supplement to cloud computing, can serve as an alternative solution. In this way, a gateway device is needed to capture the face of front-end video for back-end analysis.
公开内容Public content
本公开实施例的目的在于提出了一种改进的用于检测人脸图像的方法和装置,来解决以上背景技术部分提到的技术问题。The purpose of the embodiments of the present disclosure is to provide an improved method and device for detecting a face image to solve the technical problems mentioned in the background art section above.
第一方面,本公开实施例提供了一种用于检测人脸图像的方法,该方法包括:获取目标图像帧序列;对于目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息;基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸;对于至少一个人脸图像序列中 的每个人脸图像序列,确定该人脸图像序列包括的每个人脸图像的质量评分;基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。In the first aspect, the embodiments of the present disclosure provide a method for detecting a face image, the method includes: acquiring a target image frame sequence; for each image frame included in the target image frame sequence, inputting the image frame to pre-training Based on the obtained face position information, at least one face image sequence is determined from the image frames included in the target image frame sequence, where each face image sequence includes the person Face images are used to indicate the same face; for each face image sequence in at least one face image sequence, determine the quality score of each face image included in the face image sequence; based on the obtained quality score, from the person Extract the face image and output from the face image sequence.
在一些实施例中,基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,包括:对于目标图像帧序列中的每两个相邻的图像帧,确定该两个相邻的图像帧的第一图像帧中的每个人脸图像中的特征点,以及确定第一图像帧中的每个人脸图像对应的、在第二图像帧中的预测特征点;从第二图像帧中的人脸图像中,确定包括的预测特征点的数量大于等于预设数值的人脸图像作为与在第一图像帧中的对应人脸图像指示的人脸相同的人脸图像。In some embodiments, based on the obtained face position information, determining at least one face image sequence from the image frames included in the target image frame sequence includes: for every two adjacent images in the target image frame sequence Frame, determine the feature point in each face image in the first image frame of the two adjacent image frames, and determine the prediction in the second image frame corresponding to each face image in the first image frame Feature points; from the face images in the second image frame, determine that the number of predicted feature points included is greater than or equal to the preset value as the face image indicated by the corresponding face image in the first image frame. Face image.
在一些实施例中,基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,包括:对于目标图像帧序列中的每两个相邻的图像帧,将该两个相邻的图像帧中的第一图像帧中的人脸图像与第二图像帧中的人脸图像中的,面积重合度大于等于预设的重合度阈值的人脸图像确定为指示相同人脸的人脸图像。In some embodiments, based on the obtained face position information, determining at least one face image sequence from the image frames included in the target image frame sequence includes: for every two adjacent images in the target image frame sequence Frame, the face image of the face image in the first image frame and the face image in the second image frame in the two adjacent image frames, the area overlap is greater than or equal to the preset overlap threshold Determined as a face image indicating the same face.
在一些实施例中,人脸检测模型还用于生成图像帧的关键点信息集合,其中,关键点信息用于表征人脸关键点在人脸图像中的位置;以及确定该人脸图像序列包括的每个人脸图像的质量评分,包括:基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息;基于人脸姿态角信息,确定每个人脸图像的质量评分。In some embodiments, the face detection model is also used to generate the key point information set of the image frame, where the key point information is used to characterize the position of the face key point in the face image; and determining the face image sequence includes The quality score of each face image includes: determining the face pose angle information of each face image based on the key point information set of each face image included in the face image sequence; determining each face image based on the face pose angle information The quality rating of the personal face image.
在一些实施例中,基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息,包括:基于该人脸图像序列包括的每个人脸图像的关键点信息集合,生成每个人脸图像对应的关键点特征向量;将所生成的关键点特征向量乘以预先拟合的特征矩阵,得到人脸姿态角特征向量作为人脸姿态角信息。In some embodiments, based on the key point information set of each face image included in the face image sequence, determining the face pose angle information of each face image includes: based on each face image included in the face image sequence Generate the key point feature vector corresponding to each face image; multiply the generated key point feature vector by the pre-fitted feature matrix to obtain the face pose angle feature vector as the face pose angle information.
在一些实施例中,基于人脸姿态角信息,确定每个人脸图像的质量评分,包括:基于每个人脸图像的关键点信息集合,确定每个人脸图像的清晰度;利用人脸姿态角信息和清晰度,确定每个人脸图像的质量评分。In some embodiments, determining the quality score of each face image based on the face pose angle information includes: determining the sharpness of each face image based on the key point information set of each face image; using the face pose angle information And clarity to determine the quality score of each face image.
在一些实施例中,基于每个人脸图像的关键点信息集合,确定每个人脸图像的清晰度,包括:从每个人脸图像的关键点信息集合中提取目标关键点信息;基于目标关键点信息,从每个人脸图像中确定目标区域,以及确定目标区域包括的 像素点的平均像素梯度;基于平均像素梯度,确定每个人脸图像的清晰度。In some embodiments, determining the sharpness of each face image based on the key point information set of each face image includes: extracting target key point information from the key point information set of each face image; based on the target key point information , Determine the target area from each face image, and determine the average pixel gradient of the pixels included in the target area; determine the sharpness of each face image based on the average pixel gradient.
在一些实施例中,人脸检测模型包括结构为深度可分离卷积的卷积层。In some embodiments, the face detection model includes a convolutional layer whose structure is a deeply separable convolution.
在一些实施例中,人脸检测模型预先利用批标准化方式训练得到。In some embodiments, the face detection model is pre-trained using batch standardization.
第二方面,本公开实施例提供了一种用于检测人脸图像的装置,该装置包括:获取模块,用于获取目标图像帧序列;生成模块,用于对于目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息,其中,人脸位置信息用于表征人脸图像在该图像帧中的位置;确定模块,用于基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸;输出模块,用于对于至少一个人脸图像序列中的每个人脸图像序列,确定该人脸图像序列包括的每个人脸图像的质量评分;基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。In the second aspect, an embodiment of the present disclosure provides an apparatus for detecting a face image. The apparatus includes: an acquisition module for acquiring a target image frame sequence; a generating module for determining each of the target image frame sequences Image frame, input the image frame into a pre-trained face detection model to obtain face position information, where the face position information is used to characterize the position of the face image in the image frame; the determining module is used to Determine at least one face image sequence from the image frames included in the target image frame sequence, where the face image included in each face image sequence is used to indicate the same face; the output module is used for For each face image sequence in at least one face image sequence, determine the quality score of each face image included in the face image sequence; based on the obtained quality score, extract the face image from the face image sequence and output .
在一些实施例中,确定模块进一步配置用于:对于目标图像帧序列中的每两个相邻的图像帧,确定该两个相邻的图像帧的第一图像帧中的每个人脸图像中的特征点,以及确定第一图像帧中的每个人脸图像对应的、在第二图像帧中的预测特征点;从第二图像帧中的人脸图像中,确定包括的预测特征点的数量大于等于预设数值的人脸图像作为与在第一图像帧中的对应人脸图像指示的人脸相同的人脸图像。In some embodiments, the determining module is further configured to: for every two adjacent image frames in the target image frame sequence, determine each face image in the first image frame of the two adjacent image frames And determine the predicted feature points in the second image frame corresponding to each face image in the first image frame; from the face images in the second image frame, determine the number of predicted feature points included The face image greater than or equal to the preset value is taken as the same face image as the face indicated by the corresponding face image in the first image frame.
在一些实施例中,确定模块进一步配置用于:对于目标图像帧序列中的每两个相邻的图像帧,将该两个相邻的图像帧中的第一图像帧中的人脸图像与第二图像帧中的人脸图像中的,面积重合度大于等于预设的重合度阈值的人脸图像确定为指示相同人脸的人脸图像。In some embodiments, the determining module is further configured to: for every two adjacent image frames in the target image frame sequence, compare the face image in the first image frame of the two adjacent image frames with Among the face images in the second image frame, a face image whose area coincidence degree is greater than or equal to a preset coincidence degree threshold is determined to be a face image indicating the same face.
在一些实施例中,人脸检测模型还用于生成图像帧的关键点信息集合,其中,关键点信息用于表征人脸关键点在人脸图像中的位置;以及输出模块包括:第一确定单元,用于基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息;第二确定单元,用于基于人脸姿态角信息,确定每个人脸图像的质量评分。In some embodiments, the face detection model is also used to generate the key point information set of the image frame, where the key point information is used to characterize the position of the face key point in the face image; and the output module includes: a first determination The unit is used to determine the face pose angle information of each face image based on the key point information set of each face image included in the face image sequence; the second determining unit is used to determine each face pose angle information based on the face pose angle information The quality rating of the personal face image.
在一些实施例中,第一确定单元包括:第一生成子单元,用于基于该人脸图像序列包括的每个人脸图像的关键点信息集合,生成每个人脸图像对应的关键点特征向量;第二生成子单元,用于将所生成的关键点特征向量乘以预先拟合的特 征矩阵,得到人脸姿态角特征向量作为人脸姿态角信息。In some embodiments, the first determining unit includes: a first generating subunit, configured to generate a key point feature vector corresponding to each face image based on the key point information set of each face image included in the face image sequence; The second generating subunit is used to multiply the generated key point feature vector by the pre-fitted feature matrix to obtain the face pose angle feature vector as the face pose angle information.
在一些实施例中,第二确定单元包括:第一确定子单元,用于基于每个人脸图像的关键点信息集合,确定每个人脸图像的清晰度;第二确定子单元,用于利用人脸姿态角信息和清晰度,确定每个人脸图像的质量评分。In some embodiments, the second determining unit includes: a first determining subunit for determining the sharpness of each face image based on the key point information set of each face image; the second determining subunit for using human The facial posture angle information and sharpness determine the quality score of each face image.
在一些实施例中,第一确定子单元包括:提取子模块,用于从每个人脸图像的关键点信息集合中提取目标关键点信息;第一确定子模块,用于基于目标关键点信息,从每个人脸图像中确定目标区域,以及确定目标区域包括的像素点的平均像素梯度;第二确定子模块,用于基于平均像素梯度,确定每个人脸图像的清晰度。In some embodiments, the first determining sub-unit includes: an extracting sub-module for extracting target key point information from the key point information set of each face image; and the first determining sub-module for extracting target key point information based on the target key point information, Determine the target area from each face image, and determine the average pixel gradient of the pixels included in the target area; the second determining sub-module is used to determine the sharpness of each face image based on the average pixel gradient.
在一些实施例中,人脸检测模型包括结构为深度可分离卷积的卷积层。In some embodiments, the face detection model includes a convolutional layer whose structure is a deeply separable convolution.
在一些实施例中,人脸检测模型预先利用批标准化方式训练得到的。In some embodiments, the face detection model is pre-trained using batch standardization.
第三方面,本公开实施例提供了一种电子设备,包括一个或多个处理器;存储装置,用于存储一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面中任一实现方式描述的方法。In a third aspect, the embodiments of the present disclosure provide an electronic device, including one or more processors; a storage device, for storing one or more programs, when one or more programs are executed by one or more processors, One or more processors are caused to implement the method described in any implementation manner of the first aspect.
第四方面,本公开实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面中任一实现方式描述的方法。In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium having a computer program stored thereon, and the computer program, when executed by a processor, implements the method described in any implementation manner in the first aspect.
本公开实施例提供的用于检测人脸图像的方法和装置,通过从目标图像帧序列中确定至少一个人脸图像序列,其中,每个人脸图像序列用于指示同一个人脸,然后从每个人脸图像序列中确定每个人脸图像的质量评分,根据质量评分提取人脸图像及输出,从而实现了从目标图像序列中提取高质量的人脸图像,有利于提高利用提取出的人脸图像进行人脸识别等操作的准确性。According to the method and device for detecting face images provided by the embodiments of the present disclosure, at least one face image sequence is determined from a target image frame sequence, where each face image sequence is used to indicate the same face, and then from each person Determine the quality score of each face image in the face image sequence, extract the face image and output according to the quality score, so as to achieve the extraction of high-quality face images from the target image sequence, which is beneficial to improve the use of the extracted face images. The accuracy of operations such as face recognition.
附图说明Description of the drawings
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本公开的其它特征、目的和优点将会变得更明显:By reading the detailed description of the non-limiting embodiments with reference to the following drawings, other features, purposes and advantages of the present disclosure will become more apparent:
图1是本公开可以应用于其中的示例性系统架构图;Fig. 1 is an exemplary system architecture diagram to which the present disclosure can be applied;
图2是根据本公开的用于检测人脸图像的方法的一个实施例的流程图;Fig. 2 is a flowchart of an embodiment of a method for detecting a face image according to the present disclosure;
图3是根据本公开的用于检测人脸图像的方法的又一个实施例的流程图;Fig. 3 is a flowchart of another embodiment of a method for detecting a face image according to the present disclosure;
图4是根据本公开的用于检测人脸图像的方法的人脸姿态角的示例性示意 图;Fig. 4 is an exemplary schematic diagram of a face pose angle of the method for detecting a face image according to the present disclosure;
图5是根据本公开的用于检测人脸图像的装置的一个实施例的结构示意图;Fig. 5 is a schematic structural diagram of an embodiment of an apparatus for detecting a face image according to the present disclosure;
图6是适于用来实现本公开实施例的电子设备的计算机系统的结构示意图。Fig. 6 is a schematic structural diagram of a computer system suitable for implementing an electronic device of an embodiment of the present disclosure.
具体实施方式Detailed ways
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关公开,而非对该公开的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关公开相关的部分。The present disclosure will be further described in detail below in conjunction with the drawings and embodiments. It can be understood that the specific embodiments described here are only used to explain the relevant disclosure, but not to limit the disclosure. In addition, it should be noted that, for ease of description, only the parts related to the relevant disclosure are shown in the drawings.
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。It should be noted that the embodiments in the present disclosure and the features in the embodiments can be combined with each other if there is no conflict. Hereinafter, the present disclosure will be described in detail with reference to the drawings and in conjunction with embodiments.
图1示出了可以应用本公开实施例的用于检测人脸图像的方法的示例性系统架构100。FIG. 1 shows an exemplary system architecture 100 to which a method for detecting a face image can be applied to an embodiment of the present disclosure.
如图1所示,系统架构100可以包括终端设备101,网络102,中间设备103和服务器104。网络102用以在终端设备101、中间设备103和服务器104之间提供通信链路的介质。网络102可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include a terminal device 101, a network 102, an intermediate device 103, and a server 104. The network 102 is used to provide a medium for communication links between the terminal device 101, the intermediate device 103, and the server 104. The network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables.
服务器104可以是提供各种服务的服务器,例如对终端设备101上传的图像帧序列进行处理的图像处理服务器。图像处理服务器可以对接收的图像帧序列进行处理,并得到处理结果(例如高质量的人脸图像)。The server 104 may be a server that provides various services, for example, an image processing server that processes an image frame sequence uploaded by the terminal device 101. The image processing server can process the received image frame sequence and obtain the processing result (for example, a high-quality face image).
中间设备103可以是各种用于数据收发及处理的设备,包括但不限于以下至少一种:交换机、网关设备等。The intermediate device 103 may be various devices used for data transceiving and processing, including but not limited to at least one of the following: a switch, a gateway device, and the like.
需要说明的是,本公开实施例所提供的用于检测人脸图像的方法一般由中间设备103执行,相应地,用于检测人脸图像的装置一般设置于中间设备103中。还需要说明的是,本公开实施例所提供的用于检测人脸图像的方法还可以由终端设备101或服务器104执行,相应地,用于检测人脸图像的装置可以设置于终端设备101或服务器104中。It should be noted that the method for detecting the face image provided by the embodiment of the present disclosure is generally executed by the intermediate device 103, and accordingly, the device for detecting the face image is generally set in the intermediate device 103. It should also be noted that the method for detecting face images provided by the embodiments of the present disclosure can also be executed by the terminal device 101 or the server 104. Accordingly, the device for detecting face images can be set in the terminal device 101 or Server 104.
应该理解,图1中的数据服务器、网络和主服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络、中间设备和服务器。It should be understood that the numbers of data servers, networks, and main servers in Figure 1 are merely illustrative. According to implementation needs, there can be any number of terminal devices, networks, intermediate devices and servers.
继续参考图2,其示出了根据本公开的应用于检测人脸图像的方法的一个实 施例的流程200。该方法包括以下步骤:Continuing to refer to FIG. 2, it shows a flow 200 of an embodiment of the method for detecting a face image according to the present disclosure. The method includes the following steps:
步骤201,获取目标图像帧序列。Step 201: Obtain a target image frame sequence.
在本实施例中,用于检测人脸图像的方法的执行主体(例如图1所示的中间设备或终端设备或服务器)可以获取目标图像帧序列。其中,目标图像帧序列可以是摄像头(例如上述执行主体包括的摄像头或与上述执行主体通信连接的电子设备包括的摄像头)对目标人脸(例如即上述摄像头的拍摄范围内的人物的人脸)拍摄的视频包括的图像帧序列。通常,目标图像帧可以是摄像头当前拍摄的图像帧以及当前时间之前的预设时间段拍摄的图像帧组成的图像帧序列。In this embodiment, the execution subject of the method for detecting a human face image (for example, the intermediate device or the terminal device or the server shown in FIG. 1) can obtain the target image frame sequence. The target image frame sequence may be a camera (for example, a camera included in the above-mentioned execution subject or a camera included in an electronic device communicatively connected with the above-mentioned execution subject) to a target face (for example, the face of a person within the shooting range of the above-mentioned camera) A sequence of image frames included in the captured video. Generally, the target image frame may be an image frame sequence composed of an image frame currently shot by the camera and an image frame shot by a preset time period before the current time.
步骤202,对于目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息。Step 202: For each image frame included in the target image frame sequence, input the image frame into a pre-trained face detection model to obtain face position information.
在本实施例中,对于目标图像帧序列包括的每个图像帧,上述执行主体可以将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息。其中,人脸检测模型用于表征图像序列和人脸位置信息的对应关系。In this embodiment, for each image frame included in the target image frame sequence, the above-mentioned execution subject may input the image frame into a pre-trained face detection model to obtain face position information. Among them, the face detection model is used to characterize the correspondence between the image sequence and the face position information.
作为示例,人脸检测模型可以是上述执行主体或其他电子设备,利用机器学习方法,将预设的训练样本集合中的训练样本包括的样本图像帧序列作为输入,将与输入的样本图像帧序列对应的样本位置信息作为期望输出,对初始模型(例如卷积神经网络、循环神经网络等)进行训练,针对每次训练输入的样本图像帧序列,可以得到实际输出。其中,实际输出是初始模型实际输出的数据,用于表征人脸图像的位置。然后,上述执行主体可以采用梯度下降法和反向传播法,基于实际输出和期望输出,调整初始模型的参数,将每次调整参数后得到的模型作为下次训练的初始模型,并在满足预设的训练结束条件的情况下,结束训练,从而训练得到语音识别模型。这里预设的训练结束条件可以包括但不限于以下至少一项:训练时间超过预设时长;训练次数超过预设次数;利用预设的损失函数(例如交叉熵损失函数)计算所得的损失值小于预设损失值阈值。As an example, the face detection model may be the above-mentioned executive body or other electronic equipment, using machine learning methods, taking the sample image frame sequence included in the training samples in the preset training sample set as input, and combining it with the input sample image frame sequence The corresponding sample position information is used as the expected output, and the initial model (for example, convolutional neural network, cyclic neural network, etc.) is trained, and the actual output can be obtained for the sample image frame sequence input for each training. Among them, the actual output is the data actually output by the initial model, which is used to characterize the position of the face image. Then, the above-mentioned executive body can adopt the gradient descent method and the back propagation method to adjust the parameters of the initial model based on the actual output and the expected output, and use the model obtained after each adjustment of the parameters as the initial model for the next training, and meet the expectations If the training end condition is set, the training is ended, and the speech recognition model is obtained by training. The preset training end conditions here may include but are not limited to at least one of the following: training time exceeds the preset duration; training times exceeds the preset number of times; the loss value calculated by using the preset loss function (for example, the cross-entropy loss function) is less than Preset loss value threshold.
上述初始模型可以是各种用于目标检测的模型,例如MTCNN(Multi-task convolutional neural network,多任务卷积神经网络)、RetinaFace等。The above-mentioned initial model may be various models for target detection, such as MTCNN (Multi-task Convolutional Neural Network), RetinaFace, etc.
在本实施例的一些可选的实现方式中,人脸检测模型可以包括结构为深度可分离卷积的卷积层。其中,采用深度可分离卷积(depthwise separable convolutions)结构的卷积神经网络可以降低卷积神经网络所占用的存储空间,以及能够降低卷积神经网络的计算量,从而有助于提高提取人脸图像的效率。采 用深度可分离卷积结构的卷积神经网络是目前广泛研究和应用的公知技术,在此不再赘述。In some optional implementation manners of this embodiment, the face detection model may include a convolutional layer structured as a deeply separable convolution. Among them, a convolutional neural network with a depthwise separable convolutions structure can reduce the storage space occupied by the convolutional neural network, and can reduce the calculation amount of the convolutional neural network, thereby helping to improve the extraction of faces The efficiency of the image. The convolutional neural network using a depth-separable convolution structure is a well-known technology that is currently widely studied and applied, and will not be repeated here.
在本实施例的一些可选的实现方式中,人脸检测模型可以是预先利用批标准化方式训练得到模型。其中,批标准化(Batch Normalization,BN),又叫批量归一化,是一种用于改善人工神经网络的性能和稳定性的技术。采用批标准化方式训练模型,可以提升训练速度,收敛过程大大加快,另外可以简化调参过程,提高训练效率和模型处理数据的精度。In some optional implementation manners of this embodiment, the face detection model may be a model obtained by training in a batch standardized manner in advance. Among them, Batch Normalization (BN), also called batch normalization, is a technology used to improve the performance and stability of artificial neural networks. Using batch standardization to train the model can increase the training speed and greatly accelerate the convergence process. In addition, it can simplify the parameter adjustment process, improve the training efficiency and the accuracy of the model processing data.
步骤203,基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列。Step 203: Determine at least one face image sequence from the image frames included in the target image frame sequence based on the obtained face position information.
在本实施例中,上述执行主体可以基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列。其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸。In this embodiment, the above-mentioned execution subject may determine at least one face image sequence from the image frames included in the target image frame sequence based on the obtained face position information. Wherein, the face images included in each face image sequence are used to indicate the same face.
在本实施例的一些可选的实现方式中,上述执行主体可以按照如下步骤确定至少一个人脸图像序列:In some optional implementation manners of this embodiment, the above-mentioned execution subject may determine at least one face image sequence according to the following steps:
对于目标图像帧序列中的每两个相邻的图像帧,执行如下步骤:For every two adjacent image frames in the target image frame sequence, perform the following steps:
首先,确定该两个相邻的图像帧的第一图像帧中的每个人脸图像中的特征点,以及确定第一图像帧中的每个人脸图像对应的、在第二图像帧中的预测特征点。其中,第一图像帧为处于第二图像帧之前的图像帧。具体地,上述执行主体可以根据所得到的人脸位置信息,从各个图像帧中确定人脸图像,然后,利用各种方法确定人脸图像的特征点。例如采用SIFT(Scale-invariant feature transform,尺度不变特征转换)算法提取每个人脸图像的特征点。再然后,上述执行主体可以利用各种特征点预测算法(例如训练神经网络、条件随机场等),确定每个人脸图像对应的、在第二图像帧中的预测特征点。First, determine the feature point in each face image in the first image frame of the two adjacent image frames, and determine the prediction in the second image frame corresponding to each face image in the first image frame Feature points. Wherein, the first image frame is an image frame before the second image frame. Specifically, the above-mentioned execution subject may determine the face image from each image frame according to the obtained face position information, and then use various methods to determine the feature points of the face image. For example, the SIFT (Scale-invariant feature transform) algorithm is used to extract the feature points of each face image. Then, the above-mentioned execution subject can use various feature point prediction algorithms (for example, training a neural network, conditional random field, etc.) to determine the predicted feature point corresponding to each face image in the second image frame.
实践中,可以采用光流法确定人脸图像的特征点和预测特征点。其中,光流法是利用图像序列中像素在时间域上的变化以及相邻帧之间的相关性来找到相邻的两帧之间存在的对应关系,从而计算出相邻帧之间物体的运动信息的一种方法。光流法的优点在于它无须了解场景的信息,就可以准确地检测识别运动日标位置。而且光流不仅携带了运动物体的运动信息,而且还携带了有关景物三维结构的丰富信息,它能够在不知道场景的任何信息的情况下,检测出运动对象。In practice, the optical flow method can be used to determine the feature points and predict feature points of the face image. Among them, the optical flow method uses the changes in the time domain of pixels in the image sequence and the correlation between adjacent frames to find the correspondence between two adjacent frames, thereby calculating the object between adjacent frames. A method of sports information. The advantage of optical flow method is that it can accurately detect and identify the position of the moving day mark without knowing the information of the scene. Moreover, optical flow not only carries the movement information of the moving object, but also carries rich information about the three-dimensional structure of the scene. It can detect the moving object without knowing any information of the scene.
然后,从第二图像帧中的人脸图像中,确定包括的预测特征点的数量大于等 于预设数值的人脸图像作为与在第一图像帧中的对应人脸图像指示的人脸相同的人脸图像。具体地,上述执行主体可以根据第二图像帧对应的人脸位置信息确定人脸图像,以及确定每个人脸图像包括的预测特征点的数量。对于第二图像帧中的一个人脸图像,如果该人脸图像中的预测特征点的数量大于等于预设数值,且该人脸图像中的预测特征点是基于第一图像帧中的某个人脸图像生成的,则将这两个人脸图像确定为用于指示同一个人脸的人脸图像。通常,预测特征点具有对应的人脸图像标识(用于指示第一图像中的人脸图像),当第二图像中的某个人脸图像包括的预测特征点的数量大于等于预设数值时,将该人脸图像的人脸图像标识设置为与预测特征点对应的人脸图像标识。当第二图像中的某个人脸图像包括的预测特征点的数量小于预设数值时,为该人脸图像设置新的人脸图像标识。Then, from the face images in the second image frame, it is determined that the number of predicted feature points included is greater than or equal to the preset value as the face image that is the same as the face indicated by the corresponding face image in the first image frame. Face image. Specifically, the above-mentioned execution subject may determine the face image according to the face position information corresponding to the second image frame, and determine the number of predicted feature points included in each face image. For a face image in the second image frame, if the number of predicted feature points in the face image is greater than or equal to the preset value, and the predicted feature points in the face image are based on a person in the first image frame If the face image is generated, the two face images are determined as face images indicating the same face. Generally, the predicted feature point has a corresponding face image identifier (used to indicate the face image in the first image). When the number of predicted feature points included in a certain face image in the second image is greater than or equal to the preset value, The face image identifier of the face image is set as the face image identifier corresponding to the predicted feature point. When the number of predicted feature points included in a certain face image in the second image is less than the preset value, a new face image identifier is set for the face image.
在本实施例的一些可选的实现方式中,上述执行主体可以按照如下步骤确定至少一个人脸图像序列:In some optional implementation manners of this embodiment, the above-mentioned execution subject may determine at least one face image sequence according to the following steps:
对于目标图像帧序列中的每两个相邻的图像帧,将该两个相邻的图像帧中的第一图像帧中的人脸图像与第二图像帧中的人脸图像中的,面积重合度(或称为矩形的交并比IOU)大于等于预设的重合度阈值的人脸图像确定为指示相同人脸的人脸图像。For every two adjacent image frames in the target image frame sequence, the area of the face image in the first image frame and the face image in the second image frame in the two adjacent image frames is A face image with a coincidence degree (or called a rectangular intersection ratio IOU) greater than or equal to a preset coincidence degree threshold is determined as a face image indicating the same face.
步骤204,对于至少一个人脸图像序列中的每个人脸图像序列,确定该人脸图像序列包括的每个人脸图像的质量评分;基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。Step 204: For each face image sequence in at least one face image sequence, determine the quality score of each face image included in the face image sequence; based on the obtained quality score, extract a person from the face image sequence Face image and output.
在本实施例中,对于上述至少一个人脸图像序列中的每个人脸图像序列,上述执行主体可以首先确定该人脸图像序列包括的每个人脸图像的质量评分。然后,基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。其中,人脸图像的质量评分可以用于表征人脸图像的质量,即,质量评分越高,表示人脸图像的质量越高。作为示例,可以将质量评分最大的人脸图像作为最优人脸图像输出。In this embodiment, for each face image sequence in the at least one face image sequence, the execution subject may first determine the quality score of each face image included in the face image sequence. Then, based on the obtained quality score, the face image is extracted from the face image sequence and output. Among them, the quality score of the face image can be used to characterize the quality of the face image, that is, the higher the quality score, the higher the quality of the face image. As an example, the face image with the highest quality score can be output as the optimal face image.
上述执行主体可以按照各种方法确定人脸图像的质量评分。作为示例,上述执行主体可以确定人脸图像的清晰度,将清晰度确定为质量评分。其中,清晰度可以利用现有的确定图像清晰度的算法得到。例如,确定图像清晰度的算法可以包括但不限于以下至少一种:像素梯度函数、灰度方差函数、灰度方差乘积函数 等。The above-mentioned execution subject can determine the quality score of the face image according to various methods. As an example, the above-mentioned execution subject may determine the sharpness of the face image, and determine the sharpness as the quality score. Among them, the definition can be obtained by using existing algorithms for determining the definition of an image. For example, the algorithm for determining the definition of an image may include but is not limited to at least one of the following: pixel gradient function, gray-scale variance function, gray-scale variance product function, and the like.
上述执行主体可以从人脸图像序列中,提取清晰度最大的人脸图像并输出。或者,按照清晰度由大到小的顺序,提取预设数量个人脸图像并输出。The above-mentioned execution subject can extract and output the sharpest face image from the face image sequence. Or, extract and output a preset number of face images in descending order of sharpness.
在本实施例中,上述执行主体可以按照各种方式输出提取的人脸图像,例如,可以在上述执行主体包括的显示器上显示提取的人脸图像和该人脸图像的标识。或者,将提取的人脸图像发送到与上述执行主体通信连接的其他电子设备。In this embodiment, the execution subject may output the extracted face image in various ways, for example, the extracted face image and the identification of the face image may be displayed on a display included in the execution subject. Alternatively, the extracted face image is sent to other electronic devices communicatively connected with the execution subject.
本公开的上述实施例提供的方法,通过从目标图像帧序列中确定至少一个人脸图像序列,其中,每个人脸图像序列用于指示同一个人脸,然后从每个人脸图像序列中确定每个人脸图像的质量评分,根据质量评分提取人脸图像及输出,从而实现了从目标图像序列中提取高质量的人脸图像,有利于提高利用提取出的人脸图像进行人脸识别等操作的准确性。The method provided by the foregoing embodiment of the present disclosure determines at least one face image sequence from the target image frame sequence, where each face image sequence is used to indicate the same face, and then each face image sequence is determined from each face image sequence. The quality score of the face image, the face image is extracted and output according to the quality score, thereby achieving the extraction of high-quality face images from the target image sequence, which is conducive to improving the accuracy of operations such as face recognition using the extracted face images Sex.
进一步参考图3,其示出了根据本公开的用于检测人脸图像的方法的又一个实施例的流程300。该方法包括以下步骤:With further reference to FIG. 3, it shows a process 300 of another embodiment of the method for detecting a face image according to the present disclosure. The method includes the following steps:
步骤301,获取目标图像帧序列。Step 301: Obtain a target image frame sequence.
在本实施例中,步骤301与图2对应实施例中的步骤201基本一致,这里不再赘述。In this embodiment, step 301 is basically the same as step 201 in the embodiment corresponding to FIG. 2, and will not be repeated here.
步骤302,对于目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息。Step 302: For each image frame included in the target image frame sequence, input the image frame into a pre-trained face detection model to obtain face position information.
在本实施例中,人脸检测模型可以确定输入的图像帧中的人脸图像的人脸位置信息,还可以用于生成输入的图像帧的关键点信息集合。实践中,人脸检测模型可以是基于MTCNN(Multi-task convolutional neural network,多任务卷积神经网络)训练的模型。该模型包括多个级联的子模型,子模型可以分别用于检测人脸位置和确定关键点信息集合。其中,关键点信息集合中的关键点信息用于表征人脸关键点在人脸图像中的位置。通常,关键点信息可以包括人脸关键点在图像帧中的坐标。人脸关键点是人脸图像中,用于表征特定位置(例如眼睛、鼻子、嘴巴等)的点。In this embodiment, the face detection model can determine the face position information of the face image in the input image frame, and can also be used to generate the key point information set of the input image frame. In practice, the face detection model may be a model trained based on MTCNN (Multi-task Convolutional Neural Network). The model includes multiple cascaded sub-models, and the sub-models can be used to detect the face position and determine the key point information set. Among them, the key point information in the key point information set is used to represent the position of the face key point in the face image. Generally, the key point information may include the coordinates of the key points of the face in the image frame. The key points of the face are points in the face image that are used to characterize specific positions (such as eyes, nose, mouth, etc.).
步骤303,基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列。 Step 303, based on the obtained face position information, determine at least one face image sequence from the image frames included in the target image frame sequence.
在本实施例中,步骤303与图2对应实施例中的步骤203基本一致,这里不 再赘述。In this embodiment, step 303 is basically the same as step 203 in the embodiment corresponding to FIG. 2, and will not be repeated here.
步骤304,对于至少一个人脸图像序列中的每个人脸图像序列,基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息;基于人脸姿态角信息,确定每个人脸图像的质量评分。Step 304: For each face image sequence in at least one face image sequence, determine the face pose angle information of each face image based on the key point information set of each face image included in the face image sequence; The face angle information determines the quality score of each face image.
在本实施例中,对于上述至少一个人脸图像序列中的每个人脸图像序列,用于检测人脸图像的方法的执行主体(例如图1所示的中间设备或终端设备或服务器)可以执行如下步骤:In this embodiment, for each face image sequence in the above at least one face image sequence, the execution subject of the method for detecting face images (for example, the intermediate device or terminal device or server shown in FIG. 1) can execute The following steps:
步骤一,基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息。Step 1: Determine the face posture angle information of each face image based on the key point information set of each face image included in the face image sequence.
其中,人脸姿态角信息可以用于表征人脸的正面朝向相对于拍摄人脸的摄像头的偏转程度。人脸姿态角信息可以包括俯仰角(pitch)、偏航角(yaw)、横滚角(roll)三种角度,分别代表上下翻转,左右翻转,平面内旋转的角度。如图4所示,x轴、y轴、z轴是直角坐标系的三个轴。其中,z轴可以为目标摄像头401的光轴,y轴可以为在人的头部不发生侧转的状态下、通过人的头顶轮廓的中心点且与水平面垂直的直线。俯仰角可以为人脸绕x轴旋转的角度,偏航角可以为人脸绕y轴旋转的角度,横滚角可以为人脸绕z轴旋转的角度。在图4中的直角坐标系中,当人的头部转动时,确定以该直角坐标系的原点为端点、且通过人的两个眼球中心点的连线的中点的射线,该射线分别与x轴、y轴、z轴的角度可以确定为正面姿态角。Among them, the face attitude angle information can be used to characterize the degree of deflection of the frontal orientation of the face relative to the camera that photographs the face. The face attitude angle information may include three angles: pitch, yaw, and roll, which respectively represent the angles of up and down, left and right, and in-plane rotation. As shown in Figure 4, the x-axis, y-axis, and z-axis are the three axes of the rectangular coordinate system. Wherein, the z-axis may be the optical axis of the target camera 401, and the y-axis may be a straight line that passes through the center point of the crown of the person's head and is perpendicular to the horizontal plane when the person's head does not turn sideways. The pitch angle can be the angle that the face rotates around the x axis, the yaw angle can be the angle that the face rotates around the y axis, and the roll angle can be the angle that the face rotates around the z axis. In the rectangular coordinate system in Fig. 4, when the human head rotates, determine the ray that takes the origin of the rectangular coordinate system as the end point and passes through the midpoint of the line of the two eyeball center points of the human. The angle with the x-axis, y-axis, and z-axis can be determined as the frontal attitude angle.
上述执行主体可以按照各种方法确定人脸姿态角信息。例如,可以利用现有的人脸姿态角估计方法,基于关键点信息集合,确定人脸姿态角信息。其中,人脸姿态角估计方法可以包括但不限于以下至少一种:基于模型的方法,基于表观的方法,基于分类的方法等。The above-mentioned execution subject can determine the face pose angle information according to various methods. For example, the existing face pose angle estimation method can be used to determine the face pose angle information based on the key point information set. Among them, the face pose angle estimation method may include but is not limited to at least one of the following: a model-based method, an appearance-based method, a classification-based method, and the like.
在本实施例的一些可选的实现方式中,上述执行主体可以按照如下步骤确定每个人脸图像的人脸姿态角信息:In some optional implementation manners of this embodiment, the above-mentioned execution subject may determine the face pose angle information of each face image according to the following steps:
首先,基于该人脸图像序列包括的每个人脸图像的关键点信息集合,生成每个人脸图像对应的关键点特征向量。其中,关键点特征向量中的元素包括M个人脸关键点的坐标。First, based on the key point information set of each face image included in the face image sequence, a key point feature vector corresponding to each face image is generated. Among them, the elements in the key point feature vector include the coordinates of the key points of the M face.
作为示例,假设M为5,对于一个人脸图像,可以生成该人脸图像对应的关键点特征向量A为[x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,b],其中,x1-x5为5个人 脸关键点的x轴坐标,y1-y5为5个人脸关键点的y轴坐标,b为预设的偏置项,例如为0。关键点特征向量A为1×11的向量。As an example, assuming that M is 5, for a face image, the key point feature vector A corresponding to the face image can be generated as [x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,b ], where x1-x5 are the x-axis coordinates of the five key points of the face, y1-y5 are the y-axis coordinates of the five key points of the face, and b is a preset offset item, such as 0. The key point feature vector A is a 1×11 vector.
将所生成的关键点特征向量乘以预先拟合的特征矩阵,得到人脸姿态角特征向量作为人脸姿态角信息。The generated key point feature vector is multiplied by the pre-fitted feature matrix to obtain the face pose angle feature vector as the face pose angle information.
继续上述示例,假设上述特征矩阵X为11×3的矩阵,将特征向量A乘以特征矩阵X,得到1×3的向量即为人脸姿态角向量,人脸姿态角向量包括俯仰角、偏航角、横滚角。Continuing the above example, assuming that the feature matrix X is an 11×3 matrix, multiply the feature vector A by the feature matrix X, and the 1×3 vector is the face attitude angle vector. The face attitude angle vector includes pitch angle and yaw Angle, roll angle.
上述特征矩阵可以预先按照如下方式拟合得到:The above feature matrix can be fitted in advance as follows:
假设有N个样本关键点特征向量,每个样本关键点特征向量表示为V=[x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,1],其中,向量中的数值1为预设偏置项。将N个样本关键点特征向量组合为一个特征矩阵B,其中B是一个N×11的矩阵。每个样本关键点特征向量对应一个样本人脸姿态角向量(包括俯仰角、偏航角、横滚角),将N个样本关键点特征向量组合为一个N×3的矩阵C。建立关系式B×X=C。其中B,C为已知条件,所以利用最小二乘法求解上述关系式,可以得到X。Assuming there are N sample key point feature vectors, each sample key point feature vector is expressed as V=[x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,1], where in the vector The value 1 is the preset offset item. Combine N sample key point feature vectors into a feature matrix B, where B is an N×11 matrix. Each sample key point feature vector corresponds to a sample face attitude angle vector (including pitch angle, yaw angle, roll angle), and N sample key point feature vectors are combined into an N×3 matrix C. Establish the relationship B×X=C. Among them, B and C are known conditions, so using the least square method to solve the above relational expression, X can be obtained.
步骤二,基于人脸姿态角信息,确定每个人脸图像的质量评分。Step 2: Determine the quality score of each face image based on the face pose angle information.
具体地,上述执行主体可以利用预设的、人脸姿态角信息包括的三个角度分别对应的权重,确定每个人脸图像的质量评分。作为示例,可以按照如下公式确定人脸姿态信息:Specifically, the above-mentioned execution subject may use preset weights corresponding to the three angles included in the face pose angle information to determine the quality score of each face image. As an example, the face pose information can be determined according to the following formula:
score1=0.2×(15–abs(roll))+0.5×(15-abs(yaw))+0.3×(15-abs(pitch))        公式(1)score1=0.2×(15-abs(roll))+0.5×(15-abs(yaw))+0.3×(15-abs(pitch)) Formula (1)
其中,Score1为人脸图像的质量评分,pitch、yaw、roll分别为俯仰角、偏航角、横滚角,0.2、0.5、0.3分别为三个角度对应的权重,abs()为取括号中的角度的绝对值,15为设置的角度阈值,即,姿态角超过15度时,取负值。Among them, Score1 is the quality score of the face image, pitch, yaw, and roll are the pitch angle, yaw angle, and roll angle respectively, 0.2, 0.5, and 0.3 are the weights corresponding to the three angles, and abs() is the brackets The absolute value of the angle, 15 is the set angle threshold, that is, when the attitude angle exceeds 15 degrees, a negative value is taken.
在本实施例的一些可选的实现方式中,上述执行主体可以按照如下步骤确定人脸图像的质量评分:In some optional implementation manners of this embodiment, the above-mentioned execution subject may determine the quality score of the face image according to the following steps:
首先,基于每个人脸图像的关键点信息集合,确定每个人脸图像的清晰度。其中,清晰度可以利用现有的确定图像清晰度的算法得到。例如,确定图像清晰度的算法可以包括但不限于以下至少一种:像素梯度函数、灰度方差函数、灰度方差乘积函数等。通常,可以将清晰度设置为处于[0,1]区间内。First, based on the key point information collection of each face image, the definition of each face image is determined. Among them, the definition can be obtained by using existing algorithms for determining the definition of an image. For example, the algorithm for determining the definition of an image may include, but is not limited to, at least one of the following: pixel gradient function, gray-scale variance function, gray-scale variance product function, and the like. Generally, the definition can be set to be in the interval [0,1].
然后,利用人脸姿态角信息和清晰度,确定每个人脸图像的质量评分。具体地,上述执行主体可以利用人脸姿态角信息,按照上述公式(1)确定第一评分score1。上述清晰度为第二评分score2,基于预设的权重,确定人脸图像的质量评分。作为示例,人脸图像的质量评分可以按照如下公式(2)确定:Then, the quality score of each face image is determined by using the information and clarity of the face pose angle. Specifically, the above-mentioned execution subject may use the face posture angle information to determine the first score score1 according to the above-mentioned formula (1). The above definition is the second score score2, and the quality score of the face image is determined based on the preset weight. As an example, the quality score of a face image can be determined according to the following formula (2):
score=0.6×score1+0.4×score2       公式(2)score=0.6×score1+0.4×score2 Formula (2)
其中,score为质量评分,0.6和0.4为预设的权重。Among them, score is the quality score, 0.6 and 0.4 are the preset weights.
需要说明的是,上述第一评分和第二评分的数值区间相同,例如同处于[0,1]或[0,100]。It should be noted that the numerical ranges of the above-mentioned first score and the second score are the same, for example, both are in [0,1] or [0,100].
在本实施例的一些可选的实现方式中,上述执行主体可以按照如下步骤确定每个人脸图像的清晰度:In some optional implementation manners of this embodiment, the above-mentioned execution subject may determine the definition of each face image according to the following steps:
首先,从每个人脸图像的关键点信息集合中提取目标关键点信息。其中,目标关键点信息可以是预先设置的、用于表征人脸的特定位置的关键点信息。通常,目标人脸关键点信息可以是用于指示人的眼睛和嘴巴的关键点信息。First, extract the target key point information from the key point information set of each face image. Among them, the target key point information may be preset key point information used to characterize a specific position of a human face. Generally, the key point information of the target face may be key point information used to indicate the person's eyes and mouth.
然后,基于目标关键点信息,从每个人脸图像中确定目标区域,以及确定目标区域包括的像素点的平均像素梯度。其中,目标区域可以是包括目标关键点信息指示的人脸关键点的区域。例如,目标区域可以是包括目标关键点信息指示的人脸关键点的最小矩形。Then, based on the target key point information, the target area is determined from each face image, and the average pixel gradient of the pixels included in the target area is determined. The target area may be an area including key points of the face indicated by the target key point information. For example, the target area may be the smallest rectangle including the key points of the face indicated by the target key point information.
上述执行主体可以利用现有的确定像素的像素梯度的方法,确定目标区域中的每个像素点的像素梯度,将所确定的像素梯度取平均,得到平均像素梯度。The above-mentioned execution subject may use the existing method of determining the pixel gradient of the pixel to determine the pixel gradient of each pixel in the target area, and average the determined pixel gradients to obtain the average pixel gradient.
最后,基于平均像素梯度,确定每个人脸图像的清晰度。具体地,可以计算目标区域内每个像素点的水平梯度和垂直梯度的平均值的总和S,然后计算平均梯度avg_g=S/(w*h*255.0),其中,w和h为目标区域的宽度和高度。Finally, based on the average pixel gradient, the sharpness of each face image is determined. Specifically, the sum S of the average value of the horizontal gradient and the vertical gradient of each pixel in the target area can be calculated, and then the average gradient avg_g=S/(w*h*255.0), where w and h are the values of the target area Width and height.
现有技术中,为了提高人脸图像检测的准确性,通常需要使用深度较大的神经网络检测图像。如果网络太深,提取图像特征的时间将会变长,整个推理速度变慢。上述可选的实现方式由于采用了人脸姿态角和图像清晰度相结合的方法确定人脸图像的质量评分,相比于深度较大的神经网络,图像处理速度更快,占用的硬件资源更少,因此,本公开的实施例的各步骤以及可选的实现方式,可以组合应用在人脸图像检测系统的前端(例如图1所示的终端设备或中间设备),减轻了后端服务器的压力。In the prior art, in order to improve the accuracy of face image detection, it is usually necessary to use a deep neural network to detect the image. If the network is too deep, the time to extract image features will become longer and the overall inference speed will slow down. The above-mentioned optional implementation method adopts the method of combining the face attitude angle and the image clarity to determine the quality score of the face image. Compared with the deep neural network, the image processing speed is faster and the hardware resources are occupied more. Therefore, the steps and optional implementations of the embodiments of the present disclosure can be combined and applied to the front end of the face image detection system (such as the terminal device or intermediate device shown in FIG. 1), which reduces the burden on the back-end server. pressure.
步骤三,基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。Step three, based on the obtained quality score, extract a face image from the face image sequence and output it.
步骤三与图2对应实施例中的步骤204中的提取人脸图像及输出的方法基本相同,这里不再赘述。Step 3 is basically the same as the method of extracting and outputting the face image in step 204 in the embodiment corresponding to FIG. 2, and will not be repeated here.
从图3中可以看出,与图2对应的实施例相比,本实施例中的用于发送信息的方法的流程300突出了基于人脸姿态角信息确定每个人脸图像的质量评分的步骤。由此可以利用人脸姿态角信息,进一步提高确定人脸图像的质量评分的准确性,有助于进一步提高提取出的人脸图像的质量。As can be seen from FIG. 3, compared with the embodiment corresponding to FIG. 2, the process 300 of the method for sending information in this embodiment highlights the step of determining the quality score of each face image based on the face posture angle information. . In this way, the facial posture angle information can be used to further improve the accuracy of determining the quality score of the facial image, which helps to further improve the quality of the extracted facial image.
进一步参考图5,作为对上述各图所示方法的实现,本公开提供了一种用于检测人脸图像的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of a device for detecting a face image. The device embodiment corresponds to the method embodiment shown in FIG. 2 , The device can be specifically applied to various electronic equipment.
如图5所示,本实施例的用于检测人脸图像的装置500包括:获取模块501,用于获取目标图像帧序列;生成模块502,用于对于目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息,其中,人脸位置信息用于表征人脸图像在该图像帧中的位置;确定模块503,用于基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸;输出模块504,用于对于至少一个人脸图像序列中的每个人脸图像序列,确定该人脸图像序列包括的每个人脸图像的质量评分;基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。As shown in FIG. 5, the apparatus 500 for detecting a face image of this embodiment includes: an acquisition module 501, which is used to acquire a target image frame sequence; and a generation module 502, which is used for each image frame included in the target image frame sequence. , Input the image frame into a pre-trained face detection model to obtain face position information, where the face position information is used to characterize the position of the face image in the image frame; the determination module 503 is used to The face position information is used to determine at least one face image sequence from the image frames included in the target image frame sequence, where the face image included in each face image sequence is used to indicate the same face; the output module 504 is used to For each face image sequence in at least one face image sequence, determine the quality score of each face image included in the face image sequence; based on the obtained quality score, extract the face image from the face image sequence and output .
在本实施例中,用于检测人脸图像的方法的获取模块501可以获取目标图像帧序列。其中,目标图像帧序列可以是摄像头对目标人脸(例如即上述摄像头的拍摄范围内的人物的人脸)拍摄的视频包括的图像帧序列。通常,目标图像帧可以是摄像头当前拍摄的图像帧以及当前时间之前的预设时间段拍摄的图像帧组成的图像帧序列。In this embodiment, the acquiring module 501 of the method for detecting a face image can acquire a target image frame sequence. The target image frame sequence may be a sequence of image frames included in a video captured by the camera on the target face (for example, the face of a person within the shooting range of the camera). Generally, the target image frame may be an image frame sequence composed of an image frame currently shot by the camera and an image frame shot by a preset time period before the current time.
在本实施例中,对于目标图像帧序列包括的每个图像帧,上述生成模块502可以将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息。其中,人脸检测模型用于表征图像序列和人脸位置信息的对应关系。In this embodiment, for each image frame included in the target image frame sequence, the aforementioned generating module 502 can input the image frame into a pre-trained face detection model to obtain face position information. Among them, the face detection model is used to characterize the correspondence between the image sequence and the face position information.
作为示例,人脸检测模型可以是上述装置500或其他电子设备,利用机器学习方法,将预设的训练样本集合中的训练样本包括的样本图像帧序列作为输入,将与输入的样本图像帧序列对应的样本位置信息作为期望输出,对初始模型(例 如卷积神经网络、循环神经网络等)进行训练,针对每次训练输入的样本图像帧序列,可以得到实际输出。其中,实际输出是初始模型实际输出的数据,用于表征人脸图像的位置。然后,用于训练人脸检测模型的执行主体可以采用梯度下降法和反向传播法,基于实际输出和期望输出,调整初始模型的参数,将每次调整参数后得到的模型作为下次训练的初始模型,并在满足预设的训练结束条件的情况下,结束训练,从而训练得到语音识别模型。这里预设的训练结束条件可以包括但不限于以下至少一项:训练时间超过预设时长;训练次数超过预设次数;利用预设的损失函数(例如交叉熵损失函数)计算所得的损失值小于预设损失值阈值。As an example, the face detection model may be the aforementioned device 500 or other electronic equipment, using a machine learning method to take as input the sample image frame sequence included in the training samples in the preset training sample set, and compare it with the input sample image frame sequence The corresponding sample position information is used as the expected output, and the initial model (for example, convolutional neural network, cyclic neural network, etc.) is trained, and the actual output can be obtained for the sample image frame sequence input for each training. Among them, the actual output is the data actually output by the initial model, which is used to characterize the position of the face image. Then, the executive body used to train the face detection model can use the gradient descent method and the back propagation method to adjust the parameters of the initial model based on the actual output and the expected output, and use the model obtained after each adjustment of the parameters as the next training Initial model, and when the preset training termination conditions are met, the training ends, so that the training obtains the speech recognition model. The preset training end conditions here may include but are not limited to at least one of the following: training time exceeds the preset duration; training times exceeds the preset number of times; the loss value calculated by using the preset loss function (for example, the cross-entropy loss function) is less than Preset loss value threshold.
上述初始模型可以是各种用于目标检测的模型,例如MTCNN(Multi-task convolutional neural network,多任务卷积神经网络)、RetinaFace等。The above-mentioned initial model may be various models for target detection, such as MTCNN (Multi-task Convolutional Neural Network), RetinaFace, etc.
在本实施例中,确定模块503可以基于所得到的人脸位置信息,按照各种方式从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列。其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸。In this embodiment, the determining module 503 may determine at least one face image sequence from the image frames included in the target image frame sequence in various ways based on the obtained face position information. Wherein, the face images included in each face image sequence are used to indicate the same face.
在本实施例中,对于上述至少一个人脸图像序列中的每个人脸图像序列,上述输出模块504可以首先确定该人脸图像序列包括的每个人脸图像的质量评分。然后,基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。其中,人脸图像的质量评分可以用于表征人脸图像的质量,即,质量评分越高,表示人脸图像的质量越高。通常,可以将质量评分最大的人脸图像作为最优人脸图像输出。In this embodiment, for each face image sequence in the at least one face image sequence, the output module 504 may first determine the quality score of each face image included in the face image sequence. Then, based on the obtained quality score, the face image is extracted from the face image sequence and output. Among them, the quality score of the face image can be used to characterize the quality of the face image, that is, the higher the quality score, the higher the quality of the face image. Generally, the face image with the highest quality score can be output as the optimal face image.
上述输出模块504可以按照各种方法确定人脸图像的质量评分。作为示例,上述输出模块504可以确定人脸图像的清晰度,将清晰度确定为质量评分。其中,清晰度可以利用现有的确定图像清晰度的算法得到。例如,确定图像清晰度的算法可以包括但不限于以下至少一种:像素梯度函数、灰度方差函数、灰度方差乘积函数等。The aforementioned output module 504 can determine the quality score of the face image according to various methods. As an example, the aforementioned output module 504 may determine the sharpness of the face image, and determine the sharpness as the quality score. Among them, the definition can be obtained by using existing algorithms for determining the definition of an image. For example, the algorithm for determining the definition of an image may include, but is not limited to, at least one of the following: pixel gradient function, gray-scale variance function, gray-scale variance product function, and the like.
上述输出模块504可以从人脸图像序列中,提取清晰度最大的人脸图像并输出。或者,按照清晰度由大到小的顺序,提取预设数量个人脸图像并输出。The above-mentioned output module 504 can extract and output the face image with the greatest clarity from the face image sequence. Or, extract and output a preset number of face images in descending order of sharpness.
在本实施例的一些可选的实现方式中,确定模块503进一步配置用于:对于目标图像帧序列中的每两个相邻的图像帧,确定该两个相邻的图像帧的第一图像帧中的每个人脸图像中的特征点,以及确定第一图像帧中的每个人脸图像对应 的、在第二图像帧中的预测特征点;从第二图像帧中的人脸图像中,确定包括的预测特征点的数量大于等于预设数值的人脸图像作为与在第一图像帧中的对应人脸图像指示的人脸相同的人脸图像。In some optional implementation manners of this embodiment, the determining module 503 is further configured to: for every two adjacent image frames in the target image frame sequence, determine the first image of the two adjacent image frames The feature points in each face image in the frame, and the predicted feature points in the second image frame corresponding to each face image in the first image frame are determined; from the face images in the second image frame, It is determined that a face image whose number of prediction feature points included is greater than or equal to a preset value is the same face image as the face indicated by the corresponding face image in the first image frame.
在本实施例的一些可选的实现方式中,确定模块503进一步配置用于:对于目标图像帧序列中的每两个相邻的图像帧,将该两个相邻的图像帧中的第一图像帧中的人脸图像与第二图像帧中的人脸图像中的,面积重合度大于等于预设的重合度阈值的人脸图像确定为指示相同人脸的人脸图像。In some optional implementation manners of this embodiment, the determining module 503 is further configured to: for every two adjacent image frames in the target image frame sequence, the first of the two adjacent image frames In the face image in the image frame and the face image in the second image frame, a face image whose area coincidence degree is greater than or equal to a preset coincidence degree threshold is determined as a face image indicating the same face.
在本实施例的一些可选的实现方式中,人脸检测模型还用于生成图像帧的关键点信息集合,其中,关键点信息用于表征人脸关键点在人脸图像中的位置;以及输出模块504包括:第一确定单元(图中未示出),用于基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息;第二确定单元(图中未示出),用于基于人脸姿态角信息,确定每个人脸图像的质量评分。In some optional implementations of this embodiment, the face detection model is also used to generate the key point information set of the image frame, where the key point information is used to characterize the position of the face key point in the face image; and The output module 504 includes: a first determining unit (not shown in the figure), configured to determine the face pose angle information of each face image based on the key point information set of each face image included in the face image sequence; The second determining unit (not shown in the figure) is used to determine the quality score of each face image based on the face posture angle information.
在本实施例的一些可选的实现方式中,第一确定单元(图中未示出)包括:第一生成子单元(图中未示出),用于基于该人脸图像序列包括的每个人脸图像的关键点信息集合,生成每个人脸图像对应的关键点特征向量;第二生成子单元(图中未示出),用于将所生成的关键点特征向量乘以预先拟合的特征矩阵,得到人脸姿态角特征向量作为人脸姿态角信息。In some optional implementation manners of this embodiment, the first determining unit (not shown in the figure) includes: a first generating subunit (not shown in the figure), which is configured to be based on each face image sequence included The key point information collection of the personal face image generates the key point feature vector corresponding to each face image; the second generating subunit (not shown in the figure) is used to multiply the generated key point feature vector by the pre-fitted feature vector The feature matrix is used to obtain the face pose angle feature vector as the face pose angle information.
在本实施例的一些可选的实现方式中,第二确定单元包括:第一确定子单元(图中未示出),用于基于每个人脸图像的关键点信息集合,确定每个人脸图像的清晰度;第二确定子单元(图中未示出),用于利用人脸姿态角信息和清晰度,确定每个人脸图像的质量评分。In some optional implementations of this embodiment, the second determining unit includes: a first determining subunit (not shown in the figure), configured to determine each face image based on the key point information set of each face image The second determining sub-unit (not shown in the figure) is used to determine the quality score of each face image by using the facial posture angle information and the sharpness.
在本实施例的一些可选的实现方式中,第一确定子单元包括:提取子模块(图中未示出),用于从每个人脸图像的关键点信息集合中提取目标关键点信息;第一确定子模块(图中未示出),用于基于目标关键点信息,从每个人脸图像中确定目标区域,以及确定目标区域包括的像素点的平均像素梯度;第二确定子模块(图中未示出),用于基于平均像素梯度,确定每个人脸图像的清晰度。In some optional implementation manners of this embodiment, the first determining subunit includes: an extraction submodule (not shown in the figure) for extracting target key point information from the key point information set of each face image; The first determination submodule (not shown in the figure) is used to determine the target area from each face image based on the target key point information, and determine the average pixel gradient of the pixels included in the target area; the second determination submodule ( (Not shown in the figure), used to determine the sharpness of each face image based on the average pixel gradient.
在本实施例的一些可选的实现方式中,人脸检测模型包括结构为深度可分离卷积的卷积层。In some optional implementations of this embodiment, the face detection model includes a convolutional layer whose structure is a deeply separable convolution.
在本实施例的一些可选的实现方式中,人脸检测模型预先利用批标准化方式 训练得到的。In some optional implementations of this embodiment, the face detection model is pre-trained by batch standardization.
本公开的上述实施例提供的装置,通过从目标图像帧序列中确定至少一个人脸图像序列,其中,每个人脸图像序列用于指示同一个人脸,然后从每个人脸图像序列中确定每个人脸图像的质量评分,根据质量评分提取人脸图像及输出,从而实现了从目标图像序列中提取高质量的人脸图像,有利于提高利用提取出的人脸图像进行人脸识别等操作的准确性。The apparatus provided by the above-mentioned embodiment of the present disclosure determines at least one face image sequence from the target image frame sequence, where each face image sequence is used to indicate the same face, and then each face image sequence is determined from each face image sequence. The quality score of the face image, the face image is extracted and output according to the quality score, thereby achieving the extraction of high-quality face images from the target image sequence, which is conducive to improving the accuracy of operations such as face recognition using the extracted face images Sex.
下面参考图6,其示出了适于用来实现本公开实施例的电子设备的计算机系统600的结构示意图。图6示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Reference is now made to FIG. 6, which shows a schematic structural diagram of a computer system 600 suitable for implementing an electronic device of the embodiments of the present disclosure. The electronic device shown in FIG. 6 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
如图6所示,计算机系统600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, the computer system 600 includes a central processing unit (CPU) 601, which can be based on a program stored in a read-only memory (ROM) 602 or a program loaded from a storage part 608 into a random access memory (RAM) 603 And perform various appropriate actions and processing. The RAM 603 also stores various programs and data required for the operation of the system 600. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to the bus 604.
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input part 606 including a keyboard, a mouse, etc.; an output part 607 including a liquid crystal display (LCD), etc., and a speaker, etc.; a storage part 608 including a hard disk, etc.; and The communication part 609 of a network interface card such as a modem. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as needed, so that the computer program read from it is installed into the storage part 608 as needed.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本公开的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart can be implemented as a computer software program. For example, the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network through the communication part 609, and/or installed from the removable medium 611. When the computer program is executed by the central processing unit (CPU) 601, the above-mentioned functions defined in the method of the present disclosure are executed.
需要说明的是,本公开所述的计算机可读存储介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例 如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读存储介质,该计算机可读存储介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable storage medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device. In the present disclosure, a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable storage medium other than the computer-readable storage medium. The computer-readable storage medium may be sent, propagated or transmitted for use by or in combination with the instruction execution system, apparatus, or device program of. The program code contained on the computer-readable storage medium can be transmitted by any suitable medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。The computer program code used to perform the operations of the present disclosure can be written in one or more programming languages or a combination thereof. The programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language. The program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server. In the case of a remote computer, the remote computer can be connected to the user’s computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有 时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings illustrate the possible implementation architecture, functions, and operations of the system, method, and computer program product according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中,例如,可以描述为:一种处理器包括获取模块、生成模块、确定模块和输出模块。其中,这些模块的名称在某种情况下并不构成对该单元本身的限定,例如,获取模块还可以被描述为“用于获取目标图像帧序列的模块”。The modules involved in the embodiments described in the present disclosure can be implemented in software or hardware. The described module may also be provided in the processor, for example, it may be described as: a processor includes an acquisition module, a generation module, a determination module, and an output module. Among them, the names of these modules do not constitute a limitation on the unit itself under certain circumstances. For example, the acquisition module can also be described as "a module for acquiring a target image frame sequence".
作为另一方面,本公开还提供了一种计算机可读存储介质,该计算机可读存储介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读存储介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取目标图像帧序列;对于目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息;基于所得到的人脸位置信息,从目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸;对于至少一个人脸图像序列中的每个人脸图像序列,确定该人脸图像序列包括的每个人脸图像的质量评分;基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。As another aspect, the present disclosure also provides a computer-readable storage medium. The computer-readable storage medium may be included in the electronic device described in the above embodiment; or it may exist alone without being assembled into the In electronic equipment. The aforementioned computer-readable storage medium carries one or more programs, and when the aforementioned one or more programs are executed by the electronic device, the electronic device: acquires a sequence of target image frames; for each image frame included in the sequence of target image frames , Input the image frame into a pre-trained face detection model to obtain face position information; based on the obtained face position information, determine at least one face image sequence from the image frames included in the target image frame sequence, where, The face images included in each face image sequence are used to indicate the same face; for each face image sequence in at least one face image sequence, determine the quality score of each face image included in the face image sequence; The obtained quality score is extracted from the face image sequence and output.
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中申请的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present disclosure and an explanation of the applied technical principles. Those skilled in the art should understand that the scope of disclosure involved in this disclosure is not limited to the technical solutions formed by the specific combination of the above technical features, and should also cover the above technical features or technical solutions without departing from the above disclosed concept. Other technical solutions formed by any combination of its equivalent features. For example, a technical solution formed by replacing the above-mentioned features with (but not limited to) technical features with similar functions applied for in this disclosure.

Claims (12)

  1. 一种用于检测人脸图像的方法,其特征在于,所述方法包括:获取目标图像帧序列;对于所述目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息;基于所得到的人脸位置信息,从所述目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸;对于所述至少一个人脸图像序列中的每个人脸图像序列,确定该人脸图像序列包括的每个人脸图像的质量评分;基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。A method for detecting a human face image, characterized in that the method comprises: acquiring a target image frame sequence; for each image frame included in the target image frame sequence, inputting the image frame into a pre-trained human face Detect the model to obtain face position information; based on the obtained face position information, determine at least one face image sequence from the image frames included in the target image frame sequence, where each face image sequence includes the face The image is used to indicate the same face; for each face image sequence in the at least one face image sequence, determine the quality score of each face image included in the face image sequence; based on the obtained quality score, from the Extract the face image and output from the face image sequence.
  2. 根据权利要求1所述的方法,其特征在于,所述基于所得到的人脸位置信息,从所述目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,包括:对于所述目标图像帧序列中的每两个相邻的图像帧,确定该两个相邻的图像帧的第一图像帧中的每个人脸图像中的特征点,以及确定第一图像帧中的每个人脸图像对应的、在第二图像帧中的预测特征点;从第二图像帧中的人脸图像中,确定包括的预测特征点的数量大于等于预设数值的人脸图像作为与在第一图像帧中的对应人脸图像指示的人脸相同的人脸图像。The method according to claim 1, wherein the determining at least one face image sequence from the image frames included in the target image frame sequence based on the obtained face position information comprises: For every two adjacent image frames in the target image frame sequence, determine the feature points in each face image in the first image frame of the two adjacent image frames, and determine each person in the first image frame The predicted feature point in the second image frame corresponding to the face image; from the face image in the second image frame, determine the face image whose number of predicted feature points included is greater than or equal to the preset value as the first The corresponding face image in the image frame indicates the same face image.
  3. 根据权利要求1所述的方法,其特征在于,所述基于所得到的人脸位置信息,从所述目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,包括:对于所述目标图像帧序列中的每两个相邻的图像帧,将该两个相邻的图像帧中的第一图像帧中的人脸图像与第二图像帧中的人脸图像中的,面积重合度大于等于预设的重合度阈值的人脸图像确定为指示相同人脸的人脸图像。The method according to claim 1, wherein the determining at least one face image sequence from the image frames included in the target image frame sequence based on the obtained face position information comprises: For every two adjacent image frames in the target image frame sequence, the area of the face image in the first image frame and the face image in the second image frame in the two adjacent image frames overlap Facial images whose degrees are greater than or equal to a preset coincidence degree threshold are determined as facial images indicating the same human face.
  4. 根据权利要求1所述的方法,其特征在于,所述人脸检测模型还用于生成图像帧的关键点信息集合,其中,关键点信息用于表征人脸关键点在人脸图像中的位置;以及所述确定该人脸图像序列包括的每个人脸图像的质量评分,包括:基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息;基于人脸姿态角信息,确定每个人脸图像的质量评分。The method according to claim 1, wherein the face detection model is also used to generate a key point information set of the image frame, wherein the key point information is used to characterize the position of the face key point in the face image And said determining the quality score of each face image included in the face image sequence includes: determining the face pose angle of each face image based on the key point information set of each face image included in the face image sequence Information: Determine the quality score of each face image based on the face pose angle information.
  5. 根据权利要求4所述的方法,其特征在于,所述基于该人脸图像序列包括的每个人脸图像的关键点信息集合,确定每个人脸图像的人脸姿态角信息,包括:基于该人脸图像序列包括的每个人脸图像的关键点信息集合,生成每个人脸图像对应的关键点特征向量;将所生成的关键点特征向量乘以预先拟合的特征矩阵,得到人脸姿态角特征向量作为人脸姿态角信息。The method according to claim 4, wherein the determining the face pose angle information of each face image based on the key point information set of each face image included in the face image sequence comprises: based on the person The key point information collection of each face image included in the face image sequence is used to generate the key point feature vector corresponding to each face image; the generated key point feature vector is multiplied by the pre-fitted feature matrix to obtain the face pose angle feature The vector is used as the angle information of the face.
  6. 根据权利要求4所述的方法,其特征在于,所述基于人脸姿态角信息,确定每个人脸图像的质量评分,包括:基于每个人脸图像的关键点信息集合,确定每个人脸图像的清晰度;利用人脸姿态角信息和清晰度,确定每个人脸图像的质量评分。The method according to claim 4, wherein the determining the quality score of each face image based on the face pose angle information comprises: determining the quality score of each face image based on the key point information set of each face image Sharpness: Using the information and sharpness of the face attitude angle, determine the quality score of each face image.
  7. 根据权利要求6所述的方法,其特征在于,所述基于每个人脸图像的关键点信息集合,确定每个人脸图像的清晰度,包括:从每个人脸图像的关键点信息集合中提取目标关键点信息;基于目标关键点信息,从每个人脸图像中确定目标区域,以及确定目标区域包括的像素点的平均像素梯度;基于平均像素梯度,确定每个人脸图像的清晰度。The method according to claim 6, wherein the determining the sharpness of each face image based on the key point information set of each face image comprises: extracting the target from the key point information set of each face image Key point information; based on the target key point information, determine the target area from each face image, and determine the average pixel gradient of the pixels included in the target area; determine the sharpness of each face image based on the average pixel gradient.
  8. 根据权利要求1-7之一所述的方法,其特征在于,所述人脸检测模型包括结构为深度可分离卷积的卷积层。The method according to any one of claims 1-7, wherein the face detection model comprises a convolutional layer whose structure is a deeply separable convolution.
  9. 根据权利要求1-7之一所述的方法,其特征在于,所述人脸检测模型预先利用批标准化方式训练得到。The method according to any one of claims 1-7, wherein the face detection model is obtained by training in a batch normalization method in advance.
  10. 一种用于检测人脸图像的装置,其特征在于,所述装置包括:获取模块,用于获取目标图像帧序列;生成模块,用于对于所述目标图像帧序列包括的每个图像帧,将该图像帧输入预先训练的人脸检测模型,得到人脸位置信息,其中,人脸位置信息用于表征人脸图像在该图像帧中的位置;确定模块,用于基于所得到的人脸位置信息,从所述目标图像帧序列包括的图像帧中,确定至少一个人脸图像序列,其中,每个人脸图像序列包括的人脸图像用于指示同一个人脸;输出模块,用于对于所述至少一个人脸图像序列中的每个人脸图像序列,确定该人脸图像序列包括的每个人脸图像的质量评分;基于所得到的质量评分,从该人脸图像序列中提取人脸图像及输出。A device for detecting a human face image, characterized in that the device includes: an acquisition module for acquiring a target image frame sequence; a generating module for each image frame included in the target image frame sequence, Input the image frame into a pre-trained face detection model to obtain face position information, where the face position information is used to characterize the position of the face image in the image frame; the determination module is used to obtain face position information based on the obtained face The location information is used to determine at least one face image sequence from the image frames included in the target image frame sequence, where the face image included in each face image sequence is used to indicate the same face; the output module is used for all Describe each face image sequence in at least one face image sequence, determine the quality score of each face image included in the face image sequence; based on the obtained quality score, extract the face image and the face image sequence from the face image sequence Output.
  11. 一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-9中任一所述的方法。An electronic device, including: one or more processors; a storage device, used to store one or more programs. When the one or more programs are executed by the one or more processors, the one or more Multiple processors implement the method according to any one of claims 1-9.
  12. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-9中任一所述的方法。A computer-readable storage medium having a computer program stored thereon, wherein the program is executed by a processor to implement the method according to any one of claims 1-9.
PCT/CN2019/096575 2019-06-03 2019-07-18 Face image detection method and apparatus WO2020244032A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910475881.5A CN110276277A (en) 2019-06-03 2019-06-03 Method and apparatus for detecting facial image
CN201910475881.5 2019-06-03

Publications (1)

Publication Number Publication Date
WO2020244032A1 true WO2020244032A1 (en) 2020-12-10

Family

ID=67960421

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/096575 WO2020244032A1 (en) 2019-06-03 2019-07-18 Face image detection method and apparatus

Country Status (2)

Country Link
CN (1) CN110276277A (en)
WO (1) WO2020244032A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528903A (en) * 2020-12-18 2021-03-19 平安银行股份有限公司 Face image acquisition method and device, electronic equipment and medium
CN112541433A (en) * 2020-12-11 2021-03-23 中国电子技术标准化研究院 Two-stage human eye pupil accurate positioning method based on attention mechanism
CN112560725A (en) * 2020-12-22 2021-03-26 四川云从天府人工智能科技有限公司 Key point detection model, detection method and device thereof and computer storage medium
CN112597944A (en) * 2020-12-29 2021-04-02 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112633250A (en) * 2021-01-05 2021-04-09 北京经纬信息技术有限公司 Face recognition detection experimental method and device
CN112651321A (en) * 2020-12-21 2021-04-13 浙江商汤科技开发有限公司 File processing method and device and server
CN112926542A (en) * 2021-04-09 2021-06-08 博众精工科技股份有限公司 Performance detection method and device, electronic equipment and storage medium
CN113379877A (en) * 2021-06-08 2021-09-10 北京百度网讯科技有限公司 Face video generation method and device, electronic equipment and storage medium
CN113489897A (en) * 2021-06-28 2021-10-08 杭州逗酷软件科技有限公司 Image processing method and related device
CN113536900A (en) * 2021-05-31 2021-10-22 浙江大华技术股份有限公司 Method and device for evaluating quality of face image and computer readable storage medium
CN113627394A (en) * 2021-09-17 2021-11-09 平安银行股份有限公司 Face extraction method and device, electronic equipment and readable storage medium
CN113627290A (en) * 2021-07-27 2021-11-09 歌尔科技有限公司 Sound box control method and device, sound box and readable storage medium
CN116432152A (en) * 2023-04-18 2023-07-14 山东广电信通网络运营有限公司 Cross-platform collaborative manufacturing system

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796108B (en) * 2019-11-04 2022-05-17 北京锐安科技有限公司 Method, device and equipment for detecting face quality and storage medium
CN110688994B (en) * 2019-12-10 2020-03-27 南京甄视智能科技有限公司 Human face detection method and device based on cross-over ratio and multi-model fusion and computer readable storage medium
CN113158706A (en) * 2020-01-07 2021-07-23 北京地平线机器人技术研发有限公司 Face snapshot method, device, medium and electronic equipment
CN111310562B (en) * 2020-01-10 2020-11-27 中国平安财产保险股份有限公司 Vehicle driving risk management and control method based on artificial intelligence and related equipment thereof
CN112188091B (en) * 2020-09-24 2022-05-06 北京达佳互联信息技术有限公司 Face information identification method and device, electronic equipment and storage medium
CN112183490A (en) * 2020-11-04 2021-01-05 北京澎思科技有限公司 Face snapshot picture filing method and device
CN112418098A (en) * 2020-11-24 2021-02-26 深圳云天励飞技术股份有限公司 Training method of video structured model and related equipment
WO2022133993A1 (en) * 2020-12-25 2022-06-30 京东方科技集团股份有限公司 Method and device for performing face registration on the basis of video data, and electronic whiteboard
CN115066712A (en) * 2020-12-28 2022-09-16 京东方科技集团股份有限公司 Identity recognition method, terminal, server and system
CN112954450B (en) * 2021-02-02 2022-06-17 北京字跳网络技术有限公司 Video processing method and device, electronic equipment and storage medium
CN113283319A (en) * 2021-05-13 2021-08-20 Oppo广东移动通信有限公司 Method and device for evaluating face ambiguity, medium and electronic equipment
CN113571051A (en) * 2021-06-11 2021-10-29 天津大学 Voice recognition system and method for lip voice activity detection and result error correction
CN113486829B (en) * 2021-07-15 2023-11-07 京东科技控股股份有限公司 Face living body detection method and device, electronic equipment and storage medium
CN113674224A (en) * 2021-07-29 2021-11-19 浙江大华技术股份有限公司 Monitoring point position management method and device
CN113793368A (en) * 2021-09-29 2021-12-14 北京朗达和顺科技有限公司 Video face privacy method based on optical flow
CN114332082B (en) * 2022-03-07 2022-05-27 飞狐信息技术(天津)有限公司 Definition evaluation method and device, electronic equipment and computer storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090403A (en) * 2016-11-22 2018-05-29 上海银晨智能识别科技有限公司 Face dynamic identification method and system based on 3D convolutional neural network
CN109657612A (en) * 2018-12-19 2019-04-19 苏州纳智天地智能科技有限公司 A kind of quality-ordered system and its application method based on facial image feature
CN109753917A (en) * 2018-12-29 2019-05-14 中国科学院重庆绿色智能技术研究院 Face quality optimization method, system, computer readable storage medium and equipment
CN109784230A (en) * 2018-12-29 2019-05-21 中国科学院重庆绿色智能技术研究院 A kind of facial video image quality optimization method, system and equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012034174A1 (en) * 2010-09-14 2012-03-22 Dynamic Digital Depth Research Pty Ltd A method for enhancing depth maps
CN104517104B (en) * 2015-01-09 2018-08-10 苏州科达科技股份有限公司 A kind of face identification method and system based under monitoring scene
CN108256477B (en) * 2018-01-17 2023-04-07 百度在线网络技术(北京)有限公司 Method and device for detecting human face

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090403A (en) * 2016-11-22 2018-05-29 上海银晨智能识别科技有限公司 Face dynamic identification method and system based on 3D convolutional neural network
CN109657612A (en) * 2018-12-19 2019-04-19 苏州纳智天地智能科技有限公司 A kind of quality-ordered system and its application method based on facial image feature
CN109753917A (en) * 2018-12-29 2019-05-14 中国科学院重庆绿色智能技术研究院 Face quality optimization method, system, computer readable storage medium and equipment
CN109784230A (en) * 2018-12-29 2019-05-21 中国科学院重庆绿色智能技术研究院 A kind of facial video image quality optimization method, system and equipment

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541433A (en) * 2020-12-11 2021-03-23 中国电子技术标准化研究院 Two-stage human eye pupil accurate positioning method based on attention mechanism
CN112541433B (en) * 2020-12-11 2024-04-19 中国电子技术标准化研究院 Two-stage human eye pupil accurate positioning method based on attention mechanism
CN112528903B (en) * 2020-12-18 2023-10-31 平安银行股份有限公司 Face image acquisition method and device, electronic equipment and medium
CN112528903A (en) * 2020-12-18 2021-03-19 平安银行股份有限公司 Face image acquisition method and device, electronic equipment and medium
CN112651321A (en) * 2020-12-21 2021-04-13 浙江商汤科技开发有限公司 File processing method and device and server
CN112560725A (en) * 2020-12-22 2021-03-26 四川云从天府人工智能科技有限公司 Key point detection model, detection method and device thereof and computer storage medium
CN112597944B (en) * 2020-12-29 2024-06-11 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112597944A (en) * 2020-12-29 2021-04-02 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112633250A (en) * 2021-01-05 2021-04-09 北京经纬信息技术有限公司 Face recognition detection experimental method and device
CN112926542A (en) * 2021-04-09 2021-06-08 博众精工科技股份有限公司 Performance detection method and device, electronic equipment and storage medium
CN112926542B (en) * 2021-04-09 2024-04-30 博众精工科技股份有限公司 Sex detection method and device, electronic equipment and storage medium
CN113536900A (en) * 2021-05-31 2021-10-22 浙江大华技术股份有限公司 Method and device for evaluating quality of face image and computer readable storage medium
CN113379877A (en) * 2021-06-08 2021-09-10 北京百度网讯科技有限公司 Face video generation method and device, electronic equipment and storage medium
CN113379877B (en) * 2021-06-08 2023-07-28 北京百度网讯科技有限公司 Face video generation method and device, electronic equipment and storage medium
CN113489897A (en) * 2021-06-28 2021-10-08 杭州逗酷软件科技有限公司 Image processing method and related device
CN113627290A (en) * 2021-07-27 2021-11-09 歌尔科技有限公司 Sound box control method and device, sound box and readable storage medium
CN113627394B (en) * 2021-09-17 2023-11-17 平安银行股份有限公司 Face extraction method and device, electronic equipment and readable storage medium
CN113627394A (en) * 2021-09-17 2021-11-09 平安银行股份有限公司 Face extraction method and device, electronic equipment and readable storage medium
CN116432152A (en) * 2023-04-18 2023-07-14 山东广电信通网络运营有限公司 Cross-platform collaborative manufacturing system

Also Published As

Publication number Publication date
CN110276277A (en) 2019-09-24

Similar Documents

Publication Publication Date Title
WO2020244032A1 (en) Face image detection method and apparatus
WO2020199931A1 (en) Face key point detection method and apparatus, and storage medium and electronic device
US11928800B2 (en) Image coordinate system transformation method and apparatus, device, and storage medium
WO2018019126A1 (en) Video category identification method and device, data processing device and electronic apparatus
US11238272B2 (en) Method and apparatus for detecting face image
WO2020140723A1 (en) Method, apparatus and device for detecting dynamic facial expression, and storage medium
WO2018188453A1 (en) Method for determining human face area, storage medium, and computer device
US20220270348A1 (en) Face recognition method and apparatus, computer device, and storage medium
WO2019238114A1 (en) Three-dimensional dynamic model reconstruction method, apparatus and device, and storage medium
CN111160202B (en) Identity verification method, device, equipment and storage medium based on AR equipment
WO2022041830A1 (en) Pedestrian re-identification method and device
CN112149615B (en) Face living body detection method, device, medium and electronic equipment
WO2021083069A1 (en) Method and device for training face swapping model
CN112132847A (en) Model training method, image segmentation method, device, electronic device and medium
WO2016165614A1 (en) Method for expression recognition in instant video and electronic equipment
WO2020124994A1 (en) Liveness detection method and apparatus, electronic device, and storage medium
WO2020124993A1 (en) Liveness detection method and apparatus, electronic device, and storage medium
CN111222459A (en) Visual angle-independent video three-dimensional human body posture identification method
CN111931544B (en) Living body detection method, living body detection device, computing equipment and computer storage medium
CN113723306B (en) Push-up detection method, push-up detection device and computer readable medium
KR102494811B1 (en) Apparatus and method for realtime gaze tracking based on eye landmarks
KR20230166840A (en) Method for tracking object movement path based on artificial intelligence
CN109493349B (en) Image feature processing module, augmented reality equipment and corner detection method
JP2023512359A (en) Associated object detection method and apparatus
CN112149598A (en) Side face evaluation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19931824

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19931824

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 27.05.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19931824

Country of ref document: EP

Kind code of ref document: A1