WO2023103145A1 - Head pose truth value acquisition method, apparatus and device, and storage medium - Google Patents

Head pose truth value acquisition method, apparatus and device, and storage medium Download PDF

Info

Publication number
WO2023103145A1
WO2023103145A1 PCT/CN2022/071709 CN2022071709W WO2023103145A1 WO 2023103145 A1 WO2023103145 A1 WO 2023103145A1 CN 2022071709 W CN2022071709 W CN 2022071709W WO 2023103145 A1 WO2023103145 A1 WO 2023103145A1
Authority
WO
WIPO (PCT)
Prior art keywords
image acquisition
face
key point
target
point information
Prior art date
Application number
PCT/CN2022/071709
Other languages
French (fr)
Chinese (zh)
Inventor
周伟杰
刘威
袁淮
吕晋
周婷
武红娇
董德威
李萌
曹斌
Original Assignee
东软睿驰汽车技术(沈阳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东软睿驰汽车技术(沈阳)有限公司 filed Critical 东软睿驰汽车技术(沈阳)有限公司
Publication of WO2023103145A1 publication Critical patent/WO2023103145A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Definitions

  • the present application relates to the technical field of image processing, and in particular to a method, device, equipment and storage medium for acquiring the true value of head posture.
  • the true value of the head pose includes the yaw angle Yaw, the pitch angle Pitch and the roll angle Roll.
  • the yaw angle, pitch angle and roll angle are the angles of rotation relative to the y-axis, x-axis and z-axis in the Euler angle vector coordinate system, respectively.
  • the true value of head posture has applications in many fields, such as driver fatigue analysis, virtual reality somatosensory games, product purchase desire analysis, face verification and other fields.
  • the true value of the head pose can be obtained through the sensors in the wearable device.
  • the wearing angle of the wearable device easily affects the accuracy of the acquired true value of the head pose.
  • the accuracy of the true value of the head pose is not good, it is easy to affect the accuracy of the results of the data analysis.
  • the driver with a high degree of fatigue was mistakenly judged as not fatigued, and the voice reminder was not timely and accurate. It can be seen that improving the accuracy of obtaining the true value of the head pose is an urgent technical problem to be solved.
  • the present application provides a method, device, device and storage medium for acquiring the true value of head posture.
  • the present application provides a method for obtaining the true value of the head pose, including:
  • the target image is the target image acquisition device at the time The image collected by the target object.
  • the obtaining the true value of the head pose of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the face coordinate system includes:
  • a true value of the head pose of the target object corresponding to the target image is obtained according to the rotation matrix.
  • the establishment of the face coordinate system based on the 3D key point information of the face corresponding to the target image acquisition device includes:
  • a human face coordinate system of the target object is established based on the human face plane and the normal vector of the human face plane.
  • the three-dimensional face key points corresponding to each of the image acquisition devices are reconstructed based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices.
  • point information including:
  • the face key point information corresponding to the target image capture device the face key point information corresponding to the reference image capture device, the internal parameters of the target image capture device, the internal parameters of the reference image capture device, and the target
  • the external parameters between the image acquisition device and the reference image acquisition device are used to reconstruct the three-dimensional key point information of the face corresponding to the target image acquisition device through the triangulation reconstruction method;
  • the reference image acquisition device is the plurality of images Any image acquisition device other than the target image acquisition device among the acquisition devices;
  • the 3D key point information of the face corresponding to the target image capture device According to the 3D key point information of the face corresponding to the target image capture device and the external parameters, the 3D key point information of the face corresponding to the reference image capture device is obtained.
  • the method for obtaining the true value of the head pose further includes:
  • the image and head pose ground truths corresponding to each other are stored as image ground truth pairs.
  • the acquisition of images collected by multiple image acquisition devices on the target object at the same time includes:
  • the target object when the multiple image acquisition devices acquire images of the target object, the target object is sitting on a seat, and the seat is used to simulate a seat in a real vehicle environment;
  • the setting parameters of the image acquisition device relative to the seat are determined according to the setting parameters of the simulation object in the real vehicle environment relative to the seat;
  • the simulated objects include at least one of the following:
  • A-pillar, B-pillar, instrument panel, front windshield or left side glass is A-pillar, B-pillar, instrument panel, front windshield or left side glass.
  • the multiple image acquisition devices include: an infrared camera and an RGB camera.
  • the internal parameters of the same type of the multiple image acquisition devices have different value ranges.
  • the present application provides a device for obtaining the true value of the head posture, including:
  • An image acquisition module configured to acquire images collected by multiple image acquisition devices on the target object at the same time
  • a key point labeling module configured to mark the key points of the face in the image captured by each of the image capture devices among the plurality of image capture devices, and obtain the key point information of the face corresponding to each of the image capture devices;
  • the three-dimensional key point reconstruction module is used to reconstruct the three-dimensional key point information of the face corresponding to each of the image acquisition devices based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices;
  • a coordinate system establishment module configured to establish a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of the plurality of image acquisition devices;
  • a true value acquisition module configured to obtain the true value of the head posture of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the face coordinate system, the target image being the target image The image collected by the collection device on the target object at the time.
  • the present application provides a device for obtaining the true value of a head posture, including a processor and a memory; the memory is used to store a computer program; the processor is used to execute the computer program as provided in the first aspect according to the computer program.
  • the method for obtaining the true value of the head pose including a processor and a memory; the memory is used to store a computer program; the processor is used to execute the computer program as provided in the first aspect according to the computer program.
  • the present application provides a computer-readable storage medium for storing a computer program, and when the computer program is run by a processor, the method for obtaining the true value of the head posture as provided in the first aspect is executed.
  • the present application provides a method for obtaining the true value of head posture.
  • Use multiple image acquisition devices to collect images of the target object at the same time use the key points of the two-dimensional face marked in the multiple captured images as the data basis for the three-dimensional reconstruction of the key points of the face of the target object, and then obtain the corresponding key points of the image.
  • the ground truth of the head pose of the target object Since this solution does not require the use of wearable devices to obtain the true value of the head pose, it is not affected by the wearing angle.
  • the accuracy of the acquired true value of the head pose is guaranteed.
  • the true value of the head posture is applied to areas such as driver fatigue analysis, virtual reality somatosensory games, product purchase desire analysis, face verification, etc., the data analysis results can be made more accurate.
  • FIG. 1 is a flow chart of a method for obtaining a true value of a head posture provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a scene in which images of a target object are collected by multiple image collection devices provided in an embodiment of the present application;
  • Fig. 3 is a schematic structural diagram of a device for obtaining a true value of a head pose provided by an embodiment of the present application.
  • the inventor proposes a technical solution to simultaneously collect images of people through multiple image acquisition devices, and use these images to obtain the true value of the head posture. This solution does not require people to wear wearable devices, and can obtain high-precision true head poses only by relying on images. Furthermore, the accuracy of the data analysis result obtained when the true value of the head posture is used for data analysis is ensured.
  • FIG. 1 the figure is a flow chart of a method for obtaining a true value of a head pose provided by an embodiment of the present application.
  • the method shown in Figure 1 includes the following steps:
  • S101 Acquire images of a target object collected by multiple image collection devices at the same time.
  • multiple image acquisition devices are used to acquire images of the target object at the same time.
  • the target object refers to the subject whose true value of the head pose needs to be obtained.
  • Mr. A is used as the target object, and multiple image acquisition devices are used to acquire images of Mr. A at the same time.
  • the image acquisition device may be any device capable of capturing and forming images, such as an RGB camera.
  • the type and model of the image acquisition device are not limited here.
  • the multiple image acquisition devices are specifically two or more image acquisition devices. That is, when it is required to acquire images of the target object at the same time, at least two image acquisition devices are used.
  • the setting parameters of multiple image acquisition devices are different or not completely the same. Setting parameters include position, height and angle, etc.
  • S102 Mark the key points of the face in the images captured by each of the multiple image capture devices, and obtain the key point information of the face corresponding to each of the image capture devices.
  • the face key points are marked on the images acquired by each image acquisition device.
  • the marked face key points may include but not limited to: eyebrow head, eyebrow peak, eyebrow tail, inner eye corner, outer eye corner, pupil center, nose wing, nostril, mouth corner, etc.
  • 70 keypoints of faces in an image are labeled. There is no limit to the number of labels for key points here.
  • the face key point information may include: the pixel coordinates of the key point in the image. Since the image is a two-dimensional image, the key point information of the human face obtained by marking also refers to the key point information of the human face in the two-dimensional image.
  • the execution purpose of this step is to construct the information of the key points of the three-dimensional face by marking the key points of the two-dimensional face in multiple images collected by different image acquisition devices at the same time.
  • a set of corresponding three-dimensional key point information of the human face must be obtained. Since the image corresponds to the image acquisition device, it can be understood that the image acquisition device corresponds to the obtained three-dimensional key point information of the human face.
  • different image acquisition devices have unique labels, for example, device No. 1, device No. 2, ... device No. 14.
  • the order of the labels is not limited here.
  • labeling may be performed in a manner of increasing ordinal numbers along one direction.
  • the face key point information corresponding to the target image capture device the face key point information corresponding to the reference image capture device, the internal parameters of the target image capture device, the internal parameters of the reference image capture device, and the target image capture device and the reference image capture device between the external parameters.
  • the reference image acquisition device is any image acquisition device other than the target image acquisition device among the plurality of image acquisition devices.
  • the 3D key point information of the face corresponding to the target image capture device and the external parameters the 3D key point information of the face corresponding to the reference image capture device is obtained.
  • the 3D key point information of the face may include the 3D coordinates of the key points of the face in the coordinate system of the image acquisition device.
  • No. 4 device is the target image acquisition device
  • No. 5 device is the reference image acquisition device.
  • the face key point information corresponding to No. 4 device is reconstructed through the triangulation reconstruction method.
  • the triangulation reconstruction function is a relatively mature technology in this field, so the specific implementation process will not be described here.
  • the 3D key point information of the face corresponding to the No. 4 device is converted into the coordinate system of other image acquisition devices, and the 3D key point information of the face corresponding to other image acquisition devices is obtained. In this way, the three-dimensional key point information of the human face corresponding to each of the 14 image acquisition devices can be obtained.
  • the 3D key point information of the face corresponding to one target image acquisition device is taken as an example to introduce the method of obtaining the true value of the head pose.
  • the 3D face key point information corresponding to other image acquisition devices can also perform corresponding operations in the same implementation manner.
  • the image of the target object captured by the target image capture device at the above-mentioned moment is referred to as the target image.
  • S104 Establish a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of multiple image acquisition devices.
  • the plane of the face can be determined based on the 3D key point information of the face corresponding to the target image acquisition device.
  • a face plane can be constructed based on the coordinates of 3 to 4 key points of the face in the coordinate system of the target image acquisition device.
  • the key points of the selected face are not limited.
  • the normal vector of the face plane can be determined through the cross function of NumPy (full name: Numerical Python, which is an open source numerical calculation extension of Python).
  • the face plane and the normal vector of the face plane have been obtained. Furthermore, the face coordinate system of the target object can be established based on the plane and the normal vector of the plane. It can be understood that since the three-dimensional key point information of the face is obtained in the coordinate system of the target image acquisition device, the face coordinate system established based on the three-dimensional key point information of the face is also based on the coordinate system of the target image acquisition device. Base.
  • S105 According to the coordinate system of the target image acquisition device and the face coordinate system, obtain the true value of the head pose of the target object corresponding to the target image, where the target image is an image collected by the target image acquisition device of the target object at any time.
  • the rotation matrix of the face coordinate system relative to the coordinate system of the target image acquisition device can be obtained according to the face coordinate system and the coordinate system of the target image acquisition device . Since there is an association between this rotation matrix and the three angles (yaw angle Yaw, pitch angle Pitch, and roll angle Roll) in the true value of the head pose, the true head pose of the target object can be obtained based on the rotation matrix. value.
  • the true value of the head pose is obtained on the basis of the coordinate system of the target image acquisition device and the face coordinate system, and the face coordinate system is also based on the coordinate system of the target image acquisition device, the target image acquisition device and the The target images taken at all times have correspondence, so the true value of the head pose obtained in this step can be associated with the target image.
  • the model may be a model for determining the true value of the head posture through images, or further, the model may be a model for determining the driving safety factor through images or a model for analyzing driving fatigue, etc.
  • the specific functions of the trained model are not limited here.
  • the above is the method for obtaining the true value of the head pose provided by the embodiment of the present application.
  • multiple image acquisition devices are used to collect images of the target object at the same time, and the two-dimensional face key points marked in the multiple collected images are used as the data basis for three-dimensional reconstruction of the target object face key points, and then obtained The true value of the head pose of the target object corresponding to the image. Since this solution does not require the use of wearable devices to obtain the true value of the head pose, it is not affected by the wearing angle. Through the simultaneous use of multiple image acquisition devices and the 3D reconstruction of key points of the face, the accuracy of the acquired true value of the head pose is guaranteed. Furthermore, when the true value of the head posture is applied to areas such as driver fatigue analysis, virtual reality somatosensory games, product purchase desire analysis, face verification, etc., the data analysis results can be made more accurate.
  • the true value of the head posture obtained by executing this method under different lighting conditions may be different.
  • it can help to improve the accuracy of data analysis. For example, when it is necessary to perform data analysis under the first-level lighting conditions, because the previous stage collects images under the first-level lighting conditions and obtains the true value of the head pose, not only collects images under other lighting conditions and obtains the true value of the head pose. value, thus making the data analysis under this condition more accurate in the data analysis results.
  • the first-level, second-level, and third-level lighting conditions described above are examples of different lighting conditions, and the levels of different lighting conditions are not limited here. For example, it could be to include 4 levels of lighting conditions. The higher the level, the weaker the light intensity; the lower the level, the higher the light intensity.
  • the lighting conditions are divided into sunlight lighting conditions, infrared lighting conditions, ultraviolet lighting conditions, and the like.
  • the above lighting conditions can be realized by the natural environment, and can also be realized by lighting devices. For example, adjust or control the selection of light source type, light on and off, intensity, and irradiation angle in the lighting device to achieve different lighting conditions.
  • the embodiment of the present application can take adaptive measures in the image acquisition stage. For example, when multiple image acquisition devices acquire images of a target object, the target object sits on a seat, and the seat is used to simulate a seat in a real vehicle environment. For example, the target object represents the driver, and the seat it is on represents the driver's seat.
  • the setting parameters of the image acquisition device relative to the seat are determined according to the setting parameters of the simulated object relative to the seat in the real vehicle environment.
  • the setting parameters include: position, height, angle, etc.
  • the simulated object includes at least one of the following: A-pillar, B-pillar, instrument panel, front windshield or left vehicle glass. As an example, No.
  • No. 1 device is set to simulate the A-pillar
  • No. 2 device is set to simulate the B-pillar
  • No. 3 device is set to simulate the instrument panel
  • No. 4 device is set to simulate the front windshield
  • No. 5 device is set to simulate the left car glass set up.
  • FIG. 2 is a schematic diagram of a scene in which images of a target object are collected by multiple image collection devices according to an embodiment of the present application.
  • Fig. 2 shows that 14 cameras collect images of the target object on the seat, and the lighting device is surrounded above the target object in the scene for changing the lighting conditions.
  • multiple image acquisition devices can simultaneously record the video of the same person during acquisition, and calculate the continuous true value of the head posture according to the three-dimensional key point information of the face of the video frame number. Therefore, the technical solution of the present application has continuity in obtaining the true value of the head posture.
  • This method adopts a non-contact head posture acquisition method, which is relatively simple and convenient to implement.
  • types of multiple image acquisition devices used to acquire images may include infrared cameras, RGB cameras, and the like.
  • value ranges of the same type of internal parameters of multiple image acquisition devices may be different. These internal parameters with different value ranges may be focal length, optical center value, and the like.
  • the focal length range of device A is f1-f2
  • the focal length range of device B is f3-f4.
  • various types of images such as infrared images and RGB images
  • various types of images can be formed by configuring multiple different types of image acquisition devices.
  • a variety of images with different imaging effects can be obtained through a small number of acquisitions, which can meet the acquisition requirements of different research and development projects for images with different imaging effects in practical applications, and save acquisition time and acquisition costs.
  • various types of devices and devices with various internal parameter value ranges are set to make the acquired image data more diverse and meet actual use requirements.
  • Fig. 3 is a schematic structural diagram of a device for obtaining a true value of a head pose.
  • the device 300 shown in Figure 3 includes:
  • An image acquisition module 301 configured to acquire images collected by multiple image acquisition devices on the target object at the same time;
  • the key point labeling module 302 is configured to label the key points of the face in the image captured by each of the image capture devices among the plurality of image capture devices, and obtain the key point information of the face corresponding to each of the image capture devices ;
  • the three-dimensional key point reconstruction module 303 is used to reconstruct the three-dimensional key point information of the face corresponding to each of the image acquisition devices based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices;
  • a coordinate system establishment module 304 configured to establish a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of the plurality of image acquisition devices;
  • the true value acquisition module 305 is configured to obtain the true value of the head posture of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the human face coordinate system, and the target image is the target object The image collected by the image collection device on the target object at the time.
  • the truth acquisition module 305 includes:
  • a rotation matrix acquiring unit configured to obtain a rotation matrix of the face coordinate system relative to the coordinate system of the target image capture device according to the face coordinate system and the coordinate system of the target image capture device;
  • a true value obtaining unit configured to obtain the true value of the head pose of the target object corresponding to the target image according to the rotation matrix.
  • the coordinate system establishment module 304 includes:
  • a face plane determining unit configured to determine a face plane based on the three-dimensional key point information of the face corresponding to the target image acquisition device
  • a normal vector determination unit configured to determine the normal vector of the face plane according to the face plane
  • a coordinate system establishing unit configured to establish the face coordinate system of the target object based on the face plane and the normal vector of the face plane.
  • the 3D key point reconstruction module 303 includes:
  • the face key point information corresponding to the target image capture device the face key point information corresponding to the reference image capture device, the internal parameters of the target image capture device, the internal parameters of the reference image capture device, and the target
  • the external parameters between the image acquisition device and the reference image acquisition device are used to reconstruct the three-dimensional key point information of the face corresponding to the target image acquisition device through the triangulation reconstruction method;
  • the reference image acquisition device is the plurality of images Any image acquisition device other than the target image acquisition device among the acquisition devices;
  • the 3D key point information of the face corresponding to the target image capture device According to the 3D key point information of the face corresponding to the target image capture device and the external parameters, the 3D key point information of the face corresponding to the reference image capture device is obtained.
  • the acquisition device 300 of the true value of the head posture also includes:
  • the storage module is used for storing the images corresponding to each other and the true value of the head posture as image true value pairs.
  • the image acquisition module 301 includes:
  • a lighting unit configured to provide a variety of different lighting conditions to the space where the target object is located
  • An image acquisition unit configured to acquire images of the target object acquired by the plurality of image acquisition devices at the same time under the same lighting condition.
  • the present application also provides a device for obtaining the true value of the head posture, including a processor and a memory; the memory is used to store computer programs; The processor is configured to execute, according to the computer program, the method for obtaining the true value of the head posture as provided in the foregoing method embodiments.
  • the processor in the device can also be used to control the lighting device to provide variable lighting conditions.
  • the present application also provides a computer-readable storage medium for storing a computer program, and the computer program is executed when the processor runs The method for obtaining the true value of the head pose as provided in the foregoing method embodiments.

Abstract

The present application discloses a head pose truth value acquisition method, apparatus and device, and a storage medium. A plurality of image acquisition devices collect images of a target object at the same moment, two-dimensional face key points annotated in the plurality of collected images are used as the data bases for three-dimensional reconstruction of the face key points of the target object, and a head pose truth value of the target object corresponding to an image is acquired. According to the present solution, the head pose truth value can be acquired without using a wearable device; therefore, the head pose truth value is not affected by a wearing angle. By means of the use of the plurality of image acquisition devices at the same moment and the three-dimensional reconstruction of the face key points, the accuracy of the acquired head pose truth value is ensured. Furthermore, when the head pose truth value is applied to the fields such as driver fatigue level analysis, virtual reality motion sensing games, commodity purchase intention analysis, and face verification, the data analysis result can be more accurate.

Description

一种头部姿势真值的获取方法、装置、设备及存储介质A method, device, equipment and storage medium for obtaining the true value of head posture
本申请要求于2021年12月09日提交中华人民共和国国家知识产权局、申请号为202111501742.9、申请名称为“一种头部姿势真值的获取方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires submission of a Chinese patent application to the State Intellectual Property Office of the People's Republic of China on December 09, 2021, with the application number 202111501742.9 and the application name "A Method, Device, Equipment, and Storage Medium for Acquiring the True Value of Head Posture" priority, the entire contents of which are incorporated in this application by reference.
技术领域technical field
本申请涉及图像处理技术领域,特别是涉及一种头部姿势真值的获取方法、装置、设备及存储介质。The present application relates to the technical field of image processing, and in particular to a method, device, equipment and storage medium for acquiring the true value of head posture.
背景技术Background technique
头部姿势真值包括偏航角Yaw、俯仰角Pitch和翻滚角Roll。其中,偏航角、俯仰角和翻滚角分别在欧拉角向量坐标系中相对于y轴、x轴和z轴旋转的角度。头部姿势真值在诸多领域中都有应用,例如驾驶员疲劳程度分析、虚拟现实体感游戏、商品购买欲分析、人脸验证等领域。The true value of the head pose includes the yaw angle Yaw, the pitch angle Pitch and the roll angle Roll. Among them, the yaw angle, pitch angle and roll angle are the angles of rotation relative to the y-axis, x-axis and z-axis in the Euler angle vector coordinate system, respectively. The true value of head posture has applications in many fields, such as driver fatigue analysis, virtual reality somatosensory games, product purchase desire analysis, face verification and other fields.
目前可以通过穿戴设备中的传感器获取头部姿势真值。但是穿戴设备的穿戴角度容易影响获取到的头部姿势真值的精度。结合前文所述的诸多应用领域,如果头部姿势真值的精度不佳,很容易影响数据分析的结果的准确性。例如,因获取到的驾驶员头部姿势真值的精度差,错误地将疲劳程度较高的驾驶员判定为不疲劳的状态,进而没有及时准确地做出语音提醒。可见,提升头部姿势真值的获取精度是亟待解决的技术问题。At present, the true value of the head pose can be obtained through the sensors in the wearable device. However, the wearing angle of the wearable device easily affects the accuracy of the acquired true value of the head pose. Combined with the many application fields mentioned above, if the accuracy of the true value of the head pose is not good, it is easy to affect the accuracy of the results of the data analysis. For example, due to the poor accuracy of the acquired true value of the driver's head posture, the driver with a high degree of fatigue was mistakenly judged as not fatigued, and the voice reminder was not timely and accurate. It can be seen that improving the accuracy of obtaining the true value of the head pose is an urgent technical problem to be solved.
发明内容Contents of the invention
基于上述问题,本申请提供了一种头部姿势真值的获取方法、装置、设备及存储介质。Based on the above problems, the present application provides a method, device, device and storage medium for acquiring the true value of head posture.
本申请实施例公开了如下技术方案:The embodiment of the application discloses the following technical solutions:
第一方面,本申请提供了一种头部姿势真值的获取方法,包括:In the first aspect, the present application provides a method for obtaining the true value of the head pose, including:
获取多个图像采集设备在相同时刻对目标对象采集的图像;Obtaining images of the target object collected by multiple image acquisition devices at the same time;
对所述多个图像采集设备中各个所述图像采集设备所采集的图像中的人脸关键点进行标注,获得各个所述图像采集设备对应的人脸关键点信息;Marking the face key points in the images collected by each of the image capture devices among the plurality of image capture devices, and obtaining the face key point information corresponding to each of the image capture devices;
基于各个所述图像采集设备对应的人脸关键点信息和各个所述图像采集设备的参数,重建出各个所述图像采集设备对应的人脸三维关键点信息;Based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices, reconstruct the three-dimensional key point information of the face corresponding to each of the image acquisition devices;
基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,所述目标图像采集设备为所述多个图像采集设备之一;Establishing a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of the plurality of image acquisition devices;
根据所述目标图像采集设备的坐标系和所述人脸坐标系,获得目标图像对应的所述目标对象的头部姿势真值,所述目标图像为所述目标图像采集设备在所述时刻对所述目标对象采集的图像。According to the coordinate system of the target image acquisition device and the face coordinate system, obtain the true value of the head pose of the target object corresponding to the target image, the target image is the target image acquisition device at the time The image collected by the target object.
在一种可选的实现方式中,所述根据所述目标图像采集设备的坐标系和所述人脸坐标系,获得目标图像对应的所述目标对象的头部姿势真值,包括:In an optional implementation manner, the obtaining the true value of the head pose of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the face coordinate system includes:
根据所述人脸坐标系和所述目标图像采集设备的坐标系,获得所述人脸坐标系相对于所述目标图像采集设备的坐标系的旋转矩阵;Obtaining a rotation matrix of the face coordinate system relative to the coordinate system of the target image capture device according to the face coordinate system and the target image capture device coordinate system;
根据所述旋转矩阵获得所述目标图像对应的所述目标对象的头部姿势真值。A true value of the head pose of the target object corresponding to the target image is obtained according to the rotation matrix.
在一种可选的实现方式中,所述基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,包括:In an optional implementation manner, the establishment of the face coordinate system based on the 3D key point information of the face corresponding to the target image acquisition device includes:
基于所述目标图像采集设备对应的人脸三维关键点信息确定人脸平面;Determining the face plane based on the three-dimensional key point information of the face corresponding to the target image acquisition device;
根据所述人脸平面确定所述人脸平面的法向量;determining the normal vector of the face plane according to the face plane;
基于所述人脸平面和所述人脸平面的法向量建立所述目标对象的人脸坐标系。A human face coordinate system of the target object is established based on the human face plane and the normal vector of the human face plane.
在一种可选的实现方式中,所述基于各个所述图像采集设备对应的人脸关键点信息和各个所述图像采集设备的参数,重建出各个所述图像采集设备对应的人脸三维关键点信息,包括:In an optional implementation manner, the three-dimensional face key points corresponding to each of the image acquisition devices are reconstructed based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices. point information, including:
根据所述目标图像采集设备对应的人脸关键点信息、参考图像采集设备对应的人脸关键点信息、所述目标图像采集设备的内部参数、所述参考图像采集设备的内部参数以及所述目标图像采集设备和所述参考图像采集设备之间的外部参数,通过三角化重建法重建出所述目标图像采集设备对应的人脸三维关键点信息;所述参考图像采集设备为所述多个图像采集设备之中所述目标图像采集设备以外的任一图像采集设备;According to the face key point information corresponding to the target image capture device, the face key point information corresponding to the reference image capture device, the internal parameters of the target image capture device, the internal parameters of the reference image capture device, and the target The external parameters between the image acquisition device and the reference image acquisition device are used to reconstruct the three-dimensional key point information of the face corresponding to the target image acquisition device through the triangulation reconstruction method; the reference image acquisition device is the plurality of images Any image acquisition device other than the target image acquisition device among the acquisition devices;
根据所述目标图像采集设备对应的人脸三维关键点信息以及所述外部参数,获得所述参考图像采集设备对应的人脸三维关键点信息。According to the 3D key point information of the face corresponding to the target image capture device and the external parameters, the 3D key point information of the face corresponding to the reference image capture device is obtained.
在一种可选的实现方式中,头部姿势真值的获取方法还包括:In an optional implementation, the method for obtaining the true value of the head pose further includes:
将相互对应的图像和头部姿势真值存储为图像真值对。The image and head pose ground truths corresponding to each other are stored as image ground truth pairs.
在一种可选的实现方式中,所述获取多个图像采集设备在相同时刻对目标对象采集的图像,包括:In an optional implementation manner, the acquisition of images collected by multiple image acquisition devices on the target object at the same time includes:
向所述目标对象所处的空间提供多种不同的光照条件;providing a plurality of different lighting conditions to the space in which the target object is located;
获取相同的光照条件下所述多个图像采集设备在相同时刻对所述目标对象采集的图像。acquiring images of the target object collected by the plurality of image acquisition devices at the same moment under the same illumination condition.
在一种可选的实现方式中,所述多个图像采集设备采集所述目标对象的图像时,所述目标对象坐在座位上,所述座位用于模拟实车环境中的座椅;所述图像采集设备相对于所述座位的设置参数为依据所述实车环境中模拟对象相对于所述座椅的设置参数确定的;In an optional implementation manner, when the multiple image acquisition devices acquire images of the target object, the target object is sitting on a seat, and the seat is used to simulate a seat in a real vehicle environment; The setting parameters of the image acquisition device relative to the seat are determined according to the setting parameters of the simulation object in the real vehicle environment relative to the seat;
所述模拟对象包括以下至少一种:The simulated objects include at least one of the following:
A柱、B柱、仪表盘、前挡风玻璃或左侧车玻璃。A-pillar, B-pillar, instrument panel, front windshield or left side glass.
在一种可选的实现方式中,所述多个图像采集设备包括:红外相机和RGB相机。In an optional implementation manner, the multiple image acquisition devices include: an infrared camera and an RGB camera.
在一种可选的实现方式中,所述多个图像采集设备的同类型内参具有不同的值域。In an optional implementation manner, the internal parameters of the same type of the multiple image acquisition devices have different value ranges.
第二方面,本申请提供了一种头部姿势真值的获取装置,包括:In the second aspect, the present application provides a device for obtaining the true value of the head posture, including:
图像获取模块,用于获取多个图像采集设备在相同时刻对目标对象采集的图像;An image acquisition module, configured to acquire images collected by multiple image acquisition devices on the target object at the same time;
关键点标注模块,用于对所述多个图像采集设备中各个所述图像采集设备所采集的图像中的人脸关键点进行标注,获得各个所述图像采集设备对应的人脸关键点信息;A key point labeling module, configured to mark the key points of the face in the image captured by each of the image capture devices among the plurality of image capture devices, and obtain the key point information of the face corresponding to each of the image capture devices;
三维关键点重建模块,用于基于各个所述图像采集设备对应的人脸关键点信息和各个所述图像采集设备的参数,重建出各个所述图像采集设备对应的人脸三维关键点信息;The three-dimensional key point reconstruction module is used to reconstruct the three-dimensional key point information of the face corresponding to each of the image acquisition devices based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices;
坐标系建立模块,用于基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,所述目标图像采集设备为所述多个图像采集设备之一;A coordinate system establishment module, configured to establish a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of the plurality of image acquisition devices;
真值获取模块,用于根据所述目标图像采集设备的坐标系和所述人脸坐标系,获得目标图像对应的所述目标对象的头部姿势真值,所述目标图像为所述目标图像采集设备在所述时刻对所述目标对象采集的图像。A true value acquisition module, configured to obtain the true value of the head posture of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the face coordinate system, the target image being the target image The image collected by the collection device on the target object at the time.
第三方面,本申请提供了一种头部姿势真值的获取设备,包括处理器及存储器;所述存储器用于存储计算机程序;所述处理器用于根据所述计算机程序执行如第一方面提供的头部姿势真值的获取方法。In a third aspect, the present application provides a device for obtaining the true value of a head posture, including a processor and a memory; the memory is used to store a computer program; the processor is used to execute the computer program as provided in the first aspect according to the computer program. The method for obtaining the true value of the head pose.
第四方面,本申请提供了一种计算机可读存储介质,用于存储计算机程序,所述计算机程序被处理器运行时执行如第一方面提供的头部姿势真值的获取方法。In a fourth aspect, the present application provides a computer-readable storage medium for storing a computer program, and when the computer program is run by a processor, the method for obtaining the true value of the head posture as provided in the first aspect is executed.
相较于现有技术,本申请具有以下有益效果:Compared with the prior art, the present application has the following beneficial effects:
本申请提供了一种头部姿势真值的获取方法。以多个图像采集设备采集相同时刻目标对象的图像,以多幅所采集的图像中标记的二维人脸关键点作为三维重建目标对象人脸关键点的数据基础,并进而获取图像对应的该目标对象的头部姿势真值。由于本方案不需要借助穿戴设备来获取头部姿势真值,因此不受穿戴角度的影响。通过多个图像采集设备的同时刻使用,以及人脸关键点的三维重建,保证了所获取的头部姿势真值的精度。进而,当头部姿势真值应用到驾驶员疲劳程度分析、虚拟现实体感游戏、商品购买欲分析、人脸验证等领域时,可以使数据的分析结果更加准确。The present application provides a method for obtaining the true value of head posture. Use multiple image acquisition devices to collect images of the target object at the same time, use the key points of the two-dimensional face marked in the multiple captured images as the data basis for the three-dimensional reconstruction of the key points of the face of the target object, and then obtain the corresponding key points of the image. The ground truth of the head pose of the target object. Since this solution does not require the use of wearable devices to obtain the true value of the head pose, it is not affected by the wearing angle. Through the simultaneous use of multiple image acquisition devices and the 3D reconstruction of key points of the face, the accuracy of the acquired true value of the head pose is guaranteed. Furthermore, when the true value of the head posture is applied to areas such as driver fatigue analysis, virtual reality somatosensory games, product purchase desire analysis, face verification, etc., the data analysis results can be made more accurate.
附图说明Description of drawings
为了更清楚地说明本申请具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the specific embodiments of the present application or the technical solutions in the prior art, the following will briefly introduce the accompanying drawings that need to be used in the description of the specific embodiments or prior art. Obviously, the accompanying drawings in the following description The figures show some implementations of the present application, and those skilled in the art can obtain other figures based on these figures without any creative effort.
图1为本申请实施例提供的一种头部姿势真值的获取方法流程图;FIG. 1 is a flow chart of a method for obtaining a true value of a head posture provided by an embodiment of the present application;
图2为本申请实施例提供的一种通过多个图像采集设备采集目标对象图像的场景示意图;FIG. 2 is a schematic diagram of a scene in which images of a target object are collected by multiple image collection devices provided in an embodiment of the present application;
图3为本申请实施例提供的一种头部姿势真值的获取装置的结构示意图。Fig. 3 is a schematic structural diagram of a device for obtaining a true value of a head pose provided by an embodiment of the present application.
具体实施方式Detailed ways
正如前文描述,目前获取头部姿势真值通常需要依靠穿戴设备的传感器。但是人佩戴设备的角度有可能会影响获取到的头部姿势真值的精度,导致应用头部姿势真值进行数据分析时,分析结果的准确性也受到影响。发明人经过研究,提出一种通过多个图像采集设备同时采集人物图像,并以这些图像来获取头部姿势真值的技术方案。此方案不需要借助人佩戴穿戴设备,仅凭借图像就可以获得高精度的头部姿势真值。进而,保证了应用头部姿势真值进行数据分析时获得的数据分析结果的准确性。As described above, obtaining the true value of the head posture currently usually relies on the sensors of the wearable device. However, the angle at which a person wears the device may affect the accuracy of the obtained true head pose, which may affect the accuracy of the analysis results when the true head pose is used for data analysis. After research, the inventor proposes a technical solution to simultaneously collect images of people through multiple image acquisition devices, and use these images to obtain the true value of the head posture. This solution does not require people to wear wearable devices, and can obtain high-precision true head poses only by relying on images. Furthermore, the accuracy of the data analysis result obtained when the true value of the head posture is used for data analysis is ensured.
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the drawings in the embodiment of the application. Obviously, the described embodiment is only It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.
参见图1,该图为本申请实施例提供的一种头部姿势真值的获取方法流程图。如图1所示的方法中包括以下步骤:Referring to FIG. 1 , the figure is a flow chart of a method for obtaining a true value of a head pose provided by an embodiment of the present application. The method shown in Figure 1 includes the following steps:
S101:获取多个图像采集设备在相同时刻对目标对象采集的图像。S101: Acquire images of a target object collected by multiple image collection devices at the same time.
在本申请实施例中提出通过多个图像采集设备采集相同时刻目标对象的图像。目标对象是指需要获取的头部姿势真值的主体。例如,当需要获取A先生的头部姿势真值,则将 A先生作为目标对象,以多个图像采集设备来采集相同时刻A先生的图像。In the embodiment of the present application, it is proposed that multiple image acquisition devices are used to acquire images of the target object at the same time. The target object refers to the subject whose true value of the head pose needs to be obtained. For example, when it is necessary to obtain the true value of Mr. A's head posture, Mr. A is used as the target object, and multiple image acquisition devices are used to acquire images of Mr. A at the same time.
图像采集设备可以是任何能够拍摄并形成图像的设备,例如可以是RGB相机。此处对图像采集设备的类型和型号不做限定。多个图像采集设备具体为两个或者两个以上的图像采集设备。即要求对目标对象进行同时刻的图像采集时,使用至少两个图像采集设备。多个图像采集设备的设置参数不同或者不完全相同。设置参数包括位置、高度和角度等。The image acquisition device may be any device capable of capturing and forming images, such as an RGB camera. The type and model of the image acquisition device are not limited here. The multiple image acquisition devices are specifically two or more image acquisition devices. That is, when it is required to acquire images of the target object at the same time, at least two image acquisition devices are used. The setting parameters of multiple image acquisition devices are different or not completely the same. Setting parameters include position, height and angle, etc.
S102:对多个图像采集设备中各个图像采集设备所采集的图像中的人脸关键点进行标注,获得各个图像采集设备对应的人脸关键点信息。S102: Mark the key points of the face in the images captured by each of the multiple image capture devices, and obtain the key point information of the face corresponding to each of the image capture devices.
本步骤中,对各图像采集设备采集得到的图像进行人脸关键点标注。标注的人脸关键点可以包括但不限于:眉头、眉峰、眉尾、内眼角、外眼角、瞳孔中心、鼻翼、鼻孔、嘴角等。作为示例,标注图像中人脸的70个关键点。此处对关键点的标注数量不做限定。In this step, the face key points are marked on the images acquired by each image acquisition device. The marked face key points may include but not limited to: eyebrow head, eyebrow peak, eyebrow tail, inner eye corner, outer eye corner, pupil center, nose wing, nostril, mouth corner, etc. As an example, 70 keypoints of faces in an image are labeled. There is no limit to the number of labels for key points here.
在包含人脸的图像中识别并标记人脸关键点属于本领域较为成熟的技术,故此处对本步骤的具体实现方式不做限定。人脸关键点信息可以包括:关键点在所在图像中的像素坐标。由于图像是二维图像,因此标注得到的人脸关键点信息也是指二维图像中人脸关键点的信息。Recognizing and marking key points of human faces in images containing human faces is a relatively mature technology in this field, so the specific implementation of this step is not limited here. The face key point information may include: the pixel coordinates of the key point in the image. Since the image is a two-dimensional image, the key point information of the human face obtained by marking also refers to the key point information of the human face in the two-dimensional image.
S103:基于各个图像采集设备对应的人脸关键点信息和各个图像采集设备的参数,重建出各个图像采集设备对应的人脸三维关键点信息。S103: Based on the key point information of the face corresponding to each image acquisition device and the parameters of each image acquisition device, reconstruct the three-dimensional key point information of the face corresponding to each image acquisition device.
本步骤的执行目的是通过在多幅同时刻不同图像采集设备采集到的图像中标记的二维人脸关键点,构建出三维人脸关键点的信息。具体实现时,对于每一幅图像,都要获取到一组对应的人脸三维关键点信息。由于图像与图像采集设备具有对应性,因此可以理解为图像采集设备与获得的人脸三维关键点信息具有对应性。以某一采集的时刻为例,本步骤中需要重建出每个图像采集设备各自对应的人脸三维关键点信息。例如,共有14个图像采集设备,对于同一采集时刻,需要分别获取14个图像采集设备各自对应的人脸三维关键点信息。The execution purpose of this step is to construct the information of the key points of the three-dimensional face by marking the key points of the two-dimensional face in multiple images collected by different image acquisition devices at the same time. In specific implementation, for each image, a set of corresponding three-dimensional key point information of the human face must be obtained. Since the image corresponds to the image acquisition device, it can be understood that the image acquisition device corresponds to the obtained three-dimensional key point information of the human face. Taking a certain acquisition moment as an example, in this step, it is necessary to reconstruct the 3D key point information of the face corresponding to each image acquisition device. For example, there are 14 image acquisition devices in total, and for the same acquisition time, it is necessary to obtain the 3D key point information of the face corresponding to each of the 14 image acquisition devices.
在14个图像采集设备中,不同的图像采集设备具有唯一的标号,例如1号设备、2号设备、…14号设备。对于标号的顺序,此处不做限定。例如可以沿着一个方向依次递增序数的方式进行标号。Among the 14 image acquisition devices, different image acquisition devices have unique labels, for example, device No. 1, device No. 2, ... device No. 14. The order of the labels is not limited here. For example, labeling may be performed in a manner of increasing ordinal numbers along one direction.
下面介绍S103的一种示例性实现方式:An exemplary implementation of S103 is introduced below:
根据目标图像采集设备对应的人脸关键点信息、参考图像采集设备对应的人脸关键点信息、目标图像采集设备的内部参数、参考图像采集设备的内部参数以及目标图像采集设备和参考图像采集设备之间的外部参数,通过三角化重建法重建出目标图像采集设备对应的人脸三维关键点信息。参考图像采集设备为多个图像采集设备之中目标图像采集设备以外的任一图像采集设备。根据目标图像采集设备对应的人脸三维关键点信息以及外部参数,获得参考图像采集设备对应的人脸三维关键点信息。人脸三维关键点信息可以包括人脸关键点在所处的图像采集设备坐标系中的三维坐标。According to the face key point information corresponding to the target image capture device, the face key point information corresponding to the reference image capture device, the internal parameters of the target image capture device, the internal parameters of the reference image capture device, and the target image capture device and the reference image capture device Between the external parameters, the three-dimensional key point information of the face corresponding to the target image acquisition device is reconstructed by the triangulation reconstruction method. The reference image acquisition device is any image acquisition device other than the target image acquisition device among the plurality of image acquisition devices. According to the 3D key point information of the face corresponding to the target image capture device and the external parameters, the 3D key point information of the face corresponding to the reference image capture device is obtained. The 3D key point information of the face may include the 3D coordinates of the key points of the face in the coordinate system of the image acquisition device.
假设4号设备为目标图像采集设备,5号设备为参考图像采集设备。在执行本步骤时,将4号设备对应的人脸关键点信息、5号设备对应的人脸关键点信息、4号设备的内部参数、5号设备的内部参数以及4号设备与5号设备之间的外部参数,通过三角化重建法重建出4号设备对应的人脸三维关键点的信息。在三角化重建人脸三维关键点信息时,可以通过三 角化重建函数实现,三角化重建属于本领域较为成熟的技术,故此处对于具体的实现过程不做赘述。当获得4号设备对应的人脸三维关键点信息后,由于不同的图像采集设备之间的外部参数已经预先标定好,因此,可以借助其他图像采集设备与4号设备之间的外部参数,将4号设备对应的人脸三维关键点信息转换到其他图像采集设备的坐标系中,获得其他图像采集设备对应的人脸三维关键点信息。如此,可以获得14个图像采集设备中每一个设备对应的人脸三维关键点信息。It is assumed that No. 4 device is the target image acquisition device, and No. 5 device is the reference image acquisition device. When performing this step, the face key point information corresponding to No. 4 device, the face key point information corresponding to No. 5 device, the internal parameters of No. 4 device, the internal parameters of No. 5 device, and Between the external parameters, the information of the 3D key points of the face corresponding to the No. 4 device is reconstructed through the triangulation reconstruction method. When triangulating and reconstructing the 3D key point information of the face, it can be realized by the triangulation reconstruction function. The triangulation reconstruction is a relatively mature technology in this field, so the specific implementation process will not be described here. After obtaining the 3D key point information of the face corresponding to No. 4 device, since the external parameters between different image acquisition devices have been calibrated in advance, it is possible to use the external parameters between other image acquisition devices and No. 4 device. The 3D key point information of the face corresponding to the No. 4 device is converted into the coordinate system of other image acquisition devices, and the 3D key point information of the face corresponding to other image acquisition devices is obtained. In this way, the three-dimensional key point information of the human face corresponding to each of the 14 image acquisition devices can be obtained.
在下文中仅以一个目标图像采集设备对应的人脸三维关键点信息作为示例,介绍头部姿势真值的获取方式。其他图像采集设备对应的人脸三维关键点信息也可以通过相同的实现方式执行对应的操作。为便于说明,本申请实施例中将目标图像采集设备在上述时刻采集的该目标对象的图像称为目标图像。In the following, only the 3D key point information of the face corresponding to one target image acquisition device is taken as an example to introduce the method of obtaining the true value of the head pose. The 3D face key point information corresponding to other image acquisition devices can also perform corresponding operations in the same implementation manner. For ease of description, in the embodiments of the present application, the image of the target object captured by the target image capture device at the above-mentioned moment is referred to as the target image.
S104:基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,目标图像采集设备为多个图像采集设备之一。S104: Establish a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of multiple image acquisition devices.
本步骤在具体实现时,可以基于目标图像采集设备对应的人脸三维关键点信息确定人脸平面。作为示例,可以基于目标图像采集设备的坐标系中3至4个人脸关键点的坐标,构建出人脸平面。具体选用的人脸关键点不做限定。When this step is actually implemented, the plane of the face can be determined based on the 3D key point information of the face corresponding to the target image acquisition device. As an example, a face plane can be constructed based on the coordinates of 3 to 4 key points of the face in the coordinate system of the target image acquisition device. The key points of the selected face are not limited.
接着,根据人脸平面确定人脸平面的法向量。前面提到在确定人脸平面时使用了3至4个人脸关键点的坐标,当然通过这3至4个人脸关键点的坐标也可以得到两个不平行的空间向量。在具体实现时,可以通过对这两个向量的叉乘得到人脸平面的法向量。另外,还可以通过NumPy(全称:Numerical Python,是Python的一种开源的数值计算扩展)的cross函数来确定法向量的单位向量。Next, determine the normal vector of the face plane according to the face plane. As mentioned earlier, the coordinates of 3 to 4 key points of the face are used when determining the plane of the face. Of course, two non-parallel space vectors can also be obtained through the coordinates of the 3 to 4 key points of the face. In specific implementation, the normal vector of the face plane can be obtained by cross-producting these two vectors. In addition, the unit vector of the normal vector can also be determined through the cross function of NumPy (full name: Numerical Python, which is an open source numerical calculation extension of Python).
结合前文,已经获取到了人脸平面和人脸平面的法向量。进而,可以基于该平面和该平面的法向量,建立目标对象的人脸坐标系。可以理解的是,由于人脸三维关键点信息是在目标图像采集设备的坐标系中得到的,因此基于人脸三维关键点信息建立的人脸坐标系也是以该目标图像采集设备的坐标系作为基础。Combined with the above, the face plane and the normal vector of the face plane have been obtained. Furthermore, the face coordinate system of the target object can be established based on the plane and the normal vector of the plane. It can be understood that since the three-dimensional key point information of the face is obtained in the coordinate system of the target image acquisition device, the face coordinate system established based on the three-dimensional key point information of the face is also based on the coordinate system of the target image acquisition device. Base.
S105:根据目标图像采集设备的坐标系和人脸坐标系,获得目标图像对应的目标对象的头部姿势真值,目标图像为目标图像采集设备在时刻对目标对象采集的图像。S105: According to the coordinate system of the target image acquisition device and the face coordinate system, obtain the true value of the head pose of the target object corresponding to the target image, where the target image is an image collected by the target image acquisition device of the target object at any time.
在已知目标图像采集设备的坐标系和人脸坐标系的基础上,可以根据人脸坐标系和目标图像采集设备的坐标系获得人脸坐标系相对于目标图像采集设备的坐标系的旋转矩阵。由于此旋转矩阵与头部姿势真值中三个角度(偏航角Yaw、俯仰角Pitch和翻滚角Roll)存在关联关系,因此,根据旋转矩阵可以基于此关联关系获得目标对象的头部姿势真值。由于该头部姿势真值是在目标图像采集设备的坐标系和人脸坐标系的基础上得到的,且人脸坐标系也是以目标图像采集设备的坐标系为基础,目标图像采集设备与该时刻拍摄的目标图像具有对应性,因此可以将本步骤获得的头部姿势真值与目标图像进行关联。On the basis of knowing the coordinate system of the target image acquisition device and the face coordinate system, the rotation matrix of the face coordinate system relative to the coordinate system of the target image acquisition device can be obtained according to the face coordinate system and the coordinate system of the target image acquisition device . Since there is an association between this rotation matrix and the three angles (yaw angle Yaw, pitch angle Pitch, and roll angle Roll) in the true value of the head pose, the true head pose of the target object can be obtained based on the rotation matrix. value. Since the true value of the head pose is obtained on the basis of the coordinate system of the target image acquisition device and the face coordinate system, and the face coordinate system is also based on the coordinate system of the target image acquisition device, the target image acquisition device and the The target images taken at all times have correspondence, so the true value of the head pose obtained in this step can be associated with the target image.
例如,还可以将图像与求取的头部姿势真值存储为图像真值对。如此,建立起图像与真值的一一对应关系。存储图像真值对方便后续应用中对于头部姿势真值的使用,例如训练模型。作为示例,模型可以是通过图像确定头部姿势真值的模型,或者进一步地,模型可以是通过图像确定驾驶安全系数的模型或者驾驶疲劳程度分析的模型等等。此处对于训练的模型的具体功能不做限定。For example, it is also possible to store the image and the computed head pose ground truth as an image ground truth pair. In this way, a one-to-one correspondence between images and truth values is established. Storing image ground truth pairs facilitates the use of head pose ground truth in subsequent applications, such as training models. As an example, the model may be a model for determining the true value of the head posture through images, or further, the model may be a model for determining the driving safety factor through images or a model for analyzing driving fatigue, etc. The specific functions of the trained model are not limited here.
以上即为本申请实施例提供的头部姿势真值的获取方法。该方法中,以多个图像采集设备采集相同时刻目标对象的图像,以多幅所采集的图像中标记的二维人脸关键点作为三维重建目标对象人脸关键点的数据基础,并进而获取图像对应的该目标对象的头部姿势真值。由于本方案不需要借助穿戴设备来获取头部姿势真值,因此不受穿戴角度的影响。通过多个图像采集设备的同时刻使用,以及人脸关键点的三维重建,保证了所获取的头部姿势真值的精度。进而,当头部姿势真值应用到驾驶员疲劳程度分析、虚拟现实体感游戏、商品购买欲分析、人脸验证等领域时,可以使数据的分析结果更加准确。The above is the method for obtaining the true value of the head pose provided by the embodiment of the present application. In this method, multiple image acquisition devices are used to collect images of the target object at the same time, and the two-dimensional face key points marked in the multiple collected images are used as the data basis for three-dimensional reconstruction of the target object face key points, and then obtained The true value of the head pose of the target object corresponding to the image. Since this solution does not require the use of wearable devices to obtain the true value of the head pose, it is not affected by the wearing angle. Through the simultaneous use of multiple image acquisition devices and the 3D reconstruction of key points of the face, the accuracy of the acquired true value of the head pose is guaranteed. Furthermore, when the true value of the head posture is applied to areas such as driver fatigue analysis, virtual reality somatosensory games, product purchase desire analysis, face verification, etc., the data analysis results can be made more accurate.
在具体应用中,为了对后续头部姿势真值的使用场景提供更加多样化、丰富的数据,本申请实施例中提出,可以在S101实施时,向目标对象所处的空间提供多种不同的光照条件。获取相同的光照条件下多个图像采集设备在相同时刻对目标对象采集的图像。作为示例,在一级光照条件下获得所有图像采集设备同时刻采集的图像,在二级光照条件下获得所有图像采集设备同时刻采集的图像,在三级光照条件下获得所有图像采集设备同时刻采集的图像。实际应用中,即便是同一图像采集设备且目标对象维持头部姿势不动,在不同光照条件下通过执行本方法获得的头部姿势真值也可能是不同的。通过采集不同光照下的图像并分别获取头部姿势真值,可以有助于数据分析的准确性的提升。例如,当需要在一级光照条件下进行数据分析时,由于前期通过在一级光照条件下采集图像并获取头部姿势真值,而不仅仅在其他光照条件下采集图像并获取头部姿势真值,因此使得数据分析结果中对该条件下的数据分析更加准确。In a specific application, in order to provide more diverse and rich data for the subsequent usage scenarios of the true value of the head pose, it is proposed in the embodiment of the present application that a variety of different data can be provided to the space where the target object is located during the implementation of S101. lighting conditions. Acquire images of the target object collected by multiple image acquisition devices at the same time under the same lighting conditions. As an example, the images collected by all image acquisition devices at the same time are obtained under the first-level lighting condition, the images collected by all the image acquisition devices at the same time are obtained under the second-level lighting condition, and the images collected by all the image acquisition devices are obtained at the same time under the third-level lighting condition. The captured image. In practical applications, even if the image acquisition device is the same and the target object maintains the head posture, the true value of the head posture obtained by executing this method under different lighting conditions may be different. By collecting images under different illuminations and obtaining the true value of the head pose separately, it can help to improve the accuracy of data analysis. For example, when it is necessary to perform data analysis under the first-level lighting conditions, because the previous stage collects images under the first-level lighting conditions and obtains the true value of the head pose, not only collects images under other lighting conditions and obtains the true value of the head pose. value, thus making the data analysis under this condition more accurate in the data analysis results.
以上介绍的一级、二级、三级光照条件作为不同光照条件的示例,此处对不同光照条件的等级不做限定。例如,可以是包含4级光照条件。其中级数越高,光照的强度越弱;级数越低,光照的强度越高。另一示例中,光照条件分为日光光照条件、红外光光照条件、紫外光光照条件等。以上光照条件可以由自然环境实现,也可以是通过照明装置来实现的。例如,调节或者控制照明装置中光源类型的选择、灯光开闭、强弱、照射角度,实现不同的光照条件。The first-level, second-level, and third-level lighting conditions described above are examples of different lighting conditions, and the levels of different lighting conditions are not limited here. For example, it could be to include 4 levels of lighting conditions. The higher the level, the weaker the light intensity; the lower the level, the higher the light intensity. In another example, the lighting conditions are divided into sunlight lighting conditions, infrared lighting conditions, ultraviolet lighting conditions, and the like. The above lighting conditions can be realized by the natural environment, and can also be realized by lighting devices. For example, adjust or control the selection of light source type, light on and off, intensity, and irradiation angle in the lighting device to achieve different lighting conditions.
当头部姿势真值需要应用到驾驶领域的数据分析场景中时,本申请实施例可以在图像采集阶段做出适配性的措施。例如,多个图像采集设备采集目标对象的图像时,目标对象坐在座位上,该座位用于模拟实车环境中的座椅。例如,目标对象代表驾驶员,其所在的座位代表驾驶座。图像采集设备相对于座位的设置参数为依据实车环境中模拟对象相对于座椅的设置参数确定的。其中设置参数包括:位置、高度、角度等。模拟对象包括以下至少一种:A柱、B柱、仪表盘、前挡风玻璃或左侧车玻璃。作为示例,1号设备模拟A柱而设置,2号设备模拟B柱而设置,3号设备模拟仪表盘而设置,4号设备模拟前挡风玻璃而设置,5号设备模拟左侧车玻璃而设置。When the true value of the head posture needs to be applied to the data analysis scene in the driving field, the embodiment of the present application can take adaptive measures in the image acquisition stage. For example, when multiple image acquisition devices acquire images of a target object, the target object sits on a seat, and the seat is used to simulate a seat in a real vehicle environment. For example, the target object represents the driver, and the seat it is on represents the driver's seat. The setting parameters of the image acquisition device relative to the seat are determined according to the setting parameters of the simulated object relative to the seat in the real vehicle environment. The setting parameters include: position, height, angle, etc. The simulated object includes at least one of the following: A-pillar, B-pillar, instrument panel, front windshield or left vehicle glass. As an example, No. 1 device is set to simulate the A-pillar, No. 2 device is set to simulate the B-pillar, No. 3 device is set to simulate the instrument panel, No. 4 device is set to simulate the front windshield, and No. 5 device is set to simulate the left car glass set up.
图2为本申请实施例提供的一种通过多个图像采集设备采集目标对象图像的场景示意图。图2示意了14个相机对座位上的目标对象进行图像采集,且场景中目标对象上方环绕了照明装置,用于对光照条件进行变换。FIG. 2 is a schematic diagram of a scene in which images of a target object are collected by multiple image collection devices according to an embodiment of the present application. Fig. 2 shows that 14 cameras collect images of the target object on the seat, and the lighting device is surrounded above the target object in the scene for changing the lighting conditions.
除了前文提及的优势外,本申请实施例技术方案中,还存在诸多优势:In addition to the advantages mentioned above, there are many advantages in the technical solution of the embodiment of the present application:
本申请实施例中,多个图像采集设备在进行采集时,可以同时录制同一人的视频,按照视频帧号的人脸三维关键点信息计算出连续的头部姿势真值。所以本申请技术方案在获 取头部姿势真值方面具备连续性。In the embodiment of the present application, multiple image acquisition devices can simultaneously record the video of the same person during acquisition, and calculate the continuous true value of the head posture according to the three-dimensional key point information of the face of the video frame number. Therefore, the technical solution of the present application has continuity in obtaining the true value of the head posture.
本方法中存在拍摄到人的侧脸图像的图像采集设备,因此,该方案具备采集大角度头部姿势真值的能力。In this method, there is an image acquisition device that captures images of people's side faces. Therefore, this solution has the ability to acquire the true value of head poses at large angles.
本方法采用无接触的头部姿势采集方法,采集方法较为简单,方便实现。This method adopts a non-contact head posture acquisition method, which is relatively simple and convenient to implement.
本申请实施例中,用于采集图像的多个图像采集设备的类型可以包括红外相机、RGB相机等。并且多个图像采集设备的相同类型内参的值域范围可以是不同的。这些值域范围不同的内参可以是焦距、光心数值等。作为示例,多个图像采集设备中,设备A的焦距范围在f1~f2,设备B的焦距范围在f3~f4。In this embodiment of the present application, types of multiple image acquisition devices used to acquire images may include infrared cameras, RGB cameras, and the like. In addition, the value ranges of the same type of internal parameters of multiple image acquisition devices may be different. These internal parameters with different value ranges may be focal length, optical center value, and the like. As an example, among multiple image acquisition devices, the focal length range of device A is f1-f2, and the focal length range of device B is f3-f4.
实际应用中,通过配置多种不同类型的图像采集设备,可以形成多种类型的图像,例如红外图像和RGB图像。如此,便可以通过较少次数的采集获得多种不同成像效果的图像,满足实际应用中不同的研发项目对多种不同成像效果图像的获取需求,节省采集时间和采集成本。In practical applications, various types of images, such as infrared images and RGB images, can be formed by configuring multiple different types of image acquisition devices. In this way, a variety of images with different imaging effects can be obtained through a small number of acquisitions, which can meet the acquisition requirements of different research and development projects for images with different imaging effects in practical applications, and save acquisition time and acquisition costs.
实际应用中,通过配置多种不同值域内参的图像采集设备,可以通过较少次数的采集获得不同成像效果的图像。如此,便可以满足实际应用中不同研发项目对于多种不同成像效果图像的获取需求,节省采集时间和采集成本。In practical applications, by configuring a variety of image acquisition devices with internal references in different value ranges, images with different imaging effects can be obtained through fewer acquisitions. In this way, it can meet the acquisition requirements of different research and development projects for various images with different imaging effects in practical applications, saving acquisition time and acquisition cost.
本申请实施例中通过设置多样化类型的设备和多样化内参值域的设备,使获取的图像数据更加具有多样性,满足实际使用需求。In the embodiment of the present application, various types of devices and devices with various internal parameter value ranges are set to make the acquired image data more diverse and meet actual use requirements.
基于前述实施例提供的头部姿势真值的获取方法,相应地,本申请还提供了一种头部姿势真值的获取装置。以下结合实施例对该装置的具体实现进行说明。图3为头部姿势真值的获取装置的结构示意图。如图3所示的装置300包括:Based on the method for obtaining the true value of the head posture provided in the foregoing embodiments, correspondingly, the present application also provides a device for obtaining the true value of the head posture. The specific implementation of the device will be described below in conjunction with the embodiments. Fig. 3 is a schematic structural diagram of a device for obtaining a true value of a head pose. The device 300 shown in Figure 3 includes:
图像获取模块301,用于获取多个图像采集设备在相同时刻对目标对象采集的图像;An image acquisition module 301, configured to acquire images collected by multiple image acquisition devices on the target object at the same time;
关键点标注模块302,用于对所述多个图像采集设备中各个所述图像采集设备所采集的图像中的人脸关键点进行标注,获得各个所述图像采集设备对应的人脸关键点信息;The key point labeling module 302 is configured to label the key points of the face in the image captured by each of the image capture devices among the plurality of image capture devices, and obtain the key point information of the face corresponding to each of the image capture devices ;
三维关键点重建模块303,用于基于各个所述图像采集设备对应的人脸关键点信息和各个所述图像采集设备的参数,重建出各个所述图像采集设备对应的人脸三维关键点信息;The three-dimensional key point reconstruction module 303 is used to reconstruct the three-dimensional key point information of the face corresponding to each of the image acquisition devices based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices;
坐标系建立模块304,用于基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,所述目标图像采集设备为所述多个图像采集设备之一;A coordinate system establishment module 304, configured to establish a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of the plurality of image acquisition devices;
真值获取模块305,用于根据所述目标图像采集设备的坐标系和所述人脸坐标系,获得目标图像对应的所述目标对象的头部姿势真值,所述目标图像为所述目标图像采集设备在所述时刻对所述目标对象采集的图像。The true value acquisition module 305 is configured to obtain the true value of the head posture of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the human face coordinate system, and the target image is the target object The image collected by the image collection device on the target object at the time.
以多个图像采集设备采集相同时刻目标对象的图像,以多幅所采集的图像中标记的二维人脸关键点作为三维重建目标对象人脸关键点的数据基础,并进而获取图像对应的该目标对象的头部姿势真值。由于本方案不需要借助穿戴设备来获取头部姿势真值,因此不受穿戴角度的影响。通过多个图像采集设备的同时刻使用,以及人脸关键点的三维重建,保证了所获取的头部姿势真值的精度。进而,当头部姿势真值应用到驾驶员疲劳程度分析、虚拟现实体感游戏、商品购买欲分析、人脸验证等领域时,可以使数据的分析结果更加准确。Use multiple image acquisition devices to collect images of the target object at the same time, use the key points of the two-dimensional face marked in the multiple captured images as the data basis for the three-dimensional reconstruction of the key points of the face of the target object, and then obtain the corresponding key points of the image. The ground truth of the head pose of the target object. Since this solution does not require the use of wearable devices to obtain the true value of the head pose, it is not affected by the wearing angle. Through the simultaneous use of multiple image acquisition devices and the 3D reconstruction of key points of the face, the accuracy of the acquired true value of the head pose is guaranteed. Furthermore, when the true value of the head posture is applied to areas such as driver fatigue analysis, virtual reality somatosensory games, product purchase desire analysis, face verification, etc., the data analysis results can be made more accurate.
可选地,真值获取模块305,包括:Optionally, the truth acquisition module 305 includes:
旋转矩阵获取单元,用于根据所述人脸坐标系和所述目标图像采集设备的坐标系,获得所述人脸坐标系相对于所述目标图像采集设备的坐标系的旋转矩阵;a rotation matrix acquiring unit, configured to obtain a rotation matrix of the face coordinate system relative to the coordinate system of the target image capture device according to the face coordinate system and the coordinate system of the target image capture device;
真值获取单元,用于根据所述旋转矩阵获得所述目标图像对应的所述目标对象的头部姿势真值。A true value obtaining unit, configured to obtain the true value of the head pose of the target object corresponding to the target image according to the rotation matrix.
可选地,坐标系建立模块304,包括:Optionally, the coordinate system establishment module 304 includes:
人脸平面确定单元,用于基于所述目标图像采集设备对应的人脸三维关键点信息确定人脸平面;A face plane determining unit, configured to determine a face plane based on the three-dimensional key point information of the face corresponding to the target image acquisition device;
法向量确定单元,用于根据所述人脸平面确定所述人脸平面的法向量;a normal vector determination unit, configured to determine the normal vector of the face plane according to the face plane;
坐标系建立单元,用于基于所述人脸平面和所述人脸平面的法向量建立所述目标对象的人脸坐标系。A coordinate system establishing unit, configured to establish the face coordinate system of the target object based on the face plane and the normal vector of the face plane.
可选地,三维关键点重建模块303,包括:Optionally, the 3D key point reconstruction module 303 includes:
根据所述目标图像采集设备对应的人脸关键点信息、参考图像采集设备对应的人脸关键点信息、所述目标图像采集设备的内部参数、所述参考图像采集设备的内部参数以及所述目标图像采集设备和所述参考图像采集设备之间的外部参数,通过三角化重建法重建出所述目标图像采集设备对应的人脸三维关键点信息;所述参考图像采集设备为所述多个图像采集设备之中所述目标图像采集设备以外的任一图像采集设备;According to the face key point information corresponding to the target image capture device, the face key point information corresponding to the reference image capture device, the internal parameters of the target image capture device, the internal parameters of the reference image capture device, and the target The external parameters between the image acquisition device and the reference image acquisition device are used to reconstruct the three-dimensional key point information of the face corresponding to the target image acquisition device through the triangulation reconstruction method; the reference image acquisition device is the plurality of images Any image acquisition device other than the target image acquisition device among the acquisition devices;
根据所述目标图像采集设备对应的人脸三维关键点信息以及所述外部参数,获得所述参考图像采集设备对应的人脸三维关键点信息。According to the 3D key point information of the face corresponding to the target image capture device and the external parameters, the 3D key point information of the face corresponding to the reference image capture device is obtained.
可选地,头部姿势真值的获取装置300还包括:Optionally, the acquisition device 300 of the true value of the head posture also includes:
存储模块,用于将相互对应的图像和头部姿势真值存储为图像真值对。The storage module is used for storing the images corresponding to each other and the true value of the head posture as image true value pairs.
可选地,图像获取模块301包括:Optionally, the image acquisition module 301 includes:
光照单元,用于向所述目标对象所处的空间提供多种不同的光照条件;a lighting unit, configured to provide a variety of different lighting conditions to the space where the target object is located;
图像获取单元,用于获取相同的光照条件下所述多个图像采集设备在相同时刻对所述目标对象采集的图像。An image acquisition unit, configured to acquire images of the target object acquired by the plurality of image acquisition devices at the same time under the same lighting condition.
基于前述实施例提供的头部姿势真值的获取方法和装置,相应地,本申请还提供一种头部姿势真值的获取设备,包括处理器及存储器;所述存储器用于存储计算机程序;所述处理器用于根据所述计算机程序执行如前述方法实施例提供的头部姿势真值的获取方法。另外,该设备中处理器还可以用于控制照明装置,以提供可变换的光照条件。Based on the method and device for obtaining the true value of the head posture provided in the foregoing embodiments, correspondingly, the present application also provides a device for obtaining the true value of the head posture, including a processor and a memory; the memory is used to store computer programs; The processor is configured to execute, according to the computer program, the method for obtaining the true value of the head posture as provided in the foregoing method embodiments. In addition, the processor in the device can also be used to control the lighting device to provide variable lighting conditions.
基于前述实施例提供的头部姿势真值的获取方法、装置和设备,相应地,本申请还提供一种计算机可读存储介质,用于存储计算机程序,所述计算机程序被处理器运行时执行如前述方法实施例提供的头部姿势真值的获取方法。Based on the method, device and device for obtaining the true value of the head posture provided in the foregoing embodiments, correspondingly, the present application also provides a computer-readable storage medium for storing a computer program, and the computer program is executed when the processor runs The method for obtaining the true value of the head pose as provided in the foregoing method embodiments.
以上所述,仅为本申请的一种具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。The above is only a specific embodiment of the present application, but the protection scope of the present application is not limited thereto. Any person familiar with the technical field can easily think of changes or Replacement should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims (12)

  1. 一种头部姿势真值的获取方法,其特征在于,包括:A method for obtaining a true value of a head pose, comprising:
    获取多个图像采集设备在相同时刻对目标对象采集的图像;Obtaining images of the target object collected by multiple image acquisition devices at the same time;
    对所述多个图像采集设备中各个所述图像采集设备所采集的图像中的人脸关键点进行标注,获得各个所述图像采集设备对应的人脸关键点信息;Marking the face key points in the images collected by each of the image capture devices among the plurality of image capture devices, and obtaining the face key point information corresponding to each of the image capture devices;
    基于各个所述图像采集设备对应的人脸关键点信息和各个所述图像采集设备的参数,重建出各个所述图像采集设备对应的人脸三维关键点信息;Based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices, reconstruct the three-dimensional key point information of the face corresponding to each of the image acquisition devices;
    基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,所述目标图像采集设备为所述多个图像采集设备之一;Establishing a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of the plurality of image acquisition devices;
    根据所述目标图像采集设备的坐标系和所述人脸坐标系,获得目标图像对应的所述目标对象的头部姿势真值,所述目标图像为所述目标图像采集设备在所述时刻对所述目标对象采集的图像。According to the coordinate system of the target image acquisition device and the face coordinate system, obtain the true value of the head pose of the target object corresponding to the target image, the target image is the target image acquisition device at the time The image collected by the target object.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述目标图像采集设备的坐标系和所述人脸坐标系,获得目标图像对应的所述目标对象的头部姿势真值,包括:The method according to claim 1, wherein the obtaining the true value of the head posture of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the human face coordinate system comprises :
    根据所述人脸坐标系和所述目标图像采集设备的坐标系,获得所述人脸坐标系相对于所述目标图像采集设备的坐标系的旋转矩阵;Obtaining a rotation matrix of the face coordinate system relative to the coordinate system of the target image capture device according to the face coordinate system and the target image capture device coordinate system;
    根据所述旋转矩阵获得所述目标图像对应的所述目标对象的头部姿势真值。A true value of the head pose of the target object corresponding to the target image is obtained according to the rotation matrix.
  3. 根据权利要求1所述的方法,其特征在于,所述基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,包括:The method according to claim 1, wherein the establishment of a human face coordinate system based on the three-dimensional key point information of the human face corresponding to the target image acquisition device includes:
    基于所述目标图像采集设备对应的人脸三维关键点信息确定人脸平面;Determining the face plane based on the three-dimensional key point information of the face corresponding to the target image acquisition device;
    根据所述人脸平面确定所述人脸平面的法向量;determining the normal vector of the face plane according to the face plane;
    基于所述人脸平面和所述人脸平面的法向量建立所述目标对象的人脸坐标系。A human face coordinate system of the target object is established based on the human face plane and the normal vector of the human face plane.
  4. 根据权利要求1所述的方法,其特征在于,所述基于各个所述图像采集设备对应的人脸关键点信息和各个所述图像采集设备的参数,重建出各个所述图像采集设备对应的人脸三维关键点信息,包括:The method according to claim 1, wherein, based on the face key point information corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices, the person corresponding to each of the image acquisition devices is reconstructed. Face 3D key point information, including:
    根据所述目标图像采集设备对应的人脸关键点信息、参考图像采集设备对应的人脸关键点信息、所述目标图像采集设备的内部参数、所述参考图像采集设备的内部参数以及所述目标图像采集设备和所述参考图像采集设备之间的外部参数,通过三角化重建法重建出所述目标图像采集设备对应的人脸三维关键点信息;所述参考图像采集设备为所述多个图像采集设备之中所述目标图像采集设备以外的任一图像采集设备;According to the face key point information corresponding to the target image capture device, the face key point information corresponding to the reference image capture device, the internal parameters of the target image capture device, the internal parameters of the reference image capture device, and the target The external parameters between the image acquisition device and the reference image acquisition device are used to reconstruct the three-dimensional key point information of the face corresponding to the target image acquisition device through the triangulation reconstruction method; the reference image acquisition device is the plurality of images Any image acquisition device other than the target image acquisition device among the acquisition devices;
    根据所述目标图像采集设备对应的人脸三维关键点信息以及所述外部参数,获得所述参考图像采集设备对应的人脸三维关键点信息。According to the 3D key point information of the face corresponding to the target image capture device and the external parameters, the 3D key point information of the face corresponding to the reference image capture device is obtained.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,还包括:The method according to any one of claims 1-4, further comprising:
    将相互对应的图像和头部姿势真值存储为图像真值对。The image and head pose ground truths corresponding to each other are stored as image ground truth pairs.
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述获取多个图像采集设备在相同时刻对目标对象采集的图像,包括:The method according to any one of claims 1-4, wherein the acquiring images of the target object acquired by multiple image acquisition devices at the same time comprises:
    向所述目标对象所处的空间提供多种不同的光照条件;providing a plurality of different lighting conditions to the space in which the target object is located;
    获取相同的光照条件下所述多个图像采集设备在相同时刻对所述目标对象采集的图像。acquiring images of the target object collected by the plurality of image acquisition devices at the same moment under the same illumination condition.
  7. 根据权利要求1-4任一项所述的方法,其特征在于,所述多个图像采集设备采集所述目标对象的图像时,所述目标对象坐在座位上,所述座位用于模拟实车环境中的座椅;所述图像采集设备相对于所述座位的设置参数为依据所述实车环境中模拟对象相对于所述座椅的设置参数确定的;The method according to any one of claims 1-4, wherein when the multiple image acquisition devices acquire the images of the target object, the target object sits on a seat, and the seat is used for simulating A seat in the vehicle environment; the setting parameters of the image acquisition device relative to the seat are determined according to the setting parameters of the simulated object in the actual vehicle environment relative to the seat;
    所述模拟对象包括以下至少一种:The simulated objects include at least one of the following:
    A柱、B柱、仪表盘、前挡风玻璃或左侧车玻璃。A-pillar, B-pillar, instrument panel, front windshield or left side glass.
  8. 根据权利要求1-4任一项所述的方法,其特征在于,所述多个图像采集设备包括:红外相机和RGB相机。The method according to any one of claims 1-4, wherein the multiple image acquisition devices include: an infrared camera and an RGB camera.
  9. 根据权利要求1-4任一项所述的方法,其特征在于,所述多个图像采集设备的同类型内参具有不同的值域。The method according to any one of claims 1-4, characterized in that the internal references of the same type of the multiple image acquisition devices have different value ranges.
  10. 一种头部姿势真值的获取装置,其特征在于,包括:A device for obtaining the true value of a head posture, characterized in that it includes:
    图像获取模块,用于获取多个图像采集设备在相同时刻对目标对象采集的图像;An image acquisition module, configured to acquire images collected by multiple image acquisition devices on the target object at the same time;
    关键点标注模块,用于对所述多个图像采集设备中各个所述图像采集设备所采集的图像中的人脸关键点进行标注,获得各个所述图像采集设备对应的人脸关键点信息;A key point labeling module, configured to mark the key points of the face in the image captured by each of the image capture devices among the plurality of image capture devices, and obtain the key point information of the face corresponding to each of the image capture devices;
    三维关键点重建模块,用于基于各个所述图像采集设备对应的人脸关键点信息和各个所述图像采集设备的参数,重建出各个所述图像采集设备对应的人脸三维关键点信息;The three-dimensional key point reconstruction module is used to reconstruct the three-dimensional key point information of the face corresponding to each of the image acquisition devices based on the key point information of the face corresponding to each of the image acquisition devices and the parameters of each of the image acquisition devices;
    坐标系建立模块,用于基于目标图像采集设备对应的人脸三维关键点信息建立人脸坐标系,所述目标图像采集设备为所述多个图像采集设备之一;A coordinate system establishment module, configured to establish a face coordinate system based on the three-dimensional key point information of the face corresponding to the target image acquisition device, where the target image acquisition device is one of the plurality of image acquisition devices;
    真值获取模块,用于根据所述目标图像采集设备的坐标系和所述人脸坐标系,获得目标图像对应的所述目标对象的头部姿势真值,所述目标图像为所述目标图像采集设备在所述时刻对所述目标对象采集的图像。A true value acquisition module, configured to obtain the true value of the head posture of the target object corresponding to the target image according to the coordinate system of the target image acquisition device and the face coordinate system, the target image being the target image The image collected by the collection device on the target object at the time.
  11. 一种头部姿势真值的获取设备,其特征在于,包括处理器及存储器;所述存储器用于存储计算机程序;所述处理器用于根据所述计算机程序执行如权利要求1-9任一项所述的头部姿势真值的获取方法。A device for obtaining the true value of a head posture, characterized in that it includes a processor and a memory; the memory is used to store a computer program; and the processor is used to execute any one of claims 1-9 according to the computer program. The method for obtaining the true value of the head pose.
  12. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序被处理器运行时执行如权利要求1-9任一项所述的头部姿势真值的获取方法。A computer-readable storage medium, characterized in that it is used to store a computer program, and when the computer program is executed by a processor, the method for obtaining the true value of the head posture according to any one of claims 1-9 is executed.
PCT/CN2022/071709 2021-12-09 2022-01-13 Head pose truth value acquisition method, apparatus and device, and storage medium WO2023103145A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111501742.9A CN114220149A (en) 2021-12-09 2021-12-09 Method, device, equipment and storage medium for acquiring true value of head posture
CN202111501742.9 2021-12-09

Publications (1)

Publication Number Publication Date
WO2023103145A1 true WO2023103145A1 (en) 2023-06-15

Family

ID=80700651

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071709 WO2023103145A1 (en) 2021-12-09 2022-01-13 Head pose truth value acquisition method, apparatus and device, and storage medium

Country Status (2)

Country Link
CN (1) CN114220149A (en)
WO (1) WO2023103145A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351540A (en) * 2023-09-27 2024-01-05 东莞莱姆森科技建材有限公司 Bathroom mirror integrated with LED and head action recognition

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200218883A1 (en) * 2017-12-25 2020-07-09 Beijing Sensetime Technology Development Co., Ltd. Face pose analysis method, electronic device, and storage medium
CN111414798A (en) * 2019-02-03 2020-07-14 沈阳工业大学 Head posture detection method and system based on RGB-D image
CN111832373A (en) * 2019-05-28 2020-10-27 北京伟景智能科技有限公司 Automobile driving posture detection method based on multi-view vision
CN113454684A (en) * 2021-05-24 2021-09-28 华为技术有限公司 Key point calibration method and device
CN113689503A (en) * 2021-10-25 2021-11-23 北京市商汤科技开发有限公司 Target object posture detection method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200218883A1 (en) * 2017-12-25 2020-07-09 Beijing Sensetime Technology Development Co., Ltd. Face pose analysis method, electronic device, and storage medium
CN111414798A (en) * 2019-02-03 2020-07-14 沈阳工业大学 Head posture detection method and system based on RGB-D image
CN111832373A (en) * 2019-05-28 2020-10-27 北京伟景智能科技有限公司 Automobile driving posture detection method based on multi-view vision
CN113454684A (en) * 2021-05-24 2021-09-28 华为技术有限公司 Key point calibration method and device
CN113689503A (en) * 2021-10-25 2021-11-23 北京市商汤科技开发有限公司 Target object posture detection method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351540A (en) * 2023-09-27 2024-01-05 东莞莱姆森科技建材有限公司 Bathroom mirror integrated with LED and head action recognition
CN117351540B (en) * 2023-09-27 2024-04-02 东莞莱姆森科技建材有限公司 Bathroom mirror integrated with LED and head action recognition

Also Published As

Publication number Publication date
CN114220149A (en) 2022-03-22

Similar Documents

Publication Publication Date Title
CN111414798B (en) Head posture detection method and system based on RGB-D image
CN111325823B (en) Method, device and equipment for acquiring face texture image and storage medium
CN111126272B (en) Posture acquisition method, and training method and device of key point coordinate positioning model
CN106796449A (en) Eye-controlling focus method and device
CN108256504A (en) A kind of Three-Dimensional Dynamic gesture identification method based on deep learning
CN104035557B (en) Kinect action identification method based on joint activeness
CN109559332B (en) Sight tracking method combining bidirectional LSTM and Itracker
US11945125B2 (en) Auxiliary photographing device for dyskinesia analysis, and control method and apparatus for auxiliary photographing device for dyskinesia analysis
CN103356163A (en) Fixation point measurement device and method based on video images and artificial neural network
CN110096925A (en) Enhancement Method, acquisition methods and the device of Facial Expression Image
Selim et al. AutoPOSE: Large-scale Automotive Driver Head Pose and Gaze Dataset with Deep Head Orientation Baseline.
JP2023545200A (en) Parameter estimation model training method, parameter estimation model training apparatus, device, and storage medium
CN106981091A (en) Human body three-dimensional modeling data processing method and processing device
CN110717391A (en) Height measuring method, system, device and medium based on video image
CN110148177A (en) For determining the method, apparatus of the attitude angle of camera, calculating equipment, computer readable storage medium and acquisition entity
CN111145865A (en) Vision-based hand fine motion training guidance system and method
CN116051631A (en) Light spot labeling method and system
Lüsi et al. Sase: Rgb-depth database for human head pose estimation
WO2023103145A1 (en) Head pose truth value acquisition method, apparatus and device, and storage medium
WO2019098872A1 (en) Method for displaying a three-dimensional face of an object, and device for same
US11250592B2 (en) Information processing apparatus
CN111898552A (en) Method and device for distinguishing person attention target object and computer equipment
Martin et al. An evaluation of different methods for 3d-driver-body-pose estimation
CN114067422A (en) Sight line detection method and device for driving assistance and storage medium
Cui et al. Trajectory simulation of badminton robot based on fractal brown motion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22902590

Country of ref document: EP

Kind code of ref document: A1