CN117152839A

CN117152839A - Human body posture data set acquisition method and data acquisition platform

Info

Publication number: CN117152839A
Application number: CN202311099846.0A
Authority: CN
Inventors: 张睿恒; 孙颢洋; 余恒; 徐立新
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2023-08-29
Filing date: 2023-08-29
Publication date: 2023-12-01

Abstract

The invention discloses a method for collecting a human body posture data set and a data collection platform, which relate to the technical field of data set manufacturing and have the technical scheme that: comprising S1: constructing a multi-mode human body posture estimation data acquisition platform, wherein the data acquisition platform is connected with a data processing system and comprises an RGB camera, a millimeter wave radar and an infrared camera; s2: and acquiring one or more human body posture data by using the data acquisition platform. In the invention, the data acquisition platform comprises the millimeter wave radar, the infrared camera and the RGB camera which are in various modes, the adopted infrared camera and millimeter wave radar can overcome the influence of the illumination and climate conditions of the RGB camera, and the acquired data do not relate to the problem of privacy leakage due to the detection systems of the two sensors, so that the built system can be suitable for wider scenes.

Description

Human body posture data set acquisition method and data acquisition platform

Technical Field

The invention relates to the technical field of data set manufacturing, in particular to a human body posture data set acquisition method and a data acquisition platform.

Background

Human perception and modeling are basic technologies in the fields of computer vision, man-machine interaction, pervasive computing, computer graphics and the like. Human body posture reconstruction and motion perception have a wide range of applications in real life, such as games, home automation, autopilot, augmented and virtual reality, animation, rehabilitation, etc.

However, existing human body posture estimation methods mainly rely on RGB cameras and wearable inertial sensors, and have significant limitations in practical scenes. For example, RGB cameras have rich texture detail information, which can easily cause leakage of user privacy in some sensitive occasions. And the RGB camera is sensitive to illumination conditions, and image distortion is easy to occur under strong light or weak light conditions. The wearable inertial sensor can solve the problem that the visual sensor is affected by illumination although the cost is lower, but the wearable inertial sensor requires the user to wear all the time, has higher compliance requirement on the user, and thus has poor use experience.

Accordingly, the present invention is directed to a method and a platform for acquiring a human body posture data set, so as to solve the above-mentioned related problems.

Disclosure of Invention

The invention aims to provide a method and a platform for acquiring a human body posture data set, so as to solve the related problems that the privacy of a user is easily leaked and the accuracy is not high due to poor image quality.

The technical aim of the invention is realized by the following technical scheme: a method of acquiring a human body pose data set, comprising the steps of:

s1: constructing a multi-mode human body posture estimation data acquisition platform, wherein the data acquisition platform is connected with a data processing system and comprises an RGB camera, a millimeter wave radar and an infrared camera;

s2: acquiring one or more human body posture data by using the data acquisition platform;

s3: respectively recording time stamp information when the RGB camera, the millimeter wave radar and the infrared camera acquire the human body posture data, and performing time calibration synchronization based on the time stamp information;

s4: taking an RGB camera coordinate system as a space synchronization reference, and converting and synchronizing a millimeter wave radar and an infrared camera coordinate system to realize the space synchronization of the RGB camera, the millimeter wave radar and the infrared camera;

s5: 2D and 3D human body key point information marking is carried out on human body gesture data acquired by the millimeter wave radar and the infrared camera by taking data acquired by the RGB camera as a reference, and action categories are defined based on the human body key point marking information;

s6: and (5) storing the human body key point labeling information and the corresponding action category in the step (S5) to obtain a human body posture data set.

The invention is further provided with: the specific steps of acquiring one or more human body posture data by using the data acquisition platform comprise:

s201: creating ROS data acquisition nodes and node calling files based on a data processing system;

s202: triggering the data acquisition platform by utilizing the ROS data acquisition node to respectively acquire one or more human body postures;

s203: recording the human body posture of the related topic by using the node calling file to obtain the human body posture data.

The invention is further provided with: the method for performing time calibration synchronization based on the timestamp information comprises the specific steps of:

s301: creating ROS nodes and initializing a data processing system;

s302: subscribing the point cloud of the millimeter wave radar, the image and depth image of the RGB camera and the image of the infrared camera respectively in a callback mode and outputting corresponding timestamp information;

s303: the multi-sensor synchronizing data time stamp realizes the time synchronization of the multi-sensor.

The invention is further provided with: taking an RGB camera coordinate system as a space synchronization reference, converting and synchronizing the millimeter wave radar and an infrared camera coordinate system, and realizing the space synchronization of the RGB camera, the millimeter wave radar and the infrared camera comprises the following specific steps:

s401: respectively obtaining a first internal reference matrix of the RGB camera and a second internal reference matrix of the infrared camera;

s402: processing millimeter wave Lei Dadian cloud, and converting a radar coordinate system into a camera coordinate system;

s403: converting the radar point cloud from a camera coordinate system to a pixel coordinate system based on a first internal reference matrix of the RGB camera;

s404: calculating a first external parameter of the infrared camera relative to the RGB camera;

s405: and converting pixel points in the infrared image into an RGB camera coordinate system to realize the space synchronous calibration of the infrared camera and the RGB camera.

The invention is further provided with: the specific steps of marking human body key point information of 2D and 3D on human body gesture data acquired by a millimeter wave radar and an infrared camera by taking data acquired by an RGB camera as a reference, and defining action categories based on the human body key point marking information comprise the following steps:

s501: carrying out reasoning and identification on the acquired images of the RGB camera under one view angle to obtain first 2D human skeleton key point information which is used as 2D labeling information;

s502: acquiring images of the RGB camera under the other view angle and performing reasoning calculation to obtain second 2D human skeleton key point information;

s503: and performing triangularization processing on the first 2D human skeleton key point information and the second 2D human skeleton key point information to obtain 3D human skeleton key point information serving as 3D labeling information.

The invention also provides a collection platform of the human body posture data set, the data collection platform comprises RGB cameras, millimeter wave radars, infrared cameras, a fixed support and a control processing system, the number of the RGB cameras is 2, the RGB-1 cameras and the RGB-2 cameras are respectively arranged at the top of the fixed support at intervals, the millimeter wave radars and the infrared cameras are arranged at the top of the RGB-1 cameras at intervals, the millimeter wave radars and the infrared cameras are parallel to each other, the RGB cameras, the millimeter wave radars and the infrared cameras are connected with the control processing system, and the control processing system is connected with the data processing system.

The invention is further provided with: the data processing system builds an operating environment of the data acquisition system based on the Ubuntu operating system and the ROS.

In summary, the invention has the following beneficial effects: the invention provides a human body posture data set acquisition method based on the fusion of a millimeter wave radar and an infrared camera and a data acquisition platform, wherein the data acquisition platform comprises sensors of multiple modes of the millimeter wave radar, the infrared camera and an RGB camera, the adopted infrared camera and millimeter wave radar can overcome the influence of illumination and climate conditions of the RGB camera, and the acquired data do not relate to the problem of privacy leakage due to the detection systems of the two sensors, so that the built system can be suitable for wider scenes. In addition, all sensors are synchronized in time and space based on ROS, thereby ensuring consistency and accuracy between acquired multi-modal data. By marking the acquired data with 2D and 3D human body key point information and defining action types under the coordinate system of the RGB camera, a rich and various human body posture estimation data set can be constructed, and a new training data source is provided for research and development and performance evaluation of a human body posture estimation algorithm.

Drawings

FIG. 1 is a flow chart of a method for acquiring a human body posture data set in embodiment 2 of the present invention;

FIG. 2 is a schematic structural diagram of a data acquisition platform for a human body posture data set in embodiment 1 of the present invention;

FIG. 3 is a schematic layout diagram of an acquisition platform of a method for acquiring a human body posture data set in embodiment 2 of the present invention;

FIG. 4 is a flow chart of synchronization of multiple sensors in a method for acquiring a human body posture data set according to embodiment 2 of the present invention;

fig. 5 is a time synchronization flowchart of multiple sensors in a method for acquiring a human body posture data set in embodiment 2 of the present invention.

In the figure: 1. an RGB-1 camera; 2. an RGB-2 camera; 3. an infrared camera; 4. millimeter wave radar; 5. a depth sensor; 6. a fixed bracket; 7. the processing system is controlled.

Detailed Description

The invention is described in further detail below with reference to fig. 1-3.

Example 1

The embodiment provides a data acquisition platform of human posture dataset, the data acquisition platform includes RGB camera, millimeter wave radar 4, infrared camera 3, fixed bolster 6 and control processing system 7 (not shown in the figure), the number of RGB camera is 2, is RGB-1 camera 1 and RGB-2 camera 2 respectively, RGB-1 camera 1 and RGB-2 camera 2 interval locate the top of fixed bolster 6, millimeter wave radar 4 and infrared camera 3 interval locate the top of RGB-1 camera 1, just millimeter wave radar 4 and infrared camera 3 are parallel to each other, RGB camera, millimeter wave radar 4 and infrared camera 3 all are connected with control processing system 7, control processing system 7 is connected with data processing system.

Note that the RGB camera selected in the present embodiment is a Kinect DK camera having the depth sensor 5, wherein the resolution of the RGB-1 camera 1 is 4096×3072 and the angle of view in the horizontal direction is ±90°; the resolution of the depth sensor 5 is 640 multiplied by 576, the angle of view in the horizontal direction is +/-75 degrees, and the depth sensor 5 is a depth sensor of the RGB-1 camera 1; the working frequency of the millimeter wave radar 4 is 76-81GHz, and the angle of view in the horizontal direction is +/-60 degrees; the infrared camera 3 is selected to have a resolution of 384×288 and a view angle in the horizontal direction of ±40°.

Example 2

A method for acquiring a human body posture data set, as shown in fig. 1 and 2, comprises the following steps:

s1: constructing a multi-mode human body posture estimation data acquisition platform, wherein the data acquisition platform is connected with a data processing system and comprises an RGB camera, a millimeter wave radar and an infrared camera; the data processing system builds an operating environment of the data acquisition system based on the Ubuntu operating system and the ROS.

In this embodiment, compared with the conventional method, the data acquisition system designed by the method uses the infrared camera and the millimeter wave radar as main sensors, and the non-invasive device can protect the privacy of the user. The detection systems of the infrared camera and the millimeter wave radar are basically different from those of the RGB camera, so that the influence of illumination conditions on imaging effects can be resisted, and the infrared camera and the millimeter wave radar can be suitable for various environments. And the equipment is non-contact equipment, does not need to be worn, and improves the experience of users to a certain extent.

S2: acquiring one or more human body posture data by using a data acquisition platform;

it should be noted that, in the ROS environment, the process of collecting data based on the sensor may be mainly divided into writing a data node, writing a node launch file, and storing a data file.

S201: creating ROS data acquisition nodes and launch node calling files based on a data processing system;

in this embodiment, the ROS data collection node may collect and issue corresponding data in a callback manner, and when receiving the data, call an image data parsing function to process the data using a preset format and then issue the data as a main body; for convenience of distinction, a theme of RGB image release is defined as "image_view", a theme of infrared image release is defined as "thermal_view", and a theme of millimeter wave radar release is defined as "mmwave_view". In order to call the three nodes defined above at the same time, a calling file of the launch node needs to be written, wherein the relevant configuration information of the three hardware devices and the node information needed to be started need to be listed in the file format of xml.

S202: different scenes are selected, the ROS data acquisition nodes are utilized to trigger the data acquisition platform, and one or more human body gestures are respectively acquired;

in this embodiment, in order to ensure diversity of collected data, different time periods (early morning, afternoon and evening) are selected to perform data collection to change illumination intensity of a collected sample; the indoor space and the outdoor space which are relatively open are respectively selected to simulate different scenes when the system is actually used; and respectively acquiring gesture data of a single person and multiple persons to adapt to richer model training tasks.

In this example, a total of 27 single motions including 15 daily motions and 12 clinically recommended rehabilitation motions including waving hands, bowing, applause, etc.; 6 kinds of actions of multiple persons are all interaction of 2 persons, including handshake, thing delivery, charging and the like.

In this embodiment, the above written launch file is started, and the published topics are checked by using a rosopic command, and are respectively subscribed to RGB camera topics usb_cam/image_raw, millimeter wave Lei Dadian cloud topics mmwave/radar_scan_pcl_0, and infrared image topics usb_cam/thermal_image_raw. And recording the three topics based on rosbag, and storing the data in a bag format.

S3: respectively recording time stamp information when the RGB camera, the millimeter wave radar and the infrared camera acquire human body posture data, and performing time calibration synchronization based on the time stamp information;

it should be noted that the time synchronization of the multiple sensors in the ROS environment includes hard synchronization and soft synchronization. Hard synchronization, i.e., changing the sampling frequency of multiple sensors by a hardware trigger; the soft synchronization is then time synchronized by ROS timestamp information. The sampling frequency of the sensors adopted in the example has a large difference, so that a soft synchronization mode is adopted, and the time synchronization of multiple sensors is realized by using a message_filters library under the ROS. The time synchronization flow is shown in fig. 3.

S301: creating ROS nodes and initializing a data processing system;

in this embodiment, a subscriber provided by ROS is used to subscribe to the point cloud of the millimeter wave radar, the image and depth image of the RGB camera, and the infrared camera image data, and an appurmeratitume in the message_filters library is adopted as a synchronization policy. The principle of this strategy is as follows: when the information arrives at the synchronizer, the information is not output immediately, but the current information is output when all the sensor time stamps are the same so as to achieve the effect of time synchronization, and finally the publisher is used for publishing the synchronized multi-sensor data. Recording the synchronized data by using a record command of the rosbag, and naming the synchronized data as syn_data.

In this embodiment, the synchronized multi-sensor data syn_data. Bag is played back, and the timestamp information of the multiple sensors is checked, so that the information of time mismatch is ensured to be filtered.

S4: taking an RGB camera coordinate system as a space synchronization reference, and converting and synchronizing a millimeter wave radar and an infrared camera coordinate system to realize the space synchronization of the RGB camera, the millimeter wave radar and the infrared camera; FIG. 4 shows a joint calibration process of three sensors, which is mainly divided into an external reference calibration and an internal reference calibration of an RGB-1 camera.

S401: respectively obtaining a first internal reference matrix of the RGB camera and a second internal reference matrix of the infrared camera based on a camera calibration algorithm;

in this embodiment, first, a checkerboard calibration plate for camera calibration is manufactured, and an infrared camera should be a calibration plate made of different materials. A plurality of checkerboard images of known size are then captured using a camera. These images should cover various angles and attitudes to ensure accuracy of the calibration results. The upward corner of each figure is detected by adopting an image processing technology. And calculating the physical coordinates of each calibration plate corner point under the calibration plate coordinate system through the size information of the checkerboard. And calculating an internal reference matrix of the camera by using the extracted angular points and the corresponding physical coordinates.

S402: translating and rotating the millimeter wave Lei Dadian cloud through rigid transformation, and converting a radar coordinate system into a camera coordinate system taking an RGB-1 camera as a center;

in this embodiment, in order to establish a spatial fusion relationship between the RGB image and the millimeter wave Lei Dadian cloud, it is necessary to calculate a relative positional relationship between the RGB image and the millimeter wave Lei Dadian cloud, that is, calculate an external matrix of the millimeter wave radar with respect to the RGB camera. The above procedure can be expressed as a transformation of the radar coordinate system into the RGB camera coordinate system, where the transformation includes a translation and rotation transformation, i.e. the multiplication of the translation matrix and rotation matrix R is required _m2c Three-dimensional coordinates (x _m ,y _m ,z _m ) Is converted into three-dimensional coordinates (x) _c ,y _c ,z _c ) The transformation process is shown in the formula:

the millimeter wave radar point cloud rotates around the x axis, the y axis and the z axis by alpha, beta and gamma degrees in sequence to obtain a rotation matrix, and the calculation process can be expressed as follows:

the translation matrix may be determined by measuring the relative position of the camera coordinate system and the radar coordinate system, i.e. t _x 、t _y 、t _z Representing the Euclidean distance between the origin of the millimeter wave radar coordinate system and the origin of the camera coordinate system, respectively, the translation matrix may be represented as:

s403: converting Lei Dadian cloud from a camera coordinate system to a pixel coordinate system through perspective projection and affine transformation based on a first internal reference matrix of the RGB camera;

firstly, converting a millimeter wave radar from an RGB-1 camera coordinate system to an image coordinate system, wherein the conversion process is as follows:

wherein f is the focal length of the camera, and x and y are two-dimensional coordinates of the millimeter wave radar in an image coordinate system.

Then converting the point cloud data of the millimeter wave radar from an image coordinate system to a pixel coordinate system through affine transformation, wherein the conversion process can be expressed as follows:

wherein dx and dy respectively represent the physical dimensions of the pixel points, and (u, v) is the position of the millimeter wave Lei Dadian cloud under the pixel coordinate system.

the process is the joint calibration of the combination of the RGB-1 camera and the infrared camera, and the calculation method of the rotation matrix and the translation matrix is the same as that of the millimeter wave radar relative to the external reference matrix of the RGB-1 camera, namely, the steps S401-S403 are performed. Through the operation, the pixel points in the infrared image can be converted into the RGB camera coordinate system to realize the calibration of the pixel points and the RGB camera coordinate system.

It should be noted that, since the resolution and the angle of view of the infrared camera and the visible camera are different, the pixel sizes of the images acquired by the two sensors are also different after the two sensors are calibrated by the external reference matrix. In this example, the field angle of the RGB camera is larger than that of the infrared camera, so it is necessary to find the corresponding portions in the two modality images.

In this example, pixel-level infrared image and RGB image registration is achieved by depth information. First combine kineThe RGB image and depth information obtained by the ct sensor calculate the coordinates (x _w ,y _w ,z _w ). And then, respectively projecting all points to an infrared image plane and an RGB image plane according to the internal reference matrixes of the infrared camera and the RGB camera which are measured and calculated. Only the positions covered by a point in the two images are intercepted as corresponding parts, and finally the spatial synchronization of the two mode cameras is realized.

According to the method, through the steps, the RGB-1 camera is used as a reference, the millimeter wave radar and the infrared camera are calculated to realize the spatial synchronization of three sensors relative to the external parameter matrix of the RGB-1 camera, and the consistency of the spatial positions of the targets in all modes can be ensured under the same timestamp of the acquired human body posture data. Compared with the prior art, the method can enable the corresponding relation of pixel levels to be maintained among different modal data, and simplifies the subsequent multi-mode data fusion step.

S5: 2D and 3D human body key point information labeling is carried out on human body gesture data acquired by the millimeter wave radar and the infrared camera by taking data acquired by the RGB-1 camera as a reference, and action categories are defined based on the human body key point labeling information;

s501: carrying out reasoning and identification on the acquired image of the RGB-1 camera by adopting an image-based human body posture estimation algorithm HRNet to obtain first 2D human body skeleton key point information which is used as 2D labeling information;

s502: acquiring images of the RGB-2 camera under the other view angle, and performing reasoning calculation to obtain second 2D human skeleton key point information;

s503: and triangulating the first 2D human skeleton key point information and the second 2D human skeleton key point information under two visual angles to obtain 3D human skeleton key point information which is used as 3D labeling information of human posture data.

S6: and (5) storing the human body key point labeling information and the corresponding action category in the step S5 to obtain a human body posture data set.

The present embodiment is only for explanation of the present invention and is not to be construed as limiting the present invention, and modifications to the present embodiment, which may not creatively contribute to the present invention as required by those skilled in the art after reading the present specification, are all protected by patent laws within the scope of claims of the present invention.

Claims

1. A method of acquiring a human body pose data set, comprising the steps of:

2. The method and platform for acquiring a human body posture data set according to claim 1, wherein the specific step of acquiring one or more human body posture data by using the data acquisition platform comprises:

3. The data set acquisition system according to claim 1, wherein the specific step of recording time stamp information when the RGB camera, the millimeter wave radar, and the infrared camera acquire the human posture data, respectively, and performing time calibration synchronization based on the time stamp information includes:

s301: creating ROS nodes and initializing a data processing system;

4. The method for acquiring the human body posture data set and the data acquisition platform according to claim 1, wherein the specific steps of converting and synchronizing the millimeter wave radar and the infrared camera coordinate system by taking the RGB camera coordinate system as a spatial synchronization reference to realize the spatial synchronization of the RGB camera, the millimeter wave radar and the infrared camera comprise the following steps:

5. The method and platform for collecting human body posture data set according to claim 1, wherein the specific steps of 2D and 3D human body key point information labeling of human body posture data collected by millimeter wave radar and infrared camera based on data collected by RGB camera, and defining action category based on the human body key point labeling information include:

6. The data acquisition platform for the human body posture data set according to any one of claims 1 to 5, characterized in that the data acquisition platform comprises an RGB camera, a millimeter wave radar, an infrared camera, a fixed support and a control processing system, wherein the number of the RGB cameras is 2, the RGB cameras are an RGB-1 camera and an RGB-2 camera respectively, the RGB-1 camera and the RGB-2 camera are arranged at the top of the fixed support at intervals, the millimeter wave radar and the infrared camera are arranged at the top of the RGB-1 camera at intervals, the millimeter wave radar and the infrared camera are parallel to each other, the RGB cameras, the millimeter wave radar and the infrared camera are all connected with the control processing system, and the control processing system is connected with the data processing system.

7. The data acquisition platform of claim 6, wherein the data processing system builds an operating environment for the data acquisition system based on the Ubuntu operating system and ROS.