Skeleton information action feature extraction method based on stereo camera
Technical Field
The invention relates to a skeleton information action feature extraction method, in particular to a skeleton information action feature extraction method based on a stereo camera.
Background
Human motion characteristics are being studied in biomedical engineering, physiotherapy, medical diagnosis and rehabilitation. The detection of the human motion characteristics has wide requirements in places such as nursing homes, hospitals and the like, and also has many applications in the fields of safety protection, battlefield reconnaissance and the like. Under the impetus of the development of motion performance analysis, visual monitoring and biometry, the method for extracting and analyzing different human motion is widely regarded.
Currently, the most common method for detecting motion characteristics of a human body is to use a sequence of visual images. However, the visual perception of human body movement is affected by distance, light changes, clothing changes and the shielding of various parts of the human body on the appearance, and the detection performance is reduced. Radar is an electromagnetic sensor, which can work in daytime and at night due to its long range of action and has the ability to penetrate through walls and the ground, and is also commonly used for detecting human motion characteristics. However, the traditional radar has lower working frequency, the influence of the micro Doppler effect of human motion is very small, and the human motion characteristics are more difficult to detect with high resolution.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a method for representing the action characteristics of skeleton information extracted by a stereo camera, which has high precision and convenient application.
The purpose of the invention can be realized by the following technical scheme: a skeleton information action feature extraction method based on a stereo camera comprises the following steps:
(1) acquiring coordinates of each joint point of the human body by using a stereo camera;
(2) determining a human body coordinate system;
(3) calculating a first position matrix from a camera coordinate system to a human body coordinate system, wherein the first position matrix comprises a rotating part matrix and a translating part matrix;
(4) calculating a second position matrix from each joint point of the human body to a camera coordinate system;
(5) calculating the relative position of each joint point of the human body relative to a human body coordinate system, and taking the relative position as a part of the required characteristics of motion recognition;
(6) smoothing the rotation translation matrix between two adjacent frames, and accumulating the variation between two adjacent needles to obtain the translation variation and the rotation variation of the human body coordinate system from the beginning of the action to the current moment;
(7) and generating a feature vector of the current moment.
The human body coordinate system is characterized in that an origin O of the human body coordinate system is an origin which is an intersection point of a connecting line of a left shoulder L and a right shoulder R of a human body and a human body symmetry axis, an x axis of the human body coordinate system is a ray from the origin O to the left shoulder L, a y axis of the human body coordinate system is a ray from the origin O to a human body center T, and a z axis of the human body coordinate system is perpendicular to a plane where the x axis and the y axis are located and meets the right-hand rule.
The step (3) is specifically as follows: let the coordinate of the left shoulder L of the human body acquired by the stereo camera be (CXL,CYL,CZL) The coordinate of the right shoulder R of the human body is (CXR,CYR,CZR) The coordinate of the center T of the human body is (CXT,CYT,CZT) The rotating parts are in matrix ofSaid first position matrix Wherein,OPCORGin order to translate a portion of the matrix,OPCORG=[COx,COy,COz]T,
the rotating part matrix isThe calculation is carried out according to the formula (1),
wherein,representing the vector from the right shoulder to the left shoulder of the human body,a vector representing the center of the human body to the origin of the human body coordinate system,represents the vector from the right shoulder of the human body to the center of the human body,
the step (4) is specifically as follows: orthogonalizing the rotating part matrix in the first position matrix to obtain a second position matrixComprises the following steps:
wherein,for a rotating partial matrix, U and λ satisfy Since the rotating part matrices calculated from the acquired point cloud data and obtained based on the three left and right shoulders and the body center are not necessarily orthogonal, it is necessary to orthogonalize the rotating part matrices,
the step (5) is specifically as follows: the coordinates of each joint point of the human body are converted into a human body coordinate system through coordinate transformation, so that the relative position of each joint point of the human body with respect to the human body coordinate system is obtained, and the influence of visual angle change in action recognition is solved; however, for the motion sequence with the indistinct four-limb movement, the change of the human body coordinate system in the motion needs to be added for distinguishing the motion such as rotation, jumping and the like.
The smoothing treatment specifically comprises the following steps: and performing arithmetic average on the obtained rotation and translation matrixes between two adjacent frames. Because the human body motion is equivalent to a rigid body, the change of the human body motion between two adjacent frames is smooth, and sudden change does not occur, so that the obtained change quantity between two adjacent frames is smoothed.
The step (7) is specifically as follows: and (5) taking the relative position in the step (5) and the translation variation and the rotation variation in the step (6) as the feature vector of the current moment.
Compared with the prior art, the invention has the following advantages:
(1) according to the invention, a human body coordinate system is established, so that the scene information does not need to be modeled, and the extra error caused by calculating a horizontal or vertical reference coordinate system is avoided;
(2) after the human body is identified, the method solves the influence of visual angle change in action identification by calculating the relative position of the human body joint point relative to the human body coordinate system;
(3) in the actual test, the method reduces the error and improves the accuracy of action identification by accumulating the variable quantity between two adjacent frames;
(4) the stereo camera used by the invention makes up the defect of the loss of space information of the traditional monocular camera, has more and more extensive application due to the price advantage compared with a binocular camera and a TOF camera, and improves the identification precision and capability of the skeleton.
Drawings
Fig. 1 is a flowchart of a skeleton information action feature extraction method based on a stereo camera according to the present application;
FIG. 2 is a schematic diagram of a human coordinate system;
fig. 3 is a schematic diagram illustrating the transformation from point a to point B of the human body at time t.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
As shown in fig. 1, a skeleton information action feature extraction method based on a stereo camera includes the following steps:
(1) acquiring coordinates of each joint point of the human body by using a stereo camera;
(2) and determining a human body coordinate system, wherein an origin O of the human body coordinate system is an origin at an intersection point of a connecting line of a left shoulder L and a right shoulder R of the human body and a human body symmetry axis, an x-axis of the human body coordinate system is a ray from the origin O to the left shoulder L, a y-axis of the human body coordinate system is a ray from the origin O to a human body center T, and a z-axis of the human body coordinate system is perpendicular to a plane where the x-axis and the y-axis are located and meets the right-hand rule, as shown in fig. 2.
(3) Calculating a first position matrix from a camera coordinate system to a human body coordinate system, wherein the first position matrix comprises a rotating part matrix and a translating part matrix; let the coordinate of the left shoulder L of the human body acquired by the stereo camera be (CXL,CYL,CZL) The coordinate of the right shoulder R of the human body is (CXR,CYR,CZR) The coordinate of the center T of the human body is (CXT,CYT,CZT) The rotating parts are in matrix ofThen the first position matrix Wherein,OPCORGin order to translate a portion of the matrix,OPCORG=[COx,COy,COz]T,
rotating parts are in matrix ofThe calculation is carried out according to the formula (1),
wherein,representing the vector from the right shoulder to the left shoulder of the human body,a vector representing the center of the human body to the origin of the human body coordinate system,represents the vector from the right shoulder of the human body to the center of the human body,
(4) since the rotating part matrices calculated from the acquired point cloud data and obtained based on the three left shoulders, right shoulders and body center are not necessarily orthogonal, it is necessary to orthogonalize the rotating part matrices to obtain a second position matrix, which is a position matrix with a high accuracyComprises the following steps:
wherein,for a rotating partial matrix, U and λ satisfy
(5) The coordinates of each joint point of the human body are converted into a human body coordinate system through coordinate transformation to obtain the relative position of each joint point of the human body relative to the human body coordinate system, and the relative position is used as a part of the required characteristics of action identification, so that the influence of visual angle change in the action identification is solved; however, for the motion sequence with the indistinct four-limb movement, the change of the human body coordinate system in the motion needs to be added for distinguishing the motion such as rotation, jumping and the like.
(6) Because human motion equivalence becomes a rigid body, the motion of human motion is comparatively gentle between two adjacent frames, can not take place the sudden change, consequently carries out the smoothing to the rotational translation matrix between two adjacent frames that obtain, and the smoothing process specifically is: and performing arithmetic average on the obtained rotation and translation matrixes between two adjacent frames. After the smoothing treatment, accumulating the variation between two adjacent needles to obtain the translation variation and the rotation variation of the human body coordinate system from the beginning of the action to the current moment;
assuming that the human body is located at point A at time t, the human body is located at point B at time t +1, and α, β and gamma are the rotation angles of the human body around the x-axis, the y-axis and the z-axis, respectively, as shown in FIG. 3, the rotation matrix is rotated The Euler angle can be obtained by calculation The obtained variation is a change between frames and is small, and in the action recognition, the variation needs to be accumulated to calculate the translation and rotation changes of the human coordinate system from the action start to the current time.
(7) And (5) taking the relative position in the step (5) and the translation variation and the rotation variation in the step (6) as the feature vector of the current moment.