CN111435535A

CN111435535A - Method and device for acquiring joint point information

Info

Publication number: CN111435535A
Application number: CN201910031696.7A
Authority: CN
Inventors: 张盼; 张杨
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2019-01-14
Filing date: 2019-01-14
Publication date: 2020-07-21
Anticipated expiration: 2039-01-14
Also published as: CN111435535B

Abstract

The invention provides a method and a device for acquiring joint point information. The method and the device for acquiring the joint point information comprehensively utilize the joint point coordinate information of the depth cameras, so that the condition that some joint point(s) is/are lost in the image of part of the cameras can be avoided, the integrity and the accuracy of the joint point information are improved, and better data support can be provided for subsequent gait analysis based on the joint point information. In addition, the embodiment of the invention also improves the correctness and the integrity of the joint point information through the completion and/or the correction processing of the joint point. In addition, the embodiment of the invention also sets the weight value positively correlated with the precision for the position coordinate of the joint point based on the high and low precision of the depth information of the target object in each frame image, and carries out comprehensive processing on the position coordinate of the joint point based on the weight value, thereby further improving the accuracy of the position coordinate of the joint point.

Description

Method and device for acquiring joint point information

Technical Field

The invention relates to the technical field of video image processing, in particular to a method and a device for acquiring joint point information.

Background

In the field of video image processing, an application scenario is to acquire the gait of a target object (such as a pedestrian and an animal) through a video image, and further acquire the health information of the target object by analyzing the gait. Gait generally refers to the way and style a subject walks. Taking a pedestrian as an example, a person's gait is the outward manifestation of his own structure and function, movement regulating system, personal behavior and psychological activities while walking. Therefore, in the fields of rehabilitation, medical treatment, old age care and the like, gait analysis can provide important basis for diagnosing diseases and determining rehabilitation and treatment schemes.

The gait characteristics comprise various parameters such as kinematics (such as space position coordinates, angles, speeds, angular velocities, accelerations and the like of joint points), dynamics (ground reaction force, joint point stress, moments and the like), myoelectricity, energy consumption and the like. The spatial position coordinates are the basic data for the kinematic analysis. The spatial position coordinates of the joint points are mainly referred to herein.

A gait analysis method based on digital video and digital image processing in the prior art is characterized in that mark points are pasted on a human body, video information of walking of pedestrians is shot through a single camera, a digital video file is formed on a hard disk, and then the digital video file is divided into image sequences so as to facilitate subsequent image processing. Through image processing, the coordinates of the mark points in each image are identified, and the kinematic parameters in the gait can be calculated by the coordinates of the mark points.

The above method tracks different pedestrians based on image analysis by manually attaching a mark point on the pedestrian (template matching technique), and calculates motion information (motion estimation technique). The above method generally has the following problems in practical application:

when a single camera shoots walking people, due to the fact that the walking people cause the angle of the walking people relative to the camera to change, partial joint points are easy to lose, and gait information is incomplete and cannot be effectively analyzed. In addition, when the field personnel have a plurality of persons, and when the positions of the persons are relatively close, the precision of the joint point detection technology is reduced, the joint point detection is easy to be misplaced, and gait information is wrong, so that effective analysis cannot be performed.

Disclosure of Invention

The technical problem to be solved by the embodiments of the present invention is to provide a method and a device for acquiring joint point information, so as to improve the accuracy and integrity of the joint point information.

To solve the foregoing technical problem, an embodiment of the present invention provides a method for acquiring joint point information, including:

acquiring videos of a target object in an observation area shot by at least 2 cameras, wherein the at least 2 cameras are located at different positions of the observation area, and shooting the observation area through different shooting angles;

respectively acquiring the joint points of the target object in each frame of image according to the video shot by each camera and the corresponding depth information to obtain the coordinate information of the joint points in the image;

and integrating the joint point coordinate information in the images shot by different cameras at the same time to obtain the integrated joint point coordinate information.

Preferably, the step of respectively obtaining the joint points of the target object in each frame of image according to the video shot by each camera and the corresponding depth information to obtain the coordinate information of the joint points in the image includes:

for each frame of image of the video shot by each camera, the following processing is respectively executed:

detecting the joint point of the target object in the image based on a target object joint point detection algorithm, and obtaining joint point coordinate information in the image by combining depth information of a depth camera, wherein the joint point coordinate information comprises the spatial position of the detected joint point;

predicting a predicted spatial position of each joint at the image capturing time based on motion information of each joint detected in a history image before the image;

according to the predicted spatial position, performing completion processing and/or correction processing on joint point coordinate information in the image;

wherein the completion processing is as follows: when the joint point coordinate information in the image does not contain the complete joint point of the target object, determining a missing joint point in the image, and supplementing the predicted spatial position of the missing joint point into the joint point coordinate information in the image;

the correction processing is as follows: and correcting the coordinate information of the joint points in the image according to the difference between the predicted spatial position of each joint point and the spatial position of the corresponding joint point detected in the current frame image, wherein when the difference exceeds a preset threshold, the predicted spatial position of the joint point is used for replacing the spatial position of the detected joint point.

Preferably, the step of integrating the joint point coordinate information in the images captured by different cameras at the same time includes:

and accumulating the joint point coordinate information of the same joint point in the images shot by different cameras at the same moment and calculating the average value to obtain the integrated joint point coordinate information.

Preferably, when acquiring the joint point of the target object in each frame of image, further acquiring depth value precision information of the target object in the image, where the depth value precision information is used to indicate the precision of the depth information of the target object acquired by the camera when shooting the image;

the step of integrating the joint point coordinate information in the images shot by different cameras at the same time comprises the following steps:

according to the weight value of each piece of joint point coordinate information, carrying out weighted summation on the joint point coordinate information in the images shot by different cameras at the same time to obtain the integrated joint point coordinate information, wherein the weight value of the joint point coordinate information in the images is positively correlated with the precision represented by the depth value precision information of the target object in the images.

Preferably, the depth value precision information includes at least one of the following information: the spatial distance between the target object and the camera; a movement included angle between the movement direction of the target object and the optical axis of the camera;

before weighted summation of joint coordinate information in images taken by different cameras at the same time, the method further comprises:

determining a weight of the joint point coordinate information in the image according to the depth value precision information of the target object in the image, wherein the weight of the joint point coordinate information and a first proximity degree and/or a second proximity degree are in positive correlation, the first proximity degree is a proximity degree of the space distance and the optimal precision distance of the camera, and the second proximity degree is a proximity degree of the motion included angle and 90 degrees.

determining the weight of the joint point coordinate information in the image according to the depth value precision information and the joint point coordinate information attribute of the target object in the image, wherein the joint point coordinate information attribute comprises: detecting original joint point coordinate information obtained from the image; obtaining non-original joint point coordinate information through completion processing or correction processing; wherein, the weight of the joint point coordinate information is in positive correlation with the first proximity degree and/or the second proximity degree; the weight of the joint point coordinate information is in positive correlation with the value of the joint point coordinate information attribute; the first proximity degree is the proximity degree of the space distance and the optimal precision distance of the camera, and the second proximity degree is the proximity degree of the movement included angle and 90 degrees; and the value of the joint point coordinate information attribute under the original joint point coordinate information is higher than that under the non-original joint point coordinate information.

Preferably, the step of integrating the joint point coordinate information in the images shot by different cameras at the same time according to the weight of each joint point coordinate information includes:

aiming at the same joint point in images shot by different cameras at the same moment:

according to the weight of the joint point coordinate information, weighting and summing the joint point coordinate information of the joint point shot by different cameras to obtain the joint point coordinate information after the joint point is integrated; alternatively, the first and second electrodes may be,

and selecting the joint point coordinate information with the highest weight value from the joint point coordinate information of the joint point shot by different cameras as the integrated joint point coordinate information.

The embodiment of the present invention further provides an apparatus for acquiring joint information, including:

the video acquisition unit is used for acquiring videos of a target object in an observation area shot by at least 2 cameras, wherein the at least 2 cameras are positioned at different positions of the observation area and shoot the observation area through different shooting angles;

the joint point detection unit is used for respectively acquiring joint points of the target object in each frame of image according to the video shot by each camera and the corresponding depth information to obtain joint point coordinate information in the image;

and the information integration unit is used for integrating the joint point coordinate information in the images shot by different cameras at the same time to obtain the integrated joint point coordinate information.

Preferably, the joint point detecting unit is further configured to perform the following processing for each frame of image of the video captured by each camera:

detecting the joint point of the target object in the image, and obtaining joint point coordinate information in the image by combining corresponding depth information, wherein the joint point coordinate information comprises the spatial position of the detected joint point;

Preferably, the information integrating unit is further configured to accumulate joint point coordinate information of the same joint point in images captured by different cameras at the same time and calculate an average value to obtain integrated joint point coordinate information.

Preferably, the joint point detecting unit is further configured to, when acquiring a joint point of the target object in each frame of image, further acquire depth value precision information of the target object in the image, where the depth value precision information is used to indicate precision of depth information of the target object acquired by the camera when capturing the image;

the information integration unit is further configured to perform weighted summation on the joint coordinate information in the images shot by different cameras at the same time according to the weight of each joint coordinate information to obtain integrated joint coordinate information, where the weight of the joint coordinate information in the image is positively correlated with the precision represented by the depth value precision information of the target object in the image.

the acquisition device further includes:

the first weight determining unit is used for determining the weight of the joint point coordinate information in the image according to the depth value precision information of the target object in the image, wherein the weight of the joint point coordinate information and a first proximity degree and/or a second proximity degree are in positive correlation, the first proximity degree is the proximity degree of the space distance and the optimal precision distance of the camera, and the second proximity degree is the proximity degree of the motion included angle and 90 degrees.

the acquisition device further includes:

a second weight determining unit, configured to determine a weight of the joint coordinate information in the image according to the depth value precision information of the target object in the image and the joint coordinate information attribute, where the joint coordinate information attribute includes: detecting original joint point coordinate information obtained from the image; obtaining non-original joint point coordinate information through completion processing or correction processing; wherein, the weight of the joint point coordinate information is in positive correlation with the first proximity degree and/or the second proximity degree; the weight of the joint point coordinate information is in positive correlation with the value of the joint point coordinate information attribute; the first proximity degree is the proximity degree of the space distance and the optimal precision distance of the camera, and the second proximity degree is the motion included angle and the approach distance of 90 degrees; and the value of the joint point coordinate information attribute under the original joint point coordinate information is higher than that under the non-original joint point coordinate information.

Preferably, the information integration unit is further configured to, for a same joint point in images captured by different cameras at a same time:

The embodiment of the present invention further provides another apparatus for acquiring information of a joint, including: memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the method of acquiring information of a joint as described above.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method for acquiring the joint information as described above are implemented.

Compared with the prior art, the method and the device for acquiring the joint point information comprehensively utilize the joint point coordinate information of the plurality of cameras, so that the condition that some joint point(s) are lost in the images of part of the cameras can be avoided, the integrity and the accuracy of the joint point information are improved, and better data support can be provided for subsequent gait analysis based on the joint point information. In addition, the embodiment of the invention also improves the correctness and the integrity of the joint point information through the completion and/or the correction processing of the joint point. In addition, the embodiment of the invention also sets the weight value positively correlated with the precision for the position coordinate of the joint point based on the high and low precision of the depth information of the target object in each frame image, and carries out comprehensive processing on the position coordinate of the joint point based on the weight value, thereby further improving the accuracy of the position coordinate of the joint point.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

Fig. 1 is a schematic flowchart of a method for acquiring joint information according to an embodiment of the present invention;

fig. 2 is a schematic view of an application scenario of the method for acquiring information of a node according to an embodiment of the present invention;

fig. 3 is another schematic flow chart of a method for acquiring joint information according to an embodiment of the present invention;

fig. 4 is an exemplary diagram of a corresponding relationship between accuracy of depth information and distance obtained by a camera according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an apparatus for obtaining information of a joint point according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an apparatus for acquiring joint information according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an apparatus for acquiring joint information according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an apparatus for acquiring joint information according to an embodiment of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments. In the following description, specific details such as specific configurations and components are provided only to help the full understanding of the embodiments of the present invention. Thus, it will be apparent to those skilled in the art that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the invention. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

In various embodiments of the present invention, it should be understood that the sequence numbers of the following processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

As described in the background art, when acquiring the joint point information of the target object, the prior art generally has the problem of low accuracy and integrity, and to solve the above problem, an embodiment of the present invention provides a method for acquiring joint point information, referring to fig. 1, the method includes:

and step 11, acquiring videos of the target object in the observation area shot by at least 2 cameras.

In order to improve the accuracy and the integrity of the joint point information, the embodiment of the invention adopts more than two cameras to shoot an observation area, each camera can be fixed at different positions in the observation area, and the observation area is monitored through different shooting angles, so that videos of target objects shot by the cameras are respectively obtained. FIG. 2 shows an application scenario of the embodiment of the present invention, in which four cameras 23 to 26 are arranged in the observation area 20, and the four cameras are respectively located at different positions of the observation area and shoot target objects, such as pedestrians 21 to 22, in the observation area 20 through different shooting angles.

In the embodiment of the invention, each camera can be a depth camera, and a color field image and corresponding depth information can be acquired. And acquiring the joint points of the target object on the color live image by using a joint point detection algorithm, and acquiring the space coordinate information of the joint points by combining the depth information. Here, the camera may be specifically a binocular camera.

And step 12, respectively acquiring the joint points of the target object in each frame of image according to the video shot by each camera and the corresponding depth information to obtain the coordinate information of the joint points in the image.

Here, the embodiment of the present invention obtains the coordinate information of the joint point of the target object under each camera by using the joint point detection algorithm based on the video shot by each camera, respectively. For example, joint points (such as skeletal joint points) of people in an image can be acquired by a joint point detection algorithm based on a two-dimensional color image and depth information acquired by a binocular camera; joint point coordinate information (coordinates of spatial positions) of each joint point can be acquired based on the three-dimensional depth information. The selection of a specific joint point can be preset according to the requirements of subsequent applications, for example, gait analysis usually requires extraction of a plurality of joint points on a person (one application requirement is 17 joint points, including joint points at the wrist, ankle, knee, shoulder, etc.).

And step 13, integrating the joint point coordinate information in the images shot by different cameras at the same time to obtain the integrated joint point coordinate information.

Here, the embodiment of the present invention acquires joint point coordinate information of a target object based on videos of a plurality of cameras, respectively, and then integrates the joint point coordinate information. The comprehensive processing may specifically be that joint point coordinate information of the same joint point in images shot by different cameras at the same time is accumulated and an average value is calculated to obtain the comprehensive joint point coordinate information.

For example, assuming that there are 3 cameras, there are 8 joint points of the target object to be focused, which are joint points 1 to 8. Wherein, the coordinates of the joint points 1 to 8 are acquired from the image shot by the camera 1 at the time tn; coordinates of the joint points 1 to 7 are acquired from an image shot by the camera 2 at time tn, and coordinates of the joint point 8 are lacked; the coordinates of the joint points 1 to 8 are acquired from the image taken by the camera 3 at time tn, and one possible way of integrating the processing is: for the joint points 1-7, respectively accumulating corresponding coordinates in the 3 cameras, and then dividing by 3 to obtain integrated coordinates; for the joint point 8, the corresponding coordinates in the camera 1 and the camera 3 can be accumulated, and then divided by 2, so as to obtain the integrated coordinates.

In addition, in the embodiment of the present invention, since the installation position and the shooting angle of each camera are preset, before integrating the joint coordinate information under each camera, it is necessary to convert all the joint coordinate information into the same coordinate system and then perform the integration process.

It can be seen that in practical applications, due to factors such as a shooting angle or occlusion, some images of a certain camera may have a problem that some joint information cannot be obtained. In step 13, the joint point coordinate information of multiple cameras is comprehensively utilized, so that the condition that some joint point(s) are lost in the image of part of the cameras can be avoided, the integrity and accuracy of the joint point information are improved, and better data support can be provided for subsequent gait analysis (such as acquiring a health state) based on the joint point information.

In another embodiment provided by the present invention, in the step 12, the motion information of the joint point may be used to perform a compensation process on the missing joint point, and/or perform a correction process on the wrong joint point, so as to provide better support for the subsequent comprehensive process in the step 13, and obtain a better comprehensive result. Here, missing joint points refer to joint points that are not detected in an image, and some joint points are usually invisible due to factors such as occlusion; the wrong joint point refers to a joint point with too large a deviation between the coordinates of the detected joint point and the coordinates of the real joint point, and is usually caused by instability of a joint point detection algorithm.

Specifically, in step 12, the embodiment of the present invention may respectively perform the following processing for each frame of image of the video captured by each camera:

A) detecting the joint point of the target object in the image, and obtaining joint point coordinate information in the image by combining corresponding depth information, wherein the joint point coordinate information comprises the spatial position of the detected joint point.

B) The predicted spatial positions of the respective joint points at the image capturing time are predicted based on the motion information of the respective joint points detected in the history image before the image.

Here, the motion estimation algorithm in the prior art may be used to predict the position coordinates of the joint point in the current image based on the historical coordinate information of the joint point, and the specific algorithm may refer to related implementations in the prior art, which is not specifically limited in the embodiment of the present invention.

C) According to the predicted spatial position, performing completion processing and/or correction processing on joint point coordinate information in the image;

wherein the completion processing is as follows: when the joint point coordinate information in the image does not contain the complete joint point of the target object, determining a missing joint point in the image, and supplementing the predicted spatial position of the missing joint point into the joint point coordinate information in the image; the correction processing is as follows: and correcting the coordinate information of the joint points in the image according to the difference between the predicted spatial position and the detected spatial position of each joint point, wherein when the difference exceeds a preset threshold, the joint points are likely to be wrongly detected, so that the predicted spatial positions of the joint points can be used for replacing the detected spatial positions, and the accuracy of the information of the joint points is improved. Here, the complete joint points are all joint points that need attention, which are selected in advance according to application scenario requirements.

Through the processing, the joint points acquired by each camera can be ensured to be complete, and the joint points which are possibly detected wrongly are corrected through the motion information, so that the correctness and the integrity of the joint point information acquired in the subsequent step 13 can be further improved.

Considering the accuracy of the position coordinates of the joint points acquired from a certain frame of image of the camera, which is closely related to the accuracy of the depth information of the target object that can be acquired by the camera when the image is shot, the embodiment of the invention also provides another method for acquiring the joint point information. As shown in fig. 3, the method for acquiring the information of the joint point includes:

step 31, acquiring a video of a target object in an observation area shot by at least 2 cameras, wherein the at least 2 cameras are located at different positions of the observation area, and shooting the observation area through different shooting angles.

Here, in order to improve the accuracy and integrity of the joint point information, in the embodiment of the present invention, two or more cameras are used to capture images in an observation area, each camera may be fixed at a different position in the observation area, and the observation area is monitored by different capturing angles, so as to obtain videos of target objects captured by the multiple cameras respectively.

In the embodiment of the invention, each camera can acquire the depth image of the target object, and the joint point information of the target object can be respectively acquired based on the image of each camera. Here, the camera may be specifically a binocular camera.

And step 32, respectively obtaining the joint points of the target object in each frame of image according to the video shot by each camera and the corresponding depth information, obtaining the coordinate information of the joint points in the image, and obtaining the depth value precision information of the target object in the image, wherein the depth value precision information is used for representing the precision of the depth information of the target object obtained when the camera shoots the image.

Here, the embodiment of the present invention acquires information of the joint point of the target object under each camera based on the video taken by each camera, respectively. For example, joint points (such as skeletal joint points) of people in an image can be acquired by a color image and a depth image acquired by a binocular camera based on a two-dimensional color image and by using a joint point detection algorithm; joint point coordinate information (coordinates of spatial positions) of each joint point can be acquired based on the three-dimensional depth image. The selection of a specific joint point can be preset according to the requirements of subsequent applications, for example, gait analysis usually requires extraction of a plurality of joint points on a person (one application requirement is 17 joint points, including joint points at the wrist, ankle, knee, shoulder, etc.).

In the embodiment of the invention, the depth value precision information of the target object in the image is also acquired while the joint point coordinate information in the image is acquired. Here, the depth value accuracy information is used to indicate the accuracy of the depth information of the target object acquired by the camera when capturing the image. Specifically, the depth value precision information may include at least one of the following information: the spatial distance between the target object and the camera; and a movement included angle between the movement direction of the target object and the optical axis of the camera.

For example, a conventional binocular camera, which obtains depth information with the highest accuracy when the target object is at a distance of L0 from the camera, obtains depth information with the highest accuracy, also referred to herein as the best accuracy distance, where the accuracy of the depth information obtained by the camera is the highest, is the distance between the camera and the target object.

In addition, the angle of the moving direction of the target object relative to the optical axis direction of the camera (i.e., the movement included angle described herein) also affects the accuracy of the depth information acquisition. Generally, the depth information obtained when the moving direction is perpendicular to the optical axis direction has higher precision, that is, the moving angle and the proximity of 90 degrees (for convenience of description, referred to as a second proximity) are positively correlated with the depth value precision information, and the closer the moving angle and the proximity are, the higher precision is, and therefore, the depth value precision information can also be characterized by the moving angle. For example, the embodiment of the present invention may pre-establish a corresponding relationship (specifically, the corresponding relationship may be a function or a table) between the depth value precision information and the motion angle.

Of course, in the embodiment of the present invention, the spatial distance and the motion angle may also be used to represent the depth value precision information, at this time, the depth value precision information is in a positive correlation with both the first proximity degree and the second proximity degree, that is, the depth value precision information is in a positive correlation with the first proximity degree, and the depth value precision information is in a positive correlation with the second proximity degree. Specifically, under the condition that the spatial distance and the motion included angle are adopted to represent depth value precision information at the same time, when the motion included angles are the same, the depth value precision information is positively correlated with the first proximity degree; when the spatial distances are the same, the depth value precision information is positively correlated with the second proximity. Similarly, according to the above principle, the embodiment of the present invention may also pre-establish a corresponding relationship (specifically, the corresponding relationship may be implemented by a function or a table) between the depth value precision information and the motion angle and the spatial distance.

And step 33, determining a weight of the joint point coordinate information in the image according to the depth value precision information of the target object in the image, wherein the weight of the joint point coordinate information is in positive correlation with a first proximity and/or a second proximity, the first proximity is a proximity of the spatial distance and the optimal precision distance of the camera, and the second proximity is a proximity of the motion angle and 90 degrees.

Here, in step 33, the weight value of the joint coordinate information acquired in each frame image may be set according to the depth value precision information of the target object in the image. Wherein, the weight of the joint point coordinate information is in positive correlation with the first proximity and/or the second proximity, specifically:

1) when the depth value precision information is represented by only the spatial distance, the weight of the joint point coordinate information is positively correlated with the spatial distance and the first proximity degree of the optimal precision distance of the camera, namely, the closer the spatial distance and the first proximity degree, the larger the weight.

2) When the depth value precision information is represented by only the motion included angle, the weight of the joint point coordinate information is positively correlated with the motion included angle and the second proximity of 90 degrees, namely, the closer the motion included angle is, the larger the weight is.

3) When two parameters of the space distance and the motion included angle are adopted to represent depth value precision information, the weight of the joint point coordinate information is in positive correlation with the first proximity and the second proximity, namely when the motion included angles are the same, the weight is in positive correlation with the first proximity; the weight is positively correlated with the second proximity when the spatial distances are the same.

According to the principle, the embodiment of the present invention may also establish a corresponding relationship (specifically, the corresponding relationship may be implemented by a function or a table) between the weight and the motion included angle and the spatial distance, so that the weight of the joint coordinate information in each frame of image may be determined by the function or the table.

And step 34, integrating the joint point coordinate information in the images shot by different cameras at the same time according to the weight of each joint point coordinate information to obtain the integrated joint point coordinate information, wherein the weight of the joint point coordinate information in the images is positively correlated with the precision represented by the depth value precision information of the target object in the images.

Here, in the integrated processing, the embodiment of the present invention may process the same joint point in the images captured by different cameras at the same time in any one of the following manners:

1) and according to the weight of the joint point coordinate information, carrying out weighted summation on the joint point coordinate information of the joint point shot by different cameras to obtain the joint point coordinate information after the joint point is synthesized.

The method comprehensively considers the joint point positions in each frame image, so that the accuracy of the joint point position information is relatively high.

2) And selecting the joint point coordinate information with the highest weight value from the joint point coordinate information of the joint point shot by different cameras as the integrated joint point coordinate information.

In the method, the joint point coordinate with the highest weight is used as the final coordinate of the joint point, and the joint point coordinate with the higher weight has higher reliability, so that the method can also improve the accuracy of the position information of the joint point.

Through the above steps, when the position coordinates of the same joint point in multiple frames of images are comprehensively processed, the embodiment of the invention sets the weight value positively correlated to the precision for the position coordinates of the joint point based on the high and low precision of the depth information of the target object in each frame of image, and performs the comprehensive processing based on the weight value, thereby further improving the accuracy of the position coordinates of the joint point.

In addition, similarly, in the step 32, the embodiment of the present invention may also refer to the processing manner in the step 12, and perform the completion processing and/or the correction processing on each frame of image of the video captured by each camera, respectively, so as to improve the integrity and the accuracy of the joint point information, which is not described herein again for saving details.

After the completion processing and/or the correction processing is performed in step 32, the coordinate information of the missing joint point to be supplemented is not the original joint point coordinate information detected from the image, and the coordinate information of the joint point to be replaced after the correction is not the original joint point coordinate information detected from the image. The confidence level for these supplemental or modified joint coordinate information is typically lower than the original joint coordinate information. Based on this, when the weight of the joint point coordinate information is set, the embodiment of the present invention may further introduce the feature of the joint point coordinate information attribute. The joint point coordinate information attribute is used for indicating whether the joint point position coordinate is original joint point coordinate information detected from an image, and the attribute specifically comprises: detecting original joint point coordinate information obtained from the image; and obtaining the coordinate information of the non-original joint points through completion processing or correction processing.

As an alternative implementation of step 33, in this embodiment of the present invention, a weight of the joint coordinate information in the image may be determined according to the depth value precision information and the joint coordinate information attribute of the target object in the image; wherein, the weight of the joint point coordinate information is in positive correlation with the first proximity degree and/or the second proximity degree; the weight of the joint point coordinate information is in positive correlation with the value of the joint point coordinate information attribute; the first proximity degree is the proximity degree of the space distance and the optimal precision distance of the camera, and the second proximity degree is the motion included angle and the approach distance of 90 degrees; and the value of the joint point coordinate information attribute under the original joint point coordinate information is higher than that under the non-original joint point coordinate information.

Specifically, when the weight is determined according to the spatial distance and the joint coordinate information attribute, the following steps are performed: if the spatial distance values are the same, the weight is higher when the joint point coordinate information attribute is original joint point coordinate information; if the values of the joint point coordinate information attributes are the same, the weight is positively correlated with the first proximity;

under the condition that the weight is determined according to the motion included angle and the joint point coordinate information attribute: if the motion included angles have the same value, the weight is higher when the joint point coordinate information attribute is the original joint point coordinate information; if the values of the joint point coordinate information attributes are the same, the weight is positively correlated with the second proximity;

under the condition that the weight is determined according to 3 parameters including the spatial distance, the motion included angle and the joint point coordinate information attribute: if the values of the spatial distance and the motion included angle are the same, the weight is higher when the joint point coordinate information attribute is original joint point coordinate information; if the motion included angle and the original joint point coordinate information are the same in value, the weight is positively correlated with the first proximity degree; and if the spatial distance and the original joint point coordinate information are the same in value, the weight is positively correlated with the second proximity.

Similarly, according to the above principle, the embodiment of the present invention may establish a corresponding relationship (specifically, the corresponding relationship may be implemented by a function or a table) between the weight and the motion angle, the spatial distance, and the attribute of the joint point coordinate information, so that the weight of the joint point coordinate information in each frame of image may be determined by the function or the table.

In order to simplify the processing, a possible implementation manner is that, for a joint point whose joint point coordinate information attribute is non-original joint point coordinate information, the weight of the joint point coordinate information of the joint point may be directly set to 0, that is, the joint point may be ignored during subsequent comprehensive processing.

Based on the above method, an embodiment of the present invention further provides a device for implementing the above method, please refer to fig. 5, an embodiment of the present invention provides an apparatus 50 for acquiring joint information, including:

a video acquiring unit 51, configured to acquire a video of a target object in an observation area captured by at least 2 cameras, where the at least 2 cameras are located at different positions of the observation area, and capture the observation area through different capturing angles;

a joint point detecting unit 52, configured to obtain joint points of the target object in each frame of image according to the video shot by each camera and the corresponding depth information, to obtain coordinate information of the joint points in the image;

and the information integration unit 53 is configured to integrate the joint point coordinate information in the images captured by different cameras at the same time to obtain the integrated joint point coordinate information.

Preferably, the joint point detecting unit 52 is further configured to perform the following processing for each frame of image of the video captured by each camera:

Preferably, the information integrating unit 53 is further configured to accumulate joint point coordinate information of the same joint point in images captured by different cameras at the same time and calculate an average value to obtain integrated joint point coordinate information.

The above device 50 for acquiring joint point information can comprehensively utilize the coordinate information of joint points of a plurality of cameras, thereby avoiding the situation that some joint point(s) are lost in the image of part of the cameras, improving the integrity and accuracy of the joint point information, and further providing better data support for subsequent gait analysis based on the joint point information.

Referring to fig. 6, an apparatus 60 for acquiring joint information according to an embodiment of the present invention includes:

a video acquiring unit 61, configured to acquire a video of a target object in an observation area captured by at least 2 cameras, where the at least 2 cameras are located at different positions of the observation area, and capture the observation area through different capturing angles;

a joint point detecting unit 62, configured to obtain joint points of the target object in each frame of image according to the video captured by each camera and corresponding depth information, to obtain coordinate information of the joint points in the image, and obtain depth value precision information of the target object in the image, where the depth value precision information is used to indicate precision of the depth information of the target object obtained when the camera captures the image;

a first weight determining unit 63, configured to determine a weight of joint coordinate information in an image according to depth value precision information of a target object in the image, where the weight of the joint coordinate information is in a positive correlation with a first proximity and/or a second proximity, the first proximity is a proximity of the spatial distance and an optimal precision distance of the camera, and the second proximity is a proximity of the motion angle and 90 degrees;

and the information integration unit 64 is configured to perform weighted summation on the joint coordinate information in the images shot by different cameras at the same time according to the weight of each joint coordinate information, so as to obtain integrated joint coordinate information, where the weight of the joint coordinate information in the image is positively correlated with the precision represented by the depth value precision information of the target object in the image.

Here, the depth value precision information includes at least one of the following information: the spatial distance between the target object and the camera; and a movement included angle between the movement direction of the target object and the optical axis of the camera.

Preferably, the joint point detecting unit 62 is further configured to perform the following processing for each frame of image of the video captured by each camera:

Preferably, the information integrating unit 64 is further configured to, for the same joint point in the images captured by different cameras at the same time:

Referring to fig. 7, an apparatus 70 for acquiring joint information according to an embodiment of the present invention includes:

a video acquisition unit 71, configured to acquire a video of a target object in an observation area captured by at least 2 cameras, where the at least 2 cameras are located at different positions of the observation area, and capture the observation area through different capturing angles;

a joint point detecting unit 72, configured to obtain joint points of the target object in each frame of image according to the video captured by each camera and corresponding depth information, to obtain coordinate information of the joint points in the image, and obtain depth value precision information of the target object in the image, where the depth value precision information is used to indicate precision of the depth information of the target object obtained when the camera captures the image;

a second weight determining unit 73, configured to determine a weight of the joint coordinate information in the image according to the depth value precision information of the target object in the image and the joint coordinate information attribute, where the joint coordinate information attribute includes: detecting original joint point coordinate information obtained from the image; obtaining non-original joint point coordinate information through completion processing or correction processing; wherein, the weight of the joint point coordinate information is in positive correlation with the first proximity degree and/or the second proximity degree; the weight of the joint point coordinate information is in positive correlation with the value of the joint point coordinate information attribute; the first proximity degree is the proximity degree of the space distance and the optimal precision distance of the camera, and the second proximity degree is the motion included angle and the approach distance of 90 degrees; and the value of the joint point coordinate information attribute under the original joint point coordinate information is higher than that under the non-original joint point coordinate information.

And the information integrating unit 74 is configured to perform weighted summation on the joint coordinate information in the images shot by different cameras at the same time according to the weight of each joint coordinate information, so as to obtain integrated joint coordinate information, where the weight of the joint coordinate information in the image is positively correlated with the precision represented by the depth value precision information of the target object in the image.

Preferably, the joint point detecting unit 72 is further configured to perform the following processing for each frame of image of the video captured by each camera:

Preferably, the information integrating unit 74 is further configured to, for the same joint point in the images captured by different cameras at the same time:

Referring to fig. 8, another structural diagram of an apparatus 800 for acquiring joint information according to an embodiment of the present invention includes: a processor 801, a network interface 802, a memory 803, a user interface 804, and a bus interface, wherein:

in this embodiment of the present invention, the obtaining apparatus 800 further includes: a computer program stored on the memory 803 and executable on the processor 801, the computer program when executed by the processor 801 implementing the steps of:

In FIG. 8, the bus architecture may include any number of interconnected buses and bridges, with one or more processors, represented by the processor 801, and various circuits, represented by the memory 803, linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The network interface 802 may be a wired or wireless network card device that implements data transceiving functions over a network. The user interface 804 may also be an interface capable of interfacing with a desired device for different user devices, including but not limited to a keypad, display, speaker, microphone, joystick, etc.

The processor 801 is responsible for managing the bus architecture and general processing, and the memory 803 may store data used by the processor 801 in performing operations.

Optionally, the computer program when executed by the processor 803 may also implement the following steps:

when the joint point of the target object in each frame of image is obtained, further obtaining depth value precision information of the target object in the image, wherein the depth value precision information is used for representing the precision of the depth information of the target object obtained when the camera shoots the image;

Optionally, the depth value precision information includes at least one of the following information: the spatial distance between the target object and the camera; a movement included angle between the movement direction of the target object and the optical axis of the camera; the computer program, when executed by the processor 803, may further implement the steps of:

determining the weight of the joint point coordinate information in the image according to the depth value precision information and the joint point coordinate information attribute of the target object in the image, wherein the joint point coordinate information attribute comprises: detecting original joint point coordinate information obtained from the image; obtaining non-original joint point coordinate information through completion processing or correction processing; wherein, the weight of the joint point coordinate information is in positive correlation with the first proximity degree and/or the second proximity degree; the weight of the joint point coordinate information is in positive correlation with the value of the joint point coordinate information attribute; the first proximity degree is the proximity degree of the space distance and the optimal precision distance of the camera, and the second proximity degree is the motion included angle and the approach distance of 90 degrees; and the value of the joint point coordinate information attribute under the original joint point coordinate information is higher than that under the non-original joint point coordinate information.

Optionally, the computer program when executed by the processor 803 may further implement the following steps:

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for acquiring joint point information is characterized by comprising the following steps:

2. The method according to claim 1, wherein the step of obtaining the joint point of the target object in each frame of image according to the video shot by each camera and the corresponding depth information to obtain the coordinate information of the joint point in the image comprises:

detecting the joint point of the target object in the current frame image, and obtaining joint point coordinate information in the image by combining corresponding depth information, wherein the joint point coordinate information comprises the spatial position of the detected joint point;

3. The acquisition method according to claim 1 or 2, wherein the step of integrating the joint coordinate information in the images taken by different cameras at the same time comprises:

4. The acquisition method according to claim 2,

5. The acquisition method according to claim 4,

the depth value precision information includes at least one of: the spatial distance between the target object and the camera; a movement included angle between the movement direction of the target object and the optical axis of the camera;

6. The acquisition method according to claim 4,

7. The method according to any one of claims 4 to 6, wherein the step of integrating the coordinate information of the joint points in the images captured by different cameras at the same time according to the weight of the coordinate information of each joint point comprises:

8. An apparatus for acquiring information of a joint, comprising:

9. The acquisition apparatus according to claim 8,

the joint point detection unit is further configured to perform the following processing for each frame of image of the video captured by each camera:

10. The acquisition device according to claim 8 or 9,

the information integration unit is also used for accumulating the joint point coordinate information of the same joint point in the images shot by different cameras at the same time and calculating the average value to obtain the integrated joint point coordinate information.

11. The acquisition device as set forth in claim 9,

the joint point detection unit is further configured to, when obtaining a joint point of the target object in each frame of image, further obtain depth value precision information of the target object in the image, where the depth value precision information is used to indicate precision of depth information of the target object obtained by the camera when shooting the image;

12. The acquisition device as set forth in claim 11,

the acquisition device further includes:

13. The acquisition device as set forth in claim 11,

the acquisition device further includes:

14. The acquisition device according to any one of claims 11 to 13,

the information integration unit is also used for aiming at the same joint point in the images shot by different cameras at the same moment: