CN115578753A

CN115578753A - Human body key point detection method and device, electronic equipment and storage medium

Info

Publication number: CN115578753A
Application number: CN202211167898.2A
Authority: CN
Inventors: 宁欣; 冉航; 张玉贵; 李爽; 李卫军
Original assignee: Institute of Semiconductors of CAS
Current assignee: Institute of Semiconductors of CAS
Priority date: 2022-09-23
Filing date: 2022-09-23
Publication date: 2023-01-06
Anticipated expiration: 2042-09-23
Also published as: CN115578753B

Abstract

The invention provides a human body key point detection method, a human body key point detection device, electronic equipment and a storage medium, wherein the human body key point detection method comprises the following steps: acquiring a human body image to be detected; generating a first characteristic diagram and a second characteristic diagram corresponding to the human body image to be detected based on the human body image to be detected; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density priori knowledge; and the prior knowledge of the key point density is used for distinguishing the dense part of the human key points and the sparse part of the human key points in the second feature map. By the method, the feature maps with different resolutions can be adopted for detecting different key point positions, the dense positions of the key points of the human body can be accurately detected on the basis of ensuring the detection accuracy of the sparse positions of the key points of the human body, and the detection accuracy of the key points of the whole body of the human body is improved.

Description

Human body key point detection method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a human body key point detection method and device, electronic equipment and a storage medium.

Background

With the progress of society and the development of science and technology, industries such as online education, intelligent medical treatment, intelligent robot and the like are continuously rising, and the functional requirements for interaction based on human key point information are more and more. The human body key point detection is the basis of many high-level visual tasks, for example, behavior recognition, clothing analysis, pedestrian re-recognition and the like can be performed by using a human body key point detection technology.

In the related art, a convolutional neural network is usually adopted to learn the key points of the human body in the picture or the video in a supervised learning mode, but the method mainly aims at detecting the key points of the sparse part of the human body.

However, for dense key points of the human body, because the distribution density difference of the dense key points of the human body at each part of the human body is large, the detection of all the key points of the human body by adopting the same network structure in the related technology can cause low detection precision of the parts with dense key points; therefore, how to improve the detection accuracy of the key points of the whole human body is a problem to be solved urgently at present.

Disclosure of Invention

In order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for detecting a key point of a human body, an electronic device, and a storage medium.

The invention provides a human body key point detection method, which comprises the following steps:

acquiring a human body image to be detected;

generating a first characteristic diagram and a second characteristic diagram corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions;

determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density priori knowledge; and the key point density priori knowledge is used for distinguishing the dense parts of the human key points and the sparse parts of the human key points in the second feature map.

Optionally, the prior knowledge of the key point density is obtained by:

acquiring the number of key points of each human body part;

and determining the prior knowledge of the key point density based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.

Optionally, the determining, based on the first feature map, the second feature map, and the prior knowledge of the key point density, a key point detection result corresponding to the human body image to be detected includes:

determining a first detection result corresponding to the sparse part of the human body key points in the human body image to be detected based on the first characteristic diagram;

determining a plurality of body part detection frames corresponding to the second feature map based on the first feature map, wherein the body part detection frames are used for cutting the second feature map;

and determining a second detection result corresponding to the dense parts of the key points of the human body in the human body image to be detected based on a second feature map, the detection frames of the human body parts and the prior knowledge of the density of the key points.

Optionally, the determining, based on the first feature map, a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected includes:

carrying out down-sampling operation on the first feature map to obtain a first target feature map;

and inputting the first target characteristic diagram into a convolutional layer to obtain the first detection result output by the convolutional layer.

Optionally, the determining, based on the second feature map, the body component detection frames, and the prior knowledge of the key point density, a second detection result corresponding to a dense part of the key points of the human body in the human body image to be detected includes:

cutting the second feature map by using the body part detection frame to obtain a third feature map corresponding to each human body part;

determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to the dense part of the key points belonging to the human body in each third feature map;

and determining the second detection result based on each second target feature map.

Optionally, the generating a first feature map and a second feature map corresponding to the human body image to be detected based on the human body image to be detected includes:

inputting the human body image to be detected into a residual error network model to obtain an initial characteristic diagram output by the residual error network model;

and performing multiple upsampling operations on the initial characteristic diagram to obtain the first characteristic diagram and the second characteristic diagram.

The invention also provides a human body key point detection device, which comprises:

the first acquisition module is used for acquiring a human body image to be detected;

the generating module is used for generating a first characteristic diagram and a second characteristic diagram corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions;

the first determining module is used for determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density priori knowledge; and the key point density priori knowledge is used for distinguishing a dense part of the human key points and a sparse part of the human key points in the second feature map.

The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the human body key point detection method.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of human keypoint detection as described in any of the above.

The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements any of the human key point detection methods described above.

According to the human body key point detection method, the human body key point detection device, the electronic equipment and the storage medium, a first characteristic diagram and a second characteristic diagram with different resolutions are generated on the basis of the obtained human body image to be detected; due to the fact that the key point density priori knowledge can distinguish the dense parts of the key points of the human body from the sparse parts of the key points of the human body in the second feature map, the feature maps with different resolutions can be adopted for detecting different parts of the key points based on the first feature map, the second feature map and the key point density priori knowledge, on the basis of guaranteeing the detection precision of the sparse parts of the key points of the human body, the dense parts of the key points of the human body are accurately detected, and the detection precision of the key points of the whole human body is improved.

Drawings

In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic flow chart of a method for detecting key points of a human body according to the present invention;

FIG. 2 is a second schematic flow chart of the method for detecting key points of a human body according to the present invention;

FIG. 3 is a schematic structural diagram of a human body key point detection device provided by the present invention;

fig. 4 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.

To facilitate a clearer understanding of the embodiments of the present application, some relevant background information is first presented below.

Because key points of human joints are distributed sparsely and uniformly, the convolutional neural network is directly adopted, and the mapping from the image to the key points of the human body can be directly learned by using a supervised learning mode.

However, for dense key points of the human body, the distribution density difference of each part of the body is large, and the same network structure is adopted for all the key points, so that the detection accuracy of the dense key points is low.

In addition, in the related art, the means for detecting key points of the whole body of the human body is to divide the human body into different parts and train a model for each part, and the non-end-to-end method brings extra challenges and burdens to model management, training, data production and the like, so that the actual use is greatly limited.

Therefore, in order to improve the detection accuracy of the key points of the whole human body, the invention provides a method and a device for detecting the key points of the human body, an electronic device and a storage medium.

The following describes the human body key point detection method provided by the present invention with reference to fig. 1. Fig. 1 is a schematic flow chart of a human body key point detection method provided by the present invention, and as shown in fig. 1, the method includes steps 101 to 103, where:

step 101, obtaining a human body image to be detected.

It should be noted that the execution subject of the present invention may be any electronic device with human body key point detection function, for example, any one of a smart phone, a smart watch, a desktop computer, a portable computer, and the like.

Because the distribution density difference of the dense key points of the human body at each part of the human body is large, if the same network structure is adopted to detect all the key points of the human body, the detection precision of the dense parts of the key points is low.

Therefore, in order to improve the detection accuracy of the key points of the whole body of the human body, in the embodiment, the human body image to be detected needs to be acquired first.

Specifically, the human body image to be detected is a human body whole body image to be detected of a key point, and the human body whole body image to be detected includes parts of a head, an upper arm, a lower arm, a trunk, a thigh, a lower leg, a hand, a foot and the like of a human body.

102, generating a first characteristic diagram and a second characteristic diagram corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions.

In this embodiment, after a human body image to be detected is acquired, a first feature map and a second feature map with different resolutions need to be generated based on the human body image to be detected;

wherein the first feature map is a low-resolution feature map, for example, the resolution is 128 × 128; the second feature map is a high resolution feature map, for example, 512 × 512 resolution.

103, determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density prior knowledge; and the key point density priori knowledge is used for distinguishing a dense part of the human key points and a sparse part of the human key points in the second feature map.

In this embodiment, the key point detection result corresponding to the human body image to be detected needs to be determined based on the first feature map, the second feature map and the key point density prior knowledge.

Specifically, since the prior knowledge of the key point density is used to distinguish the dense parts of the human key points from the sparse parts of the human key points in the second feature map, the detection results of the key points corresponding to the human image to be detected can be specifically divided into: the first detection result corresponding to the sparse part of the human key points and the second detection result corresponding to the dense part of the human key points.

Optionally, in a possible implementation manner of the embodiment of the present invention, the prior knowledge of the keypoint intensity may be obtained through the following steps [1] to [2 ]:

step [1], obtaining the number of key points of each human body part;

and step [2], determining the prior knowledge of the key point density based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.

In the embodiment, the prior knowledge of the key point density is used for distinguishing a dense part of the key points of the human body and a sparse part of the key points of the human body; wherein, each part of the human body is respectively: the head, the upper arm, the lower arm, the trunk, the thigh, the shank, the hand and the foot have preset key point quantity in each part.

After the number of key points of each human body part is obtained, dividing the number of the key points of the part by the two-dimensional rectangular area corresponding to the part to obtain a target result corresponding to the part aiming at each human body part;

then setting a threshold, and under the condition that the target result is greater than the threshold, the part is a dense part of the key points of the human body; conversely, when the target result is smaller than the threshold value, the part is a human body key point sparse part.

In the embodiment, the prior knowledge of the key point density is introduced, and the dense part of the human key points and the sparse part of the human key points in the second feature map can be determined based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part; and further, the characteristic graphs with different resolutions can be adopted for detecting different key point positions.

The human body key point detection method provided by the invention comprises the steps of generating a first characteristic diagram and a second characteristic diagram with different resolutions on the basis of an obtained human body image to be detected; because the key point density priori knowledge can distinguish the human key point dense part and the human key point sparse part in the second feature map, feature maps with different resolutions can be adopted for detecting different key point parts based on the first feature map, the second feature map and the key point density priori knowledge, on the basis of ensuring the detection precision of the human key point sparse part, the precise detection of the human key point dense part is realized, and the detection precision of the human body key points is improved.

Optionally, the generating of the first feature map and the second feature map corresponding to the human body image to be detected based on the human body image to be detected may be specifically implemented by the following steps a) to b):

step a), inputting the human body image to be detected into a residual error network model to obtain an initial characteristic diagram output by the residual error network model;

and b), performing multiple upsampling operations on the initial characteristic diagram to obtain the first characteristic diagram and the second characteristic diagram.

In this embodiment, after the human body image to be detected (for example, with a resolution of 512 × 512) is acquired, the human body image to be detected needs to be input into a residual network model (for example, resNet-50) to perform basic feature extraction, so as to obtain a feature map of an initial feature map (for example, with a resolution of 16 × 16) output by the residual network model.

And then, performing up-sampling operation on the initial characteristic diagram for multiple times to obtain a first characteristic diagram and a second characteristic diagram.

Specifically, after an initial feature map with a resolution of 16 × 16 is obtained, the initial feature map is input into a plurality of upsampling modules, and finally, a first feature map (for example, with a resolution of 128 × 128) and a second feature map (for example, with a resolution of 512 × 512) output by different upsampling modules are obtained respectively;

wherein each upsampling module comprises a 2-time upsampling layer, a 3 x 3 scale convolution layer, a batch normalization layer and a ReLU layer.

In the above embodiment, the human body image to be detected is input into the residual error network model, and then the initial feature map output by the residual error network model is subjected to a plurality of upsampling operations to obtain the first feature map and the second feature map with different resolutions, so that the feature maps with different resolutions can be adopted for detecting different key point positions, on the basis of ensuring the detection precision of the sparse key point positions of the human body, the precise detection of the dense key point positions of the human body is realized, and the detection precision of the key points of the whole human body is improved.

Optionally, the determining the key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point density priori knowledge may be implemented by the following steps 1) to 3):

step 1), determining a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected based on the first characteristic diagram;

step 2), based on the first feature map, determining a plurality of body part detection frames corresponding to the second feature map, wherein the body part detection frames are used for cutting the second feature map;

and 3) determining a second detection result corresponding to the dense parts of the key points of the human body in the human body image to be detected based on a second feature map, the detection frames of the human body parts and the prior knowledge of the density of the key points.

In this embodiment, since the sparse part of the human key points does not need to be detected by the high-resolution feature map, the first detection result corresponding to the sparse part of the human key points in the human image to be detected can be determined directly based on the first feature map.

In order to cut out the features of dense parts of key points of a human body on a feature map (namely a second feature map) with high resolution and eliminate feature interference of other parts;

therefore, in this embodiment, it is necessary to connect a human body part detection network behind the first feature map, and output a body part detection frame of the head, the upper arm, the lower arm, the trunk, the upper leg, the lower leg, the hand, and the foot part on the first feature map, respectively, and to enlarge the body part detection frame onto the second feature map; and each body part detection frame is used for cutting the second feature map to obtain the feature map corresponding to each human body part.

And finally, determining a second detection result corresponding to the dense parts of the key points of the human body in the human body image to be detected based on the second feature map, the detection frames of the body parts and the prior knowledge of the density of the key points.

In the above embodiment, based on the first feature map, the second feature map and the prior knowledge of the key point density, feature maps with different resolutions can be used for detecting different key point positions, so that on the basis of ensuring the detection precision of the sparse human key point positions, the precise detection of the dense human key point positions is realized, and the detection precision of the key points of the whole human body is further improved.

Optionally, the determining, based on the first feature map, a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected may specifically be implemented by the following steps (1) to (2):

step (1), carrying out down-sampling operation on the first feature map to obtain a first target feature map;

and (2) inputting the first target characteristic diagram into a convolutional layer to obtain the first detection result output by the convolutional layer.

In the embodiment, since the sparse part of the key points of the human body does not need to be detected by a high-resolution feature map, the first feature map is directly subjected to downsampling operation to obtain a first target feature map; and then inputting the first target feature map into the convolutional layer to obtain a first detection result.

Specifically, the first feature map (e.g., with a resolution of 128 × 128) is first input to a downsampling module, where the downsampling module includes at least a 3 × 3 convolution layer, a max-pooling layer, a batch normalization layer, and a ReLU layer.

After the first feature map with the resolution of 128 × 128 is input into the down-sampling module, the first target feature map is obtained (i.e., the resolution of the first target feature map is reduced to 64 × 64).

And finally, inputting the first target feature map with the resolution of 64 × 64 into the 1 × 1 convolution layer to obtain a multichannel thermodynamic diagram, and outputting a first detection result corresponding to the sparse part of the human key points in the human body image to be detected in the thermodynamic diagram mode.

In the embodiment, the first characteristic diagram is directly adopted to detect the key points aiming at the sparse part of the key points of the human body, so that the detection precision of the sparse part of the key points of the human body is ensured.

Optionally, the determining, based on the second feature map, the detection frames of the body parts, and the prior knowledge of the densities of the key points, a second detection result corresponding to a dense part of the key points of the human body in the human body image to be detected may specifically be implemented by the following steps [ a ] -step [ c ]:

step [ a ], cutting the second feature map by using the body part detection frame to obtain a third feature map corresponding to each human body part;

determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to a dense part of the key points belonging to the human body in each third feature map;

and step [ c ], determining the second detection result based on each second target feature map.

In this embodiment, since the position and size information of each part of the human body on the second feature map is already obtained based on the body part detection frame, the second feature map can be cropped by using the body part detection frame to obtain the third feature map corresponding to each part of the human body.

For example, a human body part detection network is connected behind the first feature map, and body part detection frames of the head, the upper arm, the lower arm, the trunk, the thigh, the lower leg, the hand and the foot parts on the first feature map are respectively output, and the body part detection frames are amplified on the second feature map in an equal ratio; cutting the second characteristic diagram by using the body part detection frame to obtain a third characteristic diagram corresponding to the head, the upper arm, the lower arm, the trunk, the thigh, the calf, the hand and the foot;

and then determining feature maps corresponding to the dense parts of the key points of the human body in each third feature map as second target feature maps based on the prior knowledge of the key point density.

In order to adopt the key point detection network which is the same as that of the first characteristic diagram for all the characteristic diagrams, the second target characteristic diagram needs to be scaled until the resolution is the same as that of the first characteristic diagram;

finally, inputting the scaled second target feature map into a down-sampling module and a 1 x 1 convolution layer to obtain a second detection result corresponding to the dense part of the human body key points in the human body image to be detected; wherein, the down-sampling module at least comprises a 3 × 3 convolution layer, a maximum pooling layer, a batch normalization layer and a ReLU layer.

In the embodiment, a first characteristic diagram and a second characteristic diagram with different resolutions are generated based on the acquired human body image to be detected; because the dense parts of the key points of the human body and the sparse parts of the key points of the human body in the second feature map can be distinguished by the priori knowledge of the density of the key points, the feature maps with different resolutions can be adopted for detecting the parts of the key points based on the first feature map, the second feature map and the priori knowledge of the density of the key points, the dense parts of the key points of the human body can be accurately detected on the basis of ensuring the detection accuracy of the sparse parts of the key points of the human body, and the detection accuracy of the key points of the whole human body can be improved; meanwhile, the method is an end-to-end detection mode, the same key point detection network is adopted for the dense part and the sparse part of the key points of the human body, and different models do not need to be trained for different parts of the human body, so that the complexity of detecting the key points of the human body is reduced, the problems of non-end-to-end, complex training of key point detection models, difficult model management, high data manufacturing cost and the like are solved, and the manpower and material resources are saved.

Fig. 2 is a second schematic flow chart of the human body key point detection method provided by the present invention, and as shown in fig. 2, the method includes steps 201 to 208, wherein:

step 201, obtaining a human body image to be detected.

Step 202, inputting the human body image to be detected into the residual error network model to obtain an initial characteristic diagram output by the residual error network model.

Step 203, performing multiple upsampling operations on the initial characteristic diagram to obtain a first characteristic diagram and a second characteristic diagram; the first feature map and the second feature map have different resolutions.

And step 204, performing downsampling operation on the first feature map to obtain a first target feature map.

Step 205, inputting the first target feature map into the convolutional layer to obtain a first detection result output by the convolutional layer.

And step 206, cutting the second feature map by using the body part detection frame to obtain a third feature map corresponding to each human body part.

And step 207, determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to the dense part of the key points belonging to the human body in each third feature map.

It should be noted that the prior knowledge of the key point density can be obtained through the following steps [1] -step [2 ]:

step [1], obtaining the number of key points of each human body part;

And step 208, determining a second detection result based on each second target feature map.

It should be noted that, the execution order of steps 204 to 205 and steps 206 to 208 is not limited in the present invention, that is, the execution order is not sequential.

The human body key point detection method provided by the invention comprises the steps of generating a first characteristic diagram and a second characteristic diagram with different resolutions on the basis of an obtained human body image to be detected; because the dense parts of the key points of the human body and the sparse parts of the key points of the human body in the second feature map can be distinguished by the priori knowledge of the density of the key points, the feature maps with different resolutions can be adopted for detecting the parts of the key points based on the first feature map, the second feature map and the priori knowledge of the density of the key points, the dense parts of the key points of the human body can be accurately detected on the basis of ensuring the detection accuracy of the sparse parts of the key points of the human body, and the detection accuracy of the key points of the whole human body can be improved; meanwhile, the method is an end-to-end detection mode, the same key point detection network is adopted for the key point dense part and the key point sparse part of the human body, different models do not need to be trained for different parts of the human body, therefore, the complexity of key point detection of the human body is reduced, the problems of non-end-to-end, complex key point detection model training, difficult model management, high data manufacturing cost and the like are solved, and the manpower and material resources are saved.

The following describes the human body key point detection device provided by the present invention, and the human body key point detection device described below and the human body key point detection method described above may be referred to in correspondence with each other. Fig. 3 is a schematic structural diagram of the human body key point detection device provided by the present invention, and as shown in fig. 3, the human body key point detection device 300 includes: an obtaining module 301, a generating module 302 and a determining module 303, wherein:

the first acquisition module 301 is used for acquiring a human body image to be detected;

a generating module 302, configured to generate a first feature map and a second feature map corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions;

a first determining module 303, configured to determine, based on the first feature map, the second feature map, and the prior knowledge of the key point density, a key point detection result corresponding to the human body image to be detected; and the key point density priori knowledge is used for distinguishing a dense part of the human key points and a sparse part of the human key points in the second feature map.

The human body key point detection device provided by the invention generates a first characteristic diagram and a second characteristic diagram with different resolutions based on the obtained human body image to be detected; due to the fact that the key point density priori knowledge can distinguish the dense parts of the key points of the human body from the sparse parts of the key points of the human body in the second feature map, the feature maps with different resolutions can be adopted for detecting different parts of the key points based on the first feature map, the second feature map and the key point density priori knowledge, on the basis of guaranteeing the detection precision of the sparse parts of the key points of the human body, the dense parts of the key points of the human body are accurately detected, and the detection precision of the key points of the whole human body is improved.

Optionally, the apparatus further comprises:

the second acquisition module is used for acquiring the number of key points of each human body part;

and the second determining module is used for determining the prior knowledge of the key point density based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.

Optionally, the first determining module 303 is further configured to:

performing down-sampling operation on the first feature map to obtain a first target feature map;

and inputting the first target feature map into a convolutional layer to obtain the first detection result output by the convolutional layer.

Optionally, the first determining module 303 is further configured to:

Optionally, the generating module 302 is further configured to:

Fig. 4 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 4: a processor (processor) 410, a communication Interface 420, a memory (memory) 430 and a communication bus 440, wherein the processor 410, the communication Interface 420 and the memory 430 are communicated with each other via the communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to perform a human keypoint detection method comprising: acquiring a human body image to be detected; generating a first characteristic diagram and a second characteristic diagram corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density prior knowledge; and the key point density priori knowledge is used for distinguishing the dense parts of the human key points and the sparse parts of the human key points in the second feature map.

In addition, the logic instructions in the memory 430 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being stored on a non-transitory computer-readable storage medium, wherein when the computer program is executed by a processor, a computer is capable of executing the human key point detection method provided by the above methods, the method comprising: acquiring a human body image to be detected; generating a first characteristic diagram and a second characteristic diagram corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density priori knowledge; and the key point density priori knowledge is used for distinguishing a dense part of the human key points and a sparse part of the human key points in the second feature map.

In still another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the human key point detection method provided by the above methods, the method including: acquiring a human body image to be detected; generating a first characteristic diagram and a second characteristic diagram corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density priori knowledge; and the key point density priori knowledge is used for distinguishing the dense parts of the human key points and the sparse parts of the human key points in the second feature map.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A human body key point detection method is characterized by comprising the following steps:

acquiring a human body image to be detected;

determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density priori knowledge; and the key point density priori knowledge is used for distinguishing a dense part of the human key points and a sparse part of the human key points in the second feature map.

2. The method of claim 1, wherein the prior knowledge of keypoint intensity is obtained by:

acquiring the number of key points of each human body part;

3. The method for detecting human key points according to claim 1 or 2, wherein the determining a key point detection result corresponding to the human image to be detected based on the first feature map, the second feature map and key point density prior knowledge comprises:

4. The method according to claim 3, wherein the determining a first detection result corresponding to the sparse part of the human key points in the human image to be detected based on the first feature map comprises:

5. The method according to claim 3, wherein the determining a second detection result corresponding to a dense part of the human body key points in the human body image to be detected based on the second feature map, the detection frames of the body parts, and the prior knowledge of the key point density comprises:

6. The method according to claim 1, wherein the generating a first feature map and a second feature map corresponding to the human body image to be detected based on the human body image to be detected comprises:

7. A human key point detection device, comprising:

the first determining module is used for determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and key point density prior knowledge; and the key point density priori knowledge is used for distinguishing a dense part of the human key points and a sparse part of the human key points in the second feature map.

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of human keypoint detection according to any of claims 1 to 6 when executing the program.

9. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the human keypoint detection method according to any one of claims 1 to 6.

10. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the human keypoint detection method of any one of claims 1 to 6.