CN115578753B - Human body key point detection method and device, electronic equipment and storage medium - Google Patents

Human body key point detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115578753B
CN115578753B CN202211167898.2A CN202211167898A CN115578753B CN 115578753 B CN115578753 B CN 115578753B CN 202211167898 A CN202211167898 A CN 202211167898A CN 115578753 B CN115578753 B CN 115578753B
Authority
CN
China
Prior art keywords
human body
feature map
key point
detected
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211167898.2A
Other languages
Chinese (zh)
Other versions
CN115578753A (en
Inventor
宁欣
冉航
张玉贵
李爽
李卫军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Semiconductors of CAS
Original Assignee
Institute of Semiconductors of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Semiconductors of CAS filed Critical Institute of Semiconductors of CAS
Priority to CN202211167898.2A priority Critical patent/CN115578753B/en
Publication of CN115578753A publication Critical patent/CN115578753A/en
Application granted granted Critical
Publication of CN115578753B publication Critical patent/CN115578753B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a human body key point detection method, a device, electronic equipment and a storage medium, wherein the human body key point detection method comprises the following steps: acquiring a human body image to be detected; generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing the dense parts of the key points of the human body from the sparse parts of the key points of the human body in the second characteristic diagram. By the method, feature graphs with different resolutions can be adopted for detection of different key point positions, accurate detection of dense key point positions of a human body is achieved on the basis of guaranteeing detection accuracy of sparse key point positions of the human body, and detection accuracy of key points of the whole body of the human body is further improved.

Description

Human body key point detection method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and apparatus for detecting key points of a human body, an electronic device, and a storage medium.
Background
With the development of society and technology, industries such as online education, intelligent medical treatment, intelligent robots and the like are continuously rising, and the functional requirements for interaction based on key point information of human bodies are increasing. The human body key point detection is the basis of many high-level visual tasks, for example, the human body key point detection technology can be used for behavior recognition, clothing analysis, pedestrian re-recognition and the like.
In the related art, a convolutional neural network is generally adopted to learn key points of a human body in a picture or a video in a supervised learning mode, but the method is mainly used for detecting the key points of sparse parts of the human body.
However, for dense key points of the human body, because the distribution density difference of the dense key points of the human body is larger at each part of the human body, the detection of all the key points of the human body in the related technology adopts the same network structure, which results in low detection precision of the parts with dense key points; therefore, how to improve the detection accuracy of key points of the whole body of the human body is a problem to be solved at present.
Disclosure of Invention
Aiming at the problems existing in the prior art, the embodiment of the invention provides a human body key point detection method, a device, electronic equipment and a storage medium.
The invention provides a human body key point detection method, which comprises the following steps:
acquiring a human body image to be detected;
generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions;
determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map.
Optionally, the prior knowledge of the concentration of keypoints is obtained by:
acquiring the number of key points of each human body part;
and determining the prior knowledge of the concentration of the key points based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.
Optionally, the determining, based on the first feature map, the second feature map, and the prior knowledge of the concentration of the keypoints, a detection result of the keypoints corresponding to the human body image to be detected includes:
determining a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected based on the first feature map;
determining a plurality of body part detection frames corresponding to the second feature map based on the first feature map, wherein the body part detection frames are used for cutting the second feature map;
and determining a second detection result corresponding to the dense part of the human body key points in the human body image to be detected based on the second feature map, the detection frames of the human body parts and the priori knowledge of the key point density.
Optionally, the determining, based on the first feature map, a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected includes:
performing downsampling operation on the first feature map to obtain a first target feature map;
and inputting the first target feature map into a convolution layer to obtain the first detection result output by the convolution layer.
Optionally, the determining, based on the second feature map, each body part detection frame, and the prior knowledge of the concentration of the keypoints, a second detection result corresponding to a dense part of the human body keypoints in the human body image to be detected includes:
cutting the second feature map by using the body part detection frame to obtain third feature maps corresponding to the human body parts;
determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to a dense part belonging to a human body key point in each third feature map;
and determining the second detection result based on each second target feature map.
Optionally, the generating, based on the to-be-detected human body image, a first feature map and a second feature map corresponding to the to-be-detected human body image includes:
inputting the human body image to be detected into a residual error network model to obtain an initial feature map output by the residual error network model;
and performing up-sampling operation on the initial feature map for a plurality of times to obtain the first feature map and the second feature map.
The invention also provides a human body key point detection device, which comprises:
the first acquisition module is used for acquiring a human body image to be detected;
the generation module is used for generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions;
the first determining module is used for determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the human body key point detection method according to any one of the above when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a human key point detection method as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a human body key point detection method as described in any one of the above.
According to the human body key point detection method, the device, the electronic equipment and the storage medium, based on the acquired human body image to be detected, a first feature map and a second feature map with different resolutions are generated; because the dense part of the human body key points and the sparse part of the human body key points in the second feature map can be distinguished by the priori knowledge of the density of the key points, the feature maps with different resolutions can be adopted for detection aiming at the parts of different key points based on the first feature map, the second feature map and the priori knowledge of the density of the key points, and the accurate detection of the dense part of the human body key points is realized on the basis of ensuring the detection precision of the sparse part of the human body key points, so that the detection precision of the whole body key points of the human body is improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for detecting key points of a human body;
FIG. 2 is a second flow chart of the method for detecting key points of human body according to the present invention;
FIG. 3 is a schematic diagram of the human body key point detection device provided by the invention;
fig. 4 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In order to facilitate a clearer understanding of the various embodiments of the present application, some relevant background knowledge is first presented below.
Because the key points of the human joints are sparsely and uniformly distributed, the convolutional neural network is directly adopted, and the mapping from the image to the key points of the human can be directly learned by using a supervised learning mode.
However, for the key points of dense human body, the distribution density difference of each part of the human body is larger, and the detection precision of the part with dense key points is low due to the adoption of the same network structure for all the key points.
In addition, the means of detecting the key points of the whole body of the human body in the related technology is to divide the human body into different parts, train a model for each part, and the model management, training, data production and the like of the non-end-to-end method bring additional challenges and burdens, so that the method is greatly limited in practical use.
Therefore, in order to improve the detection accuracy of key points of the whole body of a human body, the invention provides a method, a device, an electronic device and a storage medium for detecting key points of the human body.
The human body key point detection method provided by the invention is specifically described below with reference to fig. 1. Fig. 1 is a schematic flow chart of a method for detecting key points of a human body, which is shown in fig. 1, and the method includes steps 101 to 103, wherein:
step 101, acquiring a human body image to be detected.
Firstly, it should be noted that the execution body of the present invention may be any electronic device with a human body key point detection function, for example, any one of a smart phone, a smart watch, a desktop computer, a laptop computer, and the like.
Because the distribution density difference of the dense key points of the human body is larger at each part of the human body, if the same network structure is adopted for detecting all the key points of the human body, the detection precision of the parts with dense key points is low.
Therefore, in order to improve the detection accuracy of the key points of the whole body of the human body, in the present embodiment, it is first necessary to acquire the human body image to be detected.
Specifically, the human body image to be detected refers to a human body whole body image to be detected by key points, and the human body whole body image to be detected comprises parts such as a head, an upper arm, a lower arm, a trunk, a thigh, a shank, a hand, a foot and the like of a person.
Step 102, generating a first feature map and a second feature map corresponding to the human body image to be detected based on the human body image to be detected; the first and second feature maps have different resolutions.
In this embodiment, after the human body image to be detected is acquired, a first feature map and a second feature map with different resolutions need to be generated based on the human body image to be detected;
the first feature map is a low-resolution feature map, for example, the resolution is 128×128; the second feature map is a high resolution feature map, for example, the resolution is 512×512.
Step 103, determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map.
In this embodiment, it is necessary to determine a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the prior knowledge of the key point density.
Specifically, since the prior knowledge of the concentration of the keypoints is used to distinguish the dense positions of the keypoints of the human body from the sparse positions of the keypoints of the human body in the second feature map, the detection results of the keypoints corresponding to the human body image to be detected can be specifically classified into: and a first detection result corresponding to the sparse part of the key points of the human body and a second detection result corresponding to the dense part of the key points of the human body.
Optionally, in one possible implementation manner of the embodiment of the present invention, the prior knowledge of the concentration of the keypoints may be specifically obtained through the following steps [1] -step [ 2):
step 1, obtaining the number of key points of each human body part;
and step 2, determining the prior knowledge of the concentration of the key points based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.
In this embodiment, the prior knowledge of the concentration of the keypoints is used to distinguish between dense parts of the keypoints of the human body and sparse parts of the keypoints of the human body; wherein, each position of human body is respectively: head, upper arm, lower arm, torso, thigh, shank, hand, foot, there are a predetermined number of key points in each location.
After the number of key points of each human body part is obtained, dividing the number of key points of each human body part by the two-dimensional rectangular area corresponding to the part to obtain a target result corresponding to the part;
setting a threshold value, wherein the part is a dense part of the key points of the human body under the condition that the target result is larger than the threshold value; otherwise, under the condition that the target result is smaller than the threshold value, the part is a sparse part of the key points of the human body.
In the above embodiment, the prior knowledge of the concentration degree of the key points is introduced, and the dense parts of the key points and the sparse parts of the key points of the human body in the second feature map can be determined based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part; and further, detection can be realized by adopting feature images with different resolutions aiming at different key point positions.
According to the human body key point detection method, based on the acquired human body image to be detected, a first feature image and a second feature image with different resolutions are generated; because the dense part of the human body key points and the sparse part of the human body key points in the second feature map can be distinguished by the priori knowledge of the density of the key points, the feature maps with different resolutions can be adopted for detection aiming at the parts of different key points based on the first feature map, the second feature map and the priori knowledge of the density of the key points, and the accurate detection of the dense part of the human body key points is realized on the basis of ensuring the detection precision of the sparse part of the human body key points, so that the detection precision of the whole body key points of the human body is improved.
Optionally, the generating the first feature map and the second feature map corresponding to the to-be-detected human body image based on the to-be-detected human body image may be specifically implemented by the following steps a) -b):
step a), inputting the human body image to be detected into a residual error network model to obtain an initial feature map output by the residual error network model;
and b) performing up-sampling operation on the initial feature map for a plurality of times to obtain the first feature map and the second feature map.
In this embodiment, after the human body image to be detected (for example, the resolution is 512×512) is obtained, the human body image to be detected needs to be input into a residual network model (for example, res net-50) to perform basic feature extraction, so as to obtain a feature map of an initial feature map (for example, the resolution is 16×16) output by the residual network model.
And then carrying out up-sampling operation on the initial feature map for a plurality of times to obtain a first feature map and a second feature map.
Specifically, after an initial feature map with a resolution of 16×16 is obtained, the initial feature map is input into a plurality of upsampling modules, and finally a first feature map (for example, with a resolution of 128×128) and a second feature map (for example, with a resolution of 512×512) output by different upsampling modules are respectively obtained;
wherein each up-sampling module includes a 2-fold up-sampling layer, a 3*3 scale convolution layer, a batch normalization layer, and a ReLU layer.
In the above embodiment, the human body image to be detected is input into the residual network model, and then the initial feature map output by the residual network model is subjected to up-sampling operation for multiple times to obtain the first feature map and the second feature map with different resolutions, so that the feature maps with different resolutions can be adopted for detecting different key point positions, and on the basis of ensuring the detection precision of sparse positions of the key points of the human body, the accurate detection of dense positions of the key points of the human body is realized, and the detection precision of the key points of the whole body of the human body is further improved.
Optionally, the determining, based on the first feature map, the second feature map, and the prior knowledge of the concentration of the keypoints, the detection result of the keypoints corresponding to the image of the human body to be detected may be implemented by the following steps 1) to 3):
step 1), determining a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected based on the first feature map;
step 2), determining a plurality of body part detection frames corresponding to the second feature map based on the first feature map, wherein the body part detection frames are used for cutting the second feature map;
and 3) determining a second detection result corresponding to the dense part of the human body key points in the human body image to be detected based on the second feature map, the detection frames of the human body parts and the prior knowledge of the key point density.
In this embodiment, since the sparse part of the human body key point does not need to be detected by the high-resolution feature map, the first detection result corresponding to the sparse part of the human body key point in the human body image to be detected can be determined directly based on the first feature map.
In order to cut out the features of the dense parts of the key points of the human body on the high-resolution feature map (namely the second feature map), the feature interference of other parts is eliminated;
therefore, in this embodiment, it is necessary to connect a body part detection network at the same time after the first feature map, and output a body part detection frame of the head, upper arm, lower arm, torso, thigh, calf, hand, and foot parts on the first feature map, and to scale up the body part detection frame to the second feature map; and each body part detection frame is used for cutting the second characteristic map to obtain the characteristic map corresponding to each human body part.
And finally, determining a second detection result corresponding to the dense part of the human body key points in the human body image to be detected based on the second feature map, the detection frames of the human body parts and the priori knowledge of the key point density.
In the above embodiment, based on the first feature map, the second feature map and the priori knowledge of the key point density, feature maps with different resolutions can be adopted for detecting different key point positions, and on the basis of guaranteeing the detection precision of sparse positions of key points of a human body, the accurate detection of dense positions of key points of the human body is realized, and the detection precision of key points of the whole body of the human body is further improved.
Optionally, the determining, based on the first feature map, a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected may be specifically implemented by the following steps (1) - (2):
step (1), performing downsampling operation on the first feature map to obtain a first target feature map;
and (2) inputting the first target feature map into a convolution layer to obtain the first detection result output by the convolution layer.
In this embodiment, since the sparse part of the key points of the human body does not need to detect the high-resolution feature map, the first feature map is directly subjected to the downsampling operation to obtain the first target feature map; and then inputting the first target feature map into a convolution layer to obtain a first detection result.
Specifically, a first feature map (for example, with a resolution of 128×128) is first input to a downsampling module, where the downsampling module includes at least a 3*3 convolution layer, a max-pooling layer, a batch normalization layer, and a ReLU layer.
After the first feature map with the resolution of 128×128 is input to the downsampling module, a first target feature map is obtained (i.e. the resolution of the first target feature map is reduced to 64×64).
And finally, inputting a first target feature map with the resolution of 64 x 64 into a 1*1 convolution layer to obtain a multi-channel thermodynamic diagram, and outputting a first detection result corresponding to the sparse part of the human body key points in the human body image to be detected in a thermodynamic diagram mode.
In the embodiment, aiming at the sparse part of the key points of the human body, the first feature map is directly adopted for key point detection, so that the detection precision of the sparse part of the key points of the human body is ensured.
Optionally, the determining, based on the second feature map, each body part detection frame, and the prior knowledge of the concentration of the key points, a second detection result corresponding to a dense part of the key points of the human body in the human body image to be detected may be specifically implemented through the following steps [ a ] -step [ c):
step [ a ], cutting the second feature map by using the body part detection frame to obtain third feature maps corresponding to the human body parts;
step [ b ], determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to a dense part belonging to a human body key point in each third feature map;
and step [ c ], determining the second detection result based on each second target feature map.
In this embodiment, since the position and size information of each part of the human body on the second feature map have been obtained based on the body part detection frame, the second feature map may be cut out by using the body part detection frame to obtain the third feature map corresponding to each part of the human body.
For example, a human body part detection network is connected behind the first characteristic diagram, and a human body part detection frame of the human body part detection network, namely a head, an upper arm, a lower arm, a trunk, a thigh, a lower leg, a hand and a foot part on the first characteristic diagram is respectively output, and the human body part detection frame is enlarged to a second characteristic diagram in an equal ratio; cutting the second feature map by using the body part detection frame to obtain a third feature map corresponding to the positions of the head, the upper arm, the lower arm, the trunk, the thigh, the shank, the hand and the foot;
and then, determining the feature map corresponding to the dense parts belonging to the human body key points in each third feature map as a second target feature map based on the prior knowledge of the key point density.
In order to take the same keypoint detection network for all feature maps as for the first feature map, the second target feature map needs to be scaled until the resolution is the same as for the first feature map;
finally, inputting the scaled second target feature map into a downsampling module and a 1*1 convolution layer to obtain a second detection result corresponding to the dense part of the human body key points in the human body image to be detected; the downsampling module at least comprises a 3*3 convolution layer, a maximum pooling layer, a batch normalization layer and a ReLU layer.
In the above embodiment, the first feature map and the second feature map with different resolutions are generated based on the acquired human body image to be detected; because the key point density priori knowledge can distinguish the dense part of the human body key points and the sparse part of the human body key points in the second feature map, the feature maps with different resolutions can be adopted for detection aiming at different key point parts based on the first feature map, the second feature map and the key point density priori knowledge, and the accurate detection of the dense part of the human body key points is realized on the basis of ensuring the detection precision of the sparse part of the human body key points, so that the detection precision of the whole body key points of the human body is improved; meanwhile, the method adopts the same key point detection network for the dense parts and the sparse parts of the key points of the human body in an end-to-end detection mode, and different models are not required to be trained for different parts of the human body, so that the complexity of the detection of the key points of the human body is reduced, the problems of complex training of non-end-to-end key point detection models, difficult model management, high data manufacturing cost and the like are avoided, and the manpower and material resource is saved.
Fig. 2 is a second flow chart of the method for detecting key points of human body according to the present invention, referring to fig. 2, the method includes steps 201 to 208, wherein:
step 201, acquiring a human body image to be detected.
Step 202, inputting the human body image to be detected into a residual error network model to obtain an initial feature map output by the residual error network model.
Step 203, performing up-sampling operation on the initial feature map for a plurality of times to obtain a first feature map and a second feature map; the first and second feature maps have different resolutions.
And 204, performing downsampling operation on the first feature map to obtain a first target feature map.
Step 205, inputting the first target feature map into the convolution layer to obtain a first detection result output by the convolution layer.
And 206, cutting the second feature map by using the body part detection frame to obtain a third feature map corresponding to each human body part.
Step 207, determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to a dense part of the key points belonging to the human body in each third feature map.
It should be noted that, the prior knowledge of the concentration of the key points can be obtained specifically through the following steps [1] -step [ 2):
step 1, obtaining the number of key points of each human body part;
and step 2, determining the prior knowledge of the concentration of the key points based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.
Step 208, determining a second detection result based on each second target feature map.
It should be noted that the execution sequence of steps 204-205 and steps 206-208 is not limited in the present invention, i.e. the execution sequence is not sequential.
According to the human body key point detection method, based on the acquired human body image to be detected, a first feature image and a second feature image with different resolutions are generated; because the key point density priori knowledge can distinguish the dense part of the human body key points and the sparse part of the human body key points in the second feature map, the feature maps with different resolutions can be adopted for detection aiming at different key point parts based on the first feature map, the second feature map and the key point density priori knowledge, and the accurate detection of the dense part of the human body key points is realized on the basis of ensuring the detection precision of the sparse part of the human body key points, so that the detection precision of the whole body key points of the human body is improved; meanwhile, the method adopts the same key point detection network for the dense parts and the sparse parts of the key points of the human body in an end-to-end detection mode, and different models are not required to be trained for different parts of the human body, so that the complexity of the detection of the key points of the human body is reduced, the problems of complex training of non-end-to-end key point detection models, difficult model management, high data manufacturing cost and the like are avoided, and the manpower and material resource is saved.
The human body key point detection device provided by the invention is described below, and the human body key point detection device described below and the human body key point detection method described above can be correspondingly referred to each other. Fig. 3 is a schematic structural diagram of a human body key point detection device according to the present invention, and as shown in fig. 3, the human body key point detection device 300 includes: an acquisition module 301, a generation module 302, and a determination module 303, wherein:
a first acquiring module 301, configured to acquire a human body image to be detected;
the generating module 302 is configured to generate a first feature map and a second feature map corresponding to the to-be-detected human body image based on the to-be-detected human body image; the first feature map and the second feature map have different resolutions;
the first determining module 303 is configured to determine a key point detection result corresponding to the to-be-detected human body image based on the first feature map, the second feature map, and the prior knowledge of key point density; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map.
The human body key point detection device provided by the invention generates a first characteristic image and a second characteristic image with different resolutions based on the acquired human body image to be detected; because the dense part of the human body key points and the sparse part of the human body key points in the second feature map can be distinguished by the priori knowledge of the density of the key points, the feature maps with different resolutions can be adopted for detection aiming at the parts of different key points based on the first feature map, the second feature map and the priori knowledge of the density of the key points, and the accurate detection of the dense part of the human body key points is realized on the basis of ensuring the detection precision of the sparse part of the human body key points, so that the detection precision of the whole body key points of the human body is improved.
Optionally, the apparatus further comprises:
the second acquisition module is used for acquiring the number of key points of each human body part;
and the second determining module is used for determining the prior knowledge of the concentration of the key points based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.
Optionally, the first determining module 303 is further configured to:
determining a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected based on the first feature map;
determining a plurality of body part detection frames corresponding to the second feature map based on the first feature map, wherein the body part detection frames are used for cutting the second feature map;
and determining a second detection result corresponding to the dense part of the human body key points in the human body image to be detected based on the second feature map, the detection frames of the human body parts and the priori knowledge of the key point density.
Optionally, the first determining module 303 is further configured to:
performing downsampling operation on the first feature map to obtain a first target feature map;
and inputting the first target feature map into a convolution layer to obtain the first detection result output by the convolution layer.
Optionally, the first determining module 303 is further configured to:
cutting the second feature map by using the body part detection frame to obtain third feature maps corresponding to the human body parts;
determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to a dense part belonging to a human body key point in each third feature map;
and determining the second detection result based on each second target feature map.
Optionally, the generating module 302 is further configured to:
inputting the human body image to be detected into a residual error network model to obtain an initial feature map output by the residual error network model;
and performing up-sampling operation on the initial feature map for a plurality of times to obtain the first feature map and the second feature map.
Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to perform a human keypoint detection method comprising: acquiring a human body image to be detected; generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer can execute the human body key point detection method provided by the above methods, and the method includes: acquiring a human body image to be detected; generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map.
In still another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the human body keypoint detection method provided by the above methods, the method comprising: acquiring a human body image to be detected; generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions; determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. The human body key point detection method is characterized by comprising the following steps of:
acquiring a human body image to be detected;
generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions;
determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map;
the determining, based on the first feature map, the second feature map, and the prior knowledge of the key point density, a key point detection result corresponding to the to-be-detected human body image includes:
determining a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected based on the first feature map;
determining a plurality of body part detection frames corresponding to the second feature map based on the first feature map, wherein the body part detection frames are used for cutting the second feature map;
and determining a second detection result corresponding to the dense part of the human body key points in the human body image to be detected based on the second feature map, the detection frames of the human body parts and the priori knowledge of the key point density.
2. The human keypoint detection method of claim 1, wherein the a priori knowledge of keypoint concentration is obtained by:
acquiring the number of key points of each human body part;
and determining the prior knowledge of the concentration of the key points based on the number of the key points and the two-dimensional rectangular area corresponding to each human body part.
3. The method for detecting human keypoints according to claim 1, wherein determining, based on the first feature map, a first detection result corresponding to a human keypoint sparse part in the human image to be detected comprises:
performing downsampling operation on the first feature map to obtain a first target feature map;
and inputting the first target feature map into a convolution layer to obtain the first detection result output by the convolution layer.
4. The method for detecting human keypoints according to claim 1, wherein determining the second detection result corresponding to the dense part of the human keypoints in the human body image to be detected based on the second feature map, each of the body part detection frames, and the prior knowledge of the concentration of the keypoints comprises:
cutting the second feature map by using the body part detection frame to obtain third feature maps corresponding to the human body parts;
determining at least one second target feature map based on the prior knowledge of the key point density, wherein the second target feature map is a feature map corresponding to a dense part belonging to a human body key point in each third feature map;
and determining the second detection result based on each second target feature map.
5. The method for detecting human body key points according to claim 1, wherein the generating a first feature map and a second feature map corresponding to the human body image to be detected based on the human body image to be detected includes:
inputting the human body image to be detected into a residual error network model to obtain an initial feature map output by the residual error network model;
and performing up-sampling operation on the initial feature map for a plurality of times to obtain the first feature map and the second feature map.
6. A human body key point detection device, characterized by comprising:
the first acquisition module is used for acquiring a human body image to be detected;
the generation module is used for generating a first characteristic image and a second characteristic image corresponding to the human body image to be detected based on the human body image to be detected; the first feature map and the second feature map have different resolutions;
the first determining module is used for determining a key point detection result corresponding to the human body image to be detected based on the first feature map, the second feature map and the key point concentration priori knowledge; the key point density priori knowledge is used for distinguishing a human body key point dense part from a human body key point sparse part in the second feature map;
the determining, based on the first feature map, the second feature map, and the prior knowledge of the key point density, a key point detection result corresponding to the to-be-detected human body image includes:
determining a first detection result corresponding to a sparse part of a human body key point in the human body image to be detected based on the first feature map;
determining a plurality of body part detection frames corresponding to the second feature map based on the first feature map, wherein the body part detection frames are used for cutting the second feature map;
and determining a second detection result corresponding to the dense part of the human body key points in the human body image to be detected based on the second feature map, the detection frames of the human body parts and the priori knowledge of the key point density.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the human keypoint detection method according to any one of claims 1 to 5 when executing the program.
8. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the human keypoint detection method according to any one of claims 1 to 5.
CN202211167898.2A 2022-09-23 2022-09-23 Human body key point detection method and device, electronic equipment and storage medium Active CN115578753B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211167898.2A CN115578753B (en) 2022-09-23 2022-09-23 Human body key point detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211167898.2A CN115578753B (en) 2022-09-23 2022-09-23 Human body key point detection method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115578753A CN115578753A (en) 2023-01-06
CN115578753B true CN115578753B (en) 2023-05-05

Family

ID=84580804

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211167898.2A Active CN115578753B (en) 2022-09-23 2022-09-23 Human body key point detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115578753B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309706A (en) * 2019-05-06 2019-10-08 深圳市华付信息技术有限公司 Face critical point detection method, apparatus, computer equipment and storage medium
WO2021197466A1 (en) * 2020-04-03 2021-10-07 百果园技术(新加坡)有限公司 Eyeball detection method, apparatus and device, and storage medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711273B (en) * 2018-12-04 2020-01-17 北京字节跳动网络技术有限公司 Image key point extraction method and device, readable storage medium and electronic equipment
CN110705365A (en) * 2019-09-06 2020-01-17 北京达佳互联信息技术有限公司 Human body key point detection method and device, electronic equipment and storage medium
KR102250756B1 (en) * 2019-11-29 2021-05-10 연세대학교 산학협력단 Method and Apparatus for Extracting Key Point Using Bidirectional Message Passing Structure
CN111860276B (en) * 2020-07-14 2023-04-11 咪咕文化科技有限公司 Human body key point detection method, device, network equipment and storage medium
CN112686097A (en) * 2020-12-10 2021-04-20 天津中科智能识别产业技术研究院有限公司 Human body image key point posture estimation method
CN112967216B (en) * 2021-03-08 2023-06-09 平安科技(深圳)有限公司 Method, device, equipment and storage medium for detecting key points of face image
CN113657321B (en) * 2021-08-23 2024-04-26 平安科技(深圳)有限公司 Dog face key point detection method, device, equipment and medium based on artificial intelligence
CN114913541A (en) * 2021-09-24 2022-08-16 同济大学 Human body key point detection method, device and medium based on orthogonal matching pursuit
CN114332484A (en) * 2021-11-10 2022-04-12 腾讯科技(深圳)有限公司 Key point detection method and device, computer equipment and storage medium
CN114821087A (en) * 2022-04-15 2022-07-29 苏州立创致恒电子科技有限公司 Detection and description model and method for key points of depth image

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309706A (en) * 2019-05-06 2019-10-08 深圳市华付信息技术有限公司 Face critical point detection method, apparatus, computer equipment and storage medium
WO2021197466A1 (en) * 2020-04-03 2021-10-07 百果园技术(新加坡)有限公司 Eyeball detection method, apparatus and device, and storage medium

Also Published As

Publication number Publication date
CN115578753A (en) 2023-01-06

Similar Documents

Publication Publication Date Title
CN110210513B (en) Data classification method and device and terminal equipment
CN109948741A (en) A kind of transfer learning method and device
US10627470B2 (en) System and method for learning based magnetic resonance fingerprinting
CN105654066A (en) Vehicle identification method and device
CN111915618B (en) Peak response enhancement-based instance segmentation algorithm and computing device
CN111507184B (en) Human body posture detection method based on parallel cavity convolution and body structure constraint
CN112102294A (en) Training method and device for generating countermeasure network, and image registration method and device
EP3671635B1 (en) Curvilinear object segmentation with noise priors
CN114118303B (en) Face key point detection method and device based on prior constraint
CN110210314B (en) Face detection method, device, computer equipment and storage medium
CN117036715A (en) Deformation region boundary automatic extraction method based on convolutional neural network
CN117953341A (en) Pathological image segmentation network model, method, device and medium
CN115578753B (en) Human body key point detection method and device, electronic equipment and storage medium
EP3671634A1 (en) Curvilinear object segmentation with geometric priors
CN116630286A (en) Method, device, equipment and storage medium for detecting and positioning image abnormality
CN105023016B (en) Target apperception method based on compressed sensing classification
CN114550282A (en) Multi-person three-dimensional attitude estimation method and device and electronic equipment
CN112084889A (en) Image behavior recognition method and device, computing equipment and storage medium
Palenichka et al. Multi-scale model-based skeletonization of object shapes using self-organizing maps
CN113705430B (en) Form detection method, device, equipment and storage medium based on detection model
Amoriya et al. Counting Number of Objects in Images Using Machine Learning
CN113627434B (en) Construction method and device for processing model applied to natural image
CN117474932B (en) Object segmentation method and device, electronic equipment and storage medium
Zahn et al. Efficient human-in-loop deep learning model training with iterative refinement and statistical result validation
CN117953367A (en) Multi-recognition object detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant