CN113111850A - Human body key point detection method, device and system based on region-of-interest transformation - Google Patents
Human body key point detection method, device and system based on region-of-interest transformation Download PDFInfo
- Publication number
- CN113111850A CN113111850A CN202110478213.5A CN202110478213A CN113111850A CN 113111850 A CN113111850 A CN 113111850A CN 202110478213 A CN202110478213 A CN 202110478213A CN 113111850 A CN113111850 A CN 113111850A
- Authority
- CN
- China
- Prior art keywords
- face
- image
- human body
- key points
- test
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000009466 transformation Effects 0.000 title claims abstract description 94
- 238000001514 detection method Methods 0.000 title claims description 56
- 238000000034 method Methods 0.000 claims abstract description 59
- 238000012549 training Methods 0.000 claims abstract description 32
- 230000008569 process Effects 0.000 claims abstract description 14
- 238000004364 calculation method Methods 0.000 claims abstract description 7
- 238000012360 testing method Methods 0.000 claims description 77
- 230000006870 function Effects 0.000 claims description 12
- 238000002372 labelling Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 7
- 101150064138 MAP1 gene Proteins 0.000 claims description 4
- 101100075995 Schizosaccharomyces pombe (strain 972 / ATCC 24843) fma2 gene Proteins 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 3
- 238000000844 transformation Methods 0.000 claims 1
- 230000008859 change Effects 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 3
- 238000011176 pooling Methods 0.000 description 3
- 208000029152 Small face Diseases 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a method, a device and a system for detecting human key points based on region-of-interest transformation. And in the model training process, carrying out region-of-interest transformation on the human key point data, and training the human key point model by using the transformed data. And in the process of detecting the model, detecting the human body key points according to the trained human body key point model, and performing inverse transformation to obtain the human body key points of the image before transformation. The invention effectively standardizes the data to a uniform form, overcomes the problem of large data change in an open scene, reduces the training difficulty, can improve the face proportion in the image through the region-of-interest transformation, is beneficial to the prediction of key points of the face, and further improves the integral precision of key points of a human body. Compared with a method for separately predicting body and face key points, the method only needs one face detector and one key point detector, and the calculation cost is low.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to human face detection and recognition, and specifically relates to a human body key point detection method, device and system based on region-of-interest transformation.
Background
The task of human body key point detection is to detect the key point positions of the face and the limbs in the human body image. Human body image data under an uncontrolled scene has large changes, such as large differences among people, dresses, postures, shelters and background environments, and small face proportion, which brings difficulty to training of a human body key point detection model.
The existing human body key point detection methods mainly comprise two types, one type is that the human body position in an image is detected firstly, the human body image is intercepted, and then key points in the image are detected, but because the face occupies a small proportion in the image, the face key point prediction is not accurate, the number of the face key points is often large, and the number of the limb key points is small, so the integral precision is influenced.
The other method is to detect key points of the human body by detecting the positions of the human body and the human face, and specifically comprises the steps of firstly intercepting images of the human body and the human face and then respectively detecting key points of limbs and the face. Although the method has high precision, a plurality of model predictions are needed, and the calculation is time-consuming.
Disclosure of Invention
The invention aims to provide a method, a device and a system for detecting human key points based on region-of-interest transformation.
In order to achieve the above object, a first aspect of the present invention provides a method for detecting human key points based on region of interest transformation, including the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary frame by using a human face detector for the input image to be detected containing the human body, and then carrying out region-of-interest transformation according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model obtained by training in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The second aspect of the present invention further provides a human body key point detection device based on region of interest transformation, including:
a module for acquiring M color images containing a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The third aspect of the present invention further provides a system for human body keypoint detection based on region of interest transformation, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a human keypoint detection method based on region of interest transformation as previously described.
Compared with the prior art, the technical scheme of the invention has the following remarkable beneficial effects:
the method aims at the problem of human body detection obstacle caused by the problems of large scene change and small face proportion of human body image data in an open environment, provides a mode of carrying out region-of-interest transformation on the data by taking a human face as a center, training a human body key point detection model by using the transformed data, therefore, during actual detection, after the human face is detected by the human face detector, the interested region of the image is changed, then detecting key points of the human body, finally performing inverse transformation to obtain the key point data of the original image to be detected, therefore, on one hand, the data can be adjusted to a uniform mode, the training difficulty is reduced, on the other hand, the proportion of the face in the image can be improved through transformation because the number of the face key points is far more than that of the body key points, the face key points can be predicted more accurately, therefore, the overall performance of human body key point detection is improved, and the accuracy of the human body key point detection model is improved. Meanwhile, the method only needs one face detector and one key point detector, and the calculation cost is low.
It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent. In addition, all combinations of claimed subject matter are considered a part of the presently disclosed subject matter.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.
Drawings
The drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
fig. 1 is a schematic diagram of a training process of a human body key point detection model based on region of interest transformation according to an exemplary embodiment of the present invention.
FIG. 2 is a schematic diagram of a model structure of the human body key point detection model of the present invention.
FIG. 3 is a schematic diagram illustrating a process of detecting key points of a human body by using the model shown in FIG. 1 according to the embodiment of the invention.
Detailed Description
In order to better understand the technical content of the present invention, specific embodiments are described below with reference to the accompanying drawings.
In this disclosure, aspects of the present invention are described with reference to the accompanying drawings, in which a number of illustrative embodiments are shown. Embodiments of the present disclosure are not necessarily intended to include all aspects of the invention. It should be appreciated that the various concepts and embodiments described above, as well as those described in greater detail below, may be implemented in any of numerous ways, as the disclosed concepts and embodiments are not limited to any one implementation. In addition, some aspects of the present disclosure may be used alone, or in any suitable combination with other aspects of the present disclosure.
Referring to fig. 1, 2 and 3, the method for detecting key points of a human body based on region of interest transformation provided by the invention comprises a model training process and a model detection process. And in the model training process, carrying out region-of-interest transformation on the human key point data, and training the human key point model by using the transformed data. And in the process of detecting the model, detecting the human body key points according to the trained human body key point model, and performing inverse transformation to obtain the human body key points of the image before transformation.
In the model training process, the problem of large data change in an open scene is solved by effectively standardizing the data to a uniform form, the training difficulty is reduced, meanwhile, the face proportion in the image can be improved through region-of-interest transformation, the prediction of the face key points is facilitated, and the integral precision of the human body key points is further improved. Compared with a method for separately predicting body and face key points, the method only needs one face detector and one key point detector, and the calculation cost is low.
As shown in fig. 1 and 3, the method for detecting human key points based on region of interest transformation according to the embodiment of the present disclosure includes the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary frame by using a human face detector for the input image to be detected containing the human body, and then carrying out region-of-interest transformation according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model trained in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
Human body key point data acquisition and labeling
In step 1, a base image of a training set is constructed by acquiring a large number of color images M including a human body, M being greater than 1000. In particular, the image data covers as much of the scene as possible, such as different people, clothing, poses, occlusions, and background environments.
In step 2, labeling N human body key points to each color image, and obtaining labeling data as follows:
The labeled human key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points. From the face key points, a face bounding box for the face can be determined.
Region of interest transformation
In step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed key point coordinates, including:
taking the central point of the human face boundary frame as the human face central point, taking the length of the long edge of the boundary frame as the human face size, and according to the human face central point and the size, carrying out image processingAnd carrying out region-of-interest transformation on the corresponding human body key points to obtain transformed data expression as follows:
{[I0,(p0,0,p0,1,...,p0,N-1)],[I1,(p1,0,p1,1,...,p1,N-1)],...,[IM-1,(pM-1,0,pM-1,1,...,pM-1,N-1)]}
wherein p ism,n=(xm,n,ym,n) For the m-th transformed image ImThe n-th transformed human body key point coordinate is obtained, the side length of the transformed image is L, and L is a positive integer; in a preferred example, L ≧ 64; in the present example, the value is 64 or 128;
transformed image ImWherein each pixel value is a slave imageSampled in (i.e. images before transformation), xindices,mTo be from an imageList of sampled position abscissas, yindices,mTo be from an imageThe sampling position ordinate list is specifically obtained as follows:
xindices,m=(xface,m+warpRoI,m(0),xface,m+warpRoI,m(1),...,xface,m+warpRoI,m(L-1))
yindices,m=(yface,m+warpRoI,m(0),yface,m+warpRoI,m(1),...,yface,m+warpRoI,m(L-1))
warpRoI,m(t)=am/2·arctanh(2t/L-0.9)
wherein warpRoI,m(t) is a region of interest transform function of the mth image, t is a function input, and t is 0, 1, 2.
The image interesting region transformation adopts a remap method in an opencv image processing library, and the parameter map1 is set as xindices,mThe parameter map2 is set to yindices,m;
Human body key point (x) after region of interest transformationm,n,ym,n) And calculating by a traversal method.
Wherein, the human body key point (x) after the region of interest is transformedm,n,ym,n) The method is calculated by a traversal method, and the traversal calculation process comprises the following steps:
go through t in the range of t-0, 1, 2At this time, the value of t is the abscissa x of the transformed key pointm,n(ii) a And
Human body key point training detection model
In step 5, a CNN network-based implementation of a human key point detection model for detecting human key points, such as the model structure shown in fig. 2, is made up of a convolutional layer, a maximum pooling layer, and a full-link layer.
The convolution kernel size of the convolution layer is 3 × 3, the step size is 1, the zero Padding method is Same Padding, and the number of convolution kernels is indicated in parentheses of each convolution layer in fig. 2.
The pooling window size of the maximum pooling layer was 2 × 2 with a step size of 2.
The number of first fully-connected layer neurons was 1024 and the number of second fully-connected layer neurons was 2N.
Each convolutional layer and the first fully connected layer are then activated using a ReLU activation function.
During the training process, the loss function of the mth data is expressed as:
wherein (x)m,n,ym,n) Is the nth human body key point of the mth training sample in the data set after the region of interest transformation, (x'm,n,y′m,n) And predicting the nth human body key point of the training image after the mth interesting area is transformed by the model.
Therefore, a detection model for detecting key points in the human body image after the region of interest is transformed is trained and obtained according to the image after the region of interest is transformed and the transformed coordinates of the key points of the human body.
Human body key point detection application
As an example shown in fig. 3, the human body key point detection process for an input image to be detected containing a human body includes:
firstly, detecting a human face boundary frame by using a human face detector, then carrying out region-of-interest transformation according to the method in the step 4, and improving the proportion of the human face in the image to obtain a transformed image;
then, detecting the human key points in the transformed image by using a human key point detection model obtained by training; and
and finally, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
The adopted face detector can adopt a Dlib tool and the like to detect a human body and determine a boundary frame of the face. It should be understood that, in the implementation of the present invention, the face detection is not limited to the above Dlib tool, and may also be implemented by using other face detection models trained in advance.
According to the center point (x) of the boundary box of the human facetest,face,ytest,face) And the length a of the long side of the face bounding boxtestUsing a remap method in an opencv image processing library to transform the region of interest of the image to be detected, setting the parameter map1 as xtest,indicesThe parameter map2 is set to ytest,indices. The calculation method is as follows:
xtest,indices=(xtest,face+warpRoI,test(0),xtest,face+warpRoI,test(1),...,xtest,face+warpRoI,test(L-1))
ytest,indices=(ytest,face+warpRoI,test(0),ytest,face+warpRoI,test(1),...,ytest,face+warpRoI,test(L-1))
warpRoI,test(t)=atest/2·arctanh(2t/L-0.9)
wherein warpRoI,testAnd (t) is a transformation function of the region of interest of the image to be detected, t is a function input, and t is 0, 1, 2.
Detecting keypoints (x) in the transformed image using the human keypoint detection model trained in step 3test,n,ytest,n)。
Then, carrying out region-of-interest inverse transformation on the key points in the transformed image to obtain human body key points (x) of the image to be detected before transformationsrc,test,n,ysrc,test,n):
xsrc,test,n=xtest,face+warpRoI,test(xtest,n)
ysrc,test,n=ytest,face+warpRoI,test(ytest,n) Therefore, the human body key point data of the image to be detected before transformation is obtained.
It should be understood that in step 4 and step 6, the side length L of the image after the region of interest transformation has the same value.
Test procedure
12000 groups of labeled human body key point data are prepared according to the steps 1 and 2, and comprise 10000 groups of training data and 2000 groups of test data. The data covers various people, dresses, poses, occlusions and background environments. On the basis of 10000 groups of training data, region-of-interest transformation is carried out, a detection model is trained, a training human body key point model is used, and verification is carried out on test data. And (4) comparing and directly using the original data to train the human body key point detection model to carry out key point detection on the basis of the test data.
The normalized average error is used as an evaluation index, namely the Euclidean distance between a predicted coordinate and a labeled coordinate is divided by the length of a diagonal line of a human body boundary box. The comparative results are shown in Table 1.
TABLE 1 comparison of test results of the prior art method and the method of the present invention
Normalized mean error | |
Existing methods | 6.32% |
The method of the invention | 4.94% |
As can be seen from comparison of test results, the model training method can effectively improve the model precision, and compared with the existing method, the test error is reduced by 1.38%.
Human key point detection device based on region of interest transformation
According to the disclosure of the present invention, there is also provided a human body key point detection device based on region of interest transformation, comprising:
a module for acquiring M color images containing a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
It should be understood that the functions and implementation of the modules of the human body key point detection apparatus based on region of interest transformation of the present embodiment can be implemented based on the specific operations of the aforementioned human body key point detection method based on region of interest transformation.
System for human body key point detection based on region of interest transformation
According to the disclosure of the present invention, there is also provided a system for human keypoint detection based on region of interest transformation, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a region of interest transformation based human keypoint detection method as previously described, in particular the procedures of the detection method as implemented in connection with fig. 1, 3.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.
Claims (10)
1. A human body key point detection method based on region of interest transformation is characterized by comprising the following steps:
step 1, obtaining M color images containing a human body, wherein M is a natural number more than 1000;
step 2, marking N human body key points on each color image to obtain marking data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
step 3, determining a face boundary frame of the color image according to the coordinates of the labeled face key points;
step 4, performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain transformed images and transformed human body key point coordinates; the face central point and the face size are determined according to the face bounding box;
step 5, training a human body key point detection model for detecting the human body key points based on the image after the region of interest is transformed and the transformed human body key point coordinates;
step 6, detecting a human face boundary frame by using a human face detector for an input image to be detected containing a human body, and then carrying out region-of-interest transformation on the image to be detected according to the method in the step 4, so as to improve the proportion of the human face in the image and obtain a transformed image;
step 7, detecting the human key points in the transformed image by using the human key point detection model trained in the step 5; and
and 8, carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
2. The method for detecting key points of a human body based on region of interest transformation according to claim 1, wherein in the step 2, N key points of the human body are labeled to each color image, and the obtained labeled data is expressed as:
3. The method for detecting key points of a human body based on region-of-interest transformation according to claim 1, wherein in the step 4, the region-of-interest transformation is performed on each color image and the labeled data according to the center point of the human face and the size of the human face to obtain the transformed image and the transformed key point coordinates, and the method comprises the following steps:
taking the central point of the human face boundary frame as the human face central point, taking the length of the long edge of the boundary frame as the human face size, and according to the human face central point and the size, carrying out image processingAnd carrying out region-of-interest transformation on the corresponding human body key points to obtain transformed data expression as follows:
{[I0,(p0,0,p0,1,...,p0,N-1)],[I1,(p1,0,p1,1,...,p1,N-1)],...,[IM-1,(pM-1,0,pM-1,1,...,pM-1,N-1)]}
wherein p ism,n=(xm,n,ym,n) For the m-th transformed image ImThe n-th transformed human body key point coordinate is obtained, the side length of the transformed image is L, and L is a positive integer;
transformed image ImWherein each pixel value is a slave imageObtained by intermediate sampling, xindices,mTo be from an imageList of sampled position abscissas, yindices,mTo be from an imageThe sampling position ordinate list is specifically obtained as follows:
xindices,m=(xface,m+warpRoI,m(0),xface,m+warpRoI,m(1),...,xface,m+warpRoI,m(L-1))
yindices,m=(yface,m+warpRoI,m(0),yface,m+warpRoI,m(1),...,yface,m+warpRol,m(L-1))
warpRoI,m(t)=am/2·arctanh(2t/L-0.9)
wherein warpRoI,m(t) is a region of interest transform function of the mth image, t is a function input, and t is 0, 1, 2.
The image interesting region transformation adopts a remap method in an opencv image processing library, and the parameter map1 is set as xinidices,mThe parameter map2 is set to yindices,m;
Human body key point (x) after region of interest transformationm,n,ym,n) And calculating by a traversal method.
4. The method for detecting key points of human body based on region of interest transformation as claimed in claim 3, wherein the method comprises
In the step 4, the human body key points (x) after the region of interest transformationm,nym,n) The method is calculated by a traversal method, and the traversal calculation process comprises the following steps:
go through t in the range of t-0, 1, 2At this time, the value of t is the abscissa x of the transformed key pointm,n(ii) a And
5. The method for detecting human key points based on region of interest transformation according to claim 3, wherein in the step 5, the CNN-based network implementation of the human key point detection model for detecting human key points is implemented, wherein in the training process, the loss function of the mth data is expressed as:
wherein (x)m,n,ym,n) Is the nth human body key point of the mth training sample in the data set after the region of interest transformation, (x'm,n,y′m,n) And predicting the nth human body key point of the training image after the mth interesting area is transformed by the model.
6. The method for detecting human key points based on region-of-interest transformation according to claim 3, wherein in the step 8, the human key points in the transformed image are subjected to region-of-interest inverse transformation to obtain the human key points of the image to be detected before transformation, and the method comprises the following steps:
detecting the human key points (x) output by using the human key point detection model in the step 5test,n,ytest,n) Obtaining the human body key point (x) of the image to be detected before transformation by using the following region of interest inverse transformation formulasrc,test,n,ysrc,test,n):
xsrc,test,n=xtest,face+warpRoI,test(xtest,n)
ysrc,test,n=ytest,face+warpRoI,test(ytest,n)
Wherein (x)test,face,ytest,face) Representing the midpoint of the face bounding box, atestRepresenting the length of the long side of the face bounding box; x is the number oftest,indicesAnd ytest,indicesRespectively representing the sample values, x, from the image to be detected before transformation when the image to be detected is subjected to the interesting transformationtest,indicesFor a list of sampled position abscissas, yindices,mIs a sampled position ordinate list;
wherein, for the transformation of the interested region of the image to be detected before transformation, the remap method in the opencv image processing library is used, and the parameter map1 is set as xtest,indicesThe parameter map2 is set to ytest,indices;
xtest,indices=(xtest,face+warpRoI,test(0),xtest,face+warpRoI,test(1),...,xtest,face+warpRoI,test(L-1))
ytest,indices=(ytest,face+warpRoI,test(0),ytest,face+warpRoI,test(1),...,ytest,face+warpRoI,test(L-1))
warpRoI,test(t)=atest/2·arctanh(2t/L-0.9)
Wherein warpRoI,testAnd (t) is a region-of-interest transformation function of the image to be detected, t is a function input, and t is 0.1, 2.
7. The method for detecting human body key points based on region of interest transformation according to claim 3, wherein in the step 4 and the step 6, the side lengths L of the images after the region of interest transformation have the same value.
8. The method for detecting human key points based on region of interest transformation according to claim 6, wherein the side length L of the image after the region of interest transformation is 64 or 128.
9. A human key point detection device based on region of interest transform is characterized by comprising:
a module for acquiring M color images containing a human body, M being a natural number greater than 1000;
a module for labeling N human body key points on each color image to obtain labeling data; the human body key points comprise face key points and limb key points, and the number of the face key points is more than that of the limb key points;
a module for determining a face bounding box of the color image according to the coordinates of the labeled face key points;
a module for performing region-of-interest transformation on each color image and the labeled data according to the face center point and the face size to obtain a transformed image and transformed coordinates of key points of the human body; the face central point and the face size are determined according to the face bounding box;
a module for training a human body key point detection model for detecting human body key points based on the image after the transformation of the region of interest and the transformed human body key point coordinates;
a module for detecting a human face boundary box by using a human face detector for an input image to be detected containing a human body, then carrying out region-of-interest transformation, improving the proportion of the human face in the image and obtaining a transformed image;
a module for detecting human key points in the transformed image using a trained human key point detection model; and
and the module is used for carrying out region-of-interest inverse transformation on the human body key points in the transformed image to obtain the human body key points of the image to be detected before transformation.
10. A system for human keypoint detection based on region of interest transformations, comprising:
one or more processors;
a memory storing instructions that are operable, when executed by the one or more processors, to cause the one or more processors to perform operations comprising a flow of a region of interest transform based human keypoint detection method according to any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110478213.5A CN113111850B (en) | 2021-04-30 | 2021-04-30 | Human body key point detection method, device and system based on region-of-interest transformation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110478213.5A CN113111850B (en) | 2021-04-30 | 2021-04-30 | Human body key point detection method, device and system based on region-of-interest transformation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113111850A true CN113111850A (en) | 2021-07-13 |
CN113111850B CN113111850B (en) | 2022-08-16 |
Family
ID=76720661
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110478213.5A Active CN113111850B (en) | 2021-04-30 | 2021-04-30 | Human body key point detection method, device and system based on region-of-interest transformation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113111850B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117422721A (en) * | 2023-12-19 | 2024-01-19 | 天河超级计算淮海分中心 | Intelligent labeling method based on lower limb CT image |
CN118509542A (en) * | 2024-07-18 | 2024-08-16 | 圆周率科技(常州)有限公司 | Video generation method, device, computer equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190370537A1 (en) * | 2018-05-29 | 2019-12-05 | Umbo Cv Inc. | Keypoint detection to highlight subjects of interest |
CN110807448A (en) * | 2020-01-07 | 2020-02-18 | 南京甄视智能科技有限公司 | Human face key point data enhancement method, device and system and model training method |
-
2021
- 2021-04-30 CN CN202110478213.5A patent/CN113111850B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190370537A1 (en) * | 2018-05-29 | 2019-12-05 | Umbo Cv Inc. | Keypoint detection to highlight subjects of interest |
CN110807448A (en) * | 2020-01-07 | 2020-02-18 | 南京甄视智能科技有限公司 | Human face key point data enhancement method, device and system and model training method |
CN111178337A (en) * | 2020-01-07 | 2020-05-19 | 南京甄视智能科技有限公司 | Human face key point data enhancement method, device and system and model training method |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117422721A (en) * | 2023-12-19 | 2024-01-19 | 天河超级计算淮海分中心 | Intelligent labeling method based on lower limb CT image |
CN117422721B (en) * | 2023-12-19 | 2024-03-08 | 天河超级计算淮海分中心 | Intelligent labeling method based on lower limb CT image |
CN118509542A (en) * | 2024-07-18 | 2024-08-16 | 圆周率科技(常州)有限公司 | Video generation method, device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113111850B (en) | 2022-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022002150A1 (en) | Method and device for constructing visual point cloud map | |
CN108388896B (en) | License plate identification method based on dynamic time sequence convolution neural network | |
CN108256394B (en) | Target tracking method based on contour gradient | |
CN109118473B (en) | Angular point detection method based on neural network, storage medium and image processing system | |
CN106599830B (en) | Face key point positioning method and device | |
CN108108764B (en) | Visual SLAM loop detection method based on random forest | |
WO2020177432A1 (en) | Multi-tag object detection method and system based on target detection network, and apparatuses | |
CN111445459B (en) | Image defect detection method and system based on depth twin network | |
CN109753891A (en) | Football player's orientation calibration method and system based on human body critical point detection | |
CN111709980A (en) | Multi-scale image registration method and device based on deep learning | |
CN113111850B (en) | Human body key point detection method, device and system based on region-of-interest transformation | |
CN111461113B (en) | Large-angle license plate detection method based on deformed plane object detection network | |
WO2018035794A1 (en) | System and method for measuring image resolution value | |
CN112818969A (en) | Knowledge distillation-based face pose estimation method and system | |
CN113808180B (en) | Heterologous image registration method, system and device | |
CN111415339B (en) | Image defect detection method for complex texture industrial product | |
CN113011401A (en) | Face image posture estimation and correction method, system, medium and electronic equipment | |
CN103353941A (en) | Natural marker registration method based on viewpoint classification | |
CN117541652A (en) | Dynamic SLAM method based on depth LK optical flow method and D-PROSAC sampling strategy | |
CN117372777A (en) | Compact shelf channel foreign matter detection method based on DER incremental learning | |
CN111523586A (en) | Noise-aware-based full-network supervision target detection method | |
CN108992033B (en) | Grading device, equipment and storage medium for vision test | |
CN117253062A (en) | Relay contact image characteristic quick matching method under any gesture | |
CN113111849B (en) | Human body key point detection method, device, system and computer readable medium | |
CN114419716B (en) | Calibration method for face image face key point calibration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: No.568 longmian Avenue, gaoxinyuan, Jiangning District, Nanjing City, Jiangsu Province, 211000 Patentee after: Xiaoshi Technology (Jiangsu) Co.,Ltd. Address before: No.568 longmian Avenue, gaoxinyuan, Jiangning District, Nanjing City, Jiangsu Province, 211000 Patentee before: NANJING ZHENSHI INTELLIGENT TECHNOLOGY Co.,Ltd. |