CN111476184A - Human body key point detection method based on double-attention machine system - Google Patents
Human body key point detection method based on double-attention machine system Download PDFInfo
- Publication number
- CN111476184A CN111476184A CN202010284037.7A CN202010284037A CN111476184A CN 111476184 A CN111476184 A CN 111476184A CN 202010284037 A CN202010284037 A CN 202010284037A CN 111476184 A CN111476184 A CN 111476184A
- Authority
- CN
- China
- Prior art keywords
- human body
- data set
- key point
- attention
- point detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 66
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000012360 testing method Methods 0.000 claims abstract description 25
- 230000007246 mechanism Effects 0.000 claims abstract description 12
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 101000742346 Crotalus durissus collilineatus Zinc metalloproteinase/disintegrin Proteins 0.000 claims description 10
- 101000872559 Hediste diversicolor Hemerythrin Proteins 0.000 claims description 10
- 238000011176 pooling Methods 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 6
- 230000036544 posture Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 2
- 230000008859 change Effects 0.000 claims description 2
- 238000012795 verification Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims 1
- 238000000034 method Methods 0.000 abstract description 10
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 210000003423 ankle Anatomy 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 230000005021 gait Effects 0.000 description 1
- 210000003127 knee Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 210000000707 wrist Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a human body key point detection method based on a double-attention machine mechanism, which comprises the following steps: the method comprises the steps of obtaining a human body key point detection data set which comprises a training data set and a test data set, preprocessing the training data set and the test data set, building a human body key point detection network, adding a channel attention and space attention module in a residual block with extracted features, carrying out appointed times of model training on the preprocessed training data set by using the human body key point detection network, evaluating and storing a trained network model, and testing on the test data set by using the human body key point detection network model. The method provided by the invention is more accurate in detection of key points of the human body, especially in detection of difficult key points.
Description
Technical Field
The invention relates to the field of computer vision and the field of deep learning, in particular to a human key point detection method based on deep learning.
Background
Human body key point detection is the basis of many computer vision tasks, and plays a fundamental role in research in related fields such as behavior recognition, person tracking, gait recognition and the like. The human body key point detection mainly detects some key points (joint points such as wrists, knees, ankles, faces and the like) of a human body in an image, and is important for describing human body postures and predicting human body behaviors. Therefore, researchers in this field have attempted to solve this problem in the last decade using different approaches, from the first graphical structures and graph models, to the later depth maps. Although some progress has been made in these conventional methods, the accuracy is low and it is difficult to put the methods into practical use. Until 2014, deep neural networks are applied to human key point detection problems for the first time by deep Pose, and human key points are predicted through cascaded convolutional neural networks. In 2016, with the outbreak of deep learning, a series of algorithms based on deep learning appeared, such as Hourglass, CPM, OpenPose, G-RMI, RMPE, CPN, HRNet [1], and the detection accuracy of human key points is continuously improved. Although the accuracy of the human body key point detection is improved by the algorithms as a whole, the accuracy of the algorithms still needs to be improved in a complex scene, particularly for the detection of difficult key points.
Attention has been paid in recent years to the successful use of these mechanisms in image processing, speech recognition and natural language processing. The attention mechanism in computer vision is a brain signal processing mechanism specific to human vision, the human vision obtains a target area needing important attention by rapidly scanning a global image, and then more attention resources are put into the area to obtain more detailed information needed by the attention target and suppress other useless information. The current Attention mechanism mainly includes Channel Attention (Channel Attention) focusing on different features of a picture and Spatial Attention (Spatial Attention) focusing on different regions of the picture, and represents a Channel Attention module proposed by an algorithm SENET [2], so that the sensitivity of a model to different Channel features is improved, and performance is improved. And CBAM [3] and BAM [4] introduce two attention mechanisms at the same time, compare with the attention mechanism that SENet [2] only focuses on the channel, increase the attention to different areas of the picture, the overall network performance is further promoted. Therefore, the channel attention and the space attention mechanism are added into the human key point detection algorithm at the same time, so that the model can be helped to endow different weights to each part of the input features, more key and important information can be extracted, and the model can be judged more accurately.
Reference documents:
1.Ke Sun,Bin Xiao,Dong Liu,and Jingdong Wang.Deep High-ResolutionRepresentation Learning for Human Pose Estimation.In:Proc.Of Computer Visionand Pattern Recognition(CVPR).2019.
2.J.Hu,L.Shen,and G.Sun.Squeeze-and-Excitation Networks.In:Proc.OfComputer Vision and Pattern Recognition(CVPR).2018.
3.S.Woo,J.Park,J.Y.Lee,and I.So Kweon.CBAM:Convolutional BlockAttention Module.In:Proc.of European Conf.on Computer Vision(ECCV).2018.
4.J.Park,S.Woo,J.Y.Lee,and I.So Kweon.BAM:Bottleneck AttentionModule.In:Proc.of British Machine Vision Conference(BMVC).2018.
disclosure of Invention
Aiming at the problem that the conventional human body key point detection method has high error rate in detection of difficult key points in a complex scene, the invention designs a network structure based on a double-attention machine system to detect human body key points, and mainly comprises the following steps:
step S1: acquiring a human body key point detection data set, which comprises a training data set and a test data set;
step S2: training a human body key point detection network;
step S21: preprocessing the training data set and the testing data set acquired in the step S1;
step S22: building a human body key point detection network, and adding a channel attention and space attention module in the residual block for extracting the characteristics;
step S23: performing model training for the training data set processed in the step S21 by using the network in the step S22 for a specified number of times;
step S24: evaluating and storing the model trained in step S23;
step S3: the keypoint model trained in step S2 is tested on the test data set acquired in step S1.
Aiming at the problem that the conventional human body key point detection method is high in error rate especially for difficult key point detection in a complex scene, the human body key point detection method based on the double-attention mechanism provided by the invention improves the accuracy of key point detection by respectively adding a channel attention module and a space attention module to a Basicblock for extracting features of a high-resolution network (HRNet) and a Bottleneck residual block after parallel improvement. Compared with the existing method, the method provided by the invention is more accurate in detection of key points of the human body, especially in detection of difficult key points.
Drawings
FIG. 1 is a flow chart of a human body key point detection method based on a double-attention machine mechanism.
Fig. 2 is a block diagram of a high resolution network used in the present invention.
Fig. 3 is a structural diagram of the Bottleneck attention adding module after parallel modification according to the present invention.
FIG. 4 is a block diagram of the add attention to Basicblock module of the present invention.
Detailed Description
Fig. 1 shows a flowchart of a human body key point detection method based on a double-attention machine system, which is provided by the invention, and the method mainly comprises the following steps: the method comprises the steps of obtaining a human body key point detection data set which comprises a training data set and a test data set, preprocessing the training data set and the test data set, building a human body key point detection network, adding a channel attention and space attention module in a residual block with extracted features, carrying out appointed times of model training on the preprocessed training data set by using the human body key point detection network, evaluating and storing a trained network model, and testing on the test data set by using the human body key point detection network model. The specific implementation details of each step are as follows:
step S1: acquiring a human body key point detection data set, wherein the human body key point detection data set comprises a training data set and a test data set, the data comprises pictures containing different human body postures and mark files of true values of joints of a human body, public data sets MPII and COCO2107 are used in the data set, the MPII human body posture data set comprises 25k pictures and 40k human body examples with 16 key points, and the training set and the test set are respectively 28k examples and 12k examples; the COCO2107 keypoint detection dataset contains 200k pictures and 250k human instances with 17 keypoints, wherein the training set train2017 comprises 58k pictures and 150k human instances, and the verification set val2017 and the test set test2017 respectively contain 5k pictures and 20k pictures.
Step S2: training a human body key point detection network, wherein the specific mode comprises the following steps of S21, S22, S23 and S24:
step S21, preprocessing the training data set and the test data set obtained in the step S1, wherein the human body key point detection network only detects key points of a human body, for detecting the human body in the picture, the MPII data set uses a provided human body frame, the COCO data set uses a faster-RCNN to carry out human body detection to obtain a human body detection frame, the height-width ratio of the human body detection frames of the MPII and COCO training data sets is fixed to be 4:3, then the human body detection frame is cut out from the picture, the sizes of the human body detection frame are respectively adjusted to be 256 × 256 and 256 × 192, meanwhile, data enhancement is carried out to the data, the data enhancement comprises random rotation (-45 degrees, 45 degrees), random scaling (0.65, 1.35) and turning.
Step S22, constructing a human body key point detection network, adding a channel attention and space attention module in a feature extraction residual block, wherein the human body key point detection network is specifically characterized in that a high-resolution network (HRnet) is used as an integral framework, the network is connected with high-to-low subnetworks in parallel, the whole process generates a reliable high-resolution representation by repeatedly fusing high-to-low subnetworks, the HRnet integral network structure is divided into five stages, as shown in FIG. 2, in the first stage, an input image is convolved twice by 3 × 3 convolution operation with two step lengths of 2, so that the height (H) and the width (W) of the image are changed into the sizes of H/4 and W/4, the number of channels is 64, then feature extraction is carried out by 4 improved Bottleneck residual blocks, namely, parallel improvement is carried out on the Bottleneck residual block in Net Res, 3 × 3 convolution layers and 3 × 3 in ResNeXt are connected, then the channel attention and space attention modules are added, and the channel attention modules are connected in parallelCompressing the feature map subjected to convolution operation on a spatial dimension by using maximum pooling and average pooling to obtain two different spatial background descriptions: fc (Fc)avgAnd FcmaxThen, the two descriptions are calculated by using a shared network composed of multilayer perceptrons, the calculation formula is Mc (F) ═ sigma (M L P (AvgPool (F)) + M L P (MaxPool (F))), a channel attention feature map Mc (F) is obtained, space attention is obtained, namely, the maximum pooling and the average pooling are used on the channel dimension of the obtained channel attention feature map, and two different feature descriptions Fs are obtainedavgAnd FsmaxThen, the two feature descriptions are merged using a Concatenation operation (collocation), and a spatial attention feature map ms (f) is generated by a convolution operation, and the calculation formula is ms (f) ═ σ (f)7×7([AvgPool(F);MaxPool(F)]) Wherein f) is7×7The convolution operation of 7 × 7 is shown, the improved module is called PRAB (parallel Residual Attention Block) according to the present invention, the structure is shown in FIG. 3, when the size of the characteristic diagram is [ H/4, W/4 ]]The number of channels is changed to 256, the second stage starts to change the number of channels of the feature map to 32 through a 3 × 3 convolution operation with a step size of 1, meanwhile, a low-resolution branch is generated on the basis of the previous stage, and the feature map with the size of [ H/8, W/8 ] is obtained through a 3 × 3 convolution operation with a step size of 2]Changing the number of channels from 256 to 64, then respectively extracting the features of the two branches by using 4 Basicblocks, adding channel attention and spatial attention into the Basicblocks in the same first stage, wherein the structure is shown in FIG. 4, then performing repeated multi-scale fusion, performing convolution and downsampling on the high resolution to the low resolution by using 3 × 3 with the step length of 2, adding the low resolution to obtain the output of the low resolution branch, simultaneously performing upsampling on the low resolution to the high resolution by using the nearest neighbor difference value, adding the high resolution to obtain the output of the high resolution branch, and finally obtaining the feature graph output by the two branches, wherein the feature graph size is [ H/4, W/4, 32/4 [],[H/8,W/8,64](ii) a The input of the third stage is the branch obtained by multi-scale fusion of the second stage, and is simultaneously in [ H/8, W/8,64 ]]Generating a new low resolution branch [ H/16, W/16,128 ] on the basis of the branch]Then each branch is carried out by 4 pieces of basic blocks added into the attention module respectivelyFeature extraction, in the same second stage, obtaining 3 branches [ H/4, W/4,32 ] through multi-scale fusion],[H/8,W/8,64],[H/16,W/16,128](ii) a The fourth stage is the same as the third stage, and 4 branches [ H/4, W/4,32 ] are obtained],[H/8,W/8,64],[H/16,W/16,128],[H/32,W/32,256](ii) a The fifth stage upsamples the 3 branches of low resolution, and [ H/4, W/4,32 ]]The branches are combined, and a convolution operation of 1 × 1 is performed to obtain the final output result, namely a heat map of key points, such as 17 key points of a COCO data set, and the size of the finally obtained heat map is [ H/4, W/4,17 [ ]]。
Step S23: and (5) performing model training on the training data set processed in the step S21 by using the network in the step S22 for a specified number of times, specifically, using an Adam optimizer for model training, setting the initial learning rate to be 1e-3, reducing the learning rate to be 1e-4 and 1e-5 when the epoch is 170 and 200 respectively, and stopping training when the epoch is 310.
Step S24: and evaluating and storing the model trained in the step S23, specifically, evaluating the MPII data set by using PCKh (head normalized probability of correct key points) and evaluating the COCO data set by using OKS (similarity of target key points), and storing the final model after the network training specifies epoch.
Step S3: and (4) testing the key point model trained in the step (S2) on the test data set acquired in the step (S1), specifically, inputting the test set data into the acquired key point model, calculating an average value of the heat maps of the original image and the turned image to obtain a final predicted heat map, and calculating a position predicted value of each key point at a 1/4 offset position from a highest point of a heat value to a next highest point.
Aiming at the problem that the conventional human body key point detection method is high in error rate especially for difficult key point detection in a complex scene, the human body key point detection method based on the double-attention mechanism provided by the invention improves the accuracy of key point detection by respectively adding a channel attention module and a space attention module to a Basicblock for extracting features of a high-resolution network (HRNet) and a Bottleneck residual block after parallel improvement. Compared with the existing method, the method provided by the invention is more accurate in detection of key points of the human body, especially in detection of difficult key points.
Claims (1)
1. A human body key point detection method based on a double-attention machine mechanism is characterized by comprising the following steps:
step S1: acquiring a human body key point detection data set, wherein the human body key point detection data set comprises a training data set and a test data set, the data comprises pictures containing different human body postures and mark files of true values of joints of a human body, public data sets MPII and COCO2107 are used in the data set, the MPII human body posture data set comprises 25k pictures and 40k human body examples with 16 key points, and the training set and the test set are respectively 28k examples and 12k examples; the COCO2107 key point detection data set comprises 200k pictures and 250k human body examples with 17 key points, wherein a training set train2017 comprises 58k pictures and 150k human body examples, and a verification set val2017 and a test set test2017 respectively comprise 5k pictures and 20k pictures;
step S2: training a human body key point detection network, wherein the specific mode comprises the following steps of S21, S22, S23 and S24:
step S21, preprocessing the training data set and the test data set obtained in the step S1, wherein the specific mode is that the human body key point detection network only detects key points of a human body, for the detection of the human body in the picture, the MPII data set uses a provided human body frame, the COCO data set uses a faster-RCNN for human body detection to obtain a human body detection frame, the height-width ratio of the human body detection frames of the MPII and COCO training data sets is fixed to be 4:3, then the human body detection frame is cut out from the picture, the sizes of the human body detection frames are respectively adjusted to be 256 × 256 and 256 × 192, meanwhile, data enhancement is carried out on the data, and the data enhancement comprises random rotation (-45 degrees, 45 degrees), random scaling (0.65, 1.35) and turning;
step S22: the human body key point detection network is built, a channel attention module and a space attention module are added in a residual block for extracting characteristics, and the specific mode is that the human body key point detection network takes a high-resolution network (HRnet) as an integral framework, the network is connected with high-to-low sub-networks in parallel, and the reliable high-resolution representation is generated by repeatedly fusing high-to-low sub-networks in the whole processThe HRnet overall network structure is divided into five stages, wherein in the first stage, an input image is convoluted twice by two 3 × 3 convolution operations with the step size of 2, so that the height (H) and the width (W) of the image are changed into the sizes of H/4 and W/4, the number of channels is 64, then feature extraction is carried out by 4 improved Bottleneck, namely, a Bottleneck residual block in ResNet is improved in parallel, a 3 × 3 convolution layer in the ResNet and a 3 × 3 and a group 32 convolution layer in ResNext are connected in parallel, then channel attention and space attention are added, and the channel attention is that a feature map after convolution operation is compressed on a space dimension by maximum pooling and average pooling to obtain two different space background descriptions, namely FcavgAnd FcmaxThen, the two descriptions are calculated by using a shared network composed of multilayer perceptrons, the calculation formula is Mc (F) ═ sigma (M L P (AvgPool (F)) + M L P (MaxPool (F))), a channel attention feature map Mc (F) is obtained, space attention is obtained, namely, the maximum pooling and the average pooling are used on the channel dimension of the obtained channel attention feature map, and two different feature descriptions Fs are obtainedavgAnd FsmaxThen, the two feature descriptions are merged using a Concatenation operation (collocation), and a spatial attention feature map ms (f) is generated by a convolution operation, and the calculation formula is ms (f) ═ σ (f)7×7([AvgPool(F);MaxPool(F)]) Wherein f) is7×7Representing the convolution operation of 7 × 7 the improved module is called the parallel residual Attention Block PRAB (parallel residual Attention Block) when the feature map size is [ H/4, W/4%]The number of channels is changed to 256, the second stage starts to change the number of channels of the feature map to 32 through a 3 × 3 convolution operation with a step size of 1, meanwhile, a low-resolution branch is generated on the basis of the previous stage, and the feature map with the size of [ H/8, W/8 ] is obtained through a 3 × 3 convolution operation with a step size of 2]Changing the number of channels from 256 to 64, then respectively extracting features of the two branches by using 4 Basicblocks, adding channel attention and spatial attention to the Basicblocks in the same first stage, then performing repeated multi-scale fusion, carrying out convolution and down-sampling on the high resolution to the size of the low resolution by using 3 × 3 convolution with the step size of 2, adding the low resolution to obtain the output of the low resolution branch, and simultaneously carrying out feature extraction on the low resolution by using the latest low resolutionThe adjacent difference value is up-sampled to the high resolution and added with the high resolution to obtain the output of the high resolution branch, and finally the size of the characteristic graph output by the two branches is [ H/4, W/4,32 ]],[H/8,W/8,64](ii) a The input of the third stage is the branch obtained by multi-scale fusion of the second stage, and is simultaneously in [ H/8, W/8,64 ]]Generating a new low resolution branch [ H/16, W/16,128 ] on the basis of the branch]Then, each branch is respectively subjected to feature extraction by 4 basic blocks added into the attention module, and in the same second stage, 3 branches [ H/4, W/4,32 ] are obtained after multi-scale fusion],[H/8,W/8,64],[H/16,W/16,128](ii) a The fourth stage is the same as the third stage, and 4 branches [ H/4, W/4,32 ] are obtained],[H/8,W/8,64],[H/16,W/16,128],[H/32,W/32,256](ii) a The fifth stage upsamples the 3 branches of low resolution, and [ H/4, W/4,32 ]]The branches are combined, and a convolution operation of 1 × 1 is performed to obtain the final output result, namely a heat map of key points, such as 17 key points of a COCO data set, and the size of the finally obtained heat map is [ H/4, W/4,17 [ ]];
Step S23: performing model training on the training data set processed in the step S21 by using the network in the step S22 for a specified number of times, specifically, using an Adam optimizer for model training, setting an initial learning rate to be 1e-3, reducing the learning rate to be 1e-4 and 1e-5 when the epoch is 170 and 200, and stopping training when the epoch is 310;
step S24: evaluating and storing the model trained in the step S23, specifically, evaluating an MPII data set by using PCKh (head normalized probability of correct key points), evaluating a COCO data set by using OKS (similarity of target key points), and storing a final model after network training specifies epoch;
step S3: and (4) testing the key point model trained in the step (S2) on the test data set acquired in the step (S1), specifically, inputting the test set data into the acquired key point model, calculating an average value of the heat maps of the original image and the turned image to obtain a final predicted heat map, and calculating a position predicted value of each key point at a 1/4 offset position from a highest point of a heat value to a next highest point.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010284037.7A CN111476184B (en) | 2020-04-13 | 2020-04-13 | Human body key point detection method based on double-attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010284037.7A CN111476184B (en) | 2020-04-13 | 2020-04-13 | Human body key point detection method based on double-attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111476184A true CN111476184A (en) | 2020-07-31 |
CN111476184B CN111476184B (en) | 2023-12-22 |
Family
ID=71752170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010284037.7A Active CN111476184B (en) | 2020-04-13 | 2020-04-13 | Human body key point detection method based on double-attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111476184B (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112084911A (en) * | 2020-08-28 | 2020-12-15 | 安徽清新互联信息科技有限公司 | Human face feature point positioning method and system based on global attention |
CN112084928A (en) * | 2020-09-04 | 2020-12-15 | 东南大学 | Road traffic accident detection method based on visual attention mechanism and ConvLSTM network |
CN112149613A (en) * | 2020-10-12 | 2020-12-29 | 萱闱(北京)生物科技有限公司 | Motion estimation evaluation method based on improved LSTM model |
CN112149558A (en) * | 2020-09-22 | 2020-12-29 | 驭势科技(南京)有限公司 | Image processing method, network and electronic equipment for key point detection |
CN112270213A (en) * | 2020-10-12 | 2021-01-26 | 萱闱(北京)生物科技有限公司 | Improved HRnet based on attention mechanism |
CN112347865A (en) * | 2020-10-21 | 2021-02-09 | 四川长虹电器股份有限公司 | Bill correction method based on key point detection |
CN112541409A (en) * | 2020-11-30 | 2021-03-23 | 北京建筑大学 | Attention-integrated residual network expression recognition method |
CN112712015A (en) * | 2020-12-28 | 2021-04-27 | 康佳集团股份有限公司 | Human body key point identification method and device, intelligent terminal and storage medium |
CN113011304A (en) * | 2021-03-12 | 2021-06-22 | 山东大学 | Human body posture estimation method and system based on attention multi-resolution network |
CN113034545A (en) * | 2021-03-26 | 2021-06-25 | 河海大学 | Vehicle tracking method based on CenterNet multi-target tracking algorithm |
CN113420641A (en) * | 2021-06-21 | 2021-09-21 | 梅卡曼德(北京)机器人科技有限公司 | Image data processing method, image data processing device, electronic equipment and storage medium |
CN113469193A (en) * | 2021-06-16 | 2021-10-01 | 中国科学院合肥物质科学研究院 | Low-power-consumption pest image identification method based on addition multiplication mixed convolution |
CN113920535A (en) * | 2021-10-12 | 2022-01-11 | 广东电网有限责任公司广州供电局 | Electronic region detection method based on YOLOv5 |
CN113947814A (en) * | 2021-10-28 | 2022-01-18 | 山东大学 | Cross-visual angle gait recognition method based on space-time information enhancement and multi-scale saliency feature extraction |
CN114373226A (en) * | 2021-12-31 | 2022-04-19 | 华南理工大学 | Human body posture estimation method based on improved HRNet network in operating room scene |
CN114373091A (en) * | 2020-10-14 | 2022-04-19 | 南京工业大学 | Gait recognition method based on deep learning fusion SVM |
CN114998453A (en) * | 2022-08-08 | 2022-09-02 | 国网浙江省电力有限公司宁波供电公司 | Stereo matching model based on high-scale unit and application method thereof |
CN115019338A (en) * | 2022-04-27 | 2022-09-06 | 淮阴工学院 | Multi-person posture estimation method and system based on GAMIHR-Net |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180349359A1 (en) * | 2017-05-19 | 2018-12-06 | salesforce.com,inc. | Natural language processing using a neural network |
CN110276316A (en) * | 2019-06-26 | 2019-09-24 | 电子科技大学 | A kind of human body critical point detection method based on deep learning |
-
2020
- 2020-04-13 CN CN202010284037.7A patent/CN111476184B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180349359A1 (en) * | 2017-05-19 | 2018-12-06 | salesforce.com,inc. | Natural language processing using a neural network |
CN110678881A (en) * | 2017-05-19 | 2020-01-10 | 易享信息技术有限公司 | Natural language processing using context-specific word vectors |
CN110276316A (en) * | 2019-06-26 | 2019-09-24 | 电子科技大学 | A kind of human body critical point detection method based on deep learning |
Non-Patent Citations (1)
Title |
---|
王子牛;王宏杰;高建瓴;: "基于语义强化和特征融合的文本分类", 软件 * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112084911A (en) * | 2020-08-28 | 2020-12-15 | 安徽清新互联信息科技有限公司 | Human face feature point positioning method and system based on global attention |
CN112084911B (en) * | 2020-08-28 | 2023-03-07 | 安徽清新互联信息科技有限公司 | Human face feature point positioning method and system based on global attention |
CN112084928A (en) * | 2020-09-04 | 2020-12-15 | 东南大学 | Road traffic accident detection method based on visual attention mechanism and ConvLSTM network |
CN112149558A (en) * | 2020-09-22 | 2020-12-29 | 驭势科技(南京)有限公司 | Image processing method, network and electronic equipment for key point detection |
CN112149613A (en) * | 2020-10-12 | 2020-12-29 | 萱闱(北京)生物科技有限公司 | Motion estimation evaluation method based on improved LSTM model |
CN112270213A (en) * | 2020-10-12 | 2021-01-26 | 萱闱(北京)生物科技有限公司 | Improved HRnet based on attention mechanism |
CN112149613B (en) * | 2020-10-12 | 2024-01-05 | 萱闱(北京)生物科技有限公司 | Action pre-estimation evaluation method based on improved LSTM model |
CN114373091A (en) * | 2020-10-14 | 2022-04-19 | 南京工业大学 | Gait recognition method based on deep learning fusion SVM |
CN112347865A (en) * | 2020-10-21 | 2021-02-09 | 四川长虹电器股份有限公司 | Bill correction method based on key point detection |
CN112541409B (en) * | 2020-11-30 | 2021-09-14 | 北京建筑大学 | Attention-integrated residual network expression recognition method |
CN112541409A (en) * | 2020-11-30 | 2021-03-23 | 北京建筑大学 | Attention-integrated residual network expression recognition method |
CN112712015B (en) * | 2020-12-28 | 2024-05-28 | 康佳集团股份有限公司 | Human body key point identification method and device, intelligent terminal and storage medium |
CN112712015A (en) * | 2020-12-28 | 2021-04-27 | 康佳集团股份有限公司 | Human body key point identification method and device, intelligent terminal and storage medium |
CN113011304A (en) * | 2021-03-12 | 2021-06-22 | 山东大学 | Human body posture estimation method and system based on attention multi-resolution network |
CN113034545A (en) * | 2021-03-26 | 2021-06-25 | 河海大学 | Vehicle tracking method based on CenterNet multi-target tracking algorithm |
CN113469193A (en) * | 2021-06-16 | 2021-10-01 | 中国科学院合肥物质科学研究院 | Low-power-consumption pest image identification method based on addition multiplication mixed convolution |
CN113469193B (en) * | 2021-06-16 | 2023-08-22 | 中国科学院合肥物质科学研究院 | Low-power consumption pest image identification method based on addition multiplication mixed convolution |
CN113420641A (en) * | 2021-06-21 | 2021-09-21 | 梅卡曼德(北京)机器人科技有限公司 | Image data processing method, image data processing device, electronic equipment and storage medium |
CN113420641B (en) * | 2021-06-21 | 2024-06-14 | 梅卡曼德(北京)机器人科技有限公司 | Image data processing method, device, electronic equipment and storage medium |
CN113920535B (en) * | 2021-10-12 | 2023-11-17 | 广东电网有限责任公司广州供电局 | Electronic region detection method based on YOLOv5 |
CN113920535A (en) * | 2021-10-12 | 2022-01-11 | 广东电网有限责任公司广州供电局 | Electronic region detection method based on YOLOv5 |
CN113947814A (en) * | 2021-10-28 | 2022-01-18 | 山东大学 | Cross-visual angle gait recognition method based on space-time information enhancement and multi-scale saliency feature extraction |
CN113947814B (en) * | 2021-10-28 | 2024-05-28 | 山东大学 | Cross-view gait recognition method based on space-time information enhancement and multi-scale saliency feature extraction |
CN114373226A (en) * | 2021-12-31 | 2022-04-19 | 华南理工大学 | Human body posture estimation method based on improved HRNet network in operating room scene |
CN114373226B (en) * | 2021-12-31 | 2024-09-06 | 华南理工大学 | Human body posture estimation method based on improved HRNet network in operating room scene |
CN115019338B (en) * | 2022-04-27 | 2023-09-22 | 淮阴工学院 | Multi-person gesture estimation method and system based on GAMHR-Net |
CN115019338A (en) * | 2022-04-27 | 2022-09-06 | 淮阴工学院 | Multi-person posture estimation method and system based on GAMIHR-Net |
CN114998453A (en) * | 2022-08-08 | 2022-09-02 | 国网浙江省电力有限公司宁波供电公司 | Stereo matching model based on high-scale unit and application method thereof |
Also Published As
Publication number | Publication date |
---|---|
CN111476184B (en) | 2023-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111476184B (en) | Human body key point detection method based on double-attention mechanism | |
CN110782462B (en) | Semantic segmentation method based on double-flow feature fusion | |
CN110188768B (en) | Real-time image semantic segmentation method and system | |
CN108062754B (en) | Segmentation and identification method and device based on dense network image | |
CN111291739B (en) | Face detection and image detection neural network training method, device and equipment | |
CN110929736A (en) | Multi-feature cascade RGB-D significance target detection method | |
CN112884073B (en) | Image rain removing method, system, terminal and storage medium | |
CN112232134B (en) | Human body posture estimation method based on hourglass network and attention mechanism | |
CN113989283B (en) | 3D human body posture estimation method and device, electronic equipment and storage medium | |
CN114529982A (en) | Lightweight human body posture estimation method and system based on stream attention | |
CN116030537A (en) | Three-dimensional human body posture estimation method based on multi-branch attention-seeking convolution | |
CN108229432A (en) | Face calibration method and device | |
CN114255514A (en) | Human body tracking system and method based on Transformer and camera device | |
CN113066089A (en) | Real-time image semantic segmentation network based on attention guide mechanism | |
CN116092190A (en) | Human body posture estimation method based on self-attention high-resolution network | |
CN115588116A (en) | Pedestrian action identification method based on double-channel attention mechanism | |
CN103208109A (en) | Local restriction iteration neighborhood embedding-based face hallucination method | |
Zhou et al. | Towards locality similarity preserving to 3D human pose estimation | |
CN117115855A (en) | Human body posture estimation method and system based on multi-scale transducer learning rich visual features | |
CN111401335A (en) | Key point detection method and device and storage medium | |
WO2022252519A1 (en) | Image processing method and apparatus, terminal, medium, and program | |
WO2021176985A1 (en) | Signal processing device, signal processing method, and program | |
Huo et al. | Deep high-resolution network with double attention residual blocks for human pose estimation | |
CN112528899B (en) | Image salient object detection method and system based on implicit depth information recovery | |
CN113744255A (en) | Skin mirror image segmentation method, segmentation network and segmentation network construction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |