CN111476184A - Human body key point detection method based on double-attention machine system - Google Patents

Human body key point detection method based on double-attention machine system Download PDF

Info

Publication number
CN111476184A
CN111476184A CN202010284037.7A CN202010284037A CN111476184A CN 111476184 A CN111476184 A CN 111476184A CN 202010284037 A CN202010284037 A CN 202010284037A CN 111476184 A CN111476184 A CN 111476184A
Authority
CN
China
Prior art keywords
human body
data set
key point
attention
point detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010284037.7A
Other languages
Chinese (zh)
Other versions
CN111476184B (en
Inventor
霍占强
靳晗
乔应旭
宋素玲
雒芬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Technology
Original Assignee
Henan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Technology filed Critical Henan University of Technology
Priority to CN202010284037.7A priority Critical patent/CN111476184B/en
Publication of CN111476184A publication Critical patent/CN111476184A/en
Application granted granted Critical
Publication of CN111476184B publication Critical patent/CN111476184B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a human body key point detection method based on a double-attention machine mechanism, which comprises the following steps: the method comprises the steps of obtaining a human body key point detection data set which comprises a training data set and a test data set, preprocessing the training data set and the test data set, building a human body key point detection network, adding a channel attention and space attention module in a residual block with extracted features, carrying out appointed times of model training on the preprocessed training data set by using the human body key point detection network, evaluating and storing a trained network model, and testing on the test data set by using the human body key point detection network model. The method provided by the invention is more accurate in detection of key points of the human body, especially in detection of difficult key points.

Description

Human body key point detection method based on double-attention machine system
Technical Field
The invention relates to the field of computer vision and the field of deep learning, in particular to a human key point detection method based on deep learning.
Background
Human body key point detection is the basis of many computer vision tasks, and plays a fundamental role in research in related fields such as behavior recognition, person tracking, gait recognition and the like. The human body key point detection mainly detects some key points (joint points such as wrists, knees, ankles, faces and the like) of a human body in an image, and is important for describing human body postures and predicting human body behaviors. Therefore, researchers in this field have attempted to solve this problem in the last decade using different approaches, from the first graphical structures and graph models, to the later depth maps. Although some progress has been made in these conventional methods, the accuracy is low and it is difficult to put the methods into practical use. Until 2014, deep neural networks are applied to human key point detection problems for the first time by deep Pose, and human key points are predicted through cascaded convolutional neural networks. In 2016, with the outbreak of deep learning, a series of algorithms based on deep learning appeared, such as Hourglass, CPM, OpenPose, G-RMI, RMPE, CPN, HRNet [1], and the detection accuracy of human key points is continuously improved. Although the accuracy of the human body key point detection is improved by the algorithms as a whole, the accuracy of the algorithms still needs to be improved in a complex scene, particularly for the detection of difficult key points.
Attention has been paid in recent years to the successful use of these mechanisms in image processing, speech recognition and natural language processing. The attention mechanism in computer vision is a brain signal processing mechanism specific to human vision, the human vision obtains a target area needing important attention by rapidly scanning a global image, and then more attention resources are put into the area to obtain more detailed information needed by the attention target and suppress other useless information. The current Attention mechanism mainly includes Channel Attention (Channel Attention) focusing on different features of a picture and Spatial Attention (Spatial Attention) focusing on different regions of the picture, and represents a Channel Attention module proposed by an algorithm SENET [2], so that the sensitivity of a model to different Channel features is improved, and performance is improved. And CBAM [3] and BAM [4] introduce two attention mechanisms at the same time, compare with the attention mechanism that SENet [2] only focuses on the channel, increase the attention to different areas of the picture, the overall network performance is further promoted. Therefore, the channel attention and the space attention mechanism are added into the human key point detection algorithm at the same time, so that the model can be helped to endow different weights to each part of the input features, more key and important information can be extracted, and the model can be judged more accurately.
Reference documents:
1.Ke Sun,Bin Xiao,Dong Liu,and Jingdong Wang.Deep High-ResolutionRepresentation Learning for Human Pose Estimation.In:Proc.Of Computer Visionand Pattern Recognition(CVPR).2019.
2.J.Hu,L.Shen,and G.Sun.Squeeze-and-Excitation Networks.In:Proc.OfComputer Vision and Pattern Recognition(CVPR).2018.
3.S.Woo,J.Park,J.Y.Lee,and I.So Kweon.CBAM:Convolutional BlockAttention Module.In:Proc.of European Conf.on Computer Vision(ECCV).2018.
4.J.Park,S.Woo,J.Y.Lee,and I.So Kweon.BAM:Bottleneck AttentionModule.In:Proc.of British Machine Vision Conference(BMVC).2018.
disclosure of Invention
Aiming at the problem that the conventional human body key point detection method has high error rate in detection of difficult key points in a complex scene, the invention designs a network structure based on a double-attention machine system to detect human body key points, and mainly comprises the following steps:
step S1: acquiring a human body key point detection data set, which comprises a training data set and a test data set;
step S2: training a human body key point detection network;
step S21: preprocessing the training data set and the testing data set acquired in the step S1;
step S22: building a human body key point detection network, and adding a channel attention and space attention module in the residual block for extracting the characteristics;
step S23: performing model training for the training data set processed in the step S21 by using the network in the step S22 for a specified number of times;
step S24: evaluating and storing the model trained in step S23;
step S3: the keypoint model trained in step S2 is tested on the test data set acquired in step S1.
Aiming at the problem that the conventional human body key point detection method is high in error rate especially for difficult key point detection in a complex scene, the human body key point detection method based on the double-attention mechanism provided by the invention improves the accuracy of key point detection by respectively adding a channel attention module and a space attention module to a Basicblock for extracting features of a high-resolution network (HRNet) and a Bottleneck residual block after parallel improvement. Compared with the existing method, the method provided by the invention is more accurate in detection of key points of the human body, especially in detection of difficult key points.
Drawings
FIG. 1 is a flow chart of a human body key point detection method based on a double-attention machine mechanism.
Fig. 2 is a block diagram of a high resolution network used in the present invention.
Fig. 3 is a structural diagram of the Bottleneck attention adding module after parallel modification according to the present invention.
FIG. 4 is a block diagram of the add attention to Basicblock module of the present invention.
Detailed Description
Fig. 1 shows a flowchart of a human body key point detection method based on a double-attention machine system, which is provided by the invention, and the method mainly comprises the following steps: the method comprises the steps of obtaining a human body key point detection data set which comprises a training data set and a test data set, preprocessing the training data set and the test data set, building a human body key point detection network, adding a channel attention and space attention module in a residual block with extracted features, carrying out appointed times of model training on the preprocessed training data set by using the human body key point detection network, evaluating and storing a trained network model, and testing on the test data set by using the human body key point detection network model. The specific implementation details of each step are as follows:
step S1: acquiring a human body key point detection data set, wherein the human body key point detection data set comprises a training data set and a test data set, the data comprises pictures containing different human body postures and mark files of true values of joints of a human body, public data sets MPII and COCO2107 are used in the data set, the MPII human body posture data set comprises 25k pictures and 40k human body examples with 16 key points, and the training set and the test set are respectively 28k examples and 12k examples; the COCO2107 keypoint detection dataset contains 200k pictures and 250k human instances with 17 keypoints, wherein the training set train2017 comprises 58k pictures and 150k human instances, and the verification set val2017 and the test set test2017 respectively contain 5k pictures and 20k pictures.
Step S2: training a human body key point detection network, wherein the specific mode comprises the following steps of S21, S22, S23 and S24:
step S21, preprocessing the training data set and the test data set obtained in the step S1, wherein the human body key point detection network only detects key points of a human body, for detecting the human body in the picture, the MPII data set uses a provided human body frame, the COCO data set uses a faster-RCNN to carry out human body detection to obtain a human body detection frame, the height-width ratio of the human body detection frames of the MPII and COCO training data sets is fixed to be 4:3, then the human body detection frame is cut out from the picture, the sizes of the human body detection frame are respectively adjusted to be 256 × 256 and 256 × 192, meanwhile, data enhancement is carried out to the data, the data enhancement comprises random rotation (-45 degrees, 45 degrees), random scaling (0.65, 1.35) and turning.
Step S22, constructing a human body key point detection network, adding a channel attention and space attention module in a feature extraction residual block, wherein the human body key point detection network is specifically characterized in that a high-resolution network (HRnet) is used as an integral framework, the network is connected with high-to-low subnetworks in parallel, the whole process generates a reliable high-resolution representation by repeatedly fusing high-to-low subnetworks, the HRnet integral network structure is divided into five stages, as shown in FIG. 2, in the first stage, an input image is convolved twice by 3 × 3 convolution operation with two step lengths of 2, so that the height (H) and the width (W) of the image are changed into the sizes of H/4 and W/4, the number of channels is 64, then feature extraction is carried out by 4 improved Bottleneck residual blocks, namely, parallel improvement is carried out on the Bottleneck residual block in Net Res, 3 × 3 convolution layers and 3 × 3 in ResNeXt are connected, then the channel attention and space attention modules are added, and the channel attention modules are connected in parallelCompressing the feature map subjected to convolution operation on a spatial dimension by using maximum pooling and average pooling to obtain two different spatial background descriptions: fc (Fc)avgAnd FcmaxThen, the two descriptions are calculated by using a shared network composed of multilayer perceptrons, the calculation formula is Mc (F) ═ sigma (M L P (AvgPool (F)) + M L P (MaxPool (F))), a channel attention feature map Mc (F) is obtained, space attention is obtained, namely, the maximum pooling and the average pooling are used on the channel dimension of the obtained channel attention feature map, and two different feature descriptions Fs are obtainedavgAnd FsmaxThen, the two feature descriptions are merged using a Concatenation operation (collocation), and a spatial attention feature map ms (f) is generated by a convolution operation, and the calculation formula is ms (f) ═ σ (f)7×7([AvgPool(F);MaxPool(F)]) Wherein f) is7×7The convolution operation of 7 × 7 is shown, the improved module is called PRAB (parallel Residual Attention Block) according to the present invention, the structure is shown in FIG. 3, when the size of the characteristic diagram is [ H/4, W/4 ]]The number of channels is changed to 256, the second stage starts to change the number of channels of the feature map to 32 through a 3 × 3 convolution operation with a step size of 1, meanwhile, a low-resolution branch is generated on the basis of the previous stage, and the feature map with the size of [ H/8, W/8 ] is obtained through a 3 × 3 convolution operation with a step size of 2]Changing the number of channels from 256 to 64, then respectively extracting the features of the two branches by using 4 Basicblocks, adding channel attention and spatial attention into the Basicblocks in the same first stage, wherein the structure is shown in FIG. 4, then performing repeated multi-scale fusion, performing convolution and downsampling on the high resolution to the low resolution by using 3 × 3 with the step length of 2, adding the low resolution to obtain the output of the low resolution branch, simultaneously performing upsampling on the low resolution to the high resolution by using the nearest neighbor difference value, adding the high resolution to obtain the output of the high resolution branch, and finally obtaining the feature graph output by the two branches, wherein the feature graph size is [ H/4, W/4, 32/4 [],[H/8,W/8,64](ii) a The input of the third stage is the branch obtained by multi-scale fusion of the second stage, and is simultaneously in [ H/8, W/8,64 ]]Generating a new low resolution branch [ H/16, W/16,128 ] on the basis of the branch]Then each branch is carried out by 4 pieces of basic blocks added into the attention module respectivelyFeature extraction, in the same second stage, obtaining 3 branches [ H/4, W/4,32 ] through multi-scale fusion],[H/8,W/8,64],[H/16,W/16,128](ii) a The fourth stage is the same as the third stage, and 4 branches [ H/4, W/4,32 ] are obtained],[H/8,W/8,64],[H/16,W/16,128],[H/32,W/32,256](ii) a The fifth stage upsamples the 3 branches of low resolution, and [ H/4, W/4,32 ]]The branches are combined, and a convolution operation of 1 × 1 is performed to obtain the final output result, namely a heat map of key points, such as 17 key points of a COCO data set, and the size of the finally obtained heat map is [ H/4, W/4,17 [ ]]。
Step S23: and (5) performing model training on the training data set processed in the step S21 by using the network in the step S22 for a specified number of times, specifically, using an Adam optimizer for model training, setting the initial learning rate to be 1e-3, reducing the learning rate to be 1e-4 and 1e-5 when the epoch is 170 and 200 respectively, and stopping training when the epoch is 310.
Step S24: and evaluating and storing the model trained in the step S23, specifically, evaluating the MPII data set by using PCKh (head normalized probability of correct key points) and evaluating the COCO data set by using OKS (similarity of target key points), and storing the final model after the network training specifies epoch.
Step S3: and (4) testing the key point model trained in the step (S2) on the test data set acquired in the step (S1), specifically, inputting the test set data into the acquired key point model, calculating an average value of the heat maps of the original image and the turned image to obtain a final predicted heat map, and calculating a position predicted value of each key point at a 1/4 offset position from a highest point of a heat value to a next highest point.
Aiming at the problem that the conventional human body key point detection method is high in error rate especially for difficult key point detection in a complex scene, the human body key point detection method based on the double-attention mechanism provided by the invention improves the accuracy of key point detection by respectively adding a channel attention module and a space attention module to a Basicblock for extracting features of a high-resolution network (HRNet) and a Bottleneck residual block after parallel improvement. Compared with the existing method, the method provided by the invention is more accurate in detection of key points of the human body, especially in detection of difficult key points.

Claims (1)

1. A human body key point detection method based on a double-attention machine mechanism is characterized by comprising the following steps:
step S1: acquiring a human body key point detection data set, wherein the human body key point detection data set comprises a training data set and a test data set, the data comprises pictures containing different human body postures and mark files of true values of joints of a human body, public data sets MPII and COCO2107 are used in the data set, the MPII human body posture data set comprises 25k pictures and 40k human body examples with 16 key points, and the training set and the test set are respectively 28k examples and 12k examples; the COCO2107 key point detection data set comprises 200k pictures and 250k human body examples with 17 key points, wherein a training set train2017 comprises 58k pictures and 150k human body examples, and a verification set val2017 and a test set test2017 respectively comprise 5k pictures and 20k pictures;
step S2: training a human body key point detection network, wherein the specific mode comprises the following steps of S21, S22, S23 and S24:
step S21, preprocessing the training data set and the test data set obtained in the step S1, wherein the specific mode is that the human body key point detection network only detects key points of a human body, for the detection of the human body in the picture, the MPII data set uses a provided human body frame, the COCO data set uses a faster-RCNN for human body detection to obtain a human body detection frame, the height-width ratio of the human body detection frames of the MPII and COCO training data sets is fixed to be 4:3, then the human body detection frame is cut out from the picture, the sizes of the human body detection frames are respectively adjusted to be 256 × 256 and 256 × 192, meanwhile, data enhancement is carried out on the data, and the data enhancement comprises random rotation (-45 degrees, 45 degrees), random scaling (0.65, 1.35) and turning;
step S22: the human body key point detection network is built, a channel attention module and a space attention module are added in a residual block for extracting characteristics, and the specific mode is that the human body key point detection network takes a high-resolution network (HRnet) as an integral framework, the network is connected with high-to-low sub-networks in parallel, and the reliable high-resolution representation is generated by repeatedly fusing high-to-low sub-networks in the whole processThe HRnet overall network structure is divided into five stages, wherein in the first stage, an input image is convoluted twice by two 3 × 3 convolution operations with the step size of 2, so that the height (H) and the width (W) of the image are changed into the sizes of H/4 and W/4, the number of channels is 64, then feature extraction is carried out by 4 improved Bottleneck, namely, a Bottleneck residual block in ResNet is improved in parallel, a 3 × 3 convolution layer in the ResNet and a 3 × 3 and a group 32 convolution layer in ResNext are connected in parallel, then channel attention and space attention are added, and the channel attention is that a feature map after convolution operation is compressed on a space dimension by maximum pooling and average pooling to obtain two different space background descriptions, namely FcavgAnd FcmaxThen, the two descriptions are calculated by using a shared network composed of multilayer perceptrons, the calculation formula is Mc (F) ═ sigma (M L P (AvgPool (F)) + M L P (MaxPool (F))), a channel attention feature map Mc (F) is obtained, space attention is obtained, namely, the maximum pooling and the average pooling are used on the channel dimension of the obtained channel attention feature map, and two different feature descriptions Fs are obtainedavgAnd FsmaxThen, the two feature descriptions are merged using a Concatenation operation (collocation), and a spatial attention feature map ms (f) is generated by a convolution operation, and the calculation formula is ms (f) ═ σ (f)7×7([AvgPool(F);MaxPool(F)]) Wherein f) is7×7Representing the convolution operation of 7 × 7 the improved module is called the parallel residual Attention Block PRAB (parallel residual Attention Block) when the feature map size is [ H/4, W/4%]The number of channels is changed to 256, the second stage starts to change the number of channels of the feature map to 32 through a 3 × 3 convolution operation with a step size of 1, meanwhile, a low-resolution branch is generated on the basis of the previous stage, and the feature map with the size of [ H/8, W/8 ] is obtained through a 3 × 3 convolution operation with a step size of 2]Changing the number of channels from 256 to 64, then respectively extracting features of the two branches by using 4 Basicblocks, adding channel attention and spatial attention to the Basicblocks in the same first stage, then performing repeated multi-scale fusion, carrying out convolution and down-sampling on the high resolution to the size of the low resolution by using 3 × 3 convolution with the step size of 2, adding the low resolution to obtain the output of the low resolution branch, and simultaneously carrying out feature extraction on the low resolution by using the latest low resolutionThe adjacent difference value is up-sampled to the high resolution and added with the high resolution to obtain the output of the high resolution branch, and finally the size of the characteristic graph output by the two branches is [ H/4, W/4,32 ]],[H/8,W/8,64](ii) a The input of the third stage is the branch obtained by multi-scale fusion of the second stage, and is simultaneously in [ H/8, W/8,64 ]]Generating a new low resolution branch [ H/16, W/16,128 ] on the basis of the branch]Then, each branch is respectively subjected to feature extraction by 4 basic blocks added into the attention module, and in the same second stage, 3 branches [ H/4, W/4,32 ] are obtained after multi-scale fusion],[H/8,W/8,64],[H/16,W/16,128](ii) a The fourth stage is the same as the third stage, and 4 branches [ H/4, W/4,32 ] are obtained],[H/8,W/8,64],[H/16,W/16,128],[H/32,W/32,256](ii) a The fifth stage upsamples the 3 branches of low resolution, and [ H/4, W/4,32 ]]The branches are combined, and a convolution operation of 1 × 1 is performed to obtain the final output result, namely a heat map of key points, such as 17 key points of a COCO data set, and the size of the finally obtained heat map is [ H/4, W/4,17 [ ]];
Step S23: performing model training on the training data set processed in the step S21 by using the network in the step S22 for a specified number of times, specifically, using an Adam optimizer for model training, setting an initial learning rate to be 1e-3, reducing the learning rate to be 1e-4 and 1e-5 when the epoch is 170 and 200, and stopping training when the epoch is 310;
step S24: evaluating and storing the model trained in the step S23, specifically, evaluating an MPII data set by using PCKh (head normalized probability of correct key points), evaluating a COCO data set by using OKS (similarity of target key points), and storing a final model after network training specifies epoch;
step S3: and (4) testing the key point model trained in the step (S2) on the test data set acquired in the step (S1), specifically, inputting the test set data into the acquired key point model, calculating an average value of the heat maps of the original image and the turned image to obtain a final predicted heat map, and calculating a position predicted value of each key point at a 1/4 offset position from a highest point of a heat value to a next highest point.
CN202010284037.7A 2020-04-13 2020-04-13 Human body key point detection method based on double-attention mechanism Active CN111476184B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010284037.7A CN111476184B (en) 2020-04-13 2020-04-13 Human body key point detection method based on double-attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010284037.7A CN111476184B (en) 2020-04-13 2020-04-13 Human body key point detection method based on double-attention mechanism

Publications (2)

Publication Number Publication Date
CN111476184A true CN111476184A (en) 2020-07-31
CN111476184B CN111476184B (en) 2023-12-22

Family

ID=71752170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010284037.7A Active CN111476184B (en) 2020-04-13 2020-04-13 Human body key point detection method based on double-attention mechanism

Country Status (1)

Country Link
CN (1) CN111476184B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112084911A (en) * 2020-08-28 2020-12-15 安徽清新互联信息科技有限公司 Human face feature point positioning method and system based on global attention
CN112084928A (en) * 2020-09-04 2020-12-15 东南大学 Road traffic accident detection method based on visual attention mechanism and ConvLSTM network
CN112149613A (en) * 2020-10-12 2020-12-29 萱闱(北京)生物科技有限公司 Motion estimation evaluation method based on improved LSTM model
CN112149558A (en) * 2020-09-22 2020-12-29 驭势科技(南京)有限公司 Image processing method, network and electronic equipment for key point detection
CN112270213A (en) * 2020-10-12 2021-01-26 萱闱(北京)生物科技有限公司 Improved HRnet based on attention mechanism
CN112347865A (en) * 2020-10-21 2021-02-09 四川长虹电器股份有限公司 Bill correction method based on key point detection
CN112541409A (en) * 2020-11-30 2021-03-23 北京建筑大学 Attention-integrated residual network expression recognition method
CN112712015A (en) * 2020-12-28 2021-04-27 康佳集团股份有限公司 Human body key point identification method and device, intelligent terminal and storage medium
CN113011304A (en) * 2021-03-12 2021-06-22 山东大学 Human body posture estimation method and system based on attention multi-resolution network
CN113034545A (en) * 2021-03-26 2021-06-25 河海大学 Vehicle tracking method based on CenterNet multi-target tracking algorithm
CN113420641A (en) * 2021-06-21 2021-09-21 梅卡曼德(北京)机器人科技有限公司 Image data processing method, image data processing device, electronic equipment and storage medium
CN113469193A (en) * 2021-06-16 2021-10-01 中国科学院合肥物质科学研究院 Low-power-consumption pest image identification method based on addition multiplication mixed convolution
CN113920535A (en) * 2021-10-12 2022-01-11 广东电网有限责任公司广州供电局 Electronic region detection method based on YOLOv5
CN113947814A (en) * 2021-10-28 2022-01-18 山东大学 Cross-visual angle gait recognition method based on space-time information enhancement and multi-scale saliency feature extraction
CN114373226A (en) * 2021-12-31 2022-04-19 华南理工大学 Human body posture estimation method based on improved HRNet network in operating room scene
CN114373091A (en) * 2020-10-14 2022-04-19 南京工业大学 Gait recognition method based on deep learning fusion SVM
CN114998453A (en) * 2022-08-08 2022-09-02 国网浙江省电力有限公司宁波供电公司 Stereo matching model based on high-scale unit and application method thereof
CN115019338A (en) * 2022-04-27 2022-09-06 淮阴工学院 Multi-person posture estimation method and system based on GAMIHR-Net

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180349359A1 (en) * 2017-05-19 2018-12-06 salesforce.com,inc. Natural language processing using a neural network
CN110276316A (en) * 2019-06-26 2019-09-24 电子科技大学 A kind of human body critical point detection method based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180349359A1 (en) * 2017-05-19 2018-12-06 salesforce.com,inc. Natural language processing using a neural network
CN110678881A (en) * 2017-05-19 2020-01-10 易享信息技术有限公司 Natural language processing using context-specific word vectors
CN110276316A (en) * 2019-06-26 2019-09-24 电子科技大学 A kind of human body critical point detection method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王子牛;王宏杰;高建瓴;: "基于语义强化和特征融合的文本分类", 软件 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112084911A (en) * 2020-08-28 2020-12-15 安徽清新互联信息科技有限公司 Human face feature point positioning method and system based on global attention
CN112084911B (en) * 2020-08-28 2023-03-07 安徽清新互联信息科技有限公司 Human face feature point positioning method and system based on global attention
CN112084928A (en) * 2020-09-04 2020-12-15 东南大学 Road traffic accident detection method based on visual attention mechanism and ConvLSTM network
CN112149558A (en) * 2020-09-22 2020-12-29 驭势科技(南京)有限公司 Image processing method, network and electronic equipment for key point detection
CN112149613A (en) * 2020-10-12 2020-12-29 萱闱(北京)生物科技有限公司 Motion estimation evaluation method based on improved LSTM model
CN112270213A (en) * 2020-10-12 2021-01-26 萱闱(北京)生物科技有限公司 Improved HRnet based on attention mechanism
CN112149613B (en) * 2020-10-12 2024-01-05 萱闱(北京)生物科技有限公司 Action pre-estimation evaluation method based on improved LSTM model
CN114373091A (en) * 2020-10-14 2022-04-19 南京工业大学 Gait recognition method based on deep learning fusion SVM
CN112347865A (en) * 2020-10-21 2021-02-09 四川长虹电器股份有限公司 Bill correction method based on key point detection
CN112541409B (en) * 2020-11-30 2021-09-14 北京建筑大学 Attention-integrated residual network expression recognition method
CN112541409A (en) * 2020-11-30 2021-03-23 北京建筑大学 Attention-integrated residual network expression recognition method
CN112712015B (en) * 2020-12-28 2024-05-28 康佳集团股份有限公司 Human body key point identification method and device, intelligent terminal and storage medium
CN112712015A (en) * 2020-12-28 2021-04-27 康佳集团股份有限公司 Human body key point identification method and device, intelligent terminal and storage medium
CN113011304A (en) * 2021-03-12 2021-06-22 山东大学 Human body posture estimation method and system based on attention multi-resolution network
CN113034545A (en) * 2021-03-26 2021-06-25 河海大学 Vehicle tracking method based on CenterNet multi-target tracking algorithm
CN113469193A (en) * 2021-06-16 2021-10-01 中国科学院合肥物质科学研究院 Low-power-consumption pest image identification method based on addition multiplication mixed convolution
CN113469193B (en) * 2021-06-16 2023-08-22 中国科学院合肥物质科学研究院 Low-power consumption pest image identification method based on addition multiplication mixed convolution
CN113420641A (en) * 2021-06-21 2021-09-21 梅卡曼德(北京)机器人科技有限公司 Image data processing method, image data processing device, electronic equipment and storage medium
CN113420641B (en) * 2021-06-21 2024-06-14 梅卡曼德(北京)机器人科技有限公司 Image data processing method, device, electronic equipment and storage medium
CN113920535B (en) * 2021-10-12 2023-11-17 广东电网有限责任公司广州供电局 Electronic region detection method based on YOLOv5
CN113920535A (en) * 2021-10-12 2022-01-11 广东电网有限责任公司广州供电局 Electronic region detection method based on YOLOv5
CN113947814A (en) * 2021-10-28 2022-01-18 山东大学 Cross-visual angle gait recognition method based on space-time information enhancement and multi-scale saliency feature extraction
CN113947814B (en) * 2021-10-28 2024-05-28 山东大学 Cross-view gait recognition method based on space-time information enhancement and multi-scale saliency feature extraction
CN114373226A (en) * 2021-12-31 2022-04-19 华南理工大学 Human body posture estimation method based on improved HRNet network in operating room scene
CN114373226B (en) * 2021-12-31 2024-09-06 华南理工大学 Human body posture estimation method based on improved HRNet network in operating room scene
CN115019338B (en) * 2022-04-27 2023-09-22 淮阴工学院 Multi-person gesture estimation method and system based on GAMHR-Net
CN115019338A (en) * 2022-04-27 2022-09-06 淮阴工学院 Multi-person posture estimation method and system based on GAMIHR-Net
CN114998453A (en) * 2022-08-08 2022-09-02 国网浙江省电力有限公司宁波供电公司 Stereo matching model based on high-scale unit and application method thereof

Also Published As

Publication number Publication date
CN111476184B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN111476184B (en) Human body key point detection method based on double-attention mechanism
CN110782462B (en) Semantic segmentation method based on double-flow feature fusion
CN110188768B (en) Real-time image semantic segmentation method and system
CN108062754B (en) Segmentation and identification method and device based on dense network image
CN111291739B (en) Face detection and image detection neural network training method, device and equipment
CN110929736A (en) Multi-feature cascade RGB-D significance target detection method
CN112884073B (en) Image rain removing method, system, terminal and storage medium
CN112232134B (en) Human body posture estimation method based on hourglass network and attention mechanism
CN113989283B (en) 3D human body posture estimation method and device, electronic equipment and storage medium
CN114529982A (en) Lightweight human body posture estimation method and system based on stream attention
CN116030537A (en) Three-dimensional human body posture estimation method based on multi-branch attention-seeking convolution
CN108229432A (en) Face calibration method and device
CN114255514A (en) Human body tracking system and method based on Transformer and camera device
CN113066089A (en) Real-time image semantic segmentation network based on attention guide mechanism
CN116092190A (en) Human body posture estimation method based on self-attention high-resolution network
CN115588116A (en) Pedestrian action identification method based on double-channel attention mechanism
CN103208109A (en) Local restriction iteration neighborhood embedding-based face hallucination method
Zhou et al. Towards locality similarity preserving to 3D human pose estimation
CN117115855A (en) Human body posture estimation method and system based on multi-scale transducer learning rich visual features
CN111401335A (en) Key point detection method and device and storage medium
WO2022252519A1 (en) Image processing method and apparatus, terminal, medium, and program
WO2021176985A1 (en) Signal processing device, signal processing method, and program
Huo et al. Deep high-resolution network with double attention residual blocks for human pose estimation
CN112528899B (en) Image salient object detection method and system based on implicit depth information recovery
CN113744255A (en) Skin mirror image segmentation method, segmentation network and segmentation network construction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant