CN112131959A - 2D human body posture estimation method based on multi-scale feature reinforcement - Google Patents

2D human body posture estimation method based on multi-scale feature reinforcement Download PDF

Info

Publication number
CN112131959A
CN112131959A CN202010883889.8A CN202010883889A CN112131959A CN 112131959 A CN112131959 A CN 112131959A CN 202010883889 A CN202010883889 A CN 202010883889A CN 112131959 A CN112131959 A CN 112131959A
Authority
CN
China
Prior art keywords
network
features
feature
human body
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010883889.8A
Other languages
Chinese (zh)
Other versions
CN112131959B (en
Inventor
邵展鹏
刘鹏
胡超群
周小龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202010883889.8A priority Critical patent/CN112131959B/en
Publication of CN112131959A publication Critical patent/CN112131959A/en
Application granted granted Critical
Publication of CN112131959B publication Critical patent/CN112131959B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A2D human body posture estimation method based on multi-scale feature reinforcement comprises the following steps: 1) firstly, extracting a feature with high representation capability from an input picture, and performing cross-channel interaction on the features with different scales through a separation attention module; 2) constructing a multi-stage prediction network for the obtained feature maps with different scales, and performing transverse propagation and downward propagation fusion on the features of each stage, so that more spatial resolution information is fused while semantic information is ensured; 3) constructing a high-resolution adjusting network to finely adjust the positioning result of the multistage prediction network, upsampling the multistage features to the maximum resolution through transposition convolution, and then performing cascade operation to position key points with large loss; 4) after the whole network structure is constructed, the input data of the network structure needs to be processed and parameters need to be set. The invention improves the detection capability of the whole network on key points with different scales.

Description

2D human body posture estimation method based on multi-scale feature reinforcement
Technical Field
The invention relates to a human body posture estimation task in computer vision, in particular to a 2D human body posture estimation method based on multi-scale feature reinforcement.
Background
Human posture estimation is one of the current popular research fields as the basis of many visual tasks such as motion recognition, posture tracking, human-computer interaction and the like. The method has wide application prospect in the fields of virtual reality, intelligent monitoring, robots and the like. With the development of the deep convolutional neural network, a plurality of excellent solutions for the human body posture estimation task emerge. However, because the scenes in which human bodies may appear are complex and variable, and the number of people on one picture is different, mutual occlusion and self-occlusion are easy to occur. The distance between the camera and the human body and the different visual angles can lead to different sizes of people in the picture, and the quality of the picture is easily influenced by environmental factors such as illumination and the like. Human posture estimation remains a significant challenge to be solved urgently.
In early research, human bodies are mainly modeled by artificially selecting features and appropriate models, most of the models are tree models and random forest models, and the traditional method has high requirements on image processing and has certain limitations in practical application. With the application of the depth structure to human body posture estimation, the performance of posture estimation is greatly improved. The current research focus is multi-person posture estimation, which faces more challenges and is closer to the actual scene, and the mainstream solutions are divided into a top-down method and a bottom-up method.
The bottom-up method firstly detects all key points in the image, and then distributes the obtained key points to different individuals in the image in a clustering mode. This method has the advantage of not increasing the processing time linearly with the number of people in the picture, at the cost of less accuracy than the top-down method. Some researchers provide a partial association field, and construct the relationship between two key points into a two-dimensional vector, so that the problem of wrong connection of key points between different human bodies is well avoided. The top-down method firstly detects the human body in the picture, and then carries out key point prediction on the detected human body, so that the challenges in single posture estimation need to be solved, and inaccurate and repeated detection of human body proposals is faced. Some methods divide different human body key points into two types for independent processing, firstly locate key points which are easy to detect through a global positioning network, and then locate key points which are difficult to detect through a cascade network. However, at present, the network can not well position human bodies with different sizes because more semantic information and resolution information are lost in the transmission process.
Disclosure of Invention
In order to solve the problems existing in the existing human body posture estimation method, the invention provides a 2D human body posture estimation method based on multi-scale feature reinforcement. The network loses more spatial information, so the invention performs up-sampling on the features with different scales by transposition convolution, concatenates the features of four stages together, finely adjusts the positioning result of the multistage prediction network, fuses the results of the two stages and outputs the final positioning result.
The technical scheme adopted by the invention for solving the technical problems is as follows:
A2D human body posture estimation method based on multi-scale feature reinforcement comprises the following steps:
1) obtaining abstract features with high representation capacity:
inputting the preprocessed pictures into a ResNeSt backbone network, performing cross-channel interaction on the features of different dimensions through a separation attention module, removing a final classification layer and outputting the features of four stages;
2) constructing a multi-stage prediction network:
acquiring four features with different resolutions through step 1), constructing a feature-enhanced functional pyramid for the features in the four stages, and performing fusion enhancement on the high-level features by using a feature enhancement strategy because the top-level feature points lose more semantic information in the propagation process;
3) constructing a high-resolution adjusting network:
a high-resolution adjusting network is constructed to adjust the position of the key point with large prediction loss in the previous stage, the characteristics in the multistage predicting network are up-sampled through transposition convolution, the up-sampling and convolution operations are well combined, the expanded characteristics are subjected to cascade operation, and richer space details are introduced into the key point with a smaller scale;
4) training setting of the whole network:
setting all input pictures to be in a 4:3 aspect ratio, then using a human detector to obtain a human body example in each picture, setting the size of the input example to be 384 multiplied by 288, and using an MSE loss function to carry out gradient return on errors in the training process; the initial learning rate of the network is set to be 5e-4, the weight attenuation is 1e-5, an Adam optimizer is used, the learning rate is reduced to half of the original learning rate after 6 batches of training, and 20 batches of training are performed.
Further, in the step 1), considering that the final positioning result of the features with stronger representation capability is important, a feature extraction network ResNeSt for a pixel-level visual task is used, and features with different scales are subjected to cross-channel interaction through a separation attention module;
firstly, dividing the feature map into K base arrays, dividing each group into R groups again, namely, the total number of the feature groups is G-KR, and applying some transformations to the features of each group individually
Figure BDA0002654959240000031
The middle of the feature set is represented as:
Figure BDA0002654959240000032
wherein
Figure BDA0002654959240000033
Representing different transformation functions, G representing the total number of feature sets, the input to the kth base set is:
Figure BDA0002654959240000034
wherein for K ∈ 1,2
Figure BDA0002654959240000035
H, W and C represent the height, width and channel number of the feature map, respectively.
Furthermore, in the step 2), after four features with different resolutions are obtained through the backbone network, a multi-level prediction network with a pyramid structure is constructed to maintain semantic information and spatial resolution information with different scales, and as the top-level features are reduced in dimension through a convolution kernel with the size of 1 × 1, more semantic information is lost, and the loss of each layer of semantic information is directly caused; the characteristic enhancement module is used for effectively enhancing the top-layer characteristics, so that the representation capability of the whole multi-stage prediction network is effectively improved;
and then respectively predicting the multilevel characteristic network, firstly eliminating aliasing effect generated by characteristic superposition by using 1 × 1 convolution, then applying a BN (batch normalization) layer to the multilevel characteristic network for normalization processing, then using ReLU activation function processing, reducing 256-dimensional characteristics to the finally required dimension of 17 dimensions by using 3 × 3 convolution, finally up-sampling the obtained heat map to output size, and carrying out normalization processing again, thereby effectively improving the generalization capability of the model.
Furthermore, in the step 3), after global positioning is performed according to the method in the step 2), some small hidden key points still have large detection errors, a high-resolution fine tuning network is constructed, features of different scales are integrated together, feature images in the multi-stage prediction network are subjected to feature refinement through a plurality of bottleneck modules, and then transposed convolution layers of different times are sampled to output sizes;
obtaining four high-resolution features with the same size through transposition convolution, respectively carrying out scale normalization and ReLU function processing on each feature, cascading the features together according to a first dimension, then carrying out final prediction on the features by using convolution kernel with the size of 3 multiplied by 3, carrying out scale normalization on an output result and outputting the result, and only modifying the position of a key point with a larger loss value in the gradient return process of network training in order to prevent the interference on the position of the key point of a larger human body while modifying a smaller target.
The technical conception of the invention is as follows: the method comprises the steps of obtaining features with high expression capacity by using a backbone network, constructing a feature-enhanced multi-stage prediction network based on the features, carrying out primary positioning on all key points, then constructing a high-resolution adjustment network, introducing more spatial context information in a feature diagram through transposition convolution and cascade operation, and carrying out position adjustment on key points with larger errors. And finally, fusing the outputs of the two stages to obtain a final positioning result.
The invention has the following beneficial effects: the ResNeSt backbone network is applied to the human body posture estimation task, the multi-stage prediction network is constructed for the obtained features, a feature strengthening strategy is used for aiming at the loss in feature propagation, the excellent performance of the multi-stage prediction network is effectively ensured, the high-resolution fine tuning network is constructed for the key points with large errors, the up-sampling and convolution operation are effectively combined together through the transposition convolution, and the detection capability of the whole network on the key points with different scales is improved. The prediction results of the two stages are integrated, and the method has better performance and certain robustness on the key point prediction in different scenes.
Drawings
FIG. 1 is a schematic overall flow diagram of the present invention;
FIG. 2 is a diagram of a network architecture according to an aspect of the present invention;
FIG. 3 is a block diagram of a feature extraction network;
fig. 4 is a schematic diagram of a feature enhancement policy flow.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 4, a 2D human body posture estimation method based on multi-scale feature reinforcement includes the following steps:
1) extracting features with high representation capability for an input picture:
in the invention, the final positioning result of the features with stronger representation capability is considered to be important, so that a feature extraction network ResNeSt aiming at a pixel-level visual task is used, and as shown in FIG. 3, cross-channel interaction is carried out on the features with different scales through a separation attention module;
firstly, dividing the feature map into K base arrays, dividing each group into R groups again, namely, the total number of the feature groups is G-KR, and applying some transformations to the features of each group individually
Figure BDA0002654959240000051
The middle of the feature set is represented as:
Figure BDA0002654959240000052
wherein
Figure BDA0002654959240000053
Representing different transformation functions and G representing the total number of feature sets. The input to the kth base array is:
Figure BDA0002654959240000054
wherein for K ∈ 1,2
Figure BDA0002654959240000055
H, W and C represent the height, width and channel number of the feature map, respectively. Global context information with weight statistics for channels
Figure BDA0002654959240000056
It can be derived by global average pooling across dimensions:
Figure BDA0002654959240000061
each feature map channel is generated by weighted combination after the attention separation module, and the calculation formula of the c channel is as follows:
Figure BDA0002654959240000062
wherein
Figure BDA0002654959240000063
Is the weight obtained after calculation of the softmax layer:
Figure BDA0002654959240000064
skrepresenting global spatial information, mapping relationships
Figure BDA0002654959240000065
Determining the weight of each channel by the value, and then connecting the outputs of each base array by channel dimension, i.e., V ═ Concat { V }1,V2,...VKThe output of each module is Y, which can be expressed as:
Figure BDA0002654959240000066
where V represents the output of the array of basis numbers,
Figure BDA0002654959240000067
an output representing a jump connection;
2) constructing a multistage prediction network, and pre-positioning all key points, specifically as follows:
after 4 features with different resolutions are obtained through a backbone network, a pyramid-structured multi-level prediction network is constructed to maintain semantic information and spatial resolution information with different scales. Because the dimension of the top-layer features is reduced through the convolution kernel with the size of 1 multiplied by 1, more semantic information is lost, and the loss of each layer of semantic information is directly caused. The invention uses a feature enhancement module, as shown in fig. 4, to effectively enhance the top-level features and effectively improve the characterization capability of the whole multi-level prediction network. The top-level features are firstly subjected to spatial adaptive pooling, are features with three resolutions and have 256 dimensions, then the three feature maps are sampled to the original size in a weighted fusion mode for fusion, so that a feature with reduced dimensionality and unchanged resolution is obtained, and finally the feature is fused with the original feature, as shown in a network structure in FIG. 2;
then, respectively predicting the multilevel characteristic network, firstly eliminating aliasing effect generated by characteristic superposition by using 1 × 1 convolution, then applying a BN (batch normalization) layer to the multilevel characteristic network for normalization processing, then using ReLU activation function processing, reducing 256-dimensional characteristics to finally required dimensionality 17-dimensional (the number of key points of a human body) by using 3 × 3 convolution, finally sampling the obtained heat map to output size, and carrying out normalization processing again, thereby effectively improving the generalization capability of the model;
3) constructing a high-resolution fine tuning network, and further adjusting the position of a key point with a smaller scale, wherein the specific steps are as follows:
after global positioning is carried out according to the method in the step 2, some small key points still exist, and the detection error of the blocked key points is large. As shown in fig. 2, the present invention constructs a high resolution fine tuning network that integrates features of different dimensions. Feature thinning is carried out on a feature map in the multistage prediction network through a plurality of bottleneck modules, and then the feature map is up-sampled to the output size through the transposition convolutional layers of different times. The input channel and the output channel of the transposed convolution are 256-dimensional, the size of a convolution kernel is set to be 4 multiplied by 4, the step length is 2, and the filling is 1;
obtaining four high-resolution features with the same size through transposition convolution, respectively carrying out scale normalization and ReLU function processing on each feature, cascading the features together according to a first dimension, then carrying out final prediction on the features by using a convolution core with the size of 3 multiplied by 3, and carrying out scale normalization on an output result and then outputting the result. In order to prevent the position of a key point of a larger human body from being interfered while a smaller target is modified, only the position of the key point with a larger loss value is modified in the gradient return process of network training;
4) after the whole network structure is constructed, the input data of the network structure needs to be processed and parameters need to be set, and the steps are as follows:
all input pictures are set to be 4:3 in aspect ratio, then a human detector is used for acquiring human body examples in each picture, the size of the input examples is set to be 384 multiplied by 288, and MSE loss functions are used for carrying out gradient feedback on errors in the training process. Setting the initial learning rate of the network to be 5e-4, setting the weight attenuation to be 1e-5, using an Adam optimizer, reducing the learning rate to be half of the original rate after training every 6 batches, and training 20 batches;
through the operation of the steps, the 2D human body posture estimation with strengthened characteristics can be realized.
The objects, technical solutions and advantages of the present invention are further described in detail with reference to the detailed description illustrated in the drawings, it should be understood that the above description is only an exemplary embodiment of the present invention, and is not intended to limit the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (4)

1. A2D human body posture estimation method based on multi-scale feature reinforcement is characterized by comprising the following steps:
1) obtaining abstract features with high representation capacity:
inputting the preprocessed pictures into a ResNeSt backbone network, performing cross-channel interaction on the features of different dimensions through a separation attention module, removing a final classification layer and outputting the features of four stages;
2) constructing a multi-stage prediction network:
acquiring four features with different resolutions through step 1), constructing a feature-enhanced functional pyramid for the features in the four stages, and performing fusion enhancement on the high-level features by using a feature enhancement strategy because the top-level feature points lose more semantic information in the propagation process;
3) constructing a high-resolution adjusting network:
a high-resolution adjusting network is constructed to adjust the position of the key point with large prediction loss in the previous stage, the characteristics in the multistage predicting network are up-sampled through transposition convolution, the up-sampling and convolution operations are well combined, the expanded characteristics are subjected to cascade operation, and richer space details are introduced into the key point with a smaller scale;
4) training setting of the whole network:
setting all input pictures to be in a 4:3 aspect ratio, then using a human detector to obtain a human body example in each picture, setting the size of the input example to be 384 multiplied by 288, and using an MSE loss function to carry out gradient return on errors in the training process; the initial learning rate of the network is set to be 5e-4, the weight attenuation is 1e-5, an Adam optimizer is used, the learning rate is reduced to half of the original learning rate after 6 batches of training, and 20 batches of training are performed.
2. The multi-scale feature reinforcement-based 2D human body posture estimation method according to claim 1, wherein in the step 1), considering that the final positioning result of the features with stronger characterization capability is crucial, a feature extraction network ResNeSt for a visual task at a pixel level is used, and cross-channel interaction is performed on the features with different scales through a separation attention module;
firstly, dividing the feature map into K base arrays, dividing each group into R groups again, namely, the total number of the feature groups is G-KR, and applying some transformations to the features of each group individually
Figure FDA0002654959230000011
The middle of the feature set is represented as:
Figure FDA0002654959230000012
wherein
Figure FDA0002654959230000013
Representing different transformation functions, G representing the total number of feature sets, the input to the kth base set is:
Figure FDA0002654959230000014
wherein for K ∈ 1,2
Figure FDA0002654959230000015
H, W and C represent the height, width and channel number of the feature map, respectively.
3. The multi-scale feature reinforcement-based 2D human body posture estimation method according to claim 1 or 2, characterized in that in step 2), after four features with different resolutions are obtained through a backbone network, a pyramid-structured multi-level prediction network is constructed to maintain semantic information and spatial resolution information of different scales, and since top-level features are dimension-reduced by a convolution kernel with a size of 1 × 1, more semantic information is lost, which directly results in the loss of semantic information of each layer; the characteristic enhancement module is used for effectively enhancing the top-layer characteristics, so that the representation capability of the whole multi-stage prediction network is effectively improved;
and then respectively predicting the multilevel characteristic network, firstly eliminating aliasing effect generated by characteristic superposition by using 1 × 1 convolution, then applying a BN (batch normalization) layer to the multilevel characteristic network for normalization processing, then using ReLU activation function processing, reducing 256-dimensional characteristics to the finally required dimension of 17 dimensions by using 3 × 3 convolution, finally up-sampling the obtained heat map to output size, and carrying out normalization processing again, thereby effectively improving the generalization capability of the model.
4. The multi-scale feature reinforcement based 2D human body posture estimation method according to claim 1 or 2, characterized in that in step 3), after global positioning is performed according to the method in step 2), some small key points with larger detection errors are remained, a high-resolution fine tuning network is constructed, features of different scales are integrated together, feature images in a multi-stage prediction network are subjected to feature refinement through a plurality of bottleneck modules, and then the feature images are sampled to output sizes through different times of transposition convolutional layers;
obtaining four high-resolution features with the same size through transposition convolution, respectively carrying out scale normalization and ReLU function processing on each feature, cascading the features together according to a first dimension, then carrying out final prediction on the features by using convolution kernel with the size of 3 multiplied by 3, carrying out scale normalization on an output result and outputting the result, and only modifying the position of a key point with a larger loss value in the gradient return process of network training in order to prevent the interference on the position of the key point of a larger human body while modifying a smaller target.
CN202010883889.8A 2020-08-28 2020-08-28 2D human body posture estimation method based on multi-scale feature reinforcement Active CN112131959B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010883889.8A CN112131959B (en) 2020-08-28 2020-08-28 2D human body posture estimation method based on multi-scale feature reinforcement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010883889.8A CN112131959B (en) 2020-08-28 2020-08-28 2D human body posture estimation method based on multi-scale feature reinforcement

Publications (2)

Publication Number Publication Date
CN112131959A true CN112131959A (en) 2020-12-25
CN112131959B CN112131959B (en) 2024-03-22

Family

ID=73847628

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010883889.8A Active CN112131959B (en) 2020-08-28 2020-08-28 2D human body posture estimation method based on multi-scale feature reinforcement

Country Status (1)

Country Link
CN (1) CN112131959B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837367A (en) * 2021-01-27 2021-05-25 清华大学 Semantic decomposition type object pose estimation method and system
CN113221787A (en) * 2021-05-18 2021-08-06 西安电子科技大学 Pedestrian multi-target tracking method based on multivariate difference fusion
CN113221626A (en) * 2021-03-04 2021-08-06 北京联合大学 Human body posture estimation method based on Non-local high-resolution network
CN113284146A (en) * 2021-07-23 2021-08-20 天津御锦人工智能医疗科技有限公司 Colorectal polyp image recognition method and device and storage medium
CN113792641A (en) * 2021-09-08 2021-12-14 南京航空航天大学 High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism
CN114155560A (en) * 2022-02-08 2022-03-08 成都考拉悠然科技有限公司 Light weight method of high-resolution human body posture estimation model based on space dimension reduction
CN117456562A (en) * 2023-12-25 2024-01-26 深圳须弥云图空间科技有限公司 Attitude estimation method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229445A (en) * 2018-02-09 2018-06-29 深圳市唯特视科技有限公司 A kind of more people's Attitude estimation methods based on cascade pyramid network
CN108710830A (en) * 2018-04-20 2018-10-26 浙江工商大学 A kind of intensive human body 3D posture estimation methods for connecting attention pyramid residual error network and equidistantly limiting of combination
CN110135375A (en) * 2019-05-20 2019-08-16 中国科学院宁波材料技术与工程研究所 More people's Attitude estimation methods based on global information integration
CN110276316A (en) * 2019-06-26 2019-09-24 电子科技大学 A kind of human body critical point detection method based on deep learning
CN110659565A (en) * 2019-08-15 2020-01-07 电子科技大学 3D multi-person human body posture estimation method based on porous convolution

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229445A (en) * 2018-02-09 2018-06-29 深圳市唯特视科技有限公司 A kind of more people's Attitude estimation methods based on cascade pyramid network
CN108710830A (en) * 2018-04-20 2018-10-26 浙江工商大学 A kind of intensive human body 3D posture estimation methods for connecting attention pyramid residual error network and equidistantly limiting of combination
CN110135375A (en) * 2019-05-20 2019-08-16 中国科学院宁波材料技术与工程研究所 More people's Attitude estimation methods based on global information integration
CN110276316A (en) * 2019-06-26 2019-09-24 电子科技大学 A kind of human body critical point detection method based on deep learning
CN110659565A (en) * 2019-08-15 2020-01-07 电子科技大学 3D multi-person human body posture estimation method based on porous convolution

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837367B (en) * 2021-01-27 2022-11-25 清华大学 Semantic decomposition type object pose estimation method and system
CN112837367A (en) * 2021-01-27 2021-05-25 清华大学 Semantic decomposition type object pose estimation method and system
CN113221626A (en) * 2021-03-04 2021-08-06 北京联合大学 Human body posture estimation method based on Non-local high-resolution network
CN113221626B (en) * 2021-03-04 2023-10-20 北京联合大学 Human body posture estimation method based on Non-local high-resolution network
CN113221787A (en) * 2021-05-18 2021-08-06 西安电子科技大学 Pedestrian multi-target tracking method based on multivariate difference fusion
CN113221787B (en) * 2021-05-18 2023-09-29 西安电子科技大学 Pedestrian multi-target tracking method based on multi-element difference fusion
CN113284146A (en) * 2021-07-23 2021-08-20 天津御锦人工智能医疗科技有限公司 Colorectal polyp image recognition method and device and storage medium
CN113284146B (en) * 2021-07-23 2021-10-22 天津御锦人工智能医疗科技有限公司 Colorectal polyp image recognition method and device and storage medium
CN113792641A (en) * 2021-09-08 2021-12-14 南京航空航天大学 High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism
CN113792641B (en) * 2021-09-08 2024-05-03 南京航空航天大学 High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism
CN114155560A (en) * 2022-02-08 2022-03-08 成都考拉悠然科技有限公司 Light weight method of high-resolution human body posture estimation model based on space dimension reduction
CN117456562A (en) * 2023-12-25 2024-01-26 深圳须弥云图空间科技有限公司 Attitude estimation method and device
CN117456562B (en) * 2023-12-25 2024-04-12 深圳须弥云图空间科技有限公司 Attitude estimation method and device

Also Published As

Publication number Publication date
CN112131959B (en) 2024-03-22

Similar Documents

Publication Publication Date Title
CN112131959A (en) 2D human body posture estimation method based on multi-scale feature reinforcement
CN113033570B (en) Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion
CN114202672A (en) Small target detection method based on attention mechanism
CN112396607B (en) Deformable convolution fusion enhanced street view image semantic segmentation method
CN112288011B (en) Image matching method based on self-attention deep neural network
CN110210551A (en) A kind of visual target tracking method based on adaptive main body sensitivity
CN110929578A (en) Anti-blocking pedestrian detection method based on attention mechanism
CN108427921A (en) A kind of face identification method based on convolutional neural networks
CN109558862A (en) The people counting method and system of attention refinement frame based on spatial perception
CN116452937A (en) Multi-mode characteristic target detection method based on dynamic convolution and attention mechanism
CN114119975A (en) Language-guided cross-modal instance segmentation method
CN112784756B (en) Human body identification tracking method
CN113792641A (en) High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism
CN116863539A (en) Fall figure target detection method based on optimized YOLOv8s network structure
CN115222998B (en) Image classification method
CN111860124A (en) Remote sensing image classification method based on space spectrum capsule generation countermeasure network
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN113344110A (en) Fuzzy image classification method based on super-resolution reconstruction
CN116309632A (en) Three-dimensional liver semantic segmentation method based on multi-scale cascade feature attention strategy
CN116092190A (en) Human body posture estimation method based on self-attention high-resolution network
CN116486080A (en) Lightweight image semantic segmentation method based on deep learning
CN113850182B (en) DAMR _ DNet-based action recognition method
CN113780140A (en) Gesture image segmentation and recognition method and device based on deep learning
CN117011655A (en) Adaptive region selection feature fusion based method, target tracking method and system
CN114863133A (en) Flotation froth image feature point extraction method based on multitask unsupervised algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant