CN112131959A - 2D human body posture estimation method based on multi-scale feature reinforcement - Google Patents
2D human body posture estimation method based on multi-scale feature reinforcement Download PDFInfo
- Publication number
- CN112131959A CN112131959A CN202010883889.8A CN202010883889A CN112131959A CN 112131959 A CN112131959 A CN 112131959A CN 202010883889 A CN202010883889 A CN 202010883889A CN 112131959 A CN112131959 A CN 112131959A
- Authority
- CN
- China
- Prior art keywords
- network
- features
- feature
- human body
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000002787 reinforcement Effects 0.000 title claims abstract description 11
- 230000017105 transposition Effects 0.000 claims abstract description 11
- 230000003993 interaction Effects 0.000 claims abstract description 7
- 238000000926 separation method Methods 0.000 claims abstract description 7
- 238000001514 detection method Methods 0.000 claims abstract description 6
- 230000004927 fusion Effects 0.000 claims abstract description 5
- 238000010606 normalization Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 3
- 238000003491 array Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 238000000844 transformation Methods 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 2
- 230000002708 enhancing effect Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
A2D human body posture estimation method based on multi-scale feature reinforcement comprises the following steps: 1) firstly, extracting a feature with high representation capability from an input picture, and performing cross-channel interaction on the features with different scales through a separation attention module; 2) constructing a multi-stage prediction network for the obtained feature maps with different scales, and performing transverse propagation and downward propagation fusion on the features of each stage, so that more spatial resolution information is fused while semantic information is ensured; 3) constructing a high-resolution adjusting network to finely adjust the positioning result of the multistage prediction network, upsampling the multistage features to the maximum resolution through transposition convolution, and then performing cascade operation to position key points with large loss; 4) after the whole network structure is constructed, the input data of the network structure needs to be processed and parameters need to be set. The invention improves the detection capability of the whole network on key points with different scales.
Description
Technical Field
The invention relates to a human body posture estimation task in computer vision, in particular to a 2D human body posture estimation method based on multi-scale feature reinforcement.
Background
Human posture estimation is one of the current popular research fields as the basis of many visual tasks such as motion recognition, posture tracking, human-computer interaction and the like. The method has wide application prospect in the fields of virtual reality, intelligent monitoring, robots and the like. With the development of the deep convolutional neural network, a plurality of excellent solutions for the human body posture estimation task emerge. However, because the scenes in which human bodies may appear are complex and variable, and the number of people on one picture is different, mutual occlusion and self-occlusion are easy to occur. The distance between the camera and the human body and the different visual angles can lead to different sizes of people in the picture, and the quality of the picture is easily influenced by environmental factors such as illumination and the like. Human posture estimation remains a significant challenge to be solved urgently.
In early research, human bodies are mainly modeled by artificially selecting features and appropriate models, most of the models are tree models and random forest models, and the traditional method has high requirements on image processing and has certain limitations in practical application. With the application of the depth structure to human body posture estimation, the performance of posture estimation is greatly improved. The current research focus is multi-person posture estimation, which faces more challenges and is closer to the actual scene, and the mainstream solutions are divided into a top-down method and a bottom-up method.
The bottom-up method firstly detects all key points in the image, and then distributes the obtained key points to different individuals in the image in a clustering mode. This method has the advantage of not increasing the processing time linearly with the number of people in the picture, at the cost of less accuracy than the top-down method. Some researchers provide a partial association field, and construct the relationship between two key points into a two-dimensional vector, so that the problem of wrong connection of key points between different human bodies is well avoided. The top-down method firstly detects the human body in the picture, and then carries out key point prediction on the detected human body, so that the challenges in single posture estimation need to be solved, and inaccurate and repeated detection of human body proposals is faced. Some methods divide different human body key points into two types for independent processing, firstly locate key points which are easy to detect through a global positioning network, and then locate key points which are difficult to detect through a cascade network. However, at present, the network can not well position human bodies with different sizes because more semantic information and resolution information are lost in the transmission process.
Disclosure of Invention
In order to solve the problems existing in the existing human body posture estimation method, the invention provides a 2D human body posture estimation method based on multi-scale feature reinforcement. The network loses more spatial information, so the invention performs up-sampling on the features with different scales by transposition convolution, concatenates the features of four stages together, finely adjusts the positioning result of the multistage prediction network, fuses the results of the two stages and outputs the final positioning result.
The technical scheme adopted by the invention for solving the technical problems is as follows:
A2D human body posture estimation method based on multi-scale feature reinforcement comprises the following steps:
1) obtaining abstract features with high representation capacity:
inputting the preprocessed pictures into a ResNeSt backbone network, performing cross-channel interaction on the features of different dimensions through a separation attention module, removing a final classification layer and outputting the features of four stages;
2) constructing a multi-stage prediction network:
acquiring four features with different resolutions through step 1), constructing a feature-enhanced functional pyramid for the features in the four stages, and performing fusion enhancement on the high-level features by using a feature enhancement strategy because the top-level feature points lose more semantic information in the propagation process;
3) constructing a high-resolution adjusting network:
a high-resolution adjusting network is constructed to adjust the position of the key point with large prediction loss in the previous stage, the characteristics in the multistage predicting network are up-sampled through transposition convolution, the up-sampling and convolution operations are well combined, the expanded characteristics are subjected to cascade operation, and richer space details are introduced into the key point with a smaller scale;
4) training setting of the whole network:
setting all input pictures to be in a 4:3 aspect ratio, then using a human detector to obtain a human body example in each picture, setting the size of the input example to be 384 multiplied by 288, and using an MSE loss function to carry out gradient return on errors in the training process; the initial learning rate of the network is set to be 5e-4, the weight attenuation is 1e-5, an Adam optimizer is used, the learning rate is reduced to half of the original learning rate after 6 batches of training, and 20 batches of training are performed.
Further, in the step 1), considering that the final positioning result of the features with stronger representation capability is important, a feature extraction network ResNeSt for a pixel-level visual task is used, and features with different scales are subjected to cross-channel interaction through a separation attention module;
firstly, dividing the feature map into K base arrays, dividing each group into R groups again, namely, the total number of the feature groups is G-KR, and applying some transformations to the features of each group individuallyThe middle of the feature set is represented as:
whereinRepresenting different transformation functions, G representing the total number of feature sets, the input to the kth base set is:
wherein for K ∈ 1,2H, W and C represent the height, width and channel number of the feature map, respectively.
Furthermore, in the step 2), after four features with different resolutions are obtained through the backbone network, a multi-level prediction network with a pyramid structure is constructed to maintain semantic information and spatial resolution information with different scales, and as the top-level features are reduced in dimension through a convolution kernel with the size of 1 × 1, more semantic information is lost, and the loss of each layer of semantic information is directly caused; the characteristic enhancement module is used for effectively enhancing the top-layer characteristics, so that the representation capability of the whole multi-stage prediction network is effectively improved;
and then respectively predicting the multilevel characteristic network, firstly eliminating aliasing effect generated by characteristic superposition by using 1 × 1 convolution, then applying a BN (batch normalization) layer to the multilevel characteristic network for normalization processing, then using ReLU activation function processing, reducing 256-dimensional characteristics to the finally required dimension of 17 dimensions by using 3 × 3 convolution, finally up-sampling the obtained heat map to output size, and carrying out normalization processing again, thereby effectively improving the generalization capability of the model.
Furthermore, in the step 3), after global positioning is performed according to the method in the step 2), some small hidden key points still have large detection errors, a high-resolution fine tuning network is constructed, features of different scales are integrated together, feature images in the multi-stage prediction network are subjected to feature refinement through a plurality of bottleneck modules, and then transposed convolution layers of different times are sampled to output sizes;
obtaining four high-resolution features with the same size through transposition convolution, respectively carrying out scale normalization and ReLU function processing on each feature, cascading the features together according to a first dimension, then carrying out final prediction on the features by using convolution kernel with the size of 3 multiplied by 3, carrying out scale normalization on an output result and outputting the result, and only modifying the position of a key point with a larger loss value in the gradient return process of network training in order to prevent the interference on the position of the key point of a larger human body while modifying a smaller target.
The technical conception of the invention is as follows: the method comprises the steps of obtaining features with high expression capacity by using a backbone network, constructing a feature-enhanced multi-stage prediction network based on the features, carrying out primary positioning on all key points, then constructing a high-resolution adjustment network, introducing more spatial context information in a feature diagram through transposition convolution and cascade operation, and carrying out position adjustment on key points with larger errors. And finally, fusing the outputs of the two stages to obtain a final positioning result.
The invention has the following beneficial effects: the ResNeSt backbone network is applied to the human body posture estimation task, the multi-stage prediction network is constructed for the obtained features, a feature strengthening strategy is used for aiming at the loss in feature propagation, the excellent performance of the multi-stage prediction network is effectively ensured, the high-resolution fine tuning network is constructed for the key points with large errors, the up-sampling and convolution operation are effectively combined together through the transposition convolution, and the detection capability of the whole network on the key points with different scales is improved. The prediction results of the two stages are integrated, and the method has better performance and certain robustness on the key point prediction in different scenes.
Drawings
FIG. 1 is a schematic overall flow diagram of the present invention;
FIG. 2 is a diagram of a network architecture according to an aspect of the present invention;
FIG. 3 is a block diagram of a feature extraction network;
fig. 4 is a schematic diagram of a feature enhancement policy flow.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 4, a 2D human body posture estimation method based on multi-scale feature reinforcement includes the following steps:
1) extracting features with high representation capability for an input picture:
in the invention, the final positioning result of the features with stronger representation capability is considered to be important, so that a feature extraction network ResNeSt aiming at a pixel-level visual task is used, and as shown in FIG. 3, cross-channel interaction is carried out on the features with different scales through a separation attention module;
firstly, dividing the feature map into K base arrays, dividing each group into R groups again, namely, the total number of the feature groups is G-KR, and applying some transformations to the features of each group individuallyThe middle of the feature set is represented as:
whereinRepresenting different transformation functions and G representing the total number of feature sets. The input to the kth base array is:
wherein for K ∈ 1,2H, W and C represent the height, width and channel number of the feature map, respectively. Global context information with weight statistics for channelsIt can be derived by global average pooling across dimensions:
each feature map channel is generated by weighted combination after the attention separation module, and the calculation formula of the c channel is as follows:
skrepresenting global spatial information, mapping relationshipsDetermining the weight of each channel by the value, and then connecting the outputs of each base array by channel dimension, i.e., V ═ Concat { V }1,V2,...VKThe output of each module is Y, which can be expressed as:
where V represents the output of the array of basis numbers,an output representing a jump connection;
2) constructing a multistage prediction network, and pre-positioning all key points, specifically as follows:
after 4 features with different resolutions are obtained through a backbone network, a pyramid-structured multi-level prediction network is constructed to maintain semantic information and spatial resolution information with different scales. Because the dimension of the top-layer features is reduced through the convolution kernel with the size of 1 multiplied by 1, more semantic information is lost, and the loss of each layer of semantic information is directly caused. The invention uses a feature enhancement module, as shown in fig. 4, to effectively enhance the top-level features and effectively improve the characterization capability of the whole multi-level prediction network. The top-level features are firstly subjected to spatial adaptive pooling, are features with three resolutions and have 256 dimensions, then the three feature maps are sampled to the original size in a weighted fusion mode for fusion, so that a feature with reduced dimensionality and unchanged resolution is obtained, and finally the feature is fused with the original feature, as shown in a network structure in FIG. 2;
then, respectively predicting the multilevel characteristic network, firstly eliminating aliasing effect generated by characteristic superposition by using 1 × 1 convolution, then applying a BN (batch normalization) layer to the multilevel characteristic network for normalization processing, then using ReLU activation function processing, reducing 256-dimensional characteristics to finally required dimensionality 17-dimensional (the number of key points of a human body) by using 3 × 3 convolution, finally sampling the obtained heat map to output size, and carrying out normalization processing again, thereby effectively improving the generalization capability of the model;
3) constructing a high-resolution fine tuning network, and further adjusting the position of a key point with a smaller scale, wherein the specific steps are as follows:
after global positioning is carried out according to the method in the step 2, some small key points still exist, and the detection error of the blocked key points is large. As shown in fig. 2, the present invention constructs a high resolution fine tuning network that integrates features of different dimensions. Feature thinning is carried out on a feature map in the multistage prediction network through a plurality of bottleneck modules, and then the feature map is up-sampled to the output size through the transposition convolutional layers of different times. The input channel and the output channel of the transposed convolution are 256-dimensional, the size of a convolution kernel is set to be 4 multiplied by 4, the step length is 2, and the filling is 1;
obtaining four high-resolution features with the same size through transposition convolution, respectively carrying out scale normalization and ReLU function processing on each feature, cascading the features together according to a first dimension, then carrying out final prediction on the features by using a convolution core with the size of 3 multiplied by 3, and carrying out scale normalization on an output result and then outputting the result. In order to prevent the position of a key point of a larger human body from being interfered while a smaller target is modified, only the position of the key point with a larger loss value is modified in the gradient return process of network training;
4) after the whole network structure is constructed, the input data of the network structure needs to be processed and parameters need to be set, and the steps are as follows:
all input pictures are set to be 4:3 in aspect ratio, then a human detector is used for acquiring human body examples in each picture, the size of the input examples is set to be 384 multiplied by 288, and MSE loss functions are used for carrying out gradient feedback on errors in the training process. Setting the initial learning rate of the network to be 5e-4, setting the weight attenuation to be 1e-5, using an Adam optimizer, reducing the learning rate to be half of the original rate after training every 6 batches, and training 20 batches;
through the operation of the steps, the 2D human body posture estimation with strengthened characteristics can be realized.
The objects, technical solutions and advantages of the present invention are further described in detail with reference to the detailed description illustrated in the drawings, it should be understood that the above description is only an exemplary embodiment of the present invention, and is not intended to limit the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (4)
1. A2D human body posture estimation method based on multi-scale feature reinforcement is characterized by comprising the following steps:
1) obtaining abstract features with high representation capacity:
inputting the preprocessed pictures into a ResNeSt backbone network, performing cross-channel interaction on the features of different dimensions through a separation attention module, removing a final classification layer and outputting the features of four stages;
2) constructing a multi-stage prediction network:
acquiring four features with different resolutions through step 1), constructing a feature-enhanced functional pyramid for the features in the four stages, and performing fusion enhancement on the high-level features by using a feature enhancement strategy because the top-level feature points lose more semantic information in the propagation process;
3) constructing a high-resolution adjusting network:
a high-resolution adjusting network is constructed to adjust the position of the key point with large prediction loss in the previous stage, the characteristics in the multistage predicting network are up-sampled through transposition convolution, the up-sampling and convolution operations are well combined, the expanded characteristics are subjected to cascade operation, and richer space details are introduced into the key point with a smaller scale;
4) training setting of the whole network:
setting all input pictures to be in a 4:3 aspect ratio, then using a human detector to obtain a human body example in each picture, setting the size of the input example to be 384 multiplied by 288, and using an MSE loss function to carry out gradient return on errors in the training process; the initial learning rate of the network is set to be 5e-4, the weight attenuation is 1e-5, an Adam optimizer is used, the learning rate is reduced to half of the original learning rate after 6 batches of training, and 20 batches of training are performed.
2. The multi-scale feature reinforcement-based 2D human body posture estimation method according to claim 1, wherein in the step 1), considering that the final positioning result of the features with stronger characterization capability is crucial, a feature extraction network ResNeSt for a visual task at a pixel level is used, and cross-channel interaction is performed on the features with different scales through a separation attention module;
firstly, dividing the feature map into K base arrays, dividing each group into R groups again, namely, the total number of the feature groups is G-KR, and applying some transformations to the features of each group individuallyThe middle of the feature set is represented as:
whereinRepresenting different transformation functions, G representing the total number of feature sets, the input to the kth base set is:
3. The multi-scale feature reinforcement-based 2D human body posture estimation method according to claim 1 or 2, characterized in that in step 2), after four features with different resolutions are obtained through a backbone network, a pyramid-structured multi-level prediction network is constructed to maintain semantic information and spatial resolution information of different scales, and since top-level features are dimension-reduced by a convolution kernel with a size of 1 × 1, more semantic information is lost, which directly results in the loss of semantic information of each layer; the characteristic enhancement module is used for effectively enhancing the top-layer characteristics, so that the representation capability of the whole multi-stage prediction network is effectively improved;
and then respectively predicting the multilevel characteristic network, firstly eliminating aliasing effect generated by characteristic superposition by using 1 × 1 convolution, then applying a BN (batch normalization) layer to the multilevel characteristic network for normalization processing, then using ReLU activation function processing, reducing 256-dimensional characteristics to the finally required dimension of 17 dimensions by using 3 × 3 convolution, finally up-sampling the obtained heat map to output size, and carrying out normalization processing again, thereby effectively improving the generalization capability of the model.
4. The multi-scale feature reinforcement based 2D human body posture estimation method according to claim 1 or 2, characterized in that in step 3), after global positioning is performed according to the method in step 2), some small key points with larger detection errors are remained, a high-resolution fine tuning network is constructed, features of different scales are integrated together, feature images in a multi-stage prediction network are subjected to feature refinement through a plurality of bottleneck modules, and then the feature images are sampled to output sizes through different times of transposition convolutional layers;
obtaining four high-resolution features with the same size through transposition convolution, respectively carrying out scale normalization and ReLU function processing on each feature, cascading the features together according to a first dimension, then carrying out final prediction on the features by using convolution kernel with the size of 3 multiplied by 3, carrying out scale normalization on an output result and outputting the result, and only modifying the position of a key point with a larger loss value in the gradient return process of network training in order to prevent the interference on the position of the key point of a larger human body while modifying a smaller target.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010883889.8A CN112131959B (en) | 2020-08-28 | 2020-08-28 | 2D human body posture estimation method based on multi-scale feature reinforcement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010883889.8A CN112131959B (en) | 2020-08-28 | 2020-08-28 | 2D human body posture estimation method based on multi-scale feature reinforcement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112131959A true CN112131959A (en) | 2020-12-25 |
CN112131959B CN112131959B (en) | 2024-03-22 |
Family
ID=73847628
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010883889.8A Active CN112131959B (en) | 2020-08-28 | 2020-08-28 | 2D human body posture estimation method based on multi-scale feature reinforcement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112131959B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112837367A (en) * | 2021-01-27 | 2021-05-25 | 清华大学 | Semantic decomposition type object pose estimation method and system |
CN113221787A (en) * | 2021-05-18 | 2021-08-06 | 西安电子科技大学 | Pedestrian multi-target tracking method based on multivariate difference fusion |
CN113221626A (en) * | 2021-03-04 | 2021-08-06 | 北京联合大学 | Human body posture estimation method based on Non-local high-resolution network |
CN113284146A (en) * | 2021-07-23 | 2021-08-20 | 天津御锦人工智能医疗科技有限公司 | Colorectal polyp image recognition method and device and storage medium |
CN113792641A (en) * | 2021-09-08 | 2021-12-14 | 南京航空航天大学 | High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism |
CN114155560A (en) * | 2022-02-08 | 2022-03-08 | 成都考拉悠然科技有限公司 | Light weight method of high-resolution human body posture estimation model based on space dimension reduction |
CN117456562A (en) * | 2023-12-25 | 2024-01-26 | 深圳须弥云图空间科技有限公司 | Attitude estimation method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229445A (en) * | 2018-02-09 | 2018-06-29 | 深圳市唯特视科技有限公司 | A kind of more people's Attitude estimation methods based on cascade pyramid network |
CN108710830A (en) * | 2018-04-20 | 2018-10-26 | 浙江工商大学 | A kind of intensive human body 3D posture estimation methods for connecting attention pyramid residual error network and equidistantly limiting of combination |
CN110135375A (en) * | 2019-05-20 | 2019-08-16 | 中国科学院宁波材料技术与工程研究所 | More people's Attitude estimation methods based on global information integration |
CN110276316A (en) * | 2019-06-26 | 2019-09-24 | 电子科技大学 | A kind of human body critical point detection method based on deep learning |
CN110659565A (en) * | 2019-08-15 | 2020-01-07 | 电子科技大学 | 3D multi-person human body posture estimation method based on porous convolution |
-
2020
- 2020-08-28 CN CN202010883889.8A patent/CN112131959B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229445A (en) * | 2018-02-09 | 2018-06-29 | 深圳市唯特视科技有限公司 | A kind of more people's Attitude estimation methods based on cascade pyramid network |
CN108710830A (en) * | 2018-04-20 | 2018-10-26 | 浙江工商大学 | A kind of intensive human body 3D posture estimation methods for connecting attention pyramid residual error network and equidistantly limiting of combination |
CN110135375A (en) * | 2019-05-20 | 2019-08-16 | 中国科学院宁波材料技术与工程研究所 | More people's Attitude estimation methods based on global information integration |
CN110276316A (en) * | 2019-06-26 | 2019-09-24 | 电子科技大学 | A kind of human body critical point detection method based on deep learning |
CN110659565A (en) * | 2019-08-15 | 2020-01-07 | 电子科技大学 | 3D multi-person human body posture estimation method based on porous convolution |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112837367B (en) * | 2021-01-27 | 2022-11-25 | 清华大学 | Semantic decomposition type object pose estimation method and system |
CN112837367A (en) * | 2021-01-27 | 2021-05-25 | 清华大学 | Semantic decomposition type object pose estimation method and system |
CN113221626A (en) * | 2021-03-04 | 2021-08-06 | 北京联合大学 | Human body posture estimation method based on Non-local high-resolution network |
CN113221626B (en) * | 2021-03-04 | 2023-10-20 | 北京联合大学 | Human body posture estimation method based on Non-local high-resolution network |
CN113221787A (en) * | 2021-05-18 | 2021-08-06 | 西安电子科技大学 | Pedestrian multi-target tracking method based on multivariate difference fusion |
CN113221787B (en) * | 2021-05-18 | 2023-09-29 | 西安电子科技大学 | Pedestrian multi-target tracking method based on multi-element difference fusion |
CN113284146A (en) * | 2021-07-23 | 2021-08-20 | 天津御锦人工智能医疗科技有限公司 | Colorectal polyp image recognition method and device and storage medium |
CN113284146B (en) * | 2021-07-23 | 2021-10-22 | 天津御锦人工智能医疗科技有限公司 | Colorectal polyp image recognition method and device and storage medium |
CN113792641A (en) * | 2021-09-08 | 2021-12-14 | 南京航空航天大学 | High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism |
CN113792641B (en) * | 2021-09-08 | 2024-05-03 | 南京航空航天大学 | High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism |
CN114155560A (en) * | 2022-02-08 | 2022-03-08 | 成都考拉悠然科技有限公司 | Light weight method of high-resolution human body posture estimation model based on space dimension reduction |
CN117456562A (en) * | 2023-12-25 | 2024-01-26 | 深圳须弥云图空间科技有限公司 | Attitude estimation method and device |
CN117456562B (en) * | 2023-12-25 | 2024-04-12 | 深圳须弥云图空间科技有限公司 | Attitude estimation method and device |
Also Published As
Publication number | Publication date |
---|---|
CN112131959B (en) | 2024-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112131959A (en) | 2D human body posture estimation method based on multi-scale feature reinforcement | |
CN113033570B (en) | Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion | |
CN114202672A (en) | Small target detection method based on attention mechanism | |
CN112396607B (en) | Deformable convolution fusion enhanced street view image semantic segmentation method | |
CN112288011B (en) | Image matching method based on self-attention deep neural network | |
CN110210551A (en) | A kind of visual target tracking method based on adaptive main body sensitivity | |
CN110929578A (en) | Anti-blocking pedestrian detection method based on attention mechanism | |
CN108427921A (en) | A kind of face identification method based on convolutional neural networks | |
CN109558862A (en) | The people counting method and system of attention refinement frame based on spatial perception | |
CN116452937A (en) | Multi-mode characteristic target detection method based on dynamic convolution and attention mechanism | |
CN114119975A (en) | Language-guided cross-modal instance segmentation method | |
CN112784756B (en) | Human body identification tracking method | |
CN113792641A (en) | High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism | |
CN116863539A (en) | Fall figure target detection method based on optimized YOLOv8s network structure | |
CN115222998B (en) | Image classification method | |
CN111860124A (en) | Remote sensing image classification method based on space spectrum capsule generation countermeasure network | |
CN116863194A (en) | Foot ulcer image classification method, system, equipment and medium | |
CN113344110A (en) | Fuzzy image classification method based on super-resolution reconstruction | |
CN116309632A (en) | Three-dimensional liver semantic segmentation method based on multi-scale cascade feature attention strategy | |
CN116092190A (en) | Human body posture estimation method based on self-attention high-resolution network | |
CN116486080A (en) | Lightweight image semantic segmentation method based on deep learning | |
CN113850182B (en) | DAMR _ DNet-based action recognition method | |
CN113780140A (en) | Gesture image segmentation and recognition method and device based on deep learning | |
CN117011655A (en) | Adaptive region selection feature fusion based method, target tracking method and system | |
CN114863133A (en) | Flotation froth image feature point extraction method based on multitask unsupervised algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |