CN110175575A - A kind of single Attitude estimation method based on novel high-resolution network model - Google Patents
A kind of single Attitude estimation method based on novel high-resolution network model Download PDFInfo
- Publication number
- CN110175575A CN110175575A CN201910454096.1A CN201910454096A CN110175575A CN 110175575 A CN110175575 A CN 110175575A CN 201910454096 A CN201910454096 A CN 201910454096A CN 110175575 A CN110175575 A CN 110175575A
- Authority
- CN
- China
- Prior art keywords
- network
- picture
- resolution
- key point
- parallel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses a kind of method for carrying out single Attitude estimation based on the novel high-resolution network architecture.The invention is detected with image comprising single pedestrian of the detector to input first, is removed inaccurate detection block, is carried out EDS extended data set secondly by data enhancing;Then high-resolution features figure is kept by parallel multiresolution subnet in instantiation network structure, without restoring resolution ratio, crosspoint is introduced in parallel subnet, so that each subnet is repeatedly received information from other parallel subnets, improves the accuracy rate to single Attitude estimation;Due in most of complex scene, it will appear the phenomenon that key point is blocked, so proposing the data enhanced scheme sheltered using a key point, the convolutional neural networks instructed can be very effectively finely tuned by this scheme, the key point being blocked formidably is positioned by adjacent matching, the accuracy rate to occlusion issue is promoted, to obtain more preferably model.
Description
Technical field
The present invention relates to one kind to be based on novel high-resolution network architecture method, belongs to deep learning, computer vision, machine
The interleaving techniques fields such as device study.
Background technique
2D human posture estimation is always a basic but challenging problem in computer vision, carries out one
The target of Attitude estimation is positioning human anatomy key point (for example, ancon, wrist etc.) or position.The application back of Attitude estimation
Scape is very extensive, is concentrated mainly on intelligent video monitoring, human-computer interaction, virtual reality, smart home etc..This patent is to one
Attitude estimation is interested, this is the basis of other relevant issues, such as more people's Attitude estimations, video pose estimation, Activity recognition and
The problems such as tracking.
It is highest so far that nearest development shows that depth convolutional neural networks have had reached in terms of single Attitude estimation
Accuracy rate, accuracy rate have been much higher than traditional method.Most of existing methods are transmitted by network and are inputted, by high-resolution spy
Sign figure is down sampled to low resolution, then restores from low resolution characteristic pattern to high-resolution thinking (this process single or again
It is multiple multiple), the process of Multi resolution feature extraction is realized with this.But on the contrary, this patent propose network in the whole process
Characteristic pattern remains high-resolution, and the method for this and mainstream before is very different, and remains that high-resolution is logical
It crosses and is gradually added what low resolution characteristic pattern sub-network was realized parallel in high-resolution features figure master network.
The visual problem generally acknowledged as one, pose estimation annoyings always researcher for many years, this is because existing
Grow directly from seeds in living, often there are many complicated scenes, pedestrian often will appear phenomena such as blocking, these scenes are also to have to challenge
Property scene, but use original training set training one of network the disadvantage is that usually Shortcomings amount comprising circumstance of occlusion
Single picture carries out accurate critical point detection/positioning to train depth network, and outstanding single pose estimation system is necessary
There is robustness with the critical point detection of severely deformed single pedestrian to blocking, the success under rare and novel posture, but
It is that this problem is never well solved.A kind of novel key point shielding data enhancing side is proposed in this patent
Case increases training data with trim network, improves the accuracy rate in complex scene.
Recent years, with the increasingly in-depth study to deep learning, and due to depth network model be not necessarily to according to
Rely complicated feature work and can abundant pictorial information the features such as, more and more researchers start for deep learning to be applied to
In the task of single Attitude estimation, the accuracy rate of single Attitude estimation is improved.
Summary of the invention
Technical problem: problem to be solved by the invention is by proposing a kind of list based on novel high-resolution network model
People's Attitude estimation method makes network be always maintained at high-resolution spy using parallel multiresolution subnet and Multiscale Fusion repeatedly
Sign figure is finely adjusted convolutional neural networks without restoring resolution ratio, and using a kind of new data enhancement method, increases
Training data under circumstance of occlusion is improved in complex scene with trim network, single pedestrian be blocked in the case of accuracy rate.
Technical solution: to achieve the goals above, the invention adopts the following technical scheme:
A kind of single Attitude estimation method based on novel high-resolution network model, comprising the following steps:
It is data set that step 1) input tape, which has the RGB pictures cooperation of single pedestrian's posture of key point Labeling Coordinate, is made
The pedestrian in picture is outlined with detector with rectangle frame, rectangle frame region is denoted as R;
Rectangle frame region R is saved as picture by step 2), carries out data enhancing to picture, the data enhancing includes that will scheme
The mode of piece Random-Rotation, by the mode of picture overturning, add the mode of Gaussian noise to picture;
Data set proportionally 7:3 is divided into training set and test set by step 3), for convolutional neural networks training and
Test;
Step 4) is arranged to training set resolution ratio fixed size;
Step 5) is according to parallel multiresolution subnet, repeatedly Multiscale Fusion principle example network structure;It is described parallel
Multiresolution subnet is referred to by the way that the characteristic pattern lower than the network resolution ratio is added parallel in high-resolution features figure master network
Sub-network;The Multiscale Fusion repeatedly refers to being exchanged with each other information between each parallel network, realizes Multiscale Fusion, more rulers
Degree fusion carries out multiple;
Step 6) trains a convolutional neural networks on training set, uses multiwindow, multireel product in convolutional neural networks
The picture of verification input carries out convolution operation, and the convolutional neural networks include convolutional layer, pond layer, full articulamentum, in which: are made
With line rectification function, i.e. for ReLU function as activation primitive, the ReLU activation primitive is that form is f (x)=max (x, 0)
Function, in which: x indicates the output valve of one layer of front, and max (x, 0) is used to take biggish value in x and 0, and f (x) is used to receive
The return value of max (x, 0);Use mean square error as loss function, limits instruction using the regularization of dropout mechanism and weight
Practice model, the dropout mechanism refers to the method optimized to the artificial neural network with depth structure, learning
In the process by the way that the fractional weight of hidden layer or output are returned 0 at random, the interdependency between node is reduced, realizes nerve
The regularization of network;Learning rate is modified come training convolutional neural networks by dynamic in the training process, basic learning rate is set
It is set to 10-3And drop to 10 respectively in the 150th wheel and the 200th wheel respectively-4With 10-5Training process is terminated in 250 wheels, each
Wheel requires to carry out propagated forward and backpropagation to all training datas;
Step 7) is formidably positioned using a kind of scheme of novel key point shielding data enhancing by adjacent matching
The key point being blocked increases training data with trim network, improves the accuracy rate in complex scene, the novel key
Point shielding data enhancing refers to manually blocking the data enhancement method of some key point or replicates on the image and paste certain
The data enhancement method of a body key point patch;
Step 8) inputs the triple channel RGB picture comprising single pedestrian's posture, and the picture is that user inputs picture;
The pedestrian that step 9) inputs in picture user detects, and outlines rectangle frame;
Rectangle frame region is inputted novel high-resolution network by step 10), is carried out propagated forward, is obtained thermal map, the heat
Figure is probability of the artis in each pixel;
Step 11) is by being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest
It sets, predicts the position of each key point, then again that key point is corresponding connected, the posture for obtaining single pedestrian in picture is estimated
Meter.
Further, the step 5), comprising the following steps:
The parallel multiresolution subnet of step 51): by the way that low resolution spy is added parallel in high-resolution features figure master network
Levy figure sub-network;Therefore, the resolution ratio of the parallel sub-network of the latter half include previous stage resolution ratio and one than previous
Stage lower resolution ratio;
Step 52) Multiscale Fusion repeatedly: crosspoint is introduced in parallel sub-network, each sub-network is repeatedly from it
He receives information at parallel sub-network, if input is { X1, X2..., Xn, it exports as { Y1, Y2..., Yn, in which: X indicates input
Response diagram, Y indicate the response diagram of output, XnIndicate n-th of input response diagram, YnIndicate n-th of output response figure, point of output
Resolution is identical as the resolution ratio of input, and each output is the polymerization of input mappingCrosspoint across the stage
Y is mapped with additional outputn+1: Yn+1=a (Yn, n+1), in which: i indicates i-th of response diagram serial number, and k indicates k-th of response
Figure serial number, XiIndicate i-th of input response diagram, Yn+1Indicate (n+1)th output response figure, YkIndicate k-th of output response figure, institute
State function a (Xi, k) indicate by from i to k to XiCarry out up-sampling or down-sampling, a (Yn, n+1) and it indicates from n to n+1 to YnIt carries out
Up-sampling or down-sampling, in which: up-sample and down-sampling is completed by convolution, carried out down using 3 × 3 convolution to stride
Sampling, hip walks 3 × 3 convolution and stride 2 carries out 2 × down-sampling, with the continuous hip of stride 2 walk 3 × 3 convolution carry out 4 × under adopt
Sample;For up-sampling, passed through interpolation value method later using 1 × 1 convolution and up-sampled, the interpolation value method is referred to original
On the basis of image, carry out being inserted into new element using interpolation algorithm between pixel;If i=k, then function representation
Are as follows: a (Xi, k) and=Xi, thermal map is returned from the output of the last one crosspoint;
Step 53) network example: instantiating key point thermal map estimation network, the main body of network include four simultaneously
Row sub-network, the resolution ratio of sub-network are reduced to half, and width, that is, port number increases to twice;1st stage included 4 remaining single
The width of Feature Mapping is reduced to S by member, followed by 3 × 3 convolution, and the S is the width of subnet, the 2nd, 3,4 stages point
It Bao Han not 1,4,3 swap block;One swap block includes 4 remaining units, in which: each residue unit is in each resolution ratio
Include two 3 × 3 convolution and a crosspoint across different resolution;A total of 8 crosspoints, that is, carried out 8
Secondary Multiscale Fusion.
Further, in the step 2), in the mode of picture Random-Rotation, the rotation angle of picture is -45 °~45 °,
I.e. the rotation angle of picture is 45 ° counterclockwise to 45 ° clockwise.
Further, in the step 2), the mode of picture overturning includes flip horizontal and flip vertical.
Further, in the step 2), every picture passes through any two kinds in data enhancement method at random.
Further, in the step 4), training set resolution ratio is arranged to fixed size 256px × 192px, described
256 × 192 be pixel matrix, wherein wherein 256 each column pixel number, 192 be number of pels per line.
The utility model has the advantages that the invention adopts the above technical scheme compared with prior art, have the advantages that
The present invention carries out single Attitude estimation based on the novel high-resolution network architecture using a kind of, is differentiated by parallel more
The principle of rate subnet and repeatedly Multiscale Fusion, instantiates network, improves the accuracy rate of single pose estimation, and
The data enhanced scheme for proposing a kind of novel key point shielding, increases the training data under circumstance of occlusion with trim network, mentions
Height in complex scene, single pedestrian be blocked in the case of accuracy rate.
Specifically:
(1) present invention constructs network according to the principle of parallel multiresolution, from high-resolution subnet initially as the first rank
Section gradually adds resolution ratio subnet from high to low, is connected in parallel multiresolution subnet.Therefore, the parallel subnet of the latter half
The resolution ratio of network includes the resolution ratio and lower resolution ratio of previous stage.Parallel multiresolution subnet is set to be always maintained at high-resolution
Rate characteristic pattern, without restoring resolution ratio;
(2) present invention constructs network according to the principle of Multiscale Fusion repeatedly, introduces crosspoint in parallel subnet, makes
It obtains each subnet and repeatedly receives information from other parallel subnets;
(3) present invention proposes a kind of novel key point shielding data enhanced scheme, formidably fixed by adjacent matching
The key point that position is blocked improves the single pedestrian in complex scene and is being blocked by force to increase training data with trim network
Accuracy rate under frame.
Detailed description of the invention
Fig. 1 is flow chart of the method for the present invention.
Specific embodiment
The technical solution of invention is further described in detail with reference to the accompanying drawing:
As shown in Figure 1, a kind of single Attitude estimation method based on novel high-resolution network model, including following step
It is rapid:
It is data set that step 1) input tape, which has the RGB pictures cooperation of single pedestrian's posture of key point Labeling Coordinate, is made
The pedestrian in picture is outlined with detector with rectangle frame, rectangle frame region is denoted as R;
Rectangle frame region R is saved as picture by step 2), carries out data enhancing to picture, the data enhancing includes that will scheme
The mode of piece Random-Rotation, by the mode of picture overturning, add the mode of Gaussian noise to picture;Specifically: picture revolves at random
In the mode turned, the rotation angle of picture is -45 °~45 °, i.e., the rotation angle of picture is 45 ° counterclockwise to 45 ° clockwise,
The mode of picture overturning includes flip horizontal and flip vertical, and every picture passes through any two in data enhancement method at random
Kind.
Data set proportionally 7:3 is divided into training set and test set by step 3), for convolutional neural networks training and
Test;
Step 4) is arranged to training set resolution ratio fixed size 256px × 192px, and described 256 × 192 be pixel value
Matrix, wherein wherein 256 each column pixel number, 192 be number of pels per line.
Step 5) is according to parallel multiresolution subnet, repeatedly Multiscale Fusion principle example network structure;It is described parallel
Multiresolution subnet is referred to by the way that the characteristic pattern lower than the network resolution ratio is added parallel in high-resolution features figure master network
Sub-network;The Multiscale Fusion repeatedly refers to being exchanged with each other information between each parallel network, realizes Multiscale Fusion, more rulers
Degree fusion carries out multiple;
In particular, the step 5), comprising the following steps:
The parallel multiresolution subnet of step 51): by the way that low resolution is gradually added parallel in high-resolution features figure master network
Rate characteristic pattern sub-network;Artificial regards as initial network high-resolution, the resolution ratio for the characteristic pattern sub-network being added parallel
It is lower than the resolution ratio of initial network;Therefore, the resolution ratio of the parallel sub-network of the latter half include previous stage resolution ratio and
One than previous stage more lower resolution ratio;
Step 52) Multiscale Fusion repeatedly: crosspoint is introduced in parallel sub-network, each sub-network is repeatedly from it
He receives information at parallel sub-network, if input is { X1, X2..., Xn, it exports as { Y1, Y2..., Yn, in which: X indicates input
Response diagram, Y indicate the response diagram of output, XnIndicate n-th of input response diagram, YnIndicate n-th of output response figure, point of output
Resolution is identical as the resolution ratio of input, and each output is the polymerization of input mappingCrosspoint across the stage
Y is mapped with additional outputn+1: Yn+1=a (Yn, n+1), in which: i indicates i-th of response diagram serial number, and k indicates k-th of response
Figure serial number, XiIndicate i-th of input response diagram, Yn+1Indicate (n+1)th output response figure, YkIndicate k-th of output response figure, institute
State function a (Xi, k) indicate by from i to k to XiCarry out up-sampling or down-sampling, a (Yn, n+1) and it indicates from n to n+1 to YnIt carries out
Up-sampling or down-sampling, in which: up-sample and down-sampling is completed by convolution, carried out down using 3 × 3 convolution to stride
Sampling, hip walks 3 × 3 convolution and stride 2 carries out 2 × down-sampling, with the continuous hip of stride 2 walk 3 × 3 convolution carry out 4 × under adopt
Sample;For up-sampling, passed through interpolation value method later using 1 × 1 convolution and up-sampled, the interpolation value method is referred to original
On the basis of image, carry out being inserted into new element using interpolation algorithm between pixel;If i=k, then function representation
Are as follows: a (Xi, k) and=Xi, thermal map is returned from the output of the last one crosspoint;
Step 53) network example: instantiating key point thermal map estimation network, the main body of network include four simultaneously
Row sub-network, the resolution ratio of sub-network are reduced to half, and width, that is, port number increases to twice;1st stage included 4 remaining single
The width of Feature Mapping is reduced to S by member, followed by 3 × 3 convolution, and the S is the width of subnet, the 2nd, 3,4 stages point
It Bao Han not 1,4,3 swap block;One swap block includes 4 remaining units, in which: each residue unit is in each resolution ratio
Include two 3 × 3 convolution and a crosspoint across different resolution;A total of 8 crosspoints, that is, carried out 8
Secondary Multiscale Fusion.
Step 6) trains a convolutional neural networks on training set, uses multiwindow, multireel product in convolutional neural networks
The picture of verification input carries out convolution operation, and the convolutional neural networks include convolutional layer, pond layer, full articulamentum, in which: are made
With line rectification function, i.e. for ReLU function as activation primitive, the ReLU activation primitive is that form is f (x)=max (x, 0)
Function, in which: x indicates the output valve of one layer of front, and max (x, 0) is used to take biggish value in x and 0, and f (x) is used to receive
The return value of max (x, 0);Use mean square error as loss function, limits instruction using the regularization of dropout mechanism and weight
Practice model, the dropout mechanism refers to the method optimized to the artificial neural network with depth structure, learning
In the process by the way that the fractional weight of hidden layer or output are returned 0 at random, the interdependency between node is reduced, realizes nerve
The regularization of network;Learning rate is modified come training convolutional neural networks by dynamic in the training process, basic learning rate is set
It is set to 10-3And drop to 10 respectively in the 150th wheel and the 200th wheel respectively-4With 10-5Training process is terminated in 250 wheels, each
Wheel requires to carry out propagated forward and backpropagation to all training datas;
Phenomena such as often being blocked due to pedestrian, in order to cope with these challenging scenes, using manually blocking
The data enhancement method of some key point or the data enhancing side for replicating and pasting some body key point patch on the image
Formula carries out data enhancing, to increase training data with trim network.User inputs the triple channel RGB comprising single pedestrian and schemes
Piece.The pedestrian inputted in picture to user detects, and outlines rectangle frame, rectangle frame region is input to network, obtains thermal map,
By being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest, can predict every
The position of one key point, it is then again that key point is corresponding connected, so that it may to obtain the Attitude estimation of single pedestrian in picture.
Step 7) is formidably positioned using a kind of scheme of novel key point shielding data enhancing by adjacent matching
The key point being blocked increases training data with trim network, improves the accuracy rate in complex scene, the novel key
Point shielding data enhancing refers to manually blocking the data enhancement method of some key point or replicates on the image and paste certain
The data enhancement method of a body key point patch;
Step 8) inputs the triple channel RGB picture comprising single pedestrian's posture, and the picture is that user inputs picture;
The pedestrian that step 9) inputs in picture user detects, and outlines rectangle frame;
Rectangle frame region is inputted novel high-resolution network by step 10), is carried out propagated forward, is obtained thermal map, the heat
Figure is probability of the artis in each pixel;
Step 11) is by being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest
It sets, predicts the position of each key point, then again that key point is corresponding connected, the posture for obtaining single pedestrian in picture is estimated
Meter.
The above is only a preferred embodiment of the present invention, it should be pointed out that: for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered
It is considered as protection scope of the present invention.
Claims (6)
1. a kind of single Attitude estimation method based on novel high-resolution network model, which comprises the following steps:
It is data set that step 1) input tape, which has the RGB pictures cooperation of single pedestrian's posture of key point Labeling Coordinate, uses inspection
It surveys rectangular circle and goes out the pedestrian in picture, rectangle frame region is denoted as R;
Rectangle frame region R is saved as picture by step 2), carries out data enhancing to picture, data enhancing include by picture with
The mode of machine rotation, by the mode of picture overturning, add the mode of Gaussian noise to picture;
Data set proportionally 7:3 is divided into training set and test set, the training and survey for convolutional neural networks by step 3)
Examination;
Step 4) is arranged to training set resolution ratio fixed size;
Step 5) is according to parallel multiresolution subnet, repeatedly Multiscale Fusion principle example network structure;It is described more points parallel
Resolution subnet is referred to by the way that the characteristic pattern subnet lower than the network resolution ratio is added parallel in high-resolution features figure master network
Network;The Multiscale Fusion repeatedly refers to being exchanged with each other information between each parallel network, realizes Multiscale Fusion, multiple dimensioned to melt
It closes and carries out repeatedly;
Step 6) trains a convolutional neural networks on training set, uses multiwindow, multireel product verification in convolutional neural networks
The picture of input carries out convolution operation, and the convolutional neural networks include convolutional layer, pond layer, full articulamentum, in which: uses line
Property rectification function, i.e. for ReLU function as activation primitive, the ReLU activation primitive is form for the letter of f (x)=max (x, 0)
Number, in which: x indicates the output valve of one layer of front, and max (x, 0) is used to take biggish value in x and 0, f (x) be used to receive max (x,
0) return value;Use mean square error as loss function, limits training mould using the regularization of dropout mechanism and weight
Type, the dropout mechanism refers to the method optimized to the artificial neural network with depth structure, in learning process
In by the way that the fractional weight of hidden layer or output are returned 0 at random, reduce the interdependency between node, realize neural network
Regularization;Learning rate is modified come training convolutional neural networks by dynamic in the training process, basic learning rate is set as
10-3And drop to 10 respectively in the 150th wheel and the 200th wheel respectively-4With 10-5Training process is terminated in 250 wheels, and each round is all
It needs to carry out propagated forward and backpropagation to all training datas;
Step 7) is formidably positioned by adjacent matching and is hidden using a kind of scheme of novel key point shielding data enhancing
The key point of gear increases training data with trim network, improves the accuracy rate in complex scene, the novel key point screen
Data enhancing is covered to refer to manually blocking the data enhancement method of some key point or replicate and paste some body on the image
The data enhancement method of body key point patch;
Step 8) inputs the triple channel RGB picture comprising single pedestrian's posture, and the picture is that user inputs picture;
The pedestrian that step 9) inputs in picture user detects, and outlines rectangle frame;
Rectangle frame region is inputted novel high-resolution network by step 10), is carried out propagated forward, is shown that thermal map, the thermal map are
Probability of the artis in each pixel;
Step 11) by being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest,
The position of each key point is predicted, it is then again that key point is corresponding connected, obtain the Attitude estimation of single pedestrian in picture.
2. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special
Sign is, the step 5), comprising the following steps:
The parallel multiresolution subnet of step 51): by the way that low resolution characteristic pattern is added parallel in high-resolution features figure master network
Sub-network;Therefore, the resolution ratio of the parallel sub-network of the latter half includes that the resolution ratio of previous stage and one compare previous stage
Lower resolution ratio;
Step 52) Multiscale Fusion repeatedly: introducing crosspoint in parallel sub-network, each sub-network repeatedly from other simultaneously
Row sub-network receives information, if input is { X1, X2..., Xn, it exports as { Y1, Y2..., Yn, in which: X indicates the response of input
Figure, Y indicate the response diagram of output, XnIndicate n-th of input response diagram, YnIndicate n-th of output response figure, the resolution ratio of output
Identical as the resolution ratio of input, each output is the polymerization of input mappingCrosspoint across the stage has
Additional output maps Yn+1: Yn+1=a (Yn, n+1), in which: i indicates i-th of response diagram serial number, and k indicates k-th of response diagram sequence
Number, XiIndicate i-th of input response diagram, Yn+1Indicate (n+1)th output response figure, YkIndicate k-th of output response figure, the letter
Number a (Xi, k) indicate by from i to k to XiCarry out up-sampling or down-sampling, a (Yn, n+1) and it indicates from n to n+1 to YnAdopt
Sample or down-sampling, in which: up-sample and down-sampling is completed by convolution, adopt using 3 × 3 convolution to stride
Sample, a hip walks 3 × 3 convolution and stride 2 carries out 2 × down-sampling, walks 3 × 3 convolution with the continuous hip of stride 2 and carries out 4 × down-sampling;
For up-sampling, passed through interpolation value method later using 1 × 1 convolution and up-sampled, the interpolation value method is referred in original figure
As on the basis of, carry out being inserted into new element using interpolation algorithm between pixel;If i=k, then function representation are as follows: a
(Xi, k) and=Xi, thermal map is returned from the output of the last one crosspoint;
Step 53) network example: instantiating key point thermal map estimation network, and the main body of network includes four parallel sons
Network, the resolution ratio of sub-network are reduced to half, and width, that is, port number increases to twice;1st stage included 4 remaining units,
Followed by 3 × 3 convolution, the width of Feature Mapping is reduced to S, the S is the width of subnet, the 2nd, 3,4 stages difference
Include 1,4,3 swap block;One swap block includes 4 remaining units, in which: each residue unit wraps in each resolution ratio
Containing two 3 × 3 convolution and a crosspoint across different resolution;A total of 8 crosspoints, that is, carried out 8 times
Multiscale Fusion.
3. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special
Sign is, in the step 2), in the mode of picture Random-Rotation, the rotation angle of picture is -45 °~45 °, i.e. the rotation of picture
Gyration is 45 ° counterclockwise to 45 ° clockwise.
4. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special
Sign is, in the step 2), the mode of picture overturning includes flip horizontal and flip vertical.
5. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special
Sign is, in the step 2), every picture passes through any two kinds in data enhancement method at random.
6. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special
Sign is, in the step 4), training set resolution ratio is arranged to fixed size 256px × 192px, and described 256 × 192 be picture
Plain value matrix, wherein wherein 256 each column pixel number, 192 be number of pels per line.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910454096.1A CN110175575A (en) | 2019-05-29 | 2019-05-29 | A kind of single Attitude estimation method based on novel high-resolution network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910454096.1A CN110175575A (en) | 2019-05-29 | 2019-05-29 | A kind of single Attitude estimation method based on novel high-resolution network model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110175575A true CN110175575A (en) | 2019-08-27 |
Family
ID=67695846
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910454096.1A Pending CN110175575A (en) | 2019-05-29 | 2019-05-29 | A kind of single Attitude estimation method based on novel high-resolution network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110175575A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705365A (en) * | 2019-09-06 | 2020-01-17 | 北京达佳互联信息技术有限公司 | Human body key point detection method and device, electronic equipment and storage medium |
CN110889858A (en) * | 2019-12-03 | 2020-03-17 | 中国太平洋保险(集团)股份有限公司 | Automobile part segmentation method and device based on point regression |
CN110969105A (en) * | 2019-11-22 | 2020-04-07 | 清华大学深圳国际研究生院 | Human body posture estimation method |
CN111274865A (en) * | 2019-12-14 | 2020-06-12 | 深圳先进技术研究院 | Remote sensing image cloud detection method and device based on full convolution neural network |
CN111339903A (en) * | 2020-02-21 | 2020-06-26 | 河北工业大学 | Multi-person human body posture estimation method |
CN111950412A (en) * | 2020-07-31 | 2020-11-17 | 陕西师范大学 | Hierarchical dance action attitude estimation method with sequence multi-scale depth feature fusion |
CN112364738A (en) * | 2020-10-30 | 2021-02-12 | 深圳点猫科技有限公司 | Human body posture estimation method, device, system and medium based on deep learning |
CN112580721A (en) * | 2020-12-19 | 2021-03-30 | 北京联合大学 | Target key point detection method based on multi-resolution feature fusion |
CN112861872A (en) * | 2020-12-31 | 2021-05-28 | 浙大城市学院 | Penaeus vannamei phenotype data determination method, device, computer equipment and storage medium |
CN112883761A (en) * | 2019-11-29 | 2021-06-01 | 北京达佳互联信息技术有限公司 | Method, device and equipment for constructing attitude estimation model and storage medium |
CN113221626A (en) * | 2021-03-04 | 2021-08-06 | 北京联合大学 | Human body posture estimation method based on Non-local high-resolution network |
CN113343762A (en) * | 2021-05-07 | 2021-09-03 | 北京邮电大学 | Human body posture estimation grouping model training method, posture estimation method and device |
CN113361378A (en) * | 2021-06-02 | 2021-09-07 | 合肥工业大学 | Human body posture estimation method using adaptive data enhancement |
CN113449609A (en) * | 2021-06-09 | 2021-09-28 | 东华大学 | Subway violation early warning method based on improved HigherHRNet model and DNN (deep neural network) |
CN114241051A (en) * | 2021-12-21 | 2022-03-25 | 盈嘉互联(北京)科技有限公司 | Object attitude estimation method for indoor complex scene |
CN114492216A (en) * | 2022-04-19 | 2022-05-13 | 中国石油大学(华东) | Pumping unit operation track simulation method based on high-resolution representation learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229445A (en) * | 2018-02-09 | 2018-06-29 | 深圳市唯特视科技有限公司 | A kind of more people's Attitude estimation methods based on cascade pyramid network |
CN109447906A (en) * | 2018-11-08 | 2019-03-08 | 北京印刷学院 | A kind of picture synthetic method based on generation confrontation network |
-
2019
- 2019-05-29 CN CN201910454096.1A patent/CN110175575A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229445A (en) * | 2018-02-09 | 2018-06-29 | 深圳市唯特视科技有限公司 | A kind of more people's Attitude estimation methods based on cascade pyramid network |
CN109447906A (en) * | 2018-11-08 | 2019-03-08 | 北京印刷学院 | A kind of picture synthetic method based on generation confrontation network |
Non-Patent Citations (2)
Title |
---|
KE SUN ET AL: "Deep High-Resolution Representation Learning for Human Pose Estimation", 《ARXIV》 * |
LIPENG KE ET AL.: "Multi-Scale Structure-Aware Network for Human Pose Estimation", 《ARXIV》 * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705365A (en) * | 2019-09-06 | 2020-01-17 | 北京达佳互联信息技术有限公司 | Human body key point detection method and device, electronic equipment and storage medium |
CN110969105A (en) * | 2019-11-22 | 2020-04-07 | 清华大学深圳国际研究生院 | Human body posture estimation method |
CN112883761A (en) * | 2019-11-29 | 2021-06-01 | 北京达佳互联信息技术有限公司 | Method, device and equipment for constructing attitude estimation model and storage medium |
CN112883761B (en) * | 2019-11-29 | 2023-12-12 | 北京达佳互联信息技术有限公司 | Construction method, device, equipment and storage medium of attitude estimation model |
CN110889858A (en) * | 2019-12-03 | 2020-03-17 | 中国太平洋保险(集团)股份有限公司 | Automobile part segmentation method and device based on point regression |
CN111274865A (en) * | 2019-12-14 | 2020-06-12 | 深圳先进技术研究院 | Remote sensing image cloud detection method and device based on full convolution neural network |
CN111274865B (en) * | 2019-12-14 | 2023-09-19 | 深圳先进技术研究院 | Remote sensing image cloud detection method and device based on full convolution neural network |
CN111339903A (en) * | 2020-02-21 | 2020-06-26 | 河北工业大学 | Multi-person human body posture estimation method |
CN111339903B (en) * | 2020-02-21 | 2022-02-08 | 河北工业大学 | Multi-person human body posture estimation method |
CN111950412A (en) * | 2020-07-31 | 2020-11-17 | 陕西师范大学 | Hierarchical dance action attitude estimation method with sequence multi-scale depth feature fusion |
CN111950412B (en) * | 2020-07-31 | 2023-11-24 | 陕西师范大学 | Hierarchical dance motion gesture estimation method based on sequence multi-scale depth feature fusion |
CN112364738A (en) * | 2020-10-30 | 2021-02-12 | 深圳点猫科技有限公司 | Human body posture estimation method, device, system and medium based on deep learning |
CN112580721A (en) * | 2020-12-19 | 2021-03-30 | 北京联合大学 | Target key point detection method based on multi-resolution feature fusion |
CN112580721B (en) * | 2020-12-19 | 2023-10-24 | 北京联合大学 | Target key point detection method based on multi-resolution feature fusion |
CN112861872A (en) * | 2020-12-31 | 2021-05-28 | 浙大城市学院 | Penaeus vannamei phenotype data determination method, device, computer equipment and storage medium |
CN113221626A (en) * | 2021-03-04 | 2021-08-06 | 北京联合大学 | Human body posture estimation method based on Non-local high-resolution network |
CN113221626B (en) * | 2021-03-04 | 2023-10-20 | 北京联合大学 | Human body posture estimation method based on Non-local high-resolution network |
CN113343762A (en) * | 2021-05-07 | 2021-09-03 | 北京邮电大学 | Human body posture estimation grouping model training method, posture estimation method and device |
CN113361378B (en) * | 2021-06-02 | 2023-03-10 | 合肥工业大学 | Human body posture estimation method using adaptive data enhancement |
CN113361378A (en) * | 2021-06-02 | 2021-09-07 | 合肥工业大学 | Human body posture estimation method using adaptive data enhancement |
CN113449609A (en) * | 2021-06-09 | 2021-09-28 | 东华大学 | Subway violation early warning method based on improved HigherHRNet model and DNN (deep neural network) |
CN114241051A (en) * | 2021-12-21 | 2022-03-25 | 盈嘉互联(北京)科技有限公司 | Object attitude estimation method for indoor complex scene |
CN114492216A (en) * | 2022-04-19 | 2022-05-13 | 中国石油大学(华东) | Pumping unit operation track simulation method based on high-resolution representation learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110175575A (en) | A kind of single Attitude estimation method based on novel high-resolution network model | |
CN105976378B (en) | Conspicuousness object detection method based on graph model | |
CN105631861B (en) | Restore the method for 3 D human body posture from unmarked monocular image in conjunction with height map | |
CN109064405A (en) | A kind of multi-scale image super-resolution method based on dual path network | |
CN112836597B (en) | Multi-hand gesture key point estimation method based on cascade parallel convolution neural network | |
CN108492316A (en) | A kind of localization method and device of terminal | |
CN103839277B (en) | A kind of mobile augmented reality register method of outdoor largescale natural scene | |
CN109271888A (en) | Personal identification method, device, electronic equipment based on gait | |
CN110427937A (en) | A kind of correction of inclination license plate and random length licence plate recognition method based on deep learning | |
CN109377530A (en) | A kind of binocular depth estimation method based on deep neural network | |
CN108229497A (en) | Image processing method, device, storage medium, computer program and electronic equipment | |
CN110795982A (en) | Apparent sight estimation method based on human body posture analysis | |
CN110020989A (en) | A kind of depth image super resolution ratio reconstruction method based on deep learning | |
CN110472542A (en) | A kind of infrared image pedestrian detection method and detection system based on deep learning | |
CN106599805A (en) | Supervised data driving-based monocular video depth estimating method | |
CN103020912B (en) | The remote sensing image restored method of a kind of combination wave band cluster and sparse expression | |
CN104318569A (en) | Space salient region extraction method based on depth variation model | |
CN112465827A (en) | Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation | |
CN109887029A (en) | A kind of monocular vision mileage measurement method based on color of image feature | |
CN106886986B (en) | Image interfusion method based on adaptive group structure sparse dictionary study | |
CN110246181A (en) | Attitude estimation model training method, Attitude estimation method and system based on anchor point | |
US20120299906A1 (en) | Model-Based Face Image Super-Resolution | |
CN110246084A (en) | A kind of super-resolution image reconstruction method and its system, device, storage medium | |
Park et al. | Biologically inspired saliency map model for bottom-up visual attention | |
CN106372597B (en) | CNN Vehicle Detection method based on adaptive contextual information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190827 |
|
RJ01 | Rejection of invention patent application after publication |