CN110175575A - A kind of single Attitude estimation method based on novel high-resolution network model - Google Patents

A kind of single Attitude estimation method based on novel high-resolution network model Download PDF

Info

Publication number
CN110175575A
CN110175575A CN201910454096.1A CN201910454096A CN110175575A CN 110175575 A CN110175575 A CN 110175575A CN 201910454096 A CN201910454096 A CN 201910454096A CN 110175575 A CN110175575 A CN 110175575A
Authority
CN
China
Prior art keywords
network
picture
resolution
key point
parallel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910454096.1A
Other languages
Chinese (zh)
Inventor
陈志�
任杰
岳文静
周传
陈璐
刘玲
江婧
周松颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201910454096.1A priority Critical patent/CN110175575A/en
Publication of CN110175575A publication Critical patent/CN110175575A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of method for carrying out single Attitude estimation based on the novel high-resolution network architecture.The invention is detected with image comprising single pedestrian of the detector to input first, is removed inaccurate detection block, is carried out EDS extended data set secondly by data enhancing;Then high-resolution features figure is kept by parallel multiresolution subnet in instantiation network structure, without restoring resolution ratio, crosspoint is introduced in parallel subnet, so that each subnet is repeatedly received information from other parallel subnets, improves the accuracy rate to single Attitude estimation;Due in most of complex scene, it will appear the phenomenon that key point is blocked, so proposing the data enhanced scheme sheltered using a key point, the convolutional neural networks instructed can be very effectively finely tuned by this scheme, the key point being blocked formidably is positioned by adjacent matching, the accuracy rate to occlusion issue is promoted, to obtain more preferably model.

Description

A kind of single Attitude estimation method based on novel high-resolution network model
Technical field
The present invention relates to one kind to be based on novel high-resolution network architecture method, belongs to deep learning, computer vision, machine The interleaving techniques fields such as device study.
Background technique
2D human posture estimation is always a basic but challenging problem in computer vision, carries out one The target of Attitude estimation is positioning human anatomy key point (for example, ancon, wrist etc.) or position.The application back of Attitude estimation Scape is very extensive, is concentrated mainly on intelligent video monitoring, human-computer interaction, virtual reality, smart home etc..This patent is to one Attitude estimation is interested, this is the basis of other relevant issues, such as more people's Attitude estimations, video pose estimation, Activity recognition and The problems such as tracking.
It is highest so far that nearest development shows that depth convolutional neural networks have had reached in terms of single Attitude estimation Accuracy rate, accuracy rate have been much higher than traditional method.Most of existing methods are transmitted by network and are inputted, by high-resolution spy Sign figure is down sampled to low resolution, then restores from low resolution characteristic pattern to high-resolution thinking (this process single or again It is multiple multiple), the process of Multi resolution feature extraction is realized with this.But on the contrary, this patent propose network in the whole process Characteristic pattern remains high-resolution, and the method for this and mainstream before is very different, and remains that high-resolution is logical It crosses and is gradually added what low resolution characteristic pattern sub-network was realized parallel in high-resolution features figure master network.
The visual problem generally acknowledged as one, pose estimation annoyings always researcher for many years, this is because existing Grow directly from seeds in living, often there are many complicated scenes, pedestrian often will appear phenomena such as blocking, these scenes are also to have to challenge Property scene, but use original training set training one of network the disadvantage is that usually Shortcomings amount comprising circumstance of occlusion Single picture carries out accurate critical point detection/positioning to train depth network, and outstanding single pose estimation system is necessary There is robustness with the critical point detection of severely deformed single pedestrian to blocking, the success under rare and novel posture, but It is that this problem is never well solved.A kind of novel key point shielding data enhancing side is proposed in this patent Case increases training data with trim network, improves the accuracy rate in complex scene.
Recent years, with the increasingly in-depth study to deep learning, and due to depth network model be not necessarily to according to Rely complicated feature work and can abundant pictorial information the features such as, more and more researchers start for deep learning to be applied to In the task of single Attitude estimation, the accuracy rate of single Attitude estimation is improved.
Summary of the invention
Technical problem: problem to be solved by the invention is by proposing a kind of list based on novel high-resolution network model People's Attitude estimation method makes network be always maintained at high-resolution spy using parallel multiresolution subnet and Multiscale Fusion repeatedly Sign figure is finely adjusted convolutional neural networks without restoring resolution ratio, and using a kind of new data enhancement method, increases Training data under circumstance of occlusion is improved in complex scene with trim network, single pedestrian be blocked in the case of accuracy rate.
Technical solution: to achieve the goals above, the invention adopts the following technical scheme:
A kind of single Attitude estimation method based on novel high-resolution network model, comprising the following steps:
It is data set that step 1) input tape, which has the RGB pictures cooperation of single pedestrian's posture of key point Labeling Coordinate, is made The pedestrian in picture is outlined with detector with rectangle frame, rectangle frame region is denoted as R;
Rectangle frame region R is saved as picture by step 2), carries out data enhancing to picture, the data enhancing includes that will scheme The mode of piece Random-Rotation, by the mode of picture overturning, add the mode of Gaussian noise to picture;
Data set proportionally 7:3 is divided into training set and test set by step 3), for convolutional neural networks training and Test;
Step 4) is arranged to training set resolution ratio fixed size;
Step 5) is according to parallel multiresolution subnet, repeatedly Multiscale Fusion principle example network structure;It is described parallel Multiresolution subnet is referred to by the way that the characteristic pattern lower than the network resolution ratio is added parallel in high-resolution features figure master network Sub-network;The Multiscale Fusion repeatedly refers to being exchanged with each other information between each parallel network, realizes Multiscale Fusion, more rulers Degree fusion carries out multiple;
Step 6) trains a convolutional neural networks on training set, uses multiwindow, multireel product in convolutional neural networks The picture of verification input carries out convolution operation, and the convolutional neural networks include convolutional layer, pond layer, full articulamentum, in which: are made With line rectification function, i.e. for ReLU function as activation primitive, the ReLU activation primitive is that form is f (x)=max (x, 0) Function, in which: x indicates the output valve of one layer of front, and max (x, 0) is used to take biggish value in x and 0, and f (x) is used to receive The return value of max (x, 0);Use mean square error as loss function, limits instruction using the regularization of dropout mechanism and weight Practice model, the dropout mechanism refers to the method optimized to the artificial neural network with depth structure, learning In the process by the way that the fractional weight of hidden layer or output are returned 0 at random, the interdependency between node is reduced, realizes nerve The regularization of network;Learning rate is modified come training convolutional neural networks by dynamic in the training process, basic learning rate is set It is set to 10-3And drop to 10 respectively in the 150th wheel and the 200th wheel respectively-4With 10-5Training process is terminated in 250 wheels, each Wheel requires to carry out propagated forward and backpropagation to all training datas;
Step 7) is formidably positioned using a kind of scheme of novel key point shielding data enhancing by adjacent matching The key point being blocked increases training data with trim network, improves the accuracy rate in complex scene, the novel key Point shielding data enhancing refers to manually blocking the data enhancement method of some key point or replicates on the image and paste certain The data enhancement method of a body key point patch;
Step 8) inputs the triple channel RGB picture comprising single pedestrian's posture, and the picture is that user inputs picture;
The pedestrian that step 9) inputs in picture user detects, and outlines rectangle frame;
Rectangle frame region is inputted novel high-resolution network by step 10), is carried out propagated forward, is obtained thermal map, the heat Figure is probability of the artis in each pixel;
Step 11) is by being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest It sets, predicts the position of each key point, then again that key point is corresponding connected, the posture for obtaining single pedestrian in picture is estimated Meter.
Further, the step 5), comprising the following steps:
The parallel multiresolution subnet of step 51): by the way that low resolution spy is added parallel in high-resolution features figure master network Levy figure sub-network;Therefore, the resolution ratio of the parallel sub-network of the latter half include previous stage resolution ratio and one than previous Stage lower resolution ratio;
Step 52) Multiscale Fusion repeatedly: crosspoint is introduced in parallel sub-network, each sub-network is repeatedly from it He receives information at parallel sub-network, if input is { X1, X2..., Xn, it exports as { Y1, Y2..., Yn, in which: X indicates input Response diagram, Y indicate the response diagram of output, XnIndicate n-th of input response diagram, YnIndicate n-th of output response figure, point of output Resolution is identical as the resolution ratio of input, and each output is the polymerization of input mappingCrosspoint across the stage Y is mapped with additional outputn+1: Yn+1=a (Yn, n+1), in which: i indicates i-th of response diagram serial number, and k indicates k-th of response Figure serial number, XiIndicate i-th of input response diagram, Yn+1Indicate (n+1)th output response figure, YkIndicate k-th of output response figure, institute State function a (Xi, k) indicate by from i to k to XiCarry out up-sampling or down-sampling, a (Yn, n+1) and it indicates from n to n+1 to YnIt carries out Up-sampling or down-sampling, in which: up-sample and down-sampling is completed by convolution, carried out down using 3 × 3 convolution to stride Sampling, hip walks 3 × 3 convolution and stride 2 carries out 2 × down-sampling, with the continuous hip of stride 2 walk 3 × 3 convolution carry out 4 × under adopt Sample;For up-sampling, passed through interpolation value method later using 1 × 1 convolution and up-sampled, the interpolation value method is referred to original On the basis of image, carry out being inserted into new element using interpolation algorithm between pixel;If i=k, then function representation Are as follows: a (Xi, k) and=Xi, thermal map is returned from the output of the last one crosspoint;
Step 53) network example: instantiating key point thermal map estimation network, the main body of network include four simultaneously Row sub-network, the resolution ratio of sub-network are reduced to half, and width, that is, port number increases to twice;1st stage included 4 remaining single The width of Feature Mapping is reduced to S by member, followed by 3 × 3 convolution, and the S is the width of subnet, the 2nd, 3,4 stages point It Bao Han not 1,4,3 swap block;One swap block includes 4 remaining units, in which: each residue unit is in each resolution ratio Include two 3 × 3 convolution and a crosspoint across different resolution;A total of 8 crosspoints, that is, carried out 8 Secondary Multiscale Fusion.
Further, in the step 2), in the mode of picture Random-Rotation, the rotation angle of picture is -45 °~45 °, I.e. the rotation angle of picture is 45 ° counterclockwise to 45 ° clockwise.
Further, in the step 2), the mode of picture overturning includes flip horizontal and flip vertical.
Further, in the step 2), every picture passes through any two kinds in data enhancement method at random.
Further, in the step 4), training set resolution ratio is arranged to fixed size 256px × 192px, described 256 × 192 be pixel matrix, wherein wherein 256 each column pixel number, 192 be number of pels per line.
The utility model has the advantages that the invention adopts the above technical scheme compared with prior art, have the advantages that
The present invention carries out single Attitude estimation based on the novel high-resolution network architecture using a kind of, is differentiated by parallel more The principle of rate subnet and repeatedly Multiscale Fusion, instantiates network, improves the accuracy rate of single pose estimation, and The data enhanced scheme for proposing a kind of novel key point shielding, increases the training data under circumstance of occlusion with trim network, mentions Height in complex scene, single pedestrian be blocked in the case of accuracy rate.
Specifically:
(1) present invention constructs network according to the principle of parallel multiresolution, from high-resolution subnet initially as the first rank Section gradually adds resolution ratio subnet from high to low, is connected in parallel multiresolution subnet.Therefore, the parallel subnet of the latter half The resolution ratio of network includes the resolution ratio and lower resolution ratio of previous stage.Parallel multiresolution subnet is set to be always maintained at high-resolution Rate characteristic pattern, without restoring resolution ratio;
(2) present invention constructs network according to the principle of Multiscale Fusion repeatedly, introduces crosspoint in parallel subnet, makes It obtains each subnet and repeatedly receives information from other parallel subnets;
(3) present invention proposes a kind of novel key point shielding data enhanced scheme, formidably fixed by adjacent matching The key point that position is blocked improves the single pedestrian in complex scene and is being blocked by force to increase training data with trim network Accuracy rate under frame.
Detailed description of the invention
Fig. 1 is flow chart of the method for the present invention.
Specific embodiment
The technical solution of invention is further described in detail with reference to the accompanying drawing:
As shown in Figure 1, a kind of single Attitude estimation method based on novel high-resolution network model, including following step It is rapid:
It is data set that step 1) input tape, which has the RGB pictures cooperation of single pedestrian's posture of key point Labeling Coordinate, is made The pedestrian in picture is outlined with detector with rectangle frame, rectangle frame region is denoted as R;
Rectangle frame region R is saved as picture by step 2), carries out data enhancing to picture, the data enhancing includes that will scheme The mode of piece Random-Rotation, by the mode of picture overturning, add the mode of Gaussian noise to picture;Specifically: picture revolves at random In the mode turned, the rotation angle of picture is -45 °~45 °, i.e., the rotation angle of picture is 45 ° counterclockwise to 45 ° clockwise, The mode of picture overturning includes flip horizontal and flip vertical, and every picture passes through any two in data enhancement method at random Kind.
Data set proportionally 7:3 is divided into training set and test set by step 3), for convolutional neural networks training and Test;
Step 4) is arranged to training set resolution ratio fixed size 256px × 192px, and described 256 × 192 be pixel value Matrix, wherein wherein 256 each column pixel number, 192 be number of pels per line.
Step 5) is according to parallel multiresolution subnet, repeatedly Multiscale Fusion principle example network structure;It is described parallel Multiresolution subnet is referred to by the way that the characteristic pattern lower than the network resolution ratio is added parallel in high-resolution features figure master network Sub-network;The Multiscale Fusion repeatedly refers to being exchanged with each other information between each parallel network, realizes Multiscale Fusion, more rulers Degree fusion carries out multiple;
In particular, the step 5), comprising the following steps:
The parallel multiresolution subnet of step 51): by the way that low resolution is gradually added parallel in high-resolution features figure master network Rate characteristic pattern sub-network;Artificial regards as initial network high-resolution, the resolution ratio for the characteristic pattern sub-network being added parallel It is lower than the resolution ratio of initial network;Therefore, the resolution ratio of the parallel sub-network of the latter half include previous stage resolution ratio and One than previous stage more lower resolution ratio;
Step 52) Multiscale Fusion repeatedly: crosspoint is introduced in parallel sub-network, each sub-network is repeatedly from it He receives information at parallel sub-network, if input is { X1, X2..., Xn, it exports as { Y1, Y2..., Yn, in which: X indicates input Response diagram, Y indicate the response diagram of output, XnIndicate n-th of input response diagram, YnIndicate n-th of output response figure, point of output Resolution is identical as the resolution ratio of input, and each output is the polymerization of input mappingCrosspoint across the stage Y is mapped with additional outputn+1: Yn+1=a (Yn, n+1), in which: i indicates i-th of response diagram serial number, and k indicates k-th of response Figure serial number, XiIndicate i-th of input response diagram, Yn+1Indicate (n+1)th output response figure, YkIndicate k-th of output response figure, institute State function a (Xi, k) indicate by from i to k to XiCarry out up-sampling or down-sampling, a (Yn, n+1) and it indicates from n to n+1 to YnIt carries out Up-sampling or down-sampling, in which: up-sample and down-sampling is completed by convolution, carried out down using 3 × 3 convolution to stride Sampling, hip walks 3 × 3 convolution and stride 2 carries out 2 × down-sampling, with the continuous hip of stride 2 walk 3 × 3 convolution carry out 4 × under adopt Sample;For up-sampling, passed through interpolation value method later using 1 × 1 convolution and up-sampled, the interpolation value method is referred to original On the basis of image, carry out being inserted into new element using interpolation algorithm between pixel;If i=k, then function representation Are as follows: a (Xi, k) and=Xi, thermal map is returned from the output of the last one crosspoint;
Step 53) network example: instantiating key point thermal map estimation network, the main body of network include four simultaneously Row sub-network, the resolution ratio of sub-network are reduced to half, and width, that is, port number increases to twice;1st stage included 4 remaining single The width of Feature Mapping is reduced to S by member, followed by 3 × 3 convolution, and the S is the width of subnet, the 2nd, 3,4 stages point It Bao Han not 1,4,3 swap block;One swap block includes 4 remaining units, in which: each residue unit is in each resolution ratio Include two 3 × 3 convolution and a crosspoint across different resolution;A total of 8 crosspoints, that is, carried out 8 Secondary Multiscale Fusion.
Step 6) trains a convolutional neural networks on training set, uses multiwindow, multireel product in convolutional neural networks The picture of verification input carries out convolution operation, and the convolutional neural networks include convolutional layer, pond layer, full articulamentum, in which: are made With line rectification function, i.e. for ReLU function as activation primitive, the ReLU activation primitive is that form is f (x)=max (x, 0) Function, in which: x indicates the output valve of one layer of front, and max (x, 0) is used to take biggish value in x and 0, and f (x) is used to receive The return value of max (x, 0);Use mean square error as loss function, limits instruction using the regularization of dropout mechanism and weight Practice model, the dropout mechanism refers to the method optimized to the artificial neural network with depth structure, learning In the process by the way that the fractional weight of hidden layer or output are returned 0 at random, the interdependency between node is reduced, realizes nerve The regularization of network;Learning rate is modified come training convolutional neural networks by dynamic in the training process, basic learning rate is set It is set to 10-3And drop to 10 respectively in the 150th wheel and the 200th wheel respectively-4With 10-5Training process is terminated in 250 wheels, each Wheel requires to carry out propagated forward and backpropagation to all training datas;
Phenomena such as often being blocked due to pedestrian, in order to cope with these challenging scenes, using manually blocking The data enhancement method of some key point or the data enhancing side for replicating and pasting some body key point patch on the image Formula carries out data enhancing, to increase training data with trim network.User inputs the triple channel RGB comprising single pedestrian and schemes Piece.The pedestrian inputted in picture to user detects, and outlines rectangle frame, rectangle frame region is input to network, obtains thermal map, By being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest, can predict every The position of one key point, it is then again that key point is corresponding connected, so that it may to obtain the Attitude estimation of single pedestrian in picture.
Step 7) is formidably positioned using a kind of scheme of novel key point shielding data enhancing by adjacent matching The key point being blocked increases training data with trim network, improves the accuracy rate in complex scene, the novel key Point shielding data enhancing refers to manually blocking the data enhancement method of some key point or replicates on the image and paste certain The data enhancement method of a body key point patch;
Step 8) inputs the triple channel RGB picture comprising single pedestrian's posture, and the picture is that user inputs picture;
The pedestrian that step 9) inputs in picture user detects, and outlines rectangle frame;
Rectangle frame region is inputted novel high-resolution network by step 10), is carried out propagated forward, is obtained thermal map, the heat Figure is probability of the artis in each pixel;
Step 11) is by being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest It sets, predicts the position of each key point, then again that key point is corresponding connected, the posture for obtaining single pedestrian in picture is estimated Meter.
The above is only a preferred embodiment of the present invention, it should be pointed out that: for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (6)

1. a kind of single Attitude estimation method based on novel high-resolution network model, which comprises the following steps:
It is data set that step 1) input tape, which has the RGB pictures cooperation of single pedestrian's posture of key point Labeling Coordinate, uses inspection It surveys rectangular circle and goes out the pedestrian in picture, rectangle frame region is denoted as R;
Rectangle frame region R is saved as picture by step 2), carries out data enhancing to picture, data enhancing include by picture with The mode of machine rotation, by the mode of picture overturning, add the mode of Gaussian noise to picture;
Data set proportionally 7:3 is divided into training set and test set, the training and survey for convolutional neural networks by step 3) Examination;
Step 4) is arranged to training set resolution ratio fixed size;
Step 5) is according to parallel multiresolution subnet, repeatedly Multiscale Fusion principle example network structure;It is described more points parallel Resolution subnet is referred to by the way that the characteristic pattern subnet lower than the network resolution ratio is added parallel in high-resolution features figure master network Network;The Multiscale Fusion repeatedly refers to being exchanged with each other information between each parallel network, realizes Multiscale Fusion, multiple dimensioned to melt It closes and carries out repeatedly;
Step 6) trains a convolutional neural networks on training set, uses multiwindow, multireel product verification in convolutional neural networks The picture of input carries out convolution operation, and the convolutional neural networks include convolutional layer, pond layer, full articulamentum, in which: uses line Property rectification function, i.e. for ReLU function as activation primitive, the ReLU activation primitive is form for the letter of f (x)=max (x, 0) Number, in which: x indicates the output valve of one layer of front, and max (x, 0) is used to take biggish value in x and 0, f (x) be used to receive max (x, 0) return value;Use mean square error as loss function, limits training mould using the regularization of dropout mechanism and weight Type, the dropout mechanism refers to the method optimized to the artificial neural network with depth structure, in learning process In by the way that the fractional weight of hidden layer or output are returned 0 at random, reduce the interdependency between node, realize neural network Regularization;Learning rate is modified come training convolutional neural networks by dynamic in the training process, basic learning rate is set as 10-3And drop to 10 respectively in the 150th wheel and the 200th wheel respectively-4With 10-5Training process is terminated in 250 wheels, and each round is all It needs to carry out propagated forward and backpropagation to all training datas;
Step 7) is formidably positioned by adjacent matching and is hidden using a kind of scheme of novel key point shielding data enhancing The key point of gear increases training data with trim network, improves the accuracy rate in complex scene, the novel key point screen Data enhancing is covered to refer to manually blocking the data enhancement method of some key point or replicate and paste some body on the image The data enhancement method of body key point patch;
Step 8) inputs the triple channel RGB picture comprising single pedestrian's posture, and the picture is that user inputs picture;
The pedestrian that step 9) inputs in picture user detects, and outlines rectangle frame;
Rectangle frame region is inputted novel high-resolution network by step 10), is carried out propagated forward, is shown that thermal map, the thermal map are Probability of the artis in each pixel;
Step 11) by being responsive to the position that the upward a quarter offset of the second responder adjusts maximum calorific value from highest, The position of each key point is predicted, it is then again that key point is corresponding connected, obtain the Attitude estimation of single pedestrian in picture.
2. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special Sign is, the step 5), comprising the following steps:
The parallel multiresolution subnet of step 51): by the way that low resolution characteristic pattern is added parallel in high-resolution features figure master network Sub-network;Therefore, the resolution ratio of the parallel sub-network of the latter half includes that the resolution ratio of previous stage and one compare previous stage Lower resolution ratio;
Step 52) Multiscale Fusion repeatedly: introducing crosspoint in parallel sub-network, each sub-network repeatedly from other simultaneously Row sub-network receives information, if input is { X1, X2..., Xn, it exports as { Y1, Y2..., Yn, in which: X indicates the response of input Figure, Y indicate the response diagram of output, XnIndicate n-th of input response diagram, YnIndicate n-th of output response figure, the resolution ratio of output Identical as the resolution ratio of input, each output is the polymerization of input mappingCrosspoint across the stage has Additional output maps Yn+1: Yn+1=a (Yn, n+1), in which: i indicates i-th of response diagram serial number, and k indicates k-th of response diagram sequence Number, XiIndicate i-th of input response diagram, Yn+1Indicate (n+1)th output response figure, YkIndicate k-th of output response figure, the letter Number a (Xi, k) indicate by from i to k to XiCarry out up-sampling or down-sampling, a (Yn, n+1) and it indicates from n to n+1 to YnAdopt Sample or down-sampling, in which: up-sample and down-sampling is completed by convolution, adopt using 3 × 3 convolution to stride Sample, a hip walks 3 × 3 convolution and stride 2 carries out 2 × down-sampling, walks 3 × 3 convolution with the continuous hip of stride 2 and carries out 4 × down-sampling; For up-sampling, passed through interpolation value method later using 1 × 1 convolution and up-sampled, the interpolation value method is referred in original figure As on the basis of, carry out being inserted into new element using interpolation algorithm between pixel;If i=k, then function representation are as follows: a (Xi, k) and=Xi, thermal map is returned from the output of the last one crosspoint;
Step 53) network example: instantiating key point thermal map estimation network, and the main body of network includes four parallel sons Network, the resolution ratio of sub-network are reduced to half, and width, that is, port number increases to twice;1st stage included 4 remaining units, Followed by 3 × 3 convolution, the width of Feature Mapping is reduced to S, the S is the width of subnet, the 2nd, 3,4 stages difference Include 1,4,3 swap block;One swap block includes 4 remaining units, in which: each residue unit wraps in each resolution ratio Containing two 3 × 3 convolution and a crosspoint across different resolution;A total of 8 crosspoints, that is, carried out 8 times Multiscale Fusion.
3. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special Sign is, in the step 2), in the mode of picture Random-Rotation, the rotation angle of picture is -45 °~45 °, i.e. the rotation of picture Gyration is 45 ° counterclockwise to 45 ° clockwise.
4. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special Sign is, in the step 2), the mode of picture overturning includes flip horizontal and flip vertical.
5. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special Sign is, in the step 2), every picture passes through any two kinds in data enhancement method at random.
6. a kind of single Attitude estimation method based on novel high-resolution network model according to claim 1, special Sign is, in the step 4), training set resolution ratio is arranged to fixed size 256px × 192px, and described 256 × 192 be picture Plain value matrix, wherein wherein 256 each column pixel number, 192 be number of pels per line.
CN201910454096.1A 2019-05-29 2019-05-29 A kind of single Attitude estimation method based on novel high-resolution network model Pending CN110175575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910454096.1A CN110175575A (en) 2019-05-29 2019-05-29 A kind of single Attitude estimation method based on novel high-resolution network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910454096.1A CN110175575A (en) 2019-05-29 2019-05-29 A kind of single Attitude estimation method based on novel high-resolution network model

Publications (1)

Publication Number Publication Date
CN110175575A true CN110175575A (en) 2019-08-27

Family

ID=67695846

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910454096.1A Pending CN110175575A (en) 2019-05-29 2019-05-29 A kind of single Attitude estimation method based on novel high-resolution network model

Country Status (1)

Country Link
CN (1) CN110175575A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705365A (en) * 2019-09-06 2020-01-17 北京达佳互联信息技术有限公司 Human body key point detection method and device, electronic equipment and storage medium
CN110889858A (en) * 2019-12-03 2020-03-17 中国太平洋保险(集团)股份有限公司 Automobile part segmentation method and device based on point regression
CN110969105A (en) * 2019-11-22 2020-04-07 清华大学深圳国际研究生院 Human body posture estimation method
CN111274865A (en) * 2019-12-14 2020-06-12 深圳先进技术研究院 Remote sensing image cloud detection method and device based on full convolution neural network
CN111339903A (en) * 2020-02-21 2020-06-26 河北工业大学 Multi-person human body posture estimation method
CN111950412A (en) * 2020-07-31 2020-11-17 陕西师范大学 Hierarchical dance action attitude estimation method with sequence multi-scale depth feature fusion
CN112364738A (en) * 2020-10-30 2021-02-12 深圳点猫科技有限公司 Human body posture estimation method, device, system and medium based on deep learning
CN112580721A (en) * 2020-12-19 2021-03-30 北京联合大学 Target key point detection method based on multi-resolution feature fusion
CN112861872A (en) * 2020-12-31 2021-05-28 浙大城市学院 Penaeus vannamei phenotype data determination method, device, computer equipment and storage medium
CN112883761A (en) * 2019-11-29 2021-06-01 北京达佳互联信息技术有限公司 Method, device and equipment for constructing attitude estimation model and storage medium
CN113221626A (en) * 2021-03-04 2021-08-06 北京联合大学 Human body posture estimation method based on Non-local high-resolution network
CN113343762A (en) * 2021-05-07 2021-09-03 北京邮电大学 Human body posture estimation grouping model training method, posture estimation method and device
CN113361378A (en) * 2021-06-02 2021-09-07 合肥工业大学 Human body posture estimation method using adaptive data enhancement
CN113449609A (en) * 2021-06-09 2021-09-28 东华大学 Subway violation early warning method based on improved HigherHRNet model and DNN (deep neural network)
CN114241051A (en) * 2021-12-21 2022-03-25 盈嘉互联(北京)科技有限公司 Object attitude estimation method for indoor complex scene
CN114492216A (en) * 2022-04-19 2022-05-13 中国石油大学(华东) Pumping unit operation track simulation method based on high-resolution representation learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229445A (en) * 2018-02-09 2018-06-29 深圳市唯特视科技有限公司 A kind of more people's Attitude estimation methods based on cascade pyramid network
CN109447906A (en) * 2018-11-08 2019-03-08 北京印刷学院 A kind of picture synthetic method based on generation confrontation network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229445A (en) * 2018-02-09 2018-06-29 深圳市唯特视科技有限公司 A kind of more people's Attitude estimation methods based on cascade pyramid network
CN109447906A (en) * 2018-11-08 2019-03-08 北京印刷学院 A kind of picture synthetic method based on generation confrontation network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KE SUN ET AL: "Deep High-Resolution Representation Learning for Human Pose Estimation", 《ARXIV》 *
LIPENG KE ET AL.: "Multi-Scale Structure-Aware Network for Human Pose Estimation", 《ARXIV》 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705365A (en) * 2019-09-06 2020-01-17 北京达佳互联信息技术有限公司 Human body key point detection method and device, electronic equipment and storage medium
CN110969105A (en) * 2019-11-22 2020-04-07 清华大学深圳国际研究生院 Human body posture estimation method
CN112883761A (en) * 2019-11-29 2021-06-01 北京达佳互联信息技术有限公司 Method, device and equipment for constructing attitude estimation model and storage medium
CN112883761B (en) * 2019-11-29 2023-12-12 北京达佳互联信息技术有限公司 Construction method, device, equipment and storage medium of attitude estimation model
CN110889858A (en) * 2019-12-03 2020-03-17 中国太平洋保险(集团)股份有限公司 Automobile part segmentation method and device based on point regression
CN111274865A (en) * 2019-12-14 2020-06-12 深圳先进技术研究院 Remote sensing image cloud detection method and device based on full convolution neural network
CN111274865B (en) * 2019-12-14 2023-09-19 深圳先进技术研究院 Remote sensing image cloud detection method and device based on full convolution neural network
CN111339903A (en) * 2020-02-21 2020-06-26 河北工业大学 Multi-person human body posture estimation method
CN111339903B (en) * 2020-02-21 2022-02-08 河北工业大学 Multi-person human body posture estimation method
CN111950412A (en) * 2020-07-31 2020-11-17 陕西师范大学 Hierarchical dance action attitude estimation method with sequence multi-scale depth feature fusion
CN111950412B (en) * 2020-07-31 2023-11-24 陕西师范大学 Hierarchical dance motion gesture estimation method based on sequence multi-scale depth feature fusion
CN112364738A (en) * 2020-10-30 2021-02-12 深圳点猫科技有限公司 Human body posture estimation method, device, system and medium based on deep learning
CN112580721A (en) * 2020-12-19 2021-03-30 北京联合大学 Target key point detection method based on multi-resolution feature fusion
CN112580721B (en) * 2020-12-19 2023-10-24 北京联合大学 Target key point detection method based on multi-resolution feature fusion
CN112861872A (en) * 2020-12-31 2021-05-28 浙大城市学院 Penaeus vannamei phenotype data determination method, device, computer equipment and storage medium
CN113221626A (en) * 2021-03-04 2021-08-06 北京联合大学 Human body posture estimation method based on Non-local high-resolution network
CN113221626B (en) * 2021-03-04 2023-10-20 北京联合大学 Human body posture estimation method based on Non-local high-resolution network
CN113343762A (en) * 2021-05-07 2021-09-03 北京邮电大学 Human body posture estimation grouping model training method, posture estimation method and device
CN113361378B (en) * 2021-06-02 2023-03-10 合肥工业大学 Human body posture estimation method using adaptive data enhancement
CN113361378A (en) * 2021-06-02 2021-09-07 合肥工业大学 Human body posture estimation method using adaptive data enhancement
CN113449609A (en) * 2021-06-09 2021-09-28 东华大学 Subway violation early warning method based on improved HigherHRNet model and DNN (deep neural network)
CN114241051A (en) * 2021-12-21 2022-03-25 盈嘉互联(北京)科技有限公司 Object attitude estimation method for indoor complex scene
CN114492216A (en) * 2022-04-19 2022-05-13 中国石油大学(华东) Pumping unit operation track simulation method based on high-resolution representation learning

Similar Documents

Publication Publication Date Title
CN110175575A (en) A kind of single Attitude estimation method based on novel high-resolution network model
CN105976378B (en) Conspicuousness object detection method based on graph model
CN105631861B (en) Restore the method for 3 D human body posture from unmarked monocular image in conjunction with height map
CN109064405A (en) A kind of multi-scale image super-resolution method based on dual path network
CN112836597B (en) Multi-hand gesture key point estimation method based on cascade parallel convolution neural network
CN108492316A (en) A kind of localization method and device of terminal
CN103839277B (en) A kind of mobile augmented reality register method of outdoor largescale natural scene
CN109271888A (en) Personal identification method, device, electronic equipment based on gait
CN110427937A (en) A kind of correction of inclination license plate and random length licence plate recognition method based on deep learning
CN109377530A (en) A kind of binocular depth estimation method based on deep neural network
CN108229497A (en) Image processing method, device, storage medium, computer program and electronic equipment
CN110795982A (en) Apparent sight estimation method based on human body posture analysis
CN110020989A (en) A kind of depth image super resolution ratio reconstruction method based on deep learning
CN110472542A (en) A kind of infrared image pedestrian detection method and detection system based on deep learning
CN106599805A (en) Supervised data driving-based monocular video depth estimating method
CN103020912B (en) The remote sensing image restored method of a kind of combination wave band cluster and sparse expression
CN104318569A (en) Space salient region extraction method based on depth variation model
CN112465827A (en) Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation
CN109887029A (en) A kind of monocular vision mileage measurement method based on color of image feature
CN106886986B (en) Image interfusion method based on adaptive group structure sparse dictionary study
CN110246181A (en) Attitude estimation model training method, Attitude estimation method and system based on anchor point
US20120299906A1 (en) Model-Based Face Image Super-Resolution
CN110246084A (en) A kind of super-resolution image reconstruction method and its system, device, storage medium
Park et al. Biologically inspired saliency map model for bottom-up visual attention
CN106372597B (en) CNN Vehicle Detection method based on adaptive contextual information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190827

RJ01 Rejection of invention patent application after publication