CN108198192A

CN108198192A - A kind of quick human body segmentation's method of high-precision based on deep learning

Info

Publication number: CN108198192A
Application number: CN201810035086.XA
Authority: CN
Inventors: 任俊芬
Original assignee: Individual
Current assignee: Individual
Priority date: 2018-01-15
Filing date: 2018-01-15
Publication date: 2018-06-22

Abstract

A kind of quick human body segmentation's method of high-precision based on deep learning, according to the statistics ratio-dependent input image size of human body, employ the symmetrical neural network structure convolutional network of customization and deconvolution network, and the consistent convolutional network layer of size is added in deconvolution network, learning ability of the extreme enrichment network to details is come with this.Finally, it is trained using a large amount of images comprising human body, after training, you can human body image is split using the network.This method has precision height, fireballing advantage.

Description

A kind of quick human body segmentation's method of high-precision based on deep learning

Technical field

The present invention relates to the technical fields such as artificial neural network, computer vision, and in particular to one kind is based on deep learning The quick human body segmentation's method of high-precision.

Background technology

The problem of human body segmentation is a classics in computer vision application, it is in still image to need completing for task Middle differentiation belongs to the pixel of human body and non-human pixel.In terms of another, the classification problem of Pixel-level can be considered.

High-precision human body segmentation needs to differentiate many humanoid details, for example, the gap of finger, the gap of both legs, arm with Gap of body etc..Due to a variety of variations of dress, illumination, accessories and posture so that the human body segmentation of Pixel-level becomes great Challenge.

In the method scope of non-deep learning, the canonical frame for solving the problems, such as human body segmentation is artificial pixel characteristic cooperation Figure cuts algorithm.Artificial pixel characteristic can have color characteristic, SIFT（Scale-invariant Feature Transform） Feature, LBP（Local Binary Pattern）Feature and HOG（Histogram of Oriented Gradient）Feature Deng.In figure cuts algorithm, each pixel in image is considered as the vertex of figure, and the connection of two neighboring pixel is considered as figure Side.The similarity between pixel is defined using the manual features of pixel, then the minimal cut problem of solution figure obtains segmentation result.This The defects of class scheme, is that the manual features of pixel are only capable of describing provincial characteristics of the pixel nearby in limited range, and figure Minimal cut algorithm also only considers the relationship between neighborhood territory pixel.However, the semantic relevance of human body segmentation is very strong, therefore this kind of Method is difficult to complete high-precision human body segmentation or even in some cases can lost part limbs.From the speed of service, It is taken seriously for the various manual features of each pixel extraction, and it is very big to solve the similary time overhead of minimal cut problem, therefore will Accomplish that Real-time segmentation is extremely difficult.

Deep learning greatly improves the precision of human body segmentation.Specifically, full convolutional neural networks（FCN, Fully Convolutional Neural Networks）The end-to-end partitioning algorithm using deep neural network is realized first.FCN Full articulamentum in classical taxonomy network into row equivalent is transformed, matrix multiplication is substituted using convolution operation.Such improvement makes The image input of arbitrary size can be received by obtaining traditional convolutional network, be laid the groundwork for segmentation network.FCN is chosen in sorter network If the higher dried layer of level of abstraction is restored these outputs to input size as output, and using deconvolution operation, from And classification based training pixel-by-pixel can be carried out.FCN can be the semantic information of each pixel association in larger scope, provide simultaneously Mode of learning end to end, than traditional dividing method precision higher, speed is faster.However, high-rise semantic letter is utilized in it Breath, but local detailed information is had ignored, cause details segmentation effect mentioned above undesirable.If subsequently based on FCN's Dry improved method compensates for the deficiency that local message uses to a certain extent, employs symmetrical network structure, uses simultaneously Local message is enriched in Pooling and Unpooling operations.However, the symmetric form network that these methods use is excessively huge, It can not reach real-time although segmentation precision is further promoted, in speed.

Invention content

In order to further enhance the accuracy and speed of existing human body segmentation's algorithm, the present invention proposes a kind of based on depth The High-precision human dividing method of habit.

The present invention is achieved through the following technical solutions：

A kind of quick human body segmentation's method of high-precision based on deep learning, includes the following steps：

Step 1）Design neural network：According to the average depth-width ratio of human body, the size of input picture is determined；Neural network uses Symmetrical funnel structure, funnel front half section use convolutional layer, and the step-length of each convolutional layer is 2, and the second half section uses warp lamination, often The step-length of a warp lamination is 2；The characteristic pattern of funnel front half section is connected in series to characteristic pattern of the second half section as its resolution ratio On, deconvolution operation will act on the characteristic pattern after series connection, and the last output channel number of neural network is one characteristic pattern；

Step 2）Training neural network：Enough images containing human body are collected, manually mark each picture of each image Element is designated as " human body " or " non-human " two class, some with the close-connected adjunct of human body are labeled as " human body " together, is marked Image afterwards is segmentation figure；Using the artwork and segmentation figure of the image containing human body collected, pass through stochastic gradient descent Algorithm trains neural network；It is periodically tested during training using the performance of verification set pair neural network, test meets It is required that, you can deconditioning；

Step 3）Human body segmentation is carried out using neural network：The image of human body is included using human body detecting device acquisition, through processing It is input to again in trained neural network afterwards, each pixel belongs to the probability of human body in neural network output image, according to should Situation chooses probability threshold value, and probability is human body pixel to get to the segmentation figure of human body higher than the pixel of threshold value.

Further to improve, the size of the input picture is high 150 pixel, the rectangle frame of wide 70 pixel.

It is further to improve, the step 2）In, artwork and segmentation figure to the image containing human body are processed place Reason, the step of processing be：

The rectangle frame of hard-pressed bale human body is marked according to segmentation figure, by four sides of the rectangle frame, one amplitude of outside Stochastic propagation, newly Obtained rectangle frame is no more than the size of original image;

The corresponding region of new rectangle frame after extraction expands from artwork, segmentation figure also do same extraction operation；

The region depth-width ratio for keeping previous step extraction is constant, scales it to the input size of network；Using ater supplement not The part of foot.

Further to improve, the image data for training is divided into two parts of training set and verification collection, training set For training neural network, verification collects to confirm whether network trains completion

Compared with prior art, the present invention has the following advantages：

The present invention relates to a kind of quick human body segmentation's method of high-precision based on deep learning, compared with the conventional method compared with, The symmetrical funnel structure of convolutional layer and deconvolution layer building is used, and all layers of step-length is 2, is thus subtracted as far as possible The number of plies of few neural network.Particularly, by the convolutional layer characteristic pattern of funnel first half and the characteristic pattern of warp lamination in method It connects so that network can either learn global feature, and can catch minutia, greatly improve the property of segmentation Energy.It to sum up describes, this method is fast with speed, advantage with high accuracy.

Description of the drawings

Fig. 1 is 1 flow diagram of embodiment.

Specific embodiment

Embodiment 1

A kind of quick human body segmentation's method of high-precision based on deep learning as shown in Figure 1, it is true according to the statistics ratio of human body Determine input image size, employ the symmetrical neural network structure of customization ----convolutional network and deconvolution network, and will The consistent convolutional network layer of size is added in deconvolution network, carrys out learning ability of the extreme enrichment network to details with this.Most Afterwards, it is trained using a large amount of images comprising human body, after training, you can human body image is divided using the network It cuts.This method has precision height, fireballing advantage.

The step of this method includes is specific as follows：

The design process S1 of neural network is：Using the average depth-width ratio of the somatic data statistics human body of collection, with the knot of statistics Fruit instructs determining for input image size.The design of network uses symmetrical funnel structure, i.e. funnel front half section uses convolutional layer, after Half section uses warp lamination.Base's output consistent with size in warp lamination will be rolled up simultaneously to be connected, and subsidiary details are come with this Information.

The training process S2 of neural network is：Enough images containing human body are collected, mark the mark of all pixels Label ----human body is non-human.The neural network completed using the data training customization of mark.

The use process S3 of neural network is：By trained human body segmentation's network for new image, obtain in image Each pixel belongs to the probability of human body, and by choosing a probability threshold value, and the pixel that will be above threshold value finally positions human body picture Element obtains the segmentation figure of human body.

Specifically, process S1 includes the following steps：

S101. the size of neural network input picture is determined first.The human body height of standing is greater than width, according to the data of collection It is counted, it is 2.8 to find its average proportions:1.Accordingly, the image size of network output is defined as high 150 pixel, wide 70 picture Element.

S102. the design of funnel network structure front half section.The front half section of funnel uses convolutional layer, does not use here Pooling layers down-sampled to characteristic pattern progress, and is directly substituted using the convolutional layer that step-length is 2.Meanwhile in order to reduce network depth Degree promotes network speed, and the step-length of each convolutional layer is 2, reduces rapidly the resolution ratio of characteristic pattern, until its resolution ratio is 1x1。

S103. the design of funnel network structure second half section.Second half section using the warp lamination that step-length is 2, gradually restores special The resolution ratio of figure is levied, until being restored to consistent with input size.Likewise, the step-length of all warp laminations is 2, quickly carry The resolution ratio of each layer output is risen, reduces the number of plies.

S104. in order to promote learning ability of the network to details, by the characteristic pattern of funnel front half section be connected in series to the second half section with On the same characteristic pattern of its resolution ratio, deconvolution operation will act on the characteristic pattern after series connection.

S105. the last output channel number of network is one characteristic pattern, makes its segmentation figure with mark consistent.

Specifically, process S2 includes the following steps：

S201. the image for including human body is collected.The area of human body occupies more than half of image area as far as possible, ensures human body It is relatively clear and sufficiently large.

S202. the label of each pixel is manually marked, is designated as " human body " or " non-human " two class.It needs exist for some It is labeled as human body parts, such as cap, knapsack etc. together with the close-connected adjunct of human body.

S203. data are ready for for training network.The rectangle frame of hard-pressed bale human body is found according to segmentation figure, according to rectangle Four sides, one amplitude of outside Stochastic propagation of frame, the selection of this amplitude can be rectangle frame height or wide 10.Really Protect the size that the rectangle frame newly obtained does not exceed original image.The corresponding region of new rectangle frame is extracted from artwork, Its segmentation figure does same operation.It subsequently ensures that the depth-width ratio for plucking out image is constant, scales it to the input size of network： The high x70 pixels of 150 pixels are wide.Insufficient part is supplemented using ater.

S204. training data is divided into two parts：Training set and verification collect.Verification collection is not involved in hands-on, is used for Confirm whether network trains completion.

S205. designed network is trained using the algorithm of stochastic gradient descent.Period periodically uses Verification set pair network performance is tested, until meeting the requirements, you can deconditioning.

Specifically, process S3 includes the following steps：

S301. according to the human region detected（It can usually be obtained by human testing）, keep the height of network inputs Width obtains new image than cutting.New images are zoomed to the input size of network, network is allowed to obtain output result.

S302. what network exported is the probability that each pixel belongs to human body, will be super using a suitable probability threshold value The pixel definition for crossing this threshold value is human region, it is hereby achieved that the segmentation figure of human body.

Above example is merely to illustrate the present invention, but be not limited to the scope of the present invention, it is every according to the present invention Any simple modification, equivalent change and modification that technical spirit makees following instance, still falls within technical solution of the present invention In the range of.

Claims

1. a kind of quick human body segmentation's method of high-precision based on deep learning, which is characterized in that include the following steps：

A kind of 2. quick human body segmentation's method of high-precision based on deep learning as described in claim 1, which is characterized in that institute The size of input picture is stated as high 150 pixel, the rectangle frame of wide 70 pixel.

A kind of 3. quick human body segmentation's method of high-precision based on deep learning as described in claim 1, which is characterized in that institute State step 2）In, the artwork and segmentation figure of the image containing human body are processed, the step of processing is：

The region depth-width ratio for keeping extraction is constant, scales it to the input size of network；Insufficient portion is supplemented using ater Point.

A kind of 4. quick human body segmentation's method of high-precision based on deep learning as described in claim 1, which is characterized in that institute It states and is divided into two parts of training set and verification collection for trained image data, training set verifies collection for training neural network For confirming whether network trains completion.