CN106778705B - Pedestrian individual segmentation method and device - Google Patents

Pedestrian individual segmentation method and device Download PDF

Info

Publication number
CN106778705B
CN106778705B CN201710065013.0A CN201710065013A CN106778705B CN 106778705 B CN106778705 B CN 106778705B CN 201710065013 A CN201710065013 A CN 201710065013A CN 106778705 B CN106778705 B CN 106778705B
Authority
CN
China
Prior art keywords
segmentation
grained
coarse
shaped contour
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710065013.0A
Other languages
Chinese (zh)
Other versions
CN106778705A (en
Inventor
王亮
黄永祯
宋纯锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201710065013.0A priority Critical patent/CN106778705B/en
Publication of CN106778705A publication Critical patent/CN106778705A/en
Application granted granted Critical
Publication of CN106778705B publication Critical patent/CN106778705B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian individual segmentation method and a device, wherein the method comprises the following steps: carrying out pedestrian segmentation on the image to be processed by utilizing a pre-trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation result; the blocking segmentation result comprises a plurality of blocks marked as a background and a foreground, the block marked as the background in the image to be processed does not contain a pedestrian body, and the block marked as the foreground contains a partial image of the pedestrian body; removing a background image in a part corresponding to the blocked pedestrian segmentation result in the image to be processed to obtain a coarse-grained segmentation image; inputting the coarse-grained segmentation image into a pre-trained fine-grained human-shaped contour segmentation model; outputting an individual pedestrian segmentation result by the pre-trained fine-grained human-shaped contour segmentation model; the coarse-grained human-shaped contour segmentation model and the fine-grained human-shaped contour segmentation model are obtained through full convolution neural network training.

Description

Pedestrian individual segmentation method and device
Technical Field
The invention relates to the technical field of computer vision and pattern recognition, in particular to a pedestrian individual segmentation method and device based on the combination of the thickness and the granularity of a full convolution neural network.
Background
The pedestrian individual segmentation problem is one of the most important problems in the fields of scene understanding, biological feature recognition and the like. Most of the traditional pedestrian segmentation methods require that other pedestrians cannot be contained in the background, and the segmentation result is obtained by distinguishing the difference between the human body and the environment. However, in an actual scene, there are a lot of cases where pedestrians are shielded from each other, and at this time, the conventional pedestrian segmentation method cannot obtain a satisfactory result. The problem can be partially solved by combining individual detection and pedestrian segmentation, but the individual detection is time-consuming and serious, and in many cases, even if the individual position is accurately detected, a perfect individual segmentation result cannot be obtained due to the fact that a plurality of pieces of human body information are contained in the region. The method for combining the fineness and the granularity can better solve the problem.
Disclosure of Invention
Aiming at the problems encountered in individual pedestrian segmentation in the prior art, the invention utilizes a coarse-grained segmentation model to shield other pedestrians appearing in the background in a mode of combining the coarse-grained segmentation model and a fine-grained segmentation model, and performs fine segmentation on the basis by using the fine-grained model to obtain the individual segmentation result. Firstly, training a coarse-grained human-shaped contour segmentation model of a multilayer full-convolution neural network by utilizing a large number of human-shaped images with marks; then, the human figure segmentation results of all images are obtained by utilizing the coarse-grained human figure contour segmentation model, and the background area on the images is subtracted (namely, the corresponding pixels are set to be 0) according to the segmentation results, so that the background area is used as the input of a fine-grained segmentation model; and finally, training a fine-grained human-shaped segmentation model by using the image of the shielding background as input and the fine human-shaped mark as supervision information.
In order to achieve the above object, a first aspect of the present invention provides a pedestrian individual segmentation method, including:
carrying out pedestrian segmentation on the image to be processed by utilizing a pre-trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation result; the block segmentation image comprises a plurality of blocks marked as a background and a foreground, wherein the blocks marked as the background do not contain a pedestrian body, and the blocks marked as the foreground contain partial images of the pedestrian body;
removing a background image in a part corresponding to the blocked pedestrian segmentation result in the image to be processed to obtain a coarse-grained segmentation image;
inputting the coarse-grained segmentation image into a pre-trained fine-grained human-shaped contour segmentation model;
the coarse-grained human-shaped contour segmentation model and the fine-grained human-shaped contour segmentation model are obtained through full convolution neural network training.
The first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model comprises a plurality of convolution layers and an anti-convolution layer; the second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model comprises a plurality of convolution layers and a plurality of anti-convolution layers, and the plurality of convolution layers and the plurality of anti-convolution layers are in centrosymmetric structures and combined into a funnel shape.
The method also comprises a training step of the coarse-grained human-shaped contour segmentation model, which comprises the following steps:
carrying out blocking processing on training samples with pedestrian marks in the training data set to obtain blocking processing results of the training samples;
normalizing training samples for training to be uniform in size, and then sending the normalized training samples into a first full convolution neural network corresponding to a coarse-grained human-shaped contour segmentation model;
comparing the block segmentation result output by the coarse-grained human-shaped contour segmentation model with the block processing result of the corresponding training sample to obtain a prediction error;
and reducing the prediction error by adopting a back propagation algorithm and a random gradient descent method to train a first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model, and performing repeated iterative training to obtain a final coarse-grained human-shaped contour segmentation model.
The method also comprises a training step of the fine-grained human-shaped contour segmentation model, which comprises the following steps:
inputting training samples with pedestrian marks in a training data set into a trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation result;
subtracting the background image in the corresponding part of the block segmentation result in the training sample to obtain a coarse-grained segmentation image;
normalizing the coarse-grained segmentation image to a uniform size;
sending the normalized coarse-grained segmentation image into a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model;
comparing the fine segmentation result output by the second full convolution neural network with the fine segmentation marking result of the corresponding training sample to obtain a second prediction error;
and reducing a second prediction error by adopting a back propagation algorithm and a random gradient descent method to train a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model, and performing repeated iterative training to obtain a final fine-grained human-shaped contour segmentation model.
The monitoring information of the coarse-grained human-shaped contour segmentation model is a segmentation mark for blocking processing, and is used for shielding the background in the image.
A second aspect of the present invention provides a pedestrian individual segmentation apparatus including:
the block segmentation module is configured to perform pedestrian segmentation on the image to be processed by utilizing a pre-trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation image; the block segmentation image comprises a plurality of blocks marked as a background and a foreground, wherein the blocks marked as the background do not contain a pedestrian body, and the blocks marked as the foreground contain partial images of the pedestrian body;
the background removing module is configured to remove a background image in a part corresponding to the blocked pedestrian segmentation result in the image to be processed to obtain a coarse-grained segmentation image;
a fine segmentation module configured to input the coarse-grained segmentation image to a pre-trained fine-grained human-shaped contour segmentation model;
the coarse-grained human-shaped contour segmentation model and the fine-grained human-shaped contour segmentation model are obtained through full convolution neural network training.
The first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model comprises a plurality of convolution layers and an anti-convolution layer; the second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model comprises a plurality of convolution layers and a plurality of anti-convolution layers, and the plurality of convolution layers and the plurality of anti-convolution layers are in centrosymmetric structures and combined into a funnel shape.
The device also comprises a training module of the coarse-grained human-shaped contour segmentation model, which comprises:
the marking sub-module is configured to perform blocking processing on the training samples with pedestrian marks in the training data set to obtain blocking processing results of the training samples;
a first normalization submodule configured to normalize training samples for training to a uniform size;
the first training submodule is configured to send the normalized training sample into a first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model;
the first comparison submodule is configured to compare a block segmentation result output by the coarse-granularity human-shaped contour segmentation model with a blocking processing result of a corresponding training sample to obtain a prediction error;
and the first iteration submodule is configured to reduce the prediction error by adopting a back propagation algorithm and a random gradient descent method so as to train a first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model, and obtain a final coarse-grained human-shaped contour segmentation model through multiple iterative training.
The device also comprises a training module of the fine-grained human-shaped contour segmentation model, and the training module comprises:
the blocking sub-module is configured to input training samples with pedestrian marks in the training data set into a trained coarse-grained human-shaped contour segmentation model to obtain a blocking segmentation result;
the background removal submodule is configured to subtract a background image in a part corresponding to the blocking segmentation result in the training sample to obtain a coarse-grained segmentation image;
a second normalization sub-module configured to normalize the coarse-grained segmented image to a uniform size;
the second training submodule is configured to send the normalized coarse-grained segmentation image into a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model;
a second comparison sub-module configured to compare the fine segmentation result output by the second full convolution neural network with the fine segmentation marking result of the corresponding training sample to obtain a second prediction error;
and the second iteration submodule is configured to reduce a second prediction error by adopting a back propagation algorithm and a random gradient descent method so as to train a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model, and obtain a final fine-grained human-shaped contour segmentation model through multiple times of iterative training.
The monitoring information of the coarse-grained human-shaped contour segmentation model is a segmentation mark for blocking processing, and is used for shielding the background in the image.
The pedestrian individual segmentation method based on the combination of the coarse granularity and the fine granularity of the full convolution neural network provided by the invention respectively trains the coarse granularity segmentation model and the fine granularity model by adopting the deep learning technology, and partial background is shielded by using the result of the coarse granularity segmentation model, so that the accuracy of human shape segmentation can be improved, and the pedestrian individual segmentation method is particularly suitable for the condition that other pedestrians are contained in the background. According to the technical scheme, the results of the coarse-and-fine-granularity segmentation model need to be combined at the same time, wherein the results of the coarse-and-fine-granularity segmentation model are used for removing background blocks in the image and are used as the input of the fine-granularity segmentation model, so that the difficulty of fine-granularity segmentation can be greatly reduced, and the segmentation effect is improved; the coarse and fine granularity segmentation models are all full convolution neural networks and only comprise convolution layers and full convolution layer structures, and the coarse and fine granularity segmentation models have the advantages of simple structure and few parameters, so that the operation speed is high; the coarse-grained division network only comprises one deconvolution layer and can predict the blocking division result, and the fine-grained division network is of a front-back symmetrical funnel-shaped structure and can predict the fine division result; the supervision information of the proposed coarse-grained segmentation model is a segmentation label for blocking processing, the background in the image, especially other pedestrians in the background can be effectively shielded through training, and finally, only the region containing one pedestrian is reserved
Drawings
FIG. 1 is a schematic diagram of training data and labeling methods according to the present invention;
FIG. 2 is a flow chart illustrating a pedestrian individual segmentation method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a coarse-grained segmentation model according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a fine-grained segmentation model in an embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and examples.
An embodiment of the invention provides a pedestrian individual segmentation method. The method comprises the following steps:
carrying out pedestrian segmentation on the image to be processed by utilizing a pre-trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation result; the block segmentation image comprises a plurality of blocks marked as a background and a foreground, wherein the blocks marked as the background do not contain a pedestrian body, and the blocks marked as the foreground contain partial images of the pedestrian body;
removing a background image in a part corresponding to the blocked pedestrian segmentation result in the image to be processed to obtain a coarse-grained segmentation image;
inputting the coarse-grained segmentation image into a pre-trained fine-grained human-shaped contour segmentation model;
and outputting an individual pedestrian segmentation result by the pre-trained fine-grained human-shaped contour segmentation model.
In an embodiment, the blocking segmentation result is to divide the image to be processed into a plurality of blocks with the same size, and each block is marked as a background block or a foreground block, an image corresponding to the background block does not include an image of a pedestrian subject, and an image corresponding to the foreground block includes a partial image of the pedestrian subject, as shown in fig. 1, d is a block segmentation result, and e is a blocking segmentation image, that is, a blocking segmentation image corresponding to the blocking segmentation result.
In an embodiment, the coarse-grained human-shaped contour segmentation model and the fine-grained human-shaped contour segmentation model are all full convolution neural networks, that is, both the coarse-grained human-shaped contour segmentation model and the fine-grained human-shaped contour segmentation model only comprise convolution layers and full convolution layers, and have the advantages of simple structure and few parameters, so that the operation speed is high.
In an embodiment, the coarse-grained human-shaped contour segmentation model includes a plurality of convolutional layers and a deconvolution layer, and is used for predicting a segmentation result of the blocking, and the fine-grained human-shaped contour segmentation model includes a plurality of convolutional layers and a plurality of deconvolution layers, and a convolutional part and a deconvolution part of the fine-grained human-shaped contour segmentation model are centrosymmetric structures and are combined into a funnel shape, that is, an innermost layer of the fine-grained human-shaped contour segmentation model is a convolutional layer, a first half part is a convolutional layer, a second half part is a rice convolutional layer, and the first half part and the second half part are centrosymmetric structures.
In an embodiment, the supervision information of the coarse-grained human-shaped contour segmentation model is a segmentation label for blocking processing, and the background in the image, especially other pedestrians in the background, can be effectively shielded through training, and finally only a region containing one pedestrian is reserved.
On the basis of coarse-grained model segmentation, the fine-grained model can obtain a very fine pedestrian individual segmentation result through a symmetrical funnel-shaped full convolution network. The method has strong robustness to various background changes in the image, and can better solve the human shape segmentation problem under the condition that multiple lines of people are mutually shielded
In the following, a large human figure segmentation database is taken as an example, and 5000 pedestrian images and corresponding human figure segmentation labels are included in the database.
Fig. 2 is a flowchart of the individual pedestrian segmentation method of the present invention, and as shown in the figure, the present invention specifically includes the following steps:
step S0, block 5000 pieces of pedestrian mark data in the data set, as shown in fig. 1, first divide the pedestrian segmentation mark image into 10 × 5 blocks uniformly, and then determine whether the block is a foreground or background block according to whether the block contains the pedestrian segmentation mark, so as to obtain 10 × 5 blocking division marks, and 5000 pairs of pedestrian images and block division marks are obtained;
step S1, normalize the pedestrian image for training to a uniform size (50 × 25 pixels), and then send the image to a full convolution neural network (i.e. coarse-grained segmentation network), which contains several convolution layers and deconvolution layers, and the specific structure is shown in fig. 3, which contains 4 convolution layers and 1 deconvolution layer in total. The first convolutional layer contains 48 filters (size 3 × 3) with step size of 2; similarly, the second, third, and fourth convolutional layers respectively contain 96/96/128 filters (all 3 × 3 in size) with step size of 2, the fifth layer is an deconvolution layer containing 1 filter (all 10 × 5 in size) with step size of 1, and the output is the coarse-grained segmentation result.
Step S2, outputting the image representation, i.e. the block division result (size 10 × 5), at the last layer of the coarse-grained division network;
step S3, comparing the output block division result with the corresponding block division mark (as shown in d in fig. 1) to obtain a prediction error, and comparing the prediction errors of each point in the 10 × 5 region and summing them to obtain a final prediction error;
step S4, reducing the prediction error by using a back propagation algorithm and a random gradient descent method to train the coarse-grained segmentation network, obtaining a better coarse-grained segmentation model through multiple iterative training, requiring about 10,000 iterations, further reducing the error loss by adjusting the learning rate of the weight until the error is not reduced any more, and finishing the training of the coarse-grained segmentation model;
step S5, inputting the normalized pedestrian image (with a size of 50 × 25) into the trained coarse-grained model to obtain a blocked segmentation result (with a size of 10 × 5), and subtracting the background region on the non-normalized original pedestrian image according to the segmentation result (i.e. setting the corresponding pixel to 0) to obtain a pedestrian image without the background, as shown in fig. 1;
step S6, the background-removed pedestrian image obtained in step S5 is normalized to a uniform size (e.g., 150 × 75 pixels), and then the image is sent to a full convolution neural network (i.e., a fine-grained segmentation network), where the network includes several convolution layers and deconvolution layers, and the convolution portions and deconvolution portions thereof are generally symmetrical structures and combined into a funnel shape, as shown in fig. 4, including 4 convolution layers and 3 deconvolution layers. The first convolutional layer contains 48 filters (size 3 × 3) with step size of 2; similarly, the second, third, and fourth convolutional layers respectively contain 64/96/128 filters (all with a size of 3 × 3), and the step sizes are all 2; the fifth layer is a deconvolution layer, is symmetrical to the third layer of convolution layer, and contains 96 filters (with the size of 3 multiplied by 3) and the step length of 2; the sixth layer is a deconvolution layer, is symmetrical to the second layer of convolution layer, and contains 64 filters (the size is 3 multiplied by 3), and the step length is 2; the seventh layer is a deconvolution layer, is symmetrical to the first layer of convolution layer, contains 1 filter (with the size of 3 multiplied by 3), has the step length of 2, and outputs a subdivision result with the size of 150 multiplied by 75;
step S7, outputting an image representation at the last layer of the fine-grained segmentation network, i.e., a fine segmentation result (size of 150 × 75);
step S8, comparing the output fine segmentation result with the corresponding normalized segmentation flag (with a size of 150 × 75, as shown in b of fig. 1) to obtain a prediction error, where the error is the sum of errors of 150 × 75 pixels; the normalized segmentation markers are normalized into a fine segmentation result size (50 × 75) result from the original samples with exact segmentation markers;
step S9, reducing prediction errors by adopting a back propagation algorithm and a random gradient descent method to train the fine-grained segmentation network, and obtaining a final fine-grained segmentation model through multiple iterative training; because the network scale is large, about 100,000 iterations are usually needed, the error loss can be further reduced by adjusting the learning rate of the weight until the error is not reduced any more, and the training of the fine-grained segmentation model is finished at the moment;
and step S10, testing by using the trained coarse-fine granularity segmentation model. Firstly, normalizing an image to be tested containing pedestrians to 50 × 25 pixels, and sending the image into a coarse-grained segmentation model to obtain a coarse-grained segmentation result (namely a block segmentation result, the size of which is 10 × 5);
step S11, using the obtained block segmentation result in S10 to subtract the background region on the original pedestrian image (i.e. to set the corresponding pixel to 0) to obtain the pedestrian image without the background, then normalizing the pedestrian image to a uniform size (150 × 75 pixels), and finally sending the image into a fine-grained segmentation network;
in step S12, at this time, a refined pedestrian segmentation result can be obtained from the output end of the fine-grained segmentation network.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A pedestrian individual segmentation method comprising:
carrying out pedestrian segmentation on the image to be processed by utilizing a pre-trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation result; the blocking segmentation result comprises a plurality of blocks marked as a background and a foreground, the block marked as the background in the image to be processed does not contain a pedestrian body, and the block marked as the foreground contains a partial image of the pedestrian body;
removing a background image in a part corresponding to the blocked pedestrian segmentation result in the image to be processed to obtain a coarse-grained segmentation image;
inputting the coarse-grained segmentation image into a pre-trained fine-grained human-shaped contour segmentation model;
outputting an individual pedestrian segmentation result by the pre-trained fine-grained human-shaped contour segmentation model;
the coarse-grained human-shaped contour segmentation model and the fine-grained human-shaped contour segmentation model are obtained through full convolution neural network training;
the first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model comprises a plurality of convolution layers and an anti-convolution layer; the second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model comprises a plurality of convolution layers and a plurality of anti-convolution layers, and the plurality of convolution layers and the plurality of anti-convolution layers are in centrosymmetric structures and combined into a funnel shape.
2. The method of claim 1, further comprising the step of training the coarse-grained humanoid outline segmentation model, comprising:
carrying out blocking processing on training samples with pedestrian marks in the training data set to obtain blocking processing results of the training samples;
normalizing training samples for training to be uniform in size, and then sending the normalized training samples into a first full convolution neural network corresponding to a coarse-grained human-shaped contour segmentation model;
comparing the block segmentation result output by the coarse-grained human-shaped contour segmentation model with the block processing result of the corresponding training sample to obtain a prediction error;
and reducing the prediction error by adopting a back propagation algorithm and a random gradient descent method to train a first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model, and performing repeated iterative training to obtain a final coarse-grained human-shaped contour segmentation model.
3. The method of claim 1, further comprising the step of training the fine-grained humanoid-contour segmentation model, comprising:
inputting training samples with pedestrian marks in a training data set into a trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation result;
subtracting the background image in the corresponding part of the block segmentation result in the training sample to obtain a coarse-grained segmentation image;
normalizing the coarse-grained segmentation image to a uniform size;
sending the normalized coarse-grained segmentation image into a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model;
comparing the fine segmentation result output by the second full convolution neural network with the fine segmentation marking result of the corresponding training sample to obtain a second prediction error;
and reducing a second prediction error by adopting a back propagation algorithm and a random gradient descent method to train a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model, and performing repeated iterative training to obtain a final fine-grained human-shaped contour segmentation model.
4. The method according to claim 1, wherein the supervision information of the coarse-grained human-shaped contour segmentation model is segmentation markers for blocking processing, and the segmentation markers are used for shielding background in the image.
5. A pedestrian individual segmentation apparatus comprising:
the block segmentation module is configured to perform pedestrian segmentation on the image to be processed by utilizing a pre-trained coarse-grained human-shaped contour segmentation model to obtain a block segmentation image; the block segmentation image comprises a plurality of blocks marked as a background and a foreground, wherein the blocks marked as the background do not contain a pedestrian body, and the blocks marked as the foreground contain partial images of the pedestrian body;
the background removing module is configured to remove a background image in a part corresponding to the blocked pedestrian segmentation result in the image to be processed to obtain a coarse-grained segmentation image;
a fine segmentation module configured to input the coarse-grained segmentation image to a pre-trained fine-grained human-shaped contour segmentation model;
the result output module is configured to output an individual pedestrian segmentation result by the pre-trained fine-grained human-shaped contour segmentation model;
the coarse-grained human-shaped contour segmentation model and the fine-grained human-shaped contour segmentation model are obtained through full convolution neural network training;
the first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model comprises a plurality of convolution layers and an anti-convolution layer; the second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model comprises a plurality of convolution layers and a plurality of anti-convolution layers, and the plurality of convolution layers and the plurality of anti-convolution layers are in centrosymmetric structures and combined into a funnel shape.
6. The apparatus of claim 5, further comprising a training module of the coarse-grained humanoid outline segmentation model, comprising:
the marking sub-module is configured to perform blocking processing on the training samples with pedestrian marks in the training data set to obtain blocking processing results of the training samples;
a first normalization submodule configured to normalize training samples for training to a uniform size;
the first training submodule is configured to send the normalized training sample into a first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model;
the first comparison submodule is configured to compare a block segmentation result output by the coarse-granularity human-shaped contour segmentation model with a blocking processing result of a corresponding training sample to obtain a prediction error;
and the first iteration submodule is configured to reduce the prediction error by adopting a back propagation algorithm and a random gradient descent method so as to train a first full convolution neural network corresponding to the coarse-grained human-shaped contour segmentation model, and obtain a final coarse-grained human-shaped contour segmentation model through multiple iterative training.
7. The apparatus of claim 5, further comprising a training module of the fine-grained humanoid outline segmentation model, comprising: the blocking sub-module is configured to input training samples with pedestrian marks in the training data set into a trained coarse-grained human-shaped contour segmentation model to obtain a blocking segmentation result;
the background removal submodule is configured to subtract a background image in a part corresponding to the blocking segmentation result in the training sample to obtain a coarse-grained segmentation image;
a second normalization sub-module configured to normalize the coarse-grained segmented image to a uniform size;
the second training submodule is configured to send the normalized coarse-grained segmentation image into a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model;
a second comparison sub-module configured to compare the fine segmentation result output by the second full convolution neural network with the fine segmentation marking result of the corresponding training sample to obtain a second prediction error;
and the second iteration submodule is configured to reduce a second prediction error by adopting a back propagation algorithm and a random gradient descent method so as to train a second full convolution neural network corresponding to the fine-grained human-shaped contour segmentation model, and obtain a final fine-grained human-shaped contour segmentation model through multiple times of iterative training.
8. The apparatus according to claim 5, wherein the supervision information of the coarse-grained human-shaped contour segmentation model is segmentation markers for blocking processing, and the segmentation markers are used for shielding background in the image.
CN201710065013.0A 2017-02-04 2017-02-04 Pedestrian individual segmentation method and device Active CN106778705B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710065013.0A CN106778705B (en) 2017-02-04 2017-02-04 Pedestrian individual segmentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710065013.0A CN106778705B (en) 2017-02-04 2017-02-04 Pedestrian individual segmentation method and device

Publications (2)

Publication Number Publication Date
CN106778705A CN106778705A (en) 2017-05-31
CN106778705B true CN106778705B (en) 2020-03-17

Family

ID=58955591

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710065013.0A Active CN106778705B (en) 2017-02-04 2017-02-04 Pedestrian individual segmentation method and device

Country Status (1)

Country Link
CN (1) CN106778705B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766890B (en) * 2017-10-31 2021-09-14 天津大学 Improved method for discriminant graph block learning in fine-grained identification
CN107944399A (en) * 2017-11-28 2018-04-20 广州大学 A kind of pedestrian's recognition methods again based on convolutional neural networks target's center model
CN109993187A (en) * 2017-12-29 2019-07-09 深圳市优必选科技有限公司 A kind of modeling method, robot and the storage device of object category for identification
CN108198192A (en) * 2018-01-15 2018-06-22 任俊芬 A kind of quick human body segmentation's method of high-precision based on deep learning
CN108510000B (en) * 2018-03-30 2021-06-15 北京工商大学 Method for detecting and identifying fine-grained attribute of pedestrian in complex scene
CN108711150B (en) * 2018-05-22 2022-03-25 电子科技大学 End-to-end pavement crack detection and identification method based on PCA
CN110689542A (en) * 2018-07-04 2020-01-14 清华大学 Portrait segmentation processing method and device based on multi-stage convolution neural network
CN108960190B (en) * 2018-07-23 2021-11-30 西安电子科技大学 SAR video target detection method based on FCN image sequence model
CN110855875A (en) * 2018-08-20 2020-02-28 珠海格力电器股份有限公司 Method and device for acquiring background information of image
CN109636806B (en) * 2018-11-22 2022-12-27 浙江大学山东工业技术研究院 Three-dimensional nuclear magnetic resonance pancreas image segmentation method based on multi-step learning
CN109635812B (en) * 2018-11-29 2019-11-08 中国科学院空间应用工程与技术中心 The example dividing method and device of image
CN110084156B (en) * 2019-04-12 2021-01-29 中南大学 Gait feature extraction method and pedestrian identity recognition method based on gait features
CN110516583A (en) * 2019-08-21 2019-11-29 中科视语(北京)科技有限公司 A kind of vehicle recognition methods, system, equipment and medium again
CN111368788B (en) * 2020-03-17 2023-10-27 北京迈格威科技有限公司 Training method and device for image recognition model and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609686A (en) * 2012-01-19 2012-07-25 宁波大学 Pedestrian detection method
CN105760835A (en) * 2016-02-17 2016-07-13 天津中科智能识别产业技术研究院有限公司 Gait segmentation and gait recognition integrated method based on deep learning
CN106022221A (en) * 2016-05-09 2016-10-12 腾讯科技(深圳)有限公司 Image processing method and processing system
CN106127164A (en) * 2016-06-29 2016-11-16 北京智芯原动科技有限公司 The pedestrian detection method with convolutional neural networks and device is detected based on significance
CN106355188A (en) * 2015-07-13 2017-01-25 阿里巴巴集团控股有限公司 Image detection method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609686A (en) * 2012-01-19 2012-07-25 宁波大学 Pedestrian detection method
CN106355188A (en) * 2015-07-13 2017-01-25 阿里巴巴集团控股有限公司 Image detection method and device
CN105760835A (en) * 2016-02-17 2016-07-13 天津中科智能识别产业技术研究院有限公司 Gait segmentation and gait recognition integrated method based on deep learning
CN106022221A (en) * 2016-05-09 2016-10-12 腾讯科技(深圳)有限公司 Image processing method and processing system
CN106127164A (en) * 2016-06-29 2016-11-16 北京智芯原动科技有限公司 The pedestrian detection method with convolutional neural networks and device is detected based on significance

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Fully convolutional networks for semantic segmentation;E. Shelhamer等;《IEEE Transactions on Pattern Analysis and Machine Intelligence》;20160531;第640-649页 *
深度卷积神经网络在计算机视觉中的应用研究综述;卢宏涛,张秦川;《数据采集与处理》;20160130;第31卷(第1期);第1-17页 *

Also Published As

Publication number Publication date
CN106778705A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
CN106778705B (en) Pedestrian individual segmentation method and device
CN106650740B (en) A kind of licence plate recognition method and terminal
CN110599537A (en) Mask R-CNN-based unmanned aerial vehicle image building area calculation method and system
CN110569899A (en) Dam face defect classification model training method and device
US20170200274A1 (en) Human-Shape Image Segmentation Method
CN110751678A (en) Moving object detection method and device and electronic equipment
CN108073898B (en) Method, device and equipment for identifying human head area
CN109858327B (en) Character segmentation method based on deep learning
CN108960261A (en) A kind of obvious object detection method based on attention mechanism
CN110706224B (en) Optical element weak scratch detection method, system and device based on dark field image
CN104573742A (en) Medical image classification method and system
CN110189341B (en) Image segmentation model training method, image segmentation method and device
CN110910445B (en) Object size detection method, device, detection equipment and storage medium
Azad et al. New method for optimization of license plate recognition system with use of edge detection and connected component
CN108229300A (en) Video classification methods, device, computer readable storage medium and electronic equipment
CN116245876B (en) Defect detection method, device, electronic apparatus, storage medium, and program product
CN109740609A (en) A kind of gauge detection method and device
CN115239644B (en) Concrete defect identification method, device, computer equipment and storage medium
CN110909657A (en) Method for identifying apparent tunnel disease image
CN110889399A (en) High-resolution remote sensing image weak and small target detection method based on deep learning
CN110298302B (en) Human body target detection method and related equipment
CN108154199B (en) High-precision rapid single-class target detection method based on deep learning
CN111985488A (en) Target detection segmentation method and system based on offline Gaussian model
CN111126248A (en) Method and device for identifying shielded vehicle
CN111127503A (en) Method, device and storage medium for detecting the pattern of a vehicle tyre

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant