CN110084826B

CN110084826B - Hair segmentation method based on TOF camera

Info

Publication number: CN110084826B
Application number: CN201811452085.1A
Authority: CN
Inventors: 马原曦; 蒋琪雷; 李思远; 张迎梁
Original assignee: Plex VR Digital Technology Shanghai Co Ltd
Current assignee: Plex VR Digital Technology Shanghai Co Ltd
Priority date: 2018-11-30
Filing date: 2018-11-30
Publication date: 2023-09-12
Anticipated expiration: 2038-11-30
Also published as: CN110084826A

Abstract

A hair segmentation method based on a TOF camera, comprising: establishing a deep learning network based on the data set of the hair color chart and the mask thereof; obtaining a depth map by using a TOF camera, and obtaining a variance map; optimizing the preliminary hair mask obtained by deep learning to obtain an optimized hair mask; and (5) re-optimizing the optimized hair mask by using the variance diagram to obtain a final accurate hair mask. The invention skillfully and reversely utilizes the characteristic that the flight time generates a certain degree of noise on the hair, and realizes higher-precision segmentation of the hair.

Description

Hair segmentation method based on TOF camera

Technical Field

The invention relates to the field of three-dimensional images, in particular to a hair segmentation method based on a TOF camera by reversely utilizing noise.

Background

At present, the hair segmentation method is not very robust, mainly because the segmentation method based on RGB images is very severely affected by illumination and background, the traditional segmentation method based on images cannot well complete the task, and especially the hair segmentation for high-definition images is a very difficult problem.

With the advent of the artificial intelligence era, attempts have been made to segment hair using deep learning methods. But a very obvious problem is that there are currently few more data sets for hair. The main reasons are that the hair of different people is changed and the boundary is very unsmooth, which makes the data marking very difficult. On the premise that the data set labeling is very difficult and inaccurate, the segmentation of the hair cannot be realized very well by simply utilizing a deep learning method.

The TOF (Timeofflight) camera works on the principle that depth information of the whole image is finally obtained by continuously sending light pulses to the object, then receiving light returned from the object by the sensor, and obtaining the distance of the object by detecting the flight round trip time of the light pulses. However, if the head hair is directly photographed by the TOF camera, it is found that the resulting depth image is quite noisy, mainly because the time at which the light pulses return is not exactly the same as the hair appears to be non-uniform in depth from one direction. The more jagged the hair is, the more serious the noise of the obtained depth map is; in general, the depth map obtained for the hair portion is much more noisy than for the face and background portions.

Typically, capturing a portion of the hair with a TOF camera results in a significant portion of the noise. Noise means that the transformation of the depth image is discontinuous, the edges are not clear, which is a very detrimental thing. However, in this case, the noise margin obtained by photographing only the head is very sharp, but the height of the depth of the hair portion is very remarkable. The noise information, which is visually harmful and requires smoothing, can be used in reverse to determine the approximate probability of hair location from the noise.

Disclosure of Invention

The invention aims to solve the existing problems and provides a hair segmentation method based on a TOF camera.

In order to achieve the above purpose, the technical scheme adopted by the invention comprises the following steps: comprising the following steps:

step one, establishing a deep learning network based on a data set of a hair color chart and a mask thereof;

obtaining a depth map by using a TOF camera, and obtaining a corresponding variance map;

optimizing the preliminary hair mask obtained by deep learning to obtain an optimized hair mask; and step four, re-optimizing the optimized hair mask by using the variance diagram to obtain a final accurate hair mask.

Adding noise to the hair in the data set, and changing brightness saturation to obtain training data; and then constructing a coded-decoded deep learning network, and combining gradient information of the images to train and train fit, so as to finally obtain a relatively accurate hair segmentation result.

In the second step, for each pixel, a small block with the pixel as the center is provided with a threshold value, and for the pixels in the small block, the variance is obtained for which the difference value between the pixel and the center pixel is within the threshold value range, and finally, a variance map is obtained, wherein the formula is as follows:

for all |d _i -d _center I < threshold, where n is d _i Is a number of (3).

In the third step, the primary hair mask is optimized by utilizing the distance information, the color information and the variance information among the pixel points and combining the method of utilizing the Dense CRF model, so that the optimized hair mask is obtained:

compared with the prior art, the method skillfully and reversely utilizes the characteristic that the flight time generates noise to a certain extent on the hair, and combines the variance diagram to optimize the hair mask twice, thereby realizing higher-precision segmentation of the hair; and optimizing the mask of the hair by using the distance information, the color information and the variance information among the pixel points to obtain the mask with higher quality.

Drawings

FIG. 1 is a schematic diagram of a constructed deep learning network;

FIG. 2 is a flow chart of one embodiment of the present invention.

Detailed Description

The invention will now be further described with reference to the accompanying drawings.

Referring to fig. 1 and 2, fig. 1 and 2 show an embodiment of the present invention, which is based on the characteristic that a TOF camera achieves higher precision segmentation of human hair, and reversely utilizes the characteristic that a certain degree of noise is generated on the hair by the time of flight to achieve higher precision segmentation of the hair.

First, deep learning achieves a preliminary segmentation of the hair.

That is, a deep learning network of hair style is constructed by utilizing the existing data set of a small quantity of and not very fine hair color map plus mask, and the important information of adding hair gradient is integrated, so that a preliminary hair segmentation map is obtained through training.

Referring to fig. 1, the present embodiment uses a Figaro 1K hair dataset with 1050 pieces of hair data. And performing data enhancement on the data set, adding noise to the hair, changing brightness saturation and the like, and finally obtaining about 1 ten thousand training data. And then constructing a deep learning network of encoding-decoding (Encoder-Decoder), and combining gradient information of the image, training and fitting to finally obtain a relatively accurate hair segmentation result.

Next, a variance map processing is obtained using a depth map (depth map) obtained by the TOF camera.

Referring to fig. 2, for each pixel, there is a patch centered on the pixel to set a threshold (obtained by preprocessing); and solving variances for the pixel points of which the difference value between the pixel point and the central pixel point in the small block is within a threshold range, and finally obtaining a variance map. The formula is as follows:

for all |d _i -d _center I < thr threshold, where n is d _i Is a number of (3).

Again, the hair mask was first optimized based on the Dense CRF method.

The traditional dense CRF method is mainly used for carrying out global optimization on images based on color information among pixels and position information among pixels, so that edges are sharper. However, a very poor result is obtained because other background colors such as clothing are similar to hair colors. In this embodiment, the distance information, the color information and the variance information between the pixel points are used to optimize the mask of the hair, so that a better result can be obtained.

Optimizing the preliminary hair mask obtained by deep learning by combining the distance information, the color information and the variance information among the pixel points and a DenseRF model method to obtain an optimized hair mask:

the formula is as above. Where x is the label of the pixel point of the input image. Psi _u (x _i ) Is a first order mapping function. Psi phi type _p (x _i ，x _j ) Mapping for the interrelationship between pixels. At psi _p (x _i ，x _j ) In (x) _i ，x _j ) Mu (x) _i ，x _j )＝[x _i ≠x _j ]。k(f _i ，f _j ) The first term is the relationship between pixel color value, pixel location, and pixel variance value as a relationship function. The second term is a smoothing function, which depends mainly on the relation between pixel locations.

And finally, performing second optimization on the optimized hair mask by using the variance diagram.

The optimized result obtained by the DenseRF model adds the parts such as the eye eyebrows, etc., and the main reason is that the variance diagram of the eye eyebrows and the color diagram are also very high in weight, so that the positions cannot be effectively eliminated. Meanwhile, the influence of illumination is easily received due to the consideration of the color information.

The hair is essentially black in color, but has a large portion of white due to the effects of light, and an erroneous result is obtained with the Dense CRF.

However, this embodiment also has to use the great advantage that DensecRF can obtain edge information of a comparison detail.

Therefore, in order to avoid these errors, the resulting mask needs to be optimized again using only the variogram. It can be understood from the above formula that, since color information is mostly considered, the obtained mask is small relative to a truly correct mask, and the mask is expanded based on variance, so that the final result is accurate to the hair mask.

The specific algorithm is that the whole binary mask calculated in the previous step is traversed pixel by pixel. If the current pixel is determined to be a non-masked region, then an optimization validation is performed on this pixel and surrounding 4*4 regions based on color and variance. If more than half of the pixels around do not differ much from the center pixel in color and variance, and they are all confirmed as mask parts, the pixels tested are changed to mask parts. A round of optimization iterations is performed until no more changes in the mask area occur. And only the mask with the largest communication area is reserved, and the smaller part of the communication area is removed.

The embodiments of the present invention have been described above with reference to the accompanying drawings and examples, which are not to be construed as limiting the invention, and those skilled in the art can make modifications as required, all within the scope of the appended claims.

Claims

1. A hair segmentation method based on a TOF camera, comprising:

optimizing the preliminary hair mask obtained by deep learning, and utilizing distance information, color information and variance information among pixels to optimize the preliminary hair mask by combining with a Dense CRF model to obtain an optimized hair mask:

wherein x is the label of the pixel point of the input image, ψ _u (x _i ) Is a first order mapping function;

ψ _p (x _i ，x _j ) Mapping for the interrelationship between pixels;

at psi _p (x _i ，x _j ) In (x) _i ，x _j ) Mu (x) _i ，x _j )＝[x _i ≠x _j ]；k(f _i ，f _j ) As a relation function, the first term is the relation among the pixel position, the pixel color value and the pixel variance value; the second term is a smoothing function, which depends mainly on the relationship between pixel locations;

optimizing the optimized hair mask again by using the variance diagram to obtain a final accurate hair mask; traversing the hair mask obtained in the third step pixel by pixel; if the current pixel point is determined to be a non-mask area, performing optimization confirmation based on color and variance on the current pixel point and surrounding areas; if more than half of the pixels around do not differ much from the center pixel in color and variance, and they are all confirmed as mask parts, the pixels tested are changed to mask parts.

2. The TOF camera-based hair segmentation method according to claim 1, wherein: adding noise to the hair in the data set, and changing brightness saturation to obtain training data; and then constructing a coded-decoded deep learning network, and combining gradient information of the image to train and fit to finally obtain a relatively accurate hair segmentation result.

3. The TOF camera-based hair segmentation method according to claim 1, wherein: in the second step, a threshold value is set for each pixel point and a small block taking the pixel point as the center, and variance is obtained for the pixel points of which the difference value between the pixel point in the small block and the central pixel point is within the threshold value range, and finally a variance map is obtained.