CN114119386A

CN114119386A - Data enhancement method and device based on deep convolution countermeasure network and Poisson fusion

Info

Publication number: CN114119386A
Application number: CN202111183233.6A
Authority: CN
Inventors: 李琦; 于令君; 王鑫; 白卓玉
Original assignee: Inner Mongolia University of Science and Technology
Current assignee: Inner Mongolia University of Science and Technology
Priority date: 2021-10-11
Filing date: 2021-10-11
Publication date: 2022-03-01

Abstract

The invention relates to a data enhancement method based on deep convolution countermeasure network and Poisson fusion, which comprises the following steps: acquiring urban road images and acquiring a road damage data set; cutting pit defect areas in the road damage data set, and normalizing; taking the normalized pit defect image as the input of a deep convolutional neural network (DCGAN), receiving random noise which obeys uniform distribution to generate a false pit defect image, and judging the pit defect image; and smoothly inserting the pit defect image generated by the DCGAN into the road image through Poisson fusion to generate a new road defect image. The method can amplify the road pit defect image and meet the requirement of the training sample data size in the road pit defect image identification technology.

Description

Data enhancement method and device based on deep convolution countermeasure network and Poisson fusion

Technical Field

The invention relates to the technical field of data processing, in particular to a data enhancement method and device based on deep convolution countermeasure network and Poisson fusion.

Background

The highway (traffic road) is the current trackless vehicle and pedestrian's engineering facility that pass, will appear the pavement disease such as crack, rut, pot hole, surface damage after the traffic is used successively; the pot holes are used as the road damage with the strongest harmfulness, so that not only are the road jolt and the potential safety hazard increased, but also the vehicle performance is damaged, the service life of the road is shortened, particularly rainwater permeates into the underground, a concrete roadbed is corroded, and the strength of a base layer and a soil foundation is weakened; pit detection is an important way for timely finding potential safety hazards of roads, traditional pit defect identification mainly depends on human vision, and detection efficiency and accuracy are low due to criss-cross roads. With the development of digital image processing, the traditional manual detection method is gradually replaced by a detection method based on a Neural network, and at present, pavement damage can be accurately analyzed by using Deep convolutional Neural Networks (DNNs).

The road pit defect detection method based on the neural network needs a large number of image samples to be trained to ensure that enough accuracy is achieved, however, in the practical engineering problem, huge and high-quality marking data are difficult to obtain, the complexity of the environment causes the difference of data types, the more serious the damage is, the more difficult the data acquisition is, the number of real road pit defect images is limited due to timely repairing of the road pit defects by related workers, and more training data need to be obtained through a data enhancement method; common data enhancement methods include geometric transformation, color transformation, random erasure and the like, but the nature of the data enhancement methods is similar to that of original images, so that the performance improvement is limited, and the requirements on training sample data size in the road pit defect image identification technology cannot be met.

Disclosure of Invention

In order to overcome the technical defects in the prior art, the invention provides a data enhancement method and device based on a deep convolution countermeasure network and Poisson fusion, which can effectively solve the problems in the background art.

In order to solve the technical problems, the technical scheme provided by the invention is as follows:

on one hand, the embodiment of the invention discloses a data enhancement method based on deep convolution countermeasure network and Poisson fusion, which comprises the following steps:

acquiring urban road images and acquiring a road damage data set;

cutting pit defect areas in the road damage data set, and normalizing;

taking the normalized pit defect image as the input of a deep convolutional neural network (DCGAN), receiving random noise which obeys uniform distribution to generate a false pit defect image, and judging the pit defect image;

and smoothly inserting the pit defect image generated by the DCGAN into the road image through Poisson fusion to generate a new road defect image.

In any of the above schemes, preferably, the image acquisition device is arranged on the traffic vehicle, and the image acquisition angle of the image acquisition device is adjusted to realize that the shooting range is larger than the width of the traffic road.

In any of the above aspects, it is preferable that the road damage data set is acquired by:

(1) screening out 1920pixel multiplied by 1080pixel and 10fps/s traffic road damage images from the collected traffic road damage images as an initial data set;

(2) manually removing fuzzy and complex road damage images in the initial data set;

(3) labeling the 2/3 part under the road damage image of the initial data set in the step (2) by using a Labe1Img image labeling tool to obtain a VOC data set, and defining the prepared VOC data set as the road damage data set.

In any of the above aspects, it is preferable that a rectangular frame in which the road surface surrounds the pit defect is cut out from the road damage data set, a pit defect image is generated, and the resolution of the pit defect image is normalized to 96 × 96, wherein the pit is defined as a road surface depression and a road surface is partially peeled off.

In any of the above schemes, preferably, a new pit defect is generated through a deep convolutional neural network DCGAN, the DCGAN includes a generation model and a discrimination model, the generation model is used for receiving random noise conforming to uniform distribution, and a sample conforming to the probability distribution is generated by learning the probability distribution of an existing real data sample; the discrimination model is used for performing two-classification neural network training and is responsible for discriminating whether the sample is from a real example of the road damage data set or a forged example of the sample, and the output result represents the probability of the real data sample.

In any of the above schemes, preferably, the generation model receives uniform random noise of 100 dimensions, and converts the dimensions of the random noise into 1024 feature maps of 4 × 4 by using a full convolution neural network, and performs upsampling by using a transposed convolution operation in which 4 convolution kernels are 5 × 5, the step size is 2 × 2, the number of the convolution kernels is 512, 256, 128, and 3 in sequence, the output layer uses a Tanh activation function, and the other layers use a ReLU activation function; wherein, the function value of the Tanh activation function is [ -1,1], the average value is 0, and the Tanh activation function is piled up about the center of the origin.

In any of the above aspects, the discriminant model is preferably formed by a convolutional network, and after the convolutional layer, the use of a fully-connected layer that slows down the convergence rate is avoided, and dropout is used to provide a noisy input to the convolutional layer.

In any of the above schemes, preferably, the discriminant model receives a 96pixel × 96pixel × 3 generated sample image, and discriminates authenticity of a new pit defect sample generated by the generated model by using a Leakelu activation function for other layers and mapping the discriminant probability of the output sample by using a convolution with a convolution kernel of 1 × 1 and a sigmoid activation function through convolution with a convolution kernel of 5 × 5, a step length of 2 × 2 and the number of convolution kernels of 64, 128, 256 and 512 in sequence.

In any of the above schemes, preferably, in the training process, the generation model is used for generating a sample similar to the original data; the discriminant model is used for distinguishing the samples generated by the generative model from the real samples, and the training process is expressed as follows:

min,maxV(D,G)＝E_x～pdata(x)[log_aD(x)]+E_z～pz(z)[log_a(1-D(G(z)))]wherein D is

Discriminant function, G is a generation function, E (-) is a calculated expected value, x is a real data sample, p_data(x) A is an arbitrary number greater than 0 and not equal to 1, D (x) is the probability of being judged as a real sample after the real data sample is input, z is random noise conforming to Gaussian distribution or uniform distribution, p_z(z) is a probability distribution of the initial noise data, D (g (z)) is a probability of being judged as a true sample after a sample generated by the generative model is input, and g (z) is a sample of the generative model.

In another aspect, a data enhancement apparatus based on deep convolution countermeasure network and poisson fusion includes: the acquisition module is used for acquiring urban road images and acquiring a road damage data set;

the processing module is used for cutting out pit defect areas in the road damage data set and carrying out normalization;

the judging module is used for taking the normalized pit defect image as the input of a deep convolutional neural network (DCGAN), receiving random noise which is subjected to uniform distribution to generate a false pit defect image, and judging the pit defect image;

and the generating module is used for smoothly inserting the pit defect image generated by the DCGAN into the road image through Poisson fusion to generate a new road defect image.

Compared with the prior art, the invention has the beneficial effects that:

the method comprises the steps of acquiring urban road images and acquiring a road damage data set; cutting pit defect areas in the road damage data set, and normalizing; taking the normalized pit defect image as the input of a deep convolutional neural network (DCGAN), receiving random noise which obeys uniform distribution to generate a false pit defect image, and judging the pit defect image; smoothly inserting the pit defect image generated by DCGAN into the road image through Poisson fusion to generate a new road defect image; the road pit defect image can be amplified, and the requirement of the training sample data size in the road pit defect image identification technology is met.

Drawings

The drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification.

FIG. 1 is a flow chart of a data enhancement method based on deep convolutional countermeasure networks and Poisson fusion of the present invention;

FIG. 2 is a logic diagram of the data enhancement method based on deep convolutional countermeasure network and Poisson fusion of the present invention;

FIG. 3 is a flow chart of a generation module of the data enhancement method based on deep convolution countermeasure network and Poisson fusion;

FIG. 4 is a flow chart of a discrimination module of the data enhancement method based on deep convolution countermeasure network and Poisson fusion;

FIG. 5 is a schematic diagram of a Poisson fusion process of the data enhancement method based on deep convolution countermeasure network and Poisson fusion of the invention;

FIG. 6 is a schematic position diagram of an image acquisition device of the data enhancement method based on deep convolution countermeasure network and Poisson fusion;

fig. 7 is a schematic block diagram of a data enhancement device based on deep convolution countermeasure network and poisson fusion according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or be indirectly connected to the other element.

In the description of the present invention, it is to be understood that the terms "length", "width", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on the orientations or positional relationships illustrated in the drawings, and are used merely for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

For better understanding of the above technical solutions, the technical solutions of the present invention will be described in detail below with reference to the drawings and the detailed description of the present invention.

The invention provides a data enhancement method based on a deep convolution countermeasure network and Poisson fusion, as shown in figures 1 and 2, the method comprises the following steps:

step one, acquiring urban road images and acquiring a road damage data set.

Specifically, as shown in fig. 6, in order to ensure the authenticity of the acquired traffic road damage image, the image acquisition device is disposed on the traffic vehicle, and the image acquisition angle of the image acquisition device is adjusted to realize that the shooting range is greater than the width of the traffic road.

In a specific embodiment, the image acquisition device is preferably an OAICH camera looking eye in great jiang, the traffic vehicle is preferably an automobile, and further, the camera is fixed at a position 120cm away from the ground of an automobile trunk, a camera video recording function is started, and the automobile is driven to run at a speed of 60 km/h-80 km/h, so that the acquisition of the damage image of the traffic road is realized.

Further, the road damage data set is obtained by:

And step two, cutting out a pit defect area in the road damage data set, and normalizing.

Specifically, a rectangular frame of a road surface surrounding a pit defect is cut out from the road damage data set, a pit defect image is generated, and the resolution of the pit defect image is normalized to 96 multiplied by 96, wherein the definition of the pit is pavement depression and the surface of the road is stripped by an area; in the road damage data set, pits and holes appearing on the road surface, regardless of the cause of formation, are identified as road pit defects.

And step three, taking the normalized pit defect image as the input of a deep convolutional neural network (DCGAN), receiving random noise which obeys uniform distribution to generate a false pit defect image, and judging the pit defect image.

Specifically, in the pit defect images cut out from the road damage data set, the pit defects can be repaired in time due to complexity and harmfulness of the pit defects, so that it is difficult to acquire a sufficient number of road pit defect images.

Further, generating a new pit defect through a deep convolutional neural network DCGAN so as to achieve the purpose of amplifying the number of pit defect image samples; the DCGAN comprises a generation model and a discrimination model, wherein the generation model is used for receiving random noise which accords with uniform distribution, and generating samples which obey the probability distribution as much as possible by learning the probability distribution of the existing real data samples; the discrimination model is used for performing two-classification neural network training and is responsible for discriminating whether the sample is from a real example of the road damage data set or a forged example of the sample, and the output result represents the probability of the real data sample.

Further, as shown in fig. 3, the generation model receives uniform random noise (1 × 1 × 100) of 100 dimensions, and converts the dimensions thereof into 1024 feature maps of 4 × 4 by using a full convolution neural network, and performs upsampling by using a transposed convolution operation in which 4 convolution kernels are 5 × 5, a step size is 2 × 2, and the number of the convolution kernels is 512, 256, 128, and 3 in sequence, an output layer uses a Tanh activation function, and other layers use a ReLU activation function; wherein, the function value of the Tanh activation function is [ -1,1], the average value is 0, and the Tanh activation function is piled up about the center of the origin.

Further, the operation method of the transposed convolution used by the generation model comprises the comparison of nearest neighbor interpolation and bilinear interpolation, and the initialization mode of the parameters of the transposed convolution is the same as the initialization mode of the ordinary convolution, but the parameters can be learned.

Further, as shown in fig. 4, the discriminant model is formed by a convolutional network, and a convolutional layer is used to replace a spatial pooling layer, so that pixel values at certain fixed positions are not discarded in the downsampling process, and the network can learn the downsampling mode by itself; avoiding the use of fully connected layers after the convolutional layer that slow convergence speed, and using dropout to provide noisy inputs to the convolutional layer prevents over-fitting.

Further, the discrimination model receives a 96pixel × 96pixel × 3 generated sample image, after convolution with the convolution kernel size of 5 × 5, the step length of 2 × 2 and the number of convolution kernels of 64, 128, 256 and 512 in sequence, the other layers use Leakelu activation functions to avoid that the neural network cannot learn, and finally, the discrimination probability of the output sample is mapped through the convolution with the convolution kernel of 1 × 1 and the sigmoid activation function to discriminate the authenticity of a new pit defect sample generated by the generation model; the 5 x 5 convolution kernel selection can obtain a large enough receptive field, is favorable for noise suppression while having few parameters, and extracts structural information of the crack.

Further, the last layer of the discriminant model outputs probability by using a Sigmoid activation function, when the input of the activation function is negative, the activation value of the Relu function is 0, and neurons cannot learn; the activation value of the leakyrelu function is not 0, but is small, allowing the neuron to continue learning.

Furthermore, a convolution network is used for both a generation model and a discrimination model in the DCGAN, so that the generation and discrimination capabilities are improved; wherein the generative model utilizes a transposed convolution and the discriminative model utilizes a conventional convolution.

Besides the output layer of the generating model and the input layer of the distinguishing model, batch standardization BN is adopted in other layers, the BN can ensure that the input of each node is normally distributed with the average value of 0 and the variance of 1, and the activation function is in a region sensitive to the input value, so that the training of the model is accelerated.

Further, in the training process, the generation model is used for generating samples similar to the original data as much as possible; the discrimination model is used for distinguishing samples generated by the generation model from real samples as much as possible, the two models are alternately optimized and trained and are mutually promoted to form a dynamic game process, and finally the generation model and the discrimination model reach Nash balance, wherein the discrimination model cannot discriminate the samples generated by the generation model.

Further, the training process is represented as:

wherein D is a discriminant function, G is a generation function, E (-) is a calculation expectation value, x is a real data sample, p_data(x) A is an arbitrary number greater than 0 and not equal to 1, D (x) is the probability of being judged as a real sample after the real data sample is input, z is random noise conforming to Gaussian distribution or uniform distribution, p_z(z) probability distribution of initial noise dataD (g (z)) is a probability of being judged as a true sample after a sample generated by the generative model is input, and g (z) is a sample for generating the model.

And step four, smoothly inserting the pit defect image generated by the DCGAN into the road image through Poisson fusion to generate a new road defect image.

Specifically, as shown in fig. 5, a new pit defect image generated by DCGAN is smoothly inserted into a road image by a poisson fusion method, and when the image is inserted, the boundary of the foreground pit defect image is as same as the characteristic value of the inserted region boundary of the background road image as possible, and the gradients of the foreground and the background are as same as possible, that is, the gradient of the foreground and the background is as same as possible, that is, the image is inserted into the road image by a poisson fusion method

The pit defect image cannot be directly stuck to the background road image so as to avoid the occurrence of a protruding edge; the road pit defect image and the road background image have no gap at the boundary, and the pixel values at the position of the boundary point need to keep the boundary consistent, that is to say

In order to better perform Poisson fusion, the pit defect traverses the lower 1/2 area of the road image, the proper pit defect position is manually selected, and in order to further improve the fusion effect, the brightness and the contrast of the road image are manually adjusted if necessary; wherein in the formula

And formula

Wherein Ω is the region covered by the foreground in the merged target image, v is the reference gradient field of the foreground fusion region, f is the pixel value representation function of the merged image within Ω, and f is the pixel value representation function of the merged image outside Ω.

The invention also provides a data enhancement device based on deep convolution countermeasure network and poisson fusion, as shown in fig. 7, comprising:

the acquisition module is used for acquiring urban road images and acquiring a road damage data set;

Compared with the prior art, the invention has the beneficial effects that:

Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that various changes, modifications and substitutions can be made without departing from the spirit and scope of the invention as defined by the appended claims. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A data enhancement method based on deep convolution countermeasure network and Poisson fusion is characterized in that: the method comprises the following steps:

acquiring urban road images and acquiring a road damage data set;

cutting pit defect areas in the road damage data set, and normalizing;

2. The data enhancement method based on deep convolution countermeasure network and poisson fusion as claimed in claim 1, wherein: the image acquisition device is arranged on the traffic carrier, and the image acquisition angle of the image acquisition device is adjusted to realize that the shooting range is larger than the width of a traffic road.

3. The data enhancement method based on deep convolution countermeasure network and poisson fusion as claimed in claim 2, wherein: acquiring a road damage dataset by:

step 1: screening out 1920pixel multiplied by 1080pixel and 10fps/s traffic road damage images from the collected traffic road damage images as an initial data set;

step 2: manually removing fuzzy and complex road damage images in the initial data set;

and step 3: labeling the 2/3 part of the road damage image of the initial data set in the step 2 by using a Labe1Img image labeling tool to prepare a VOC data set, and defining the prepared VOC data set as a road damage data set.

4. The data enhancement method based on deep convolution countermeasure network and poisson fusion as claimed in claim 3, wherein: cutting out a rectangular frame of a road surface surrounding the pit defect from the road damage data set, generating a pit defect image, and normalizing the resolution of the pit defect image to 96 multiplied by 96, wherein the pit is defined as a road surface depression and a road surface stripped area.

5. The data enhancement method based on deep convolution countermeasure network and poisson fusion as claimed in claim 4, wherein: generating a new pit defect through a deep convolutional neural network DCGAN, wherein the DCGAN comprises a generation model and a discrimination model, the generation model is used for receiving random noise which accords with uniform distribution, and generating a sample which obeys the probability distribution of the existing real data sample by learning the probability distribution of the real data sample; the discrimination model is used for performing two-classification neural network training and is responsible for discriminating whether the sample is from a real example of the road damage data set or a forged example of the sample, and the output result represents the probability of the real data sample.

6. The data enhancement method based on deep convolution countermeasure network and poisson fusion as claimed in claim 5, wherein: the generation model receives uniform 100-dimensional random noise, the dimensionality of the random noise is converted into 1024 4 x 4 feature maps by adopting a full convolution neural network, up-sampling is carried out through transposition convolution operations of 4 convolution kernels of 5 x 5, step length of 2 x 2 and convolution kernel numbers of 512, 256, 128 and 3 in sequence, a Tanh activation function is adopted in an output layer, and ReLU activation functions are used in other layers; wherein, the function value of the Tanh activation function is [ -1,1], the average value is 0, and the Tanh activation function is piled up about the center of the origin.

7. The data enhancement method based on deep convolution countermeasure network and poisson fusion as claimed in claim 6, wherein: the discriminant model is formed by a convolutional network, avoiding the use of fully-connected layers that slow convergence speed after the convolutional layer, and using dropout to provide noisy inputs to the convolutional layer.

8. The data enhancement method based on deep convolution countermeasure network and poisson fusion as claimed in claim 7, wherein: the discrimination model receives a 96pixel multiplied by 3 generated sample image, after convolution with the convolution kernel size of 5 multiplied by 5, the step length of 2 multiplied by 2, the number of convolution kernels of 64, 128, 256 and 512 in sequence, the other layers use Leakrelu activation functions, and the authenticity of a new pit defect sample generated by the generation model is discriminated by mapping the discrimination probability of an output sample through the convolution with the convolution kernel of 1 multiplied by 1 and the sigmoid activation function.

9. The data enhancement method based on deep convolutional countermeasure network and poisson fusion as claimed in claim 8, wherein: in the training process, generating a model for generating a sample similar to the original data; the discriminant model is used for distinguishing the samples generated by the generative model from the real samples, and the training process is expressed as follows:

wherein D is a discriminant function, G is a generation function, E (-) is a calculation expectation value, x is a real data sample, p_data(x) A is an arbitrary number greater than 0 and not equal to 1, D (x) is the probability of being judged as a real sample after the real data sample is input, z is random noise conforming to Gaussian distribution or uniform distribution, p_z(z) is a probability distribution of the initial noise data, D (g (z)) is a probability of being judged as a true sample after a sample generated by the generative model is input, and g (z) is a sample of the generative model.

10. A data enhancement device based on deep convolution countermeasure network and Poisson fusion is characterized in that: the method comprises the following steps: the acquisition module is used for acquiring urban road images and acquiring a road damage data set;