CN114118362A - Method for detecting structural surface crack under small sample based on generating type countermeasure network - Google Patents

Method for detecting structural surface crack under small sample based on generating type countermeasure network Download PDF

Info

Publication number
CN114118362A
CN114118362A CN202111298865.7A CN202111298865A CN114118362A CN 114118362 A CN114118362 A CN 114118362A CN 202111298865 A CN202111298865 A CN 202111298865A CN 114118362 A CN114118362 A CN 114118362A
Authority
CN
China
Prior art keywords
image
layer
training
generator
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111298865.7A
Other languages
Chinese (zh)
Inventor
刘超
许博强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202111298865.7A priority Critical patent/CN114118362A/en
Publication of CN114118362A publication Critical patent/CN114118362A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention relates to a method for detecting surface cracks of a small-sample lower structure based on a generating type confrontation network, which comprises the following steps: 1) acquiring an image of the surface of the structure, preprocessing the image and constructing an image data set; 2) constructing an image generation model to train the acquired image data set, and realizing the expansion of the image data set; 3) and constructing a detection model for judging whether the structure surface has cracks or not based on VggNet, and finishing the crack detection of the structure surface after training. Compared with the prior art, the method has the advantages of being suitable for small samples, accurate and rapid in detection and the like.

Description

Method for detecting structural surface crack under small sample based on generating type countermeasure network
Technical Field
The invention relates to the field of structure surface crack detection, in particular to a method for detecting a structure surface crack under a small sample based on a generating type countermeasure network.
Background
In recent years, with the large amount of construction and use of roads, attention is paid to maintenance and management of roads, and cost can be greatly reduced by detecting and maintaining the roads at the early stage of road damage. With the rapid development of the deep learning technology, the acquired structural surface image can be classified and detected rapidly and accurately by using the convolutional neural network. For the task of structural surface detection based on the convolutional neural network, when the number of samples is missing, how to obtain a large amount of high-quality training set data is a key problem to be solved urgently.
Through the literature search of the prior art, the research in the detection of the cracks on the surface of the structure mostly focuses on the improvement of the performance of the algorithm under the condition of sufficient data set quantity, so that the aim of accurately detecting the cracks on the road surface is fulfilled.
In the Chinese patent 'a complex pavement crack recognition method based on R-CNN' (application number 202110732505.7), Wangmai et al utilizes a public data set ImageNet to pre-train Mask R-CNN, then constructs a pavement crack recognition model, trains marked pavement picture data through a deep learning algorithm, detects cracks in pictures and positions where the cracks appear, finally uploads recognized crack image files to a server through a network, and stores paths of the images. According to the method, the detection precision is improved through the characteristic pyramid network, and the crack identification effect is further improved.
In a Chinese patent 'concrete pavement crack detection method for improving a PoolNet network structure' (application number 202110432301.1), Qu and the like disclose a concrete pavement crack detection method for improving a PoolNet network structure.
However, in the above existing related researches, the situation that the data set is small and is not enough for training the model is not considered, and in fact, if the data set used for training is small in sample, the accuracy rate is greatly reduced when the trained model is used for detecting the pavement crack.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a method for detecting surface cracks of a structure under a small sample based on a generative antagonistic network.
The purpose of the invention can be realized by the following technical scheme:
a method for detecting surface cracks of a small-sample lower structure based on a generative antagonistic network comprises the following steps:
1) acquiring an image of the surface of the structure, preprocessing the image and constructing an image data set;
2) constructing an image generation model to train the acquired image data set, and realizing the expansion of the image data set;
3) and constructing a detection model for judging whether the structure surface has cracks or not based on VggNet, and finishing the crack detection of the structure surface after training.
The step 1) is specifically as follows:
the method comprises the steps of shooting a structure surface by adopting an unmanned aerial vehicle, a camera or a smart phone, finishing acquisition of an original image data set, and dividing an acquired image of the structure surface into a plurality of images with square sizes to form an image data set.
In the step 2), a depth convolution generation countermeasure network DCGAN is adopted as an image generation model to train the image data set so as to generate a structural surface image with any multiple cracks, and the expansion of the image data set is realized.
The generator structure for generating the countermeasure network DCGAN by deep convolution sequentially comprises the following steps:
an input layer: for inputting random noise array;
remodeling layer: a full connection layer for converting random numbers and rearranging the random numbers to convert the random numbers into arrays;
three in succession include the combination of the micro-step convolutional layer, the normalization layer and the Relu layer: the method is used for converting an array output by a remolding layer into an array of 3 channels, namely an RGB three-channel color image, through a micro-step convolution layer and a hyperbolic tangent activation function.
The input of the generator is one1 × 1 × 100 noise array, after passing through the reshaper layer, is a 24 × 24 array of 512 channels, and then, each time passing through a micro-step convolution layer, the number of channels of the array is reduced and the size of the array is increased, the normalization layer normalizes each group of data into data with a mean value of 0 and a variance of 1, the Relu layer is used for increasing sparsity of a neural network, and a Relu function is adopted
Figure BDA0003337478570000031
The size of the array is not changed by the normalization layer, the Relu layer and the tanh layer, and finally the array of 224 multiplied by 224 of 3 channels is output, namely the generated color image.
The discriminator of the deep convolution generation countermeasure network DCGAN is a convolution neural network, the input of which is a 224 multiplied by 224 color image generated by the generator, and a 1 multiplied by 1 number is generated after continuous 5 convolution layers, namely the output of the discriminator.
In 5 convolutional layers, the filter size of the first 4 convolutional layers is 5 × 5, the number of channels is 32,64,128 and 256 respectively, the step size is 2, the mode of filling pixels is Same, the last convolutional layer is a single-channel 14 × 14 × 256 filter, in order to prevent gradient explosion or overfitting, a plurality of normalization layers and Relu layers are added into a discriminator, and the Relu function in the discriminator uses a leaked Relu function
Figure BDA0003337478570000032
The loss function expression of the deep convolution generation countermeasure network DCGAN is as follows:
Figure BDA0003337478570000033
wherein D (x) represents the probability that the input image is a true photographed image calculated by the discriminator D, 1 represents the judgment as true, 0 represents the judgment as false, the higher the probability is, the more likely it is as true image, z is the input random noise, G (z) represents the generated sample image,
Figure BDA0003337478570000034
representing a mathematical expectation on a real data set,
Figure BDA0003337478570000035
representing a mathematical expectation on the generated data set;
the training goal of the deep convolution generation of the countermeasure network DCGAN is to make D (x) and 1-D (G (z)) as large as possible with the generator G fixed and as small as possible with the arbiter D fixed, optimizing the arbiter D with the generator G fixed, in the form of a binary cross-entropy function, resulting in a loss
Figure BDA0003337478570000036
Optimize generator G with discriminator D fixed, get loss
Figure BDA0003337478570000037
Optimization taking comprehensive consideration of the discriminator D and the generator G, i.e. with a loss function of
Figure BDA0003337478570000038
During the training process, the discriminator D is optimized to reduce lossDMaximizing, and then optimizing the generator G to loseGMinimizing, and continuously circulating the process to make the discriminator D and the generator G reach Nash equilibrium, so as to train the generator which can be synthesized to be very similar to the real image.
The hyper-parameters of the anti-network DCGAN generated by the deep convolution comprise the probability dropprogram of dropout layer discarding, the gradient LEAKYReLuScale of the leaked ReLu function when the input is negative, the training set quantity miniBatchSize of each iteration, the iteration time epoch, the learning rate learngrate, the gradient attenuation factor gradientDecayFactor and the square gradient attenuation factor squarredGradientDecayFactor in SGDM optimization.
The step 3) specifically comprises the following steps:
31) construction of detection model: classifying the types of the cracks by training through a convolutional neural network CNN, training in a transfer learning mode by taking synthetic image data generated by a generator as input, training by using partial data of a Vgg16 pre-training network, and training the partial data of the convolutional neural network by using a Vgg16 pre-training network;
32) evaluation of the detection model:
respectively training the data set before the expansion and the data set after the expansion to obtain two detection models, inputting the pictures in the test set into the detection models of the two convolutional neural networks for prediction, and if the detection accuracy after the expansion is higher than that before the expansion, considering that the generated model is meaningful.
33) Application of the detection model:
and processing the actually shot picture, inputting the processed picture into a detection model obtained by training after expansion, and obtaining a detection result, wherein the detection result comprises no crack, a linear crack and a reticular crack.
Compared with the prior art, the invention has the following advantages:
the method constructs a generation type countermeasure network based on DCGAN, and enables the generator to generate vivid images when the model is stable by continuously improving the game performance of the generator and the discriminator, then the synthesized images are used for expanding a crack data set, and a convolutional neural network for crack type identification is constructed based on Vgg16 in a transfer learning mode, so that the method can expand data under the condition of few samples, and can adapt to the requirement of the convolutional neural network model on the data quantity.
Drawings
Fig. 1 is a schematic diagram of image segmentation.
Fig. 2 is a schematic diagram of the layers of a generator network.
FIG. 3 is a diagram of the layers of a network of arbiters.
Fig. 4 is a neural network structure of the migration learning based on the Vgg16 network.
FIG. 5 is a flow diagram of the present invention.
FIG. 6 is the average probability calculated by the discriminator during the training process, wherein FIG. 6a shows averProb _ epoch at each learning ratetrueFIG. 6b shows averProb _ epoch at each learning ratefake
Fig. 7 shows the model training around 1500 generations, where fig. (7a) is averProb curve and fig. (7b) is loss curve.
Fig. 8 is an image generated by the generation model.
FIG. 9 is a partial composite image generated with a generator.
FIG. 10 is a loss curve and accuracy curve for training before and after data expansion, where FIG. 10a is the loss curve before expansion, FIG. 10b is the accuracy curve before expansion, FIG. 10c is the loss curve after expansion, and FIG. 10d is the accuracy curve after expansion.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
The invention provides a method for detecting surface cracks of a small-sample lower structure based on a generating type countermeasure network, which comprises the following steps:
1) collecting data and preprocessing;
2) constructing an image generation model (DCGAN), training the acquired image data set, and realizing the expansion of the image data set;
3) and constructing a CNN model for judging whether the surface of the structure has cracks or not based on VggNet, and comparing the effects of the data set before and after expansion.
The details of the above steps are as follows:
step 1) the data set production and pretreatment specifically comprises the following steps:
11) raw data acquisition
And shooting the surface of the structure by using an unmanned aerial vehicle, a camera or a smart phone to finish the acquisition of the original data set.
12) Image pre-processing
Since the convolutional neural network processes a generally square image, the image is divided into a plurality of square images, as shown in fig. 1.
Since the resolution of the input image is 224 × 224 for the conventional convolutional neural network (Vgg16/19, ResNet101), the excessive size will increase the size and depth of the network geometrically, which greatly increases the computation of the computer. Therefore, the divided square image of the resolution is compressed into an image of the resolution 224 × 224 in consideration of the computational efficiency and the image quality, and finally an image data set is formed. In these images, 80% were used as a training set of data, and 20% were used as a test set for training the neural network.
Step 2) the construction and training of the image generation model specifically comprises the following steps:
21) selection of image generation model
The method is based on DCGAN, trains the crack images obtained in the step 1, and finally obtains a generation model for generating any multi-crack images.
The network of GANs used to generate images is divided into two parts: a discriminator and a generator. The discriminator is similar to the neural network in the image classification, inputs the image information, outputs and judges, and judges whether the image belongs to the given class. The generator inputs a plurality of random numbers, and the false images are output by calculating through a neural network of the generator. In the training process, firstly, a random noise vector z with several dimensions is generated, the information of the noise is input into a generator G, and the generator converts the random noise vector into an image G (z). G (z) is generated and then is input into a discriminator D, the discriminator outputs the probability that the image is a real image, if the probability is 1, the discriminator judges that the image is the real image, otherwise, if the probability is 0, the image is a fake image.
In each round of training, the generator G aims to generate a picture which looks real as much as possible to cheat the discrimination network D. And the aim of D is to separate the picture generated by G and the real picture as much as possible. After each round of training, parameters in D and G are corrected through a back propagation algorithm, the capability of G for generating false and true images and the capability of D for judging the images can be improved, and the generator can generate images similar to a training set after enough times of training.
The original GAN presents some problems in image generation. The stability of the generator is not high, and the situation that the generating capacity of the generator is not consistent with the judging capacity of the discriminator is easy to occur. Deep convolution generation countermeasure networks (DCGANs) can effectively solve this problem, so this patent will construct a generation network that generates structural surface cracks.
22) Construction of generators
The input of the generator is a random noise of 1 × 1 × 100, and the input layer passes through a reshaper layer (project and reshape layer), which is a fully connected layer, and the 100 generated random numbers are converted into 294912 numbers, and then rearranged into a 24 × 24 array of 512 channels.
The generator then converts this 512-channel 24 x 24 array into a 64-channel 112 x 112 array by a succession of three combinations including a micro-convolution layer, a normalization layer, and a Relu layer, and then passes a micro-convolution layer and a hyperbolic tangent activation function (i.e., a micro-convolution layer and a hyperbolic tangent activation function)
Figure BDA0003337478570000061
) The image is converted into a 224 × 224 array of 3 channels, namely, an RGB three-channel image.
The micro-step convolution expands the smaller array to a larger one, with a total of four micro-step convolution layers, denoted as transConv1, transConv2, transConv3, and transConv 4. The convolution kernel sizes of the four micro-step convolution layers are all 5 multiplied by 5, the lengths of the convolution kernel sizes are the same as the number of input array channels (512,256,128 and 64) output by the previous layer, and the number of the layers is 256,128,64 and 3 respectively. In addition, the step lengths of transConv2, transConv3, and transConv4 were all 2, and the output size reduction mode was "Same", that is, the size was not changed, except that the step length was 1 in the transConv1 and the output size was reduced to the default value (without clipping).
A normalization layer and a Relu layer were attached after each of transConv1, transConv2, and transConv 3. The normalization layer optimizes the problem of the change of the data distribution of the middle layer to prevent the disappearance or explosion of the gradient and accelerate the training speed, and utilizes
Figure BDA0003337478570000071
Each set of data was normalized to data with a variance of 1 for a mean of 0. Relu layer can be increased to some extentDue to the sparsity of the neural network, the over-fitting problem is relieved, the situation of gradient disappearance is greatly reduced, meanwhile, the calculated amount is small, and the training efficiency can be improved. In the generation network, all Relu layers use the common Relu function, i.e.
Figure BDA0003337478570000072
The specific structure of the generator network is shown in table 1.
Table 1 details of the generator network
Figure BDA0003337478570000073
The input of the generator is a 1 × 1 × 100 noise array, which is a 24 × 24 array of 512 channels after passing through the reshaping layer, and then the number of channels of the array is reduced and the size is increased every time passing through a micro-step convolution layer, while the normalization layer, the Relu layer and the tanh layer do not change the size of the array. The final output becomes a 3-channel 224 x 224 array, i.e., the generated color image, with the inputs and outputs of the various layers of the generator network shown in fig. 2.
23) Structure of discriminator
The arbiter is a convolutional neural network. The input to the discriminator is a 224 x 224 color image, first passed through a Dropout layer to prevent overfitting. Then, the discriminator convolves the array of 224 × 224 × 3 into a number of 1 × 1 × 1 through 5 convolution layers. These 5 convolutional layers are designated as Conv1, Conv2, Conv3, Conv4 and Conv 5. Of the five convolutional layers, the first four convolutional layers all have a filter size of 5 × 5, the number of channels is 32,64,128,256, respectively, the steps are 2, and the mode of filling pixels is "Same", i.e., the size is not changed. The last convolutional layer Conv5 is a single-channel 14 × 14 × 256 filter, and convolves the 14 × 14 × 256 array transmitted by the previous layer into a 1 × 1 × 1 number, which is the output of the discriminator.
In order to prevent gradient explosion or overfitting and the like, a plurality of normalization layers and Relu layers are also added into the discriminator. Normalization is the same as in 2.2.1, using
Figure BDA0003337478570000081
The input data are converted into data with a mean value of 0 and a variance of 1. While the Relu function in the discriminator uses the leaky Relu function, whose gradient is 0.2 when the input is negative, i.e. the gradient is
Figure BDA0003337478570000082
The specific structure of the discriminator network is shown in table 2.
TABLE 2 detailed construction of discriminator network
Figure BDA0003337478570000083
Figure BDA0003337478570000091
The input of the discriminator is a 224 × 224 × 3 RGB three-channel image, the array size becomes half of the original (112 × 112,56 × 56,28 × 28,14 × 14) after each convolution layer of Conv1, Conv2, Conv3 and Conv4, the number of channels is 32,64,128 and 256 in turn, and the normalization layer and the Relu layer do not change the array size. The final output becomes a number of 1 × 1 × 1. The input and output of each layer of the discriminator network are shown in fig. 3.
24) Construction of loss function
The loss function adopts a loss function commonly used in the GAN network:
Figure BDA0003337478570000092
where D (x) represents the probability that the input image is a true captured image calculated by the discriminator D, 1 represents an image determined to be "true", 0 represents an image determined to be "fake", and the higher the probability, the more likely it is to be a true image. z is the input random noise, and g (z) represents the generated samples. The goal of discriminant D training is therefore to make D (x) with G fixedtrue) Tends to 1 and D (G (z)) tends to 0. Whereas the opposite is true for the generator G,it is desirable to make D (g (z)) approach 1 as much as possible with D fixed. In summary, the goal of this GAN training is to make D (x) and 1-D (G (z)) as large as possible with G fixed and as small as possible with D fixed, and D is optimized with G fixed using the form of a binary cross-entropy function to obtain D
Figure BDA0003337478570000093
Optimizing G under the condition that D is fixed to obtain
Figure BDA0003337478570000094
The optimization of D and G is comprehensively considered, and the loss function is
Figure BDA0003337478570000095
In the training process, D is optimized to reduce lossDMaximizing, then optimizing G to make lossGAnd (4) minimizing. The process is continuously circulated, so that D and G reach Nash equilibrium, and a generator which can be synthesized to be very similar to a real image can be trained, thereby providing data supplement for further analysis.
25) Selection of hyper-parameters
The selection of the hyper-parameters of the GAN network is based in part on the experience of the computer field regarding the selection of hyper-parameters and in part on the results of testing a set of different hyper-parameters.
In the neural network constructed by the invention, the hyper-parameters to be selected are: dropout layer discarding probability dropout prob, gradient leakrescale of the leaked ReLu function when the input is negative, training set number miniBatchSize per iteration, iteration number epoch, learning rate learningRate in SGDM optimization, gradient attenuation factor gradientDecayFactor, and squared gradient attenuation factor squarredGradientDecayFactor. The learning rate and the iteration times of the most important hyper-parameters have large influence on the training effect, the common learning rate value range (0.0001-0.001) and the large iteration times (2000 generation) are taken, and the optimal learning rate and the optimal iteration times are determined through testing in actual use. Specific values of these hyper-parameters are shown in table 3.
TABLE 3 selection of hyper-parameters
Hyper-parameter Value taking
dropoutProb 0.5
leakyReluScale 0.2
miniBatchSize 8
epoch ≤2000
learningRate 0.0001-0.001
gradientDecayFactor 0.5
squaredGradientDecayFactor 0.999
Step 3) the construction and training of the detection model specifically comprises the following steps:
31) construction of detection model
The cracks of the structured surface are divided into three types according to geometrical characteristics: no cracks, linear cracks and reticular cracks. The convolutional neural network is used for classifying the types of the cracks through training.
With the generator trained in step 2, a large amount of synthetic image data can be generated. If there are 1000 possibilities for the random numbers input by the generator, then theoretically, 1000 random numbers can be generated by using 100 random numbers as inputs100=10300Different images can meet the requirement of the convolutional neural network on the number of training sets.
Training is carried out by using partial data of a pre-training network of Vgg16 in a transfer learning mode. The portion of the convolutional neural network employs Vgg 16. When the Vgg16 trains images, the set number of classes is 1000, so in the classification model of the structural surface cracks, the last two layers (full-connected layer and softmax layer) containing 1000 outputs are replaced by 3 full-connected layers and softmax layers, and the parameters of other layers are still trained by the parameters of Vgg 16. The structure of the convolutional neural network is shown in fig. 4.
32) Evaluation of detection models
And respectively training the data set before the expansion and the data set after the expansion to obtain two detection models. And (3) inputting the pictures in the test set into the models of the two convolutional neural networks for prediction, and if the detection accuracy after the expansion is higher than that before the expansion, determining that the generated model is meaningful.
Examples
The method for detecting the cracks on the surface of the structure comprises the following steps:
step 1, collecting a data set on one road surface of an intersection of two roads in a certain place, and shooting by using an unmanned aerial vehicle, wherein the data of the unmanned aerial vehicle is shown in a table 4:
TABLE 4 UAV parameters
Parameter(s) Value of
Mass/g of unmanned aerial vehicle 1250
Maximum time of flight/min About 30
Effective pixel About 20,000,000
Photo size 5472×3648
Resolution of video recording 1920×1080
Controllable rotation range/(°) -90°-30°
The flying height of the unmanned aerial vehicle is 1.5m, the camera is vertical to the ground, the unmanned aerial vehicle moves along the road direction, images of 80 pavements are shot, and the images are preprocessed firstly. The picture pixel size that unmanned aerial vehicle shot is 5472 × 3648, because the convolutional neural network processing is the image of square in general, so split the image into 24 images of size 912 × 912.
The divided images of 912 × 912 resolution were compressed into images of 224 × 224 resolution, and 1920 images were obtained finally. In fact, all images processed after segmentation and having a resolution of 224 × 224 do not contain cracks. Therefore, 1187 images containing cracks can be obtained by removing the pictures of the crack-free pavement and the pictures containing leaves, inspection well covers and the like. In these images, 950 (80%) were used as a training set of data, and 237 (20%) was used as a test set to train a neural network.
Step 2, training the DCGAN network by taking training set data as input
Training of the model the construction and training of the discriminators and generators was performed using the MATLAB as a programming platform, with a deep learning toolkit in the MATLAB.
Four parameters, namely 0.0001,0.0002,0.0005 and 0.001, are selected for comparison in order to obtain the optimal learning rate.
FIG. 6 shows the average probability of the classifiers when the learning rates are 0.0001,0.0002,0.0005,0.001, i.e. the average probability averProb predicted by the classifier D for each real image and synthesized image in each cycletrue,averProbfakeNamely:
Figure BDA0003337478570000121
Figure BDA0003337478570000122
where m is the amount of data used per cycle, miniBatchSize. I isdataFor real images, IfakeAs a composite image (i.e. as a random input x)randOutput G (x) by generator Grand))。
Similar to the loss function, to smooth the averProb function, the average averProb _ epoch of the loss function is also calculated for all iterations of each generationtrue,averProb_epochfakeAnd (6) performing calculation.
As can be seen from fig. 6, when the learning rates are 0.0002,0.0005 and 0.001, the convergence rates of the loss function and the prediction probability in the early stage of training are almost the same, the values of loss and averProb in models at the learning rates of 0.0005 and 0.001 after reaching the steady state are almost the same, and the values of loss and averProb in models at the learning rates of 0.0002 and 0.0001 are almost the same, and the latter is superior to the former. In summary, considering the convergence rate and the calculation efficiency of the model, the learning rate of not less than 0.0002 is preferable, and considering the final convergence condition of the model, the learning rate of not more than 0.0002 is preferable. In consideration of both, the learning rate of 0.0002 is optimal.
To obtain the optimal number of training iterations maxEpoch, analysis of the loss curve and the mean probability curve after selecting a learning rate of 0.0002 can find averProb after training to 1500 generationsfakeAnd lossDBeginning to increase, averProbtrueAnd lossGBegins to decrease as shown in fig. 7.
Therefore, after 1500 generations, the quality of the model is reduced, and an overfitting phenomenon occurs, so that the generator obtained by 1500 generations of training is taken as a final result. The resulting image is shown in fig. 8.
Step 3, therefore, a plurality of images are generated by using the trained generator, images without cracks, linear cracks and reticular cracks are separated out, and the training set is expanded to 2000 pieces of images in each classification. The 6000 images were trained as an image classification dataset. Fig. 9 is a partially synthesized image.
The loss curve versus accuracy curve for the training is shown in fig. 10.
And (3) inputting the pictures in the test set into the models of the two convolutional neural networks for prediction, wherein the result shows that the model classification accuracy rate trained by utilizing the data set before expansion is 87.23%, and the model classification accuracy rate trained by utilizing the data set after expansion is 97.50%. The results of the fracture classification are shown in Table 5.
TABLE 5 crack Classification results
Figure BDA0003337478570000131
The results show that the training precision can be greatly improved by using the extended data for training. In addition, the model can be rapidly converged by using the transfer learning method, and the training efficiency is greatly improved, so that the data set expansion by using the generated model is meaningful.

Claims (10)

1. A method for detecting surface cracks of a small-sample lower structure based on a generating type antagonistic network is characterized by comprising the following steps:
1) acquiring an image of the surface of the structure, preprocessing the image and constructing an image data set;
2) constructing an image generation model to train the acquired image data set, and realizing the expansion of the image data set;
3) and constructing a detection model for judging whether the structure surface has cracks or not based on VggNet, and finishing the crack detection of the structure surface after training.
2. The method for detecting the surface cracks of the lower structure of the small sample based on the generative antagonistic network as claimed in claim 1, wherein the step 1) is specifically as follows:
the method comprises the steps of shooting a structure surface by adopting an unmanned aerial vehicle, a camera or a smart phone, finishing acquisition of an original image data set, and dividing an acquired image of the structure surface into a plurality of images with square sizes to form an image data set.
3. The method as claimed in claim 1, wherein in the step 2), the generative countermeasure network DCGAN is used as an image generation model to train the image data set to generate a structural surface image with any multiple cracks, so as to achieve the expansion of the image data set.
4. The method as claimed in claim 3, wherein the generator structure of the deep convolution generated countermeasure network DCGAN sequentially includes:
an input layer: for inputting random noise array;
remodeling layer: a full connection layer for converting random numbers and rearranging the random numbers to convert the random numbers into arrays;
three in succession include the combination of the micro-step convolutional layer, the normalization layer and the Relu layer: the method is used for converting an array output by a remolding layer into an array of 3 channels, namely an RGB three-channel color image, through a micro-step convolution layer and a hyperbolic tangent activation function.
5. The method as claimed in claim 4, wherein the input of the generator is a 1 x 100 noise array, the input of the generator is a 24 x 24 array of 512 channels after passing through the reshape layer, the number of channels of the array decreases and the size increases after passing through a micro convolution layer, the normalization layer normalizes each array of data into data with a mean value of 0 and a variance of 1, the Relu layer is used to increase sparsity of the neural network, and a Relu function is used
Figure FDA0003337478560000021
The size of the array is not changed by the normalization layer, the Relu layer and the tanh layer, and finally the array of 224 multiplied by 224 of 3 channels is output, namely the generated color image.
6. The method as claimed in claim 5, wherein the discriminator of the deep convolution generation countermeasure network DCGAN is a convolution neural network, the input of which is a 224 x 224 color image generated by the generator, and the discriminator outputs a number of 1 x 1 after 5 successive convolution layers.
7. The method as claimed in claim 6, wherein the filter size of the first 4 convolutional layers among the 5 convolutional layers is 5 × 5, the number of channels is 32,64,128 and 256, the step is 2, the mode of filling pixels is Same, the last convolutional layer is a single-channel 14 × 14 × 256 filter, to prevent gradient explosion or overfitting, a plurality of normalization layers and Relu layers are added to the discriminator, and the Relu function in the discriminator uses the leaked Relu function
Figure FDA0003337478560000022
8. The method as claimed in claim 3, wherein the loss function expression of the deep convolution generative countermeasure network DCGAN is as follows:
Figure FDA0003337478560000023
wherein D (x) represents the probability that the input image is a true photographed image calculated by the discriminator D, 1 represents the judgment as true, 0 represents the judgment as false, the higher the probability is, the more likely it is as true image, z is the input random noise, G (z) represents the generated sample image,
Figure FDA0003337478560000024
representing a mathematical expectation on a real data set,
Figure FDA0003337478560000025
representing a mathematical expectation on the generated data set;
the training goal of the deep convolution generation of the countermeasure network DCGAN is to make D (x) and 1-D (G (z)) as large as possible with the generator G fixed and as small as possible with the arbiter D fixed, optimizing the arbiter D with the generator G fixed, in the form of a binary cross-entropy function, resulting in a loss
Figure FDA0003337478560000026
Optimize generator G with discriminator D fixed, get loss
Figure FDA0003337478560000027
Optimization taking comprehensive consideration of the discriminator D and the generator G, i.e. with a loss function of
Figure FDA0003337478560000028
During the training process, the discriminator D is fed firstOptimization of line to lossDMaximizing, and then optimizing the generator G to loseGMinimizing, and continuously circulating the process to make the discriminator D and the generator G reach Nash equilibrium, so as to train the generator which can be synthesized to be very similar to the real image.
9. The method for detecting the surface cracks of the structure under the small sample based on the generative countermeasure network according to claim 3, wherein the hyper-parameters of the deep convolution generative countermeasure network DCGAN comprise a probability dropout of dropout layer discarding, a gradient leakReLu Scale of a leaked ReLu function when an input is a negative number, a training set number miniBatchSize of each iteration, an iteration number epoch, a learning rate learninggR in SGDM optimization, a gradient attenuation DecayFactor and a square gradient attenuation factor squaredGradientDecayFactor.
10. The method for detecting the surface cracks of the structure under the small sample based on the generative antagonistic network as claimed in claim 1, wherein the step 3) comprises the following steps:
31) construction of detection model: classifying the types of the cracks by training through a convolutional neural network CNN, training in a transfer learning mode by taking synthetic image data generated by a generator as input, training by using partial data of a Vgg16 pre-training network, and training the partial data of the convolutional neural network by using a Vgg16 pre-training network;
32) evaluation of the detection model:
respectively training the data set before the expansion and the data set after the expansion to obtain two detection models, inputting the pictures in the test set into the detection models of the two convolutional neural networks for prediction, and if the detection accuracy after the expansion is higher than that before the expansion, considering that the generated model is meaningful.
33) Application of the detection model:
and processing the actually shot picture, inputting the processed picture into a detection model obtained by training after expansion, and obtaining a detection result, wherein the detection result comprises no crack, a linear crack and a reticular crack.
CN202111298865.7A 2021-11-04 2021-11-04 Method for detecting structural surface crack under small sample based on generating type countermeasure network Pending CN114118362A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111298865.7A CN114118362A (en) 2021-11-04 2021-11-04 Method for detecting structural surface crack under small sample based on generating type countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111298865.7A CN114118362A (en) 2021-11-04 2021-11-04 Method for detecting structural surface crack under small sample based on generating type countermeasure network

Publications (1)

Publication Number Publication Date
CN114118362A true CN114118362A (en) 2022-03-01

Family

ID=80380591

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111298865.7A Pending CN114118362A (en) 2021-11-04 2021-11-04 Method for detecting structural surface crack under small sample based on generating type countermeasure network

Country Status (1)

Country Link
CN (1) CN114118362A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689924A (en) * 2022-10-28 2023-02-03 浙江大学 Data enhancement method and device for concrete structure ultrasonic tomography image
CN117436350A (en) * 2023-12-18 2024-01-23 中国石油大学(华东) Fracturing horizontal well pressure prediction method based on deep convolution generation countermeasure network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689924A (en) * 2022-10-28 2023-02-03 浙江大学 Data enhancement method and device for concrete structure ultrasonic tomography image
CN117436350A (en) * 2023-12-18 2024-01-23 中国石油大学(华东) Fracturing horizontal well pressure prediction method based on deep convolution generation countermeasure network
CN117436350B (en) * 2023-12-18 2024-03-08 中国石油大学(华东) Fracturing horizontal well pressure prediction method based on deep convolution generation countermeasure network

Similar Documents

Publication Publication Date Title
CN110334765B (en) Remote sensing image classification method based on attention mechanism multi-scale deep learning
CN109614979B (en) Data augmentation method and image classification method based on selection and generation
CN106599797B (en) A kind of infrared face recognition method based on local parallel neural network
CN109063724B (en) Enhanced generation type countermeasure network and target sample identification method
CN110348399B (en) Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network
CN111191660B (en) Colon cancer pathology image classification method based on multi-channel collaborative capsule network
CN106228185B (en) A kind of general image classifying and identifying system neural network based and method
CN112990097B (en) Face expression recognition method based on countermeasure elimination
CN113705526B (en) Hyperspectral remote sensing image classification method
WO2018052586A1 (en) Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
CN108009222B (en) Three-dimensional model retrieval method based on better view and deep convolutional neural network
CN111460980B (en) Multi-scale detection method for small-target pedestrian based on multi-semantic feature fusion
CN111860587B (en) Detection method for small targets of pictures
CN114118362A (en) Method for detecting structural surface crack under small sample based on generating type countermeasure network
CN111832608B (en) Iron spectrum image multi-abrasive particle identification method based on single-stage detection model yolov3
Liao et al. Triplet-based deep similarity learning for person re-identification
CN110503000B (en) Teaching head-up rate measuring method based on face recognition technology
Zhan et al. Semi-supervised classification of hyperspectral data based on generative adversarial networks and neighborhood majority voting
CN103226699B (en) A kind of face identification method having supervision locality preserving projections based on separating degree difference
CN111833322B (en) Garbage multi-target detection method based on improved YOLOv3
CN114495010A (en) Cross-modal pedestrian re-identification method and system based on multi-feature learning
CN113095158A (en) Handwriting generation method and device based on countermeasure generation network
CN110503157B (en) Image steganalysis method of multitask convolution neural network based on fine-grained image
CN112132207A (en) Target detection neural network construction method based on multi-branch feature mapping
CN113723482B (en) Hyperspectral target detection method based on multi-example twin network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination