CN114118362A - Method for detecting structural surface crack under small sample based on generating type countermeasure network - Google Patents
Method for detecting structural surface crack under small sample based on generating type countermeasure network Download PDFInfo
- Publication number
- CN114118362A CN114118362A CN202111298865.7A CN202111298865A CN114118362A CN 114118362 A CN114118362 A CN 114118362A CN 202111298865 A CN202111298865 A CN 202111298865A CN 114118362 A CN114118362 A CN 114118362A
- Authority
- CN
- China
- Prior art keywords
- image
- layer
- training
- generator
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention relates to a method for detecting surface cracks of a small-sample lower structure based on a generating type confrontation network, which comprises the following steps: 1) acquiring an image of the surface of the structure, preprocessing the image and constructing an image data set; 2) constructing an image generation model to train the acquired image data set, and realizing the expansion of the image data set; 3) and constructing a detection model for judging whether the structure surface has cracks or not based on VggNet, and finishing the crack detection of the structure surface after training. Compared with the prior art, the method has the advantages of being suitable for small samples, accurate and rapid in detection and the like.
Description
Technical Field
The invention relates to the field of structure surface crack detection, in particular to a method for detecting a structure surface crack under a small sample based on a generating type countermeasure network.
Background
In recent years, with the large amount of construction and use of roads, attention is paid to maintenance and management of roads, and cost can be greatly reduced by detecting and maintaining the roads at the early stage of road damage. With the rapid development of the deep learning technology, the acquired structural surface image can be classified and detected rapidly and accurately by using the convolutional neural network. For the task of structural surface detection based on the convolutional neural network, when the number of samples is missing, how to obtain a large amount of high-quality training set data is a key problem to be solved urgently.
Through the literature search of the prior art, the research in the detection of the cracks on the surface of the structure mostly focuses on the improvement of the performance of the algorithm under the condition of sufficient data set quantity, so that the aim of accurately detecting the cracks on the road surface is fulfilled.
In the Chinese patent 'a complex pavement crack recognition method based on R-CNN' (application number 202110732505.7), Wangmai et al utilizes a public data set ImageNet to pre-train Mask R-CNN, then constructs a pavement crack recognition model, trains marked pavement picture data through a deep learning algorithm, detects cracks in pictures and positions where the cracks appear, finally uploads recognized crack image files to a server through a network, and stores paths of the images. According to the method, the detection precision is improved through the characteristic pyramid network, and the crack identification effect is further improved.
In a Chinese patent 'concrete pavement crack detection method for improving a PoolNet network structure' (application number 202110432301.1), Qu and the like disclose a concrete pavement crack detection method for improving a PoolNet network structure.
However, in the above existing related researches, the situation that the data set is small and is not enough for training the model is not considered, and in fact, if the data set used for training is small in sample, the accuracy rate is greatly reduced when the trained model is used for detecting the pavement crack.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a method for detecting surface cracks of a structure under a small sample based on a generative antagonistic network.
The purpose of the invention can be realized by the following technical scheme:
a method for detecting surface cracks of a small-sample lower structure based on a generative antagonistic network comprises the following steps:
1) acquiring an image of the surface of the structure, preprocessing the image and constructing an image data set;
2) constructing an image generation model to train the acquired image data set, and realizing the expansion of the image data set;
3) and constructing a detection model for judging whether the structure surface has cracks or not based on VggNet, and finishing the crack detection of the structure surface after training.
The step 1) is specifically as follows:
the method comprises the steps of shooting a structure surface by adopting an unmanned aerial vehicle, a camera or a smart phone, finishing acquisition of an original image data set, and dividing an acquired image of the structure surface into a plurality of images with square sizes to form an image data set.
In the step 2), a depth convolution generation countermeasure network DCGAN is adopted as an image generation model to train the image data set so as to generate a structural surface image with any multiple cracks, and the expansion of the image data set is realized.
The generator structure for generating the countermeasure network DCGAN by deep convolution sequentially comprises the following steps:
an input layer: for inputting random noise array;
remodeling layer: a full connection layer for converting random numbers and rearranging the random numbers to convert the random numbers into arrays;
three in succession include the combination of the micro-step convolutional layer, the normalization layer and the Relu layer: the method is used for converting an array output by a remolding layer into an array of 3 channels, namely an RGB three-channel color image, through a micro-step convolution layer and a hyperbolic tangent activation function.
The input of the generator is one1 × 1 × 100 noise array, after passing through the reshaper layer, is a 24 × 24 array of 512 channels, and then, each time passing through a micro-step convolution layer, the number of channels of the array is reduced and the size of the array is increased, the normalization layer normalizes each group of data into data with a mean value of 0 and a variance of 1, the Relu layer is used for increasing sparsity of a neural network, and a Relu function is adoptedThe size of the array is not changed by the normalization layer, the Relu layer and the tanh layer, and finally the array of 224 multiplied by 224 of 3 channels is output, namely the generated color image.
The discriminator of the deep convolution generation countermeasure network DCGAN is a convolution neural network, the input of which is a 224 multiplied by 224 color image generated by the generator, and a 1 multiplied by 1 number is generated after continuous 5 convolution layers, namely the output of the discriminator.
In 5 convolutional layers, the filter size of the first 4 convolutional layers is 5 × 5, the number of channels is 32,64,128 and 256 respectively, the step size is 2, the mode of filling pixels is Same, the last convolutional layer is a single-channel 14 × 14 × 256 filter, in order to prevent gradient explosion or overfitting, a plurality of normalization layers and Relu layers are added into a discriminator, and the Relu function in the discriminator uses a leaked Relu function
The loss function expression of the deep convolution generation countermeasure network DCGAN is as follows:
wherein D (x) represents the probability that the input image is a true photographed image calculated by the discriminator D, 1 represents the judgment as true, 0 represents the judgment as false, the higher the probability is, the more likely it is as true image, z is the input random noise, G (z) represents the generated sample image,representing a mathematical expectation on a real data set,representing a mathematical expectation on the generated data set;
the training goal of the deep convolution generation of the countermeasure network DCGAN is to make D (x) and 1-D (G (z)) as large as possible with the generator G fixed and as small as possible with the arbiter D fixed, optimizing the arbiter D with the generator G fixed, in the form of a binary cross-entropy function, resulting in a lossOptimize generator G with discriminator D fixed, get lossOptimization taking comprehensive consideration of the discriminator D and the generator G, i.e. with a loss function ofDuring the training process, the discriminator D is optimized to reduce lossDMaximizing, and then optimizing the generator G to loseGMinimizing, and continuously circulating the process to make the discriminator D and the generator G reach Nash equilibrium, so as to train the generator which can be synthesized to be very similar to the real image.
The hyper-parameters of the anti-network DCGAN generated by the deep convolution comprise the probability dropprogram of dropout layer discarding, the gradient LEAKYReLuScale of the leaked ReLu function when the input is negative, the training set quantity miniBatchSize of each iteration, the iteration time epoch, the learning rate learngrate, the gradient attenuation factor gradientDecayFactor and the square gradient attenuation factor squarredGradientDecayFactor in SGDM optimization.
The step 3) specifically comprises the following steps:
31) construction of detection model: classifying the types of the cracks by training through a convolutional neural network CNN, training in a transfer learning mode by taking synthetic image data generated by a generator as input, training by using partial data of a Vgg16 pre-training network, and training the partial data of the convolutional neural network by using a Vgg16 pre-training network;
32) evaluation of the detection model:
respectively training the data set before the expansion and the data set after the expansion to obtain two detection models, inputting the pictures in the test set into the detection models of the two convolutional neural networks for prediction, and if the detection accuracy after the expansion is higher than that before the expansion, considering that the generated model is meaningful.
33) Application of the detection model:
and processing the actually shot picture, inputting the processed picture into a detection model obtained by training after expansion, and obtaining a detection result, wherein the detection result comprises no crack, a linear crack and a reticular crack.
Compared with the prior art, the invention has the following advantages:
the method constructs a generation type countermeasure network based on DCGAN, and enables the generator to generate vivid images when the model is stable by continuously improving the game performance of the generator and the discriminator, then the synthesized images are used for expanding a crack data set, and a convolutional neural network for crack type identification is constructed based on Vgg16 in a transfer learning mode, so that the method can expand data under the condition of few samples, and can adapt to the requirement of the convolutional neural network model on the data quantity.
Drawings
Fig. 1 is a schematic diagram of image segmentation.
Fig. 2 is a schematic diagram of the layers of a generator network.
FIG. 3 is a diagram of the layers of a network of arbiters.
Fig. 4 is a neural network structure of the migration learning based on the Vgg16 network.
FIG. 5 is a flow diagram of the present invention.
FIG. 6 is the average probability calculated by the discriminator during the training process, wherein FIG. 6a shows averProb _ epoch at each learning ratetrueFIG. 6b shows averProb _ epoch at each learning ratefake。
Fig. 7 shows the model training around 1500 generations, where fig. (7a) is averProb curve and fig. (7b) is loss curve.
Fig. 8 is an image generated by the generation model.
FIG. 9 is a partial composite image generated with a generator.
FIG. 10 is a loss curve and accuracy curve for training before and after data expansion, where FIG. 10a is the loss curve before expansion, FIG. 10b is the accuracy curve before expansion, FIG. 10c is the loss curve after expansion, and FIG. 10d is the accuracy curve after expansion.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
The invention provides a method for detecting surface cracks of a small-sample lower structure based on a generating type countermeasure network, which comprises the following steps:
1) collecting data and preprocessing;
2) constructing an image generation model (DCGAN), training the acquired image data set, and realizing the expansion of the image data set;
3) and constructing a CNN model for judging whether the surface of the structure has cracks or not based on VggNet, and comparing the effects of the data set before and after expansion.
The details of the above steps are as follows:
step 1) the data set production and pretreatment specifically comprises the following steps:
11) raw data acquisition
And shooting the surface of the structure by using an unmanned aerial vehicle, a camera or a smart phone to finish the acquisition of the original data set.
12) Image pre-processing
Since the convolutional neural network processes a generally square image, the image is divided into a plurality of square images, as shown in fig. 1.
Since the resolution of the input image is 224 × 224 for the conventional convolutional neural network (Vgg16/19, ResNet101), the excessive size will increase the size and depth of the network geometrically, which greatly increases the computation of the computer. Therefore, the divided square image of the resolution is compressed into an image of the resolution 224 × 224 in consideration of the computational efficiency and the image quality, and finally an image data set is formed. In these images, 80% were used as a training set of data, and 20% were used as a test set for training the neural network.
Step 2) the construction and training of the image generation model specifically comprises the following steps:
21) selection of image generation model
The method is based on DCGAN, trains the crack images obtained in the step 1, and finally obtains a generation model for generating any multi-crack images.
The network of GANs used to generate images is divided into two parts: a discriminator and a generator. The discriminator is similar to the neural network in the image classification, inputs the image information, outputs and judges, and judges whether the image belongs to the given class. The generator inputs a plurality of random numbers, and the false images are output by calculating through a neural network of the generator. In the training process, firstly, a random noise vector z with several dimensions is generated, the information of the noise is input into a generator G, and the generator converts the random noise vector into an image G (z). G (z) is generated and then is input into a discriminator D, the discriminator outputs the probability that the image is a real image, if the probability is 1, the discriminator judges that the image is the real image, otherwise, if the probability is 0, the image is a fake image.
In each round of training, the generator G aims to generate a picture which looks real as much as possible to cheat the discrimination network D. And the aim of D is to separate the picture generated by G and the real picture as much as possible. After each round of training, parameters in D and G are corrected through a back propagation algorithm, the capability of G for generating false and true images and the capability of D for judging the images can be improved, and the generator can generate images similar to a training set after enough times of training.
The original GAN presents some problems in image generation. The stability of the generator is not high, and the situation that the generating capacity of the generator is not consistent with the judging capacity of the discriminator is easy to occur. Deep convolution generation countermeasure networks (DCGANs) can effectively solve this problem, so this patent will construct a generation network that generates structural surface cracks.
22) Construction of generators
The input of the generator is a random noise of 1 × 1 × 100, and the input layer passes through a reshaper layer (project and reshape layer), which is a fully connected layer, and the 100 generated random numbers are converted into 294912 numbers, and then rearranged into a 24 × 24 array of 512 channels.
The generator then converts this 512-channel 24 x 24 array into a 64-channel 112 x 112 array by a succession of three combinations including a micro-convolution layer, a normalization layer, and a Relu layer, and then passes a micro-convolution layer and a hyperbolic tangent activation function (i.e., a micro-convolution layer and a hyperbolic tangent activation function)) The image is converted into a 224 × 224 array of 3 channels, namely, an RGB three-channel image.
The micro-step convolution expands the smaller array to a larger one, with a total of four micro-step convolution layers, denoted as transConv1, transConv2, transConv3, and transConv 4. The convolution kernel sizes of the four micro-step convolution layers are all 5 multiplied by 5, the lengths of the convolution kernel sizes are the same as the number of input array channels (512,256,128 and 64) output by the previous layer, and the number of the layers is 256,128,64 and 3 respectively. In addition, the step lengths of transConv2, transConv3, and transConv4 were all 2, and the output size reduction mode was "Same", that is, the size was not changed, except that the step length was 1 in the transConv1 and the output size was reduced to the default value (without clipping).
A normalization layer and a Relu layer were attached after each of transConv1, transConv2, and transConv 3. The normalization layer optimizes the problem of the change of the data distribution of the middle layer to prevent the disappearance or explosion of the gradient and accelerate the training speed, and utilizesEach set of data was normalized to data with a variance of 1 for a mean of 0. Relu layer can be increased to some extentDue to the sparsity of the neural network, the over-fitting problem is relieved, the situation of gradient disappearance is greatly reduced, meanwhile, the calculated amount is small, and the training efficiency can be improved. In the generation network, all Relu layers use the common Relu function, i.e.The specific structure of the generator network is shown in table 1.
Table 1 details of the generator network
The input of the generator is a 1 × 1 × 100 noise array, which is a 24 × 24 array of 512 channels after passing through the reshaping layer, and then the number of channels of the array is reduced and the size is increased every time passing through a micro-step convolution layer, while the normalization layer, the Relu layer and the tanh layer do not change the size of the array. The final output becomes a 3-channel 224 x 224 array, i.e., the generated color image, with the inputs and outputs of the various layers of the generator network shown in fig. 2.
23) Structure of discriminator
The arbiter is a convolutional neural network. The input to the discriminator is a 224 x 224 color image, first passed through a Dropout layer to prevent overfitting. Then, the discriminator convolves the array of 224 × 224 × 3 into a number of 1 × 1 × 1 through 5 convolution layers. These 5 convolutional layers are designated as Conv1, Conv2, Conv3, Conv4 and Conv 5. Of the five convolutional layers, the first four convolutional layers all have a filter size of 5 × 5, the number of channels is 32,64,128,256, respectively, the steps are 2, and the mode of filling pixels is "Same", i.e., the size is not changed. The last convolutional layer Conv5 is a single-channel 14 × 14 × 256 filter, and convolves the 14 × 14 × 256 array transmitted by the previous layer into a 1 × 1 × 1 number, which is the output of the discriminator.
In order to prevent gradient explosion or overfitting and the like, a plurality of normalization layers and Relu layers are also added into the discriminator. Normalization is the same as in 2.2.1, usingThe input data are converted into data with a mean value of 0 and a variance of 1. While the Relu function in the discriminator uses the leaky Relu function, whose gradient is 0.2 when the input is negative, i.e. the gradient isThe specific structure of the discriminator network is shown in table 2.
TABLE 2 detailed construction of discriminator network
The input of the discriminator is a 224 × 224 × 3 RGB three-channel image, the array size becomes half of the original (112 × 112,56 × 56,28 × 28,14 × 14) after each convolution layer of Conv1, Conv2, Conv3 and Conv4, the number of channels is 32,64,128 and 256 in turn, and the normalization layer and the Relu layer do not change the array size. The final output becomes a number of 1 × 1 × 1. The input and output of each layer of the discriminator network are shown in fig. 3.
24) Construction of loss function
The loss function adopts a loss function commonly used in the GAN network:
where D (x) represents the probability that the input image is a true captured image calculated by the discriminator D, 1 represents an image determined to be "true", 0 represents an image determined to be "fake", and the higher the probability, the more likely it is to be a true image. z is the input random noise, and g (z) represents the generated samples. The goal of discriminant D training is therefore to make D (x) with G fixedtrue) Tends to 1 and D (G (z)) tends to 0. Whereas the opposite is true for the generator G,it is desirable to make D (g (z)) approach 1 as much as possible with D fixed. In summary, the goal of this GAN training is to make D (x) and 1-D (G (z)) as large as possible with G fixed and as small as possible with D fixed, and D is optimized with G fixed using the form of a binary cross-entropy function to obtain DOptimizing G under the condition that D is fixed to obtainThe optimization of D and G is comprehensively considered, and the loss function isIn the training process, D is optimized to reduce lossDMaximizing, then optimizing G to make lossGAnd (4) minimizing. The process is continuously circulated, so that D and G reach Nash equilibrium, and a generator which can be synthesized to be very similar to a real image can be trained, thereby providing data supplement for further analysis.
25) Selection of hyper-parameters
The selection of the hyper-parameters of the GAN network is based in part on the experience of the computer field regarding the selection of hyper-parameters and in part on the results of testing a set of different hyper-parameters.
In the neural network constructed by the invention, the hyper-parameters to be selected are: dropout layer discarding probability dropout prob, gradient leakrescale of the leaked ReLu function when the input is negative, training set number miniBatchSize per iteration, iteration number epoch, learning rate learningRate in SGDM optimization, gradient attenuation factor gradientDecayFactor, and squared gradient attenuation factor squarredGradientDecayFactor. The learning rate and the iteration times of the most important hyper-parameters have large influence on the training effect, the common learning rate value range (0.0001-0.001) and the large iteration times (2000 generation) are taken, and the optimal learning rate and the optimal iteration times are determined through testing in actual use. Specific values of these hyper-parameters are shown in table 3.
TABLE 3 selection of hyper-parameters
Hyper-parameter | Value taking |
dropoutProb | 0.5 |
leakyReluScale | 0.2 |
miniBatchSize | 8 |
epoch | ≤2000 |
learningRate | 0.0001-0.001 |
gradientDecayFactor | 0.5 |
squaredGradientDecayFactor | 0.999 |
Step 3) the construction and training of the detection model specifically comprises the following steps:
31) construction of detection model
The cracks of the structured surface are divided into three types according to geometrical characteristics: no cracks, linear cracks and reticular cracks. The convolutional neural network is used for classifying the types of the cracks through training.
With the generator trained in step 2, a large amount of synthetic image data can be generated. If there are 1000 possibilities for the random numbers input by the generator, then theoretically, 1000 random numbers can be generated by using 100 random numbers as inputs100=10300Different images can meet the requirement of the convolutional neural network on the number of training sets.
Training is carried out by using partial data of a pre-training network of Vgg16 in a transfer learning mode. The portion of the convolutional neural network employs Vgg 16. When the Vgg16 trains images, the set number of classes is 1000, so in the classification model of the structural surface cracks, the last two layers (full-connected layer and softmax layer) containing 1000 outputs are replaced by 3 full-connected layers and softmax layers, and the parameters of other layers are still trained by the parameters of Vgg 16. The structure of the convolutional neural network is shown in fig. 4.
32) Evaluation of detection models
And respectively training the data set before the expansion and the data set after the expansion to obtain two detection models. And (3) inputting the pictures in the test set into the models of the two convolutional neural networks for prediction, and if the detection accuracy after the expansion is higher than that before the expansion, determining that the generated model is meaningful.
Examples
The method for detecting the cracks on the surface of the structure comprises the following steps:
step 1, collecting a data set on one road surface of an intersection of two roads in a certain place, and shooting by using an unmanned aerial vehicle, wherein the data of the unmanned aerial vehicle is shown in a table 4:
TABLE 4 UAV parameters
Parameter(s) | Value of |
Mass/g of unmanned |
1250 |
Maximum time of flight/min | About 30 |
Effective pixel | About 20,000,000 |
Photo size | 5472×3648 |
Resolution of video recording | 1920×1080 |
Controllable rotation range/(°) | -90°-30° |
The flying height of the unmanned aerial vehicle is 1.5m, the camera is vertical to the ground, the unmanned aerial vehicle moves along the road direction, images of 80 pavements are shot, and the images are preprocessed firstly. The picture pixel size that unmanned aerial vehicle shot is 5472 × 3648, because the convolutional neural network processing is the image of square in general, so split the image into 24 images of size 912 × 912.
The divided images of 912 × 912 resolution were compressed into images of 224 × 224 resolution, and 1920 images were obtained finally. In fact, all images processed after segmentation and having a resolution of 224 × 224 do not contain cracks. Therefore, 1187 images containing cracks can be obtained by removing the pictures of the crack-free pavement and the pictures containing leaves, inspection well covers and the like. In these images, 950 (80%) were used as a training set of data, and 237 (20%) was used as a test set to train a neural network.
Step 2, training the DCGAN network by taking training set data as input
Training of the model the construction and training of the discriminators and generators was performed using the MATLAB as a programming platform, with a deep learning toolkit in the MATLAB.
Four parameters, namely 0.0001,0.0002,0.0005 and 0.001, are selected for comparison in order to obtain the optimal learning rate.
FIG. 6 shows the average probability of the classifiers when the learning rates are 0.0001,0.0002,0.0005,0.001, i.e. the average probability averProb predicted by the classifier D for each real image and synthesized image in each cycletrue,averProbfakeNamely:
where m is the amount of data used per cycle, miniBatchSize. I isdataFor real images, IfakeAs a composite image (i.e. as a random input x)randOutput G (x) by generator Grand))。
Similar to the loss function, to smooth the averProb function, the average averProb _ epoch of the loss function is also calculated for all iterations of each generationtrue,averProb_epochfakeAnd (6) performing calculation.
As can be seen from fig. 6, when the learning rates are 0.0002,0.0005 and 0.001, the convergence rates of the loss function and the prediction probability in the early stage of training are almost the same, the values of loss and averProb in models at the learning rates of 0.0005 and 0.001 after reaching the steady state are almost the same, and the values of loss and averProb in models at the learning rates of 0.0002 and 0.0001 are almost the same, and the latter is superior to the former. In summary, considering the convergence rate and the calculation efficiency of the model, the learning rate of not less than 0.0002 is preferable, and considering the final convergence condition of the model, the learning rate of not more than 0.0002 is preferable. In consideration of both, the learning rate of 0.0002 is optimal.
To obtain the optimal number of training iterations maxEpoch, analysis of the loss curve and the mean probability curve after selecting a learning rate of 0.0002 can find averProb after training to 1500 generationsfakeAnd lossDBeginning to increase, averProbtrueAnd lossGBegins to decrease as shown in fig. 7.
Therefore, after 1500 generations, the quality of the model is reduced, and an overfitting phenomenon occurs, so that the generator obtained by 1500 generations of training is taken as a final result. The resulting image is shown in fig. 8.
Step 3, therefore, a plurality of images are generated by using the trained generator, images without cracks, linear cracks and reticular cracks are separated out, and the training set is expanded to 2000 pieces of images in each classification. The 6000 images were trained as an image classification dataset. Fig. 9 is a partially synthesized image.
The loss curve versus accuracy curve for the training is shown in fig. 10.
And (3) inputting the pictures in the test set into the models of the two convolutional neural networks for prediction, wherein the result shows that the model classification accuracy rate trained by utilizing the data set before expansion is 87.23%, and the model classification accuracy rate trained by utilizing the data set after expansion is 97.50%. The results of the fracture classification are shown in Table 5.
TABLE 5 crack Classification results
The results show that the training precision can be greatly improved by using the extended data for training. In addition, the model can be rapidly converged by using the transfer learning method, and the training efficiency is greatly improved, so that the data set expansion by using the generated model is meaningful.
Claims (10)
1. A method for detecting surface cracks of a small-sample lower structure based on a generating type antagonistic network is characterized by comprising the following steps:
1) acquiring an image of the surface of the structure, preprocessing the image and constructing an image data set;
2) constructing an image generation model to train the acquired image data set, and realizing the expansion of the image data set;
3) and constructing a detection model for judging whether the structure surface has cracks or not based on VggNet, and finishing the crack detection of the structure surface after training.
2. The method for detecting the surface cracks of the lower structure of the small sample based on the generative antagonistic network as claimed in claim 1, wherein the step 1) is specifically as follows:
the method comprises the steps of shooting a structure surface by adopting an unmanned aerial vehicle, a camera or a smart phone, finishing acquisition of an original image data set, and dividing an acquired image of the structure surface into a plurality of images with square sizes to form an image data set.
3. The method as claimed in claim 1, wherein in the step 2), the generative countermeasure network DCGAN is used as an image generation model to train the image data set to generate a structural surface image with any multiple cracks, so as to achieve the expansion of the image data set.
4. The method as claimed in claim 3, wherein the generator structure of the deep convolution generated countermeasure network DCGAN sequentially includes:
an input layer: for inputting random noise array;
remodeling layer: a full connection layer for converting random numbers and rearranging the random numbers to convert the random numbers into arrays;
three in succession include the combination of the micro-step convolutional layer, the normalization layer and the Relu layer: the method is used for converting an array output by a remolding layer into an array of 3 channels, namely an RGB three-channel color image, through a micro-step convolution layer and a hyperbolic tangent activation function.
5. The method as claimed in claim 4, wherein the input of the generator is a 1 x 100 noise array, the input of the generator is a 24 x 24 array of 512 channels after passing through the reshape layer, the number of channels of the array decreases and the size increases after passing through a micro convolution layer, the normalization layer normalizes each array of data into data with a mean value of 0 and a variance of 1, the Relu layer is used to increase sparsity of the neural network, and a Relu function is usedThe size of the array is not changed by the normalization layer, the Relu layer and the tanh layer, and finally the array of 224 multiplied by 224 of 3 channels is output, namely the generated color image.
6. The method as claimed in claim 5, wherein the discriminator of the deep convolution generation countermeasure network DCGAN is a convolution neural network, the input of which is a 224 x 224 color image generated by the generator, and the discriminator outputs a number of 1 x 1 after 5 successive convolution layers.
7. The method as claimed in claim 6, wherein the filter size of the first 4 convolutional layers among the 5 convolutional layers is 5 × 5, the number of channels is 32,64,128 and 256, the step is 2, the mode of filling pixels is Same, the last convolutional layer is a single-channel 14 × 14 × 256 filter, to prevent gradient explosion or overfitting, a plurality of normalization layers and Relu layers are added to the discriminator, and the Relu function in the discriminator uses the leaked Relu function
8. The method as claimed in claim 3, wherein the loss function expression of the deep convolution generative countermeasure network DCGAN is as follows:
wherein D (x) represents the probability that the input image is a true photographed image calculated by the discriminator D, 1 represents the judgment as true, 0 represents the judgment as false, the higher the probability is, the more likely it is as true image, z is the input random noise, G (z) represents the generated sample image,representing a mathematical expectation on a real data set,representing a mathematical expectation on the generated data set;
the training goal of the deep convolution generation of the countermeasure network DCGAN is to make D (x) and 1-D (G (z)) as large as possible with the generator G fixed and as small as possible with the arbiter D fixed, optimizing the arbiter D with the generator G fixed, in the form of a binary cross-entropy function, resulting in a lossOptimize generator G with discriminator D fixed, get lossOptimization taking comprehensive consideration of the discriminator D and the generator G, i.e. with a loss function ofDuring the training process, the discriminator D is fed firstOptimization of line to lossDMaximizing, and then optimizing the generator G to loseGMinimizing, and continuously circulating the process to make the discriminator D and the generator G reach Nash equilibrium, so as to train the generator which can be synthesized to be very similar to the real image.
9. The method for detecting the surface cracks of the structure under the small sample based on the generative countermeasure network according to claim 3, wherein the hyper-parameters of the deep convolution generative countermeasure network DCGAN comprise a probability dropout of dropout layer discarding, a gradient leakReLu Scale of a leaked ReLu function when an input is a negative number, a training set number miniBatchSize of each iteration, an iteration number epoch, a learning rate learninggR in SGDM optimization, a gradient attenuation DecayFactor and a square gradient attenuation factor squaredGradientDecayFactor.
10. The method for detecting the surface cracks of the structure under the small sample based on the generative antagonistic network as claimed in claim 1, wherein the step 3) comprises the following steps:
31) construction of detection model: classifying the types of the cracks by training through a convolutional neural network CNN, training in a transfer learning mode by taking synthetic image data generated by a generator as input, training by using partial data of a Vgg16 pre-training network, and training the partial data of the convolutional neural network by using a Vgg16 pre-training network;
32) evaluation of the detection model:
respectively training the data set before the expansion and the data set after the expansion to obtain two detection models, inputting the pictures in the test set into the detection models of the two convolutional neural networks for prediction, and if the detection accuracy after the expansion is higher than that before the expansion, considering that the generated model is meaningful.
33) Application of the detection model:
and processing the actually shot picture, inputting the processed picture into a detection model obtained by training after expansion, and obtaining a detection result, wherein the detection result comprises no crack, a linear crack and a reticular crack.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111298865.7A CN114118362A (en) | 2021-11-04 | 2021-11-04 | Method for detecting structural surface crack under small sample based on generating type countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111298865.7A CN114118362A (en) | 2021-11-04 | 2021-11-04 | Method for detecting structural surface crack under small sample based on generating type countermeasure network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114118362A true CN114118362A (en) | 2022-03-01 |
Family
ID=80380591
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111298865.7A Pending CN114118362A (en) | 2021-11-04 | 2021-11-04 | Method for detecting structural surface crack under small sample based on generating type countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114118362A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115689924A (en) * | 2022-10-28 | 2023-02-03 | 浙江大学 | Data enhancement method and device for concrete structure ultrasonic tomography image |
CN117436350A (en) * | 2023-12-18 | 2024-01-23 | 中国石油大学(华东) | Fracturing horizontal well pressure prediction method based on deep convolution generation countermeasure network |
-
2021
- 2021-11-04 CN CN202111298865.7A patent/CN114118362A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115689924A (en) * | 2022-10-28 | 2023-02-03 | 浙江大学 | Data enhancement method and device for concrete structure ultrasonic tomography image |
CN117436350A (en) * | 2023-12-18 | 2024-01-23 | 中国石油大学(华东) | Fracturing horizontal well pressure prediction method based on deep convolution generation countermeasure network |
CN117436350B (en) * | 2023-12-18 | 2024-03-08 | 中国石油大学(华东) | Fracturing horizontal well pressure prediction method based on deep convolution generation countermeasure network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110334765B (en) | Remote sensing image classification method based on attention mechanism multi-scale deep learning | |
CN109614979B (en) | Data augmentation method and image classification method based on selection and generation | |
CN106599797B (en) | A kind of infrared face recognition method based on local parallel neural network | |
CN109063724B (en) | Enhanced generation type countermeasure network and target sample identification method | |
CN110348399B (en) | Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network | |
CN111191660B (en) | Colon cancer pathology image classification method based on multi-channel collaborative capsule network | |
CN106228185B (en) | A kind of general image classifying and identifying system neural network based and method | |
CN112990097B (en) | Face expression recognition method based on countermeasure elimination | |
CN113705526B (en) | Hyperspectral remote sensing image classification method | |
WO2018052586A1 (en) | Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks | |
CN108009222B (en) | Three-dimensional model retrieval method based on better view and deep convolutional neural network | |
CN111460980B (en) | Multi-scale detection method for small-target pedestrian based on multi-semantic feature fusion | |
CN111860587B (en) | Detection method for small targets of pictures | |
CN114118362A (en) | Method for detecting structural surface crack under small sample based on generating type countermeasure network | |
CN111832608B (en) | Iron spectrum image multi-abrasive particle identification method based on single-stage detection model yolov3 | |
Liao et al. | Triplet-based deep similarity learning for person re-identification | |
CN110503000B (en) | Teaching head-up rate measuring method based on face recognition technology | |
Zhan et al. | Semi-supervised classification of hyperspectral data based on generative adversarial networks and neighborhood majority voting | |
CN103226699B (en) | A kind of face identification method having supervision locality preserving projections based on separating degree difference | |
CN111833322B (en) | Garbage multi-target detection method based on improved YOLOv3 | |
CN114495010A (en) | Cross-modal pedestrian re-identification method and system based on multi-feature learning | |
CN113095158A (en) | Handwriting generation method and device based on countermeasure generation network | |
CN110503157B (en) | Image steganalysis method of multitask convolution neural network based on fine-grained image | |
CN112132207A (en) | Target detection neural network construction method based on multi-branch feature mapping | |
CN113723482B (en) | Hyperspectral target detection method based on multi-example twin network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |