CN112633354B - Pavement crack detection method, device, computer equipment and storage medium - Google Patents

Pavement crack detection method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN112633354B
CN112633354B CN202011506513.1A CN202011506513A CN112633354B CN 112633354 B CN112633354 B CN 112633354B CN 202011506513 A CN202011506513 A CN 202011506513A CN 112633354 B CN112633354 B CN 112633354B
Authority
CN
China
Prior art keywords
training
model
image
application model
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011506513.1A
Other languages
Chinese (zh)
Other versions
CN112633354A (en
Inventor
刘建
毛妤
王云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Greater Bay Area Institute of Integrated Circuit and System
Original Assignee
Guangdong Greater Bay Area Institute of Integrated Circuit and System
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Greater Bay Area Institute of Integrated Circuit and System filed Critical Guangdong Greater Bay Area Institute of Integrated Circuit and System
Priority to CN202011506513.1A priority Critical patent/CN112633354B/en
Publication of CN112633354A publication Critical patent/CN112633354A/en
Application granted granted Critical
Publication of CN112633354B publication Critical patent/CN112633354B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The application relates to a pavement crack detection method, a pavement crack detection device, computer equipment and a storage medium. The method comprises the following steps: acquiring a plurality of training images, a plurality of training models and an application model; inputting each training image into each neural network model respectively to obtain the output of each neural network model; determining a reference category of the training image according to the output of the plurality of training models corresponding to the same training image; adjusting model parameters of the application model until a first training stopping condition is met, and obtaining a pre-trained application model; acquiring a plurality of pavement images marked with the positions of the areas where the cracks are positioned; taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so as to obtain a final trained application model; the finally trained application model is used for detecting pavement cracks. The method can accurately detect the pavement cracks.

Description

Pavement crack detection method, device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of image analysis technologies, and in particular, to a method and apparatus for detecting a pavement crack, a computer device, and a storage medium.
Background
With the development of the transportation industry, the maintenance of roads has become very important. The road is used as an important component of the transportation junction, and not only bears the heavy duty of transportation, but also is related to the security risk of transportation personnel. However, after the road is used, road cracks are inevitably generated, so that the resistance of the road structure is attenuated, and potential safety hazards exist. At this time, repair is needed in time, otherwise, it is only serious. Therefore, regular inspection of the road is indispensable, and the pavement crack is detected in time.
In the conventional technology, a road surface crack training image can be acquired through an image pickup device. And marking the area where the pavement crack is positioned on each pavement crack training image by manpower to obtain the position of the area where the crack corresponding to the pavement crack training image is positioned. And training the neural network model by using the pavement crack training image and the corresponding crack position. And finally, inputting the pavement crack detection image into a neural network model to obtain the position of the area where the crack on the pavement crack detection image is located.
However, the distribution of the pavement cracks is irregular, and the tiny pavement cracks are easily interfered by surrounding obstacles, so that a large amount of manpower is required to be consumed for marking the area where the pavement cracks are located on each pavement crack training image, the number of pavement crack training images marked with the positions of the area where the pavement cracks are located is small, and the positions where the pavement cracks are located cannot be accurately detected on the pavement crack detection images by the trained neural network model.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a road surface crack detection method, apparatus, computer device, and storage medium capable of accurately detecting a road surface crack.
A pavement crack detection method, the method comprising:
acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images;
inputting each training image into each neural network model respectively to obtain the output of each neural network model;
Determining a reference category of the training image according to the output of the plurality of training models corresponding to the same training image;
adjusting model parameters of the application model until a first training stopping condition is met, so as to obtain a pre-trained application model; the pre-trained application model corresponds to a difference between the output of the training image and a reference class of the training image being less than a threshold;
acquiring a plurality of pavement images marked with the positions of the areas where the cracks are positioned;
taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so as to obtain a final trained application model; the final trained application model is used for detecting pavement cracks.
In one embodiment, the acquiring a plurality of training images, a plurality of training models, and an application model includes:
establishing a plurality of training models and an application model;
acquiring a plurality of classified images marked with the belonging categories;
and taking the classified images as training samples, taking the categories to which the classified images belong as training labels, and training each training model until a third training stopping condition is met, so as to obtain a trained training model.
In one embodiment, the inputting each training image into each neural network model to obtain an output of each neural network model includes:
and inputting one training image into one neural network model to obtain the probability that the training image output by the neural network model belongs to each category.
In one embodiment, the determining the reference class of the training image according to the outputs of the training models corresponding to the same training image includes:
taking an average value of probabilities that the same training image output by the training models belongs to the same category as a reference probability that the training image belongs to the same category;
and taking the category corresponding to the maximum value in the reference probabilities of the categories of the training images as the reference category of the training images.
In one embodiment, the adjusting the model parameters of the application model until the first training stopping condition is met, to obtain a pre-trained application model includes:
establishing a discriminator, and forming a generated type countermeasure network with the discriminator by taking the application model as a generator; the discriminator is a neural network model;
The reference category of the training image is used as a true sample to be input into the discriminator to obtain a true sample discrimination result, and the output of the training image corresponding to the application model is used as a false sample to be input into the discriminator to obtain a false sample discrimination result;
alternately adjusting model parameters of the discriminator and model parameters of the generator until a fourth training stop condition is met, so as to obtain a pre-trained application model; the difference between the true sample discrimination result and the false sample discrimination result in the adjusted discriminator is larger than that in the discriminator before adjustment, and the difference between the true sample discrimination result and the false sample discrimination result in the adjusted generator is smaller than that in the generator before adjustment.
In one embodiment, the number of neurons in the application model is less than the number of neurons in each of the training models.
In one embodiment, after the obtaining the final trained application model, the method further comprises:
acquiring an image to be detected;
inputting the image to be detected into the final trained application model to obtain the position of the region where the crack in the image to be detected is located.
A pavement crack detection device, the device comprising:
the pre-training acquisition module is used for acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images;
the processing module is used for inputting each training image into each neural network model respectively to obtain the output of each neural network model;
the determining module is used for determining the reference category of the training image according to the output of the training images corresponding to the same training model;
the adjusting module is used for adjusting the model parameters of the application model until the first training stopping condition is met, so as to obtain a pre-trained application model; the pre-trained application model corresponds to a difference between the output of the training image and a reference class of the training image being less than a threshold;
the training acquisition module is used for acquiring a plurality of road surface images marked with the positions of the areas where the cracks are located;
the training module is used for taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is located as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so that a final trained application model is obtained; the final trained application model is used for detecting pavement cracks.
A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:
acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images;
inputting each training image into each neural network model respectively to obtain the output of each neural network model;
determining a reference category of the training image according to the output of the plurality of training models corresponding to the same training image;
adjusting model parameters of the application model until a first training stopping condition is met, so as to obtain a pre-trained application model; the pre-trained application model corresponds to a difference between the output of the training image and a reference class of the training image being less than a threshold;
acquiring a plurality of pavement images marked with the positions of the areas where the cracks are positioned;
taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so as to obtain a final trained application model; the final trained application model is used for detecting pavement cracks.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images;
inputting each training image into each neural network model respectively to obtain the output of each neural network model;
determining a reference category of the training image according to the output of the plurality of training models corresponding to the same training image;
adjusting model parameters of the application model until a first training stopping condition is met, so as to obtain a pre-trained application model; the pre-trained application model corresponds to a difference between the output of the training image and a reference class of the training image being less than a threshold;
acquiring a plurality of pavement images marked with the positions of the areas where the cracks are positioned;
taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so as to obtain a final trained application model; the final trained application model is used for detecting pavement cracks.
According to the pavement crack detection method, the pavement crack detection device, the computer equipment and the storage medium, the plurality of training images, the plurality of training models and the application model are obtained, the application model and the plurality of training models are neural network models, and the class range output by each neural network model is the same as the class range of the plurality of training images, so that the plurality of training images can be input into the plurality of training models and the application model, and the class of each training image output by each neural network is obtained. The training models are different in structure, the reference category of the training image is determined according to the output of the training models corresponding to the same training image, the training models can be utilized to extract different features from the training image, various features are fully extracted from the training image, the comprehensiveness of feature extraction is ensured, and therefore the accuracy of category judgment is improved. Based on the reference class of the training image and the reference class of the application model corresponding to the training image, model parameters of the application model are adjusted until a first training stopping condition is met, a pre-trained application model with the difference between the output of the corresponding training image and the reference class of the training image smaller than a threshold value is obtained, and the characteristics of comprehensively extracting the characteristics of a plurality of training models and accurately classifying the characteristics can be transplanted to the application model, so that the effect of integrating the plurality of training models can be achieved by one application model. After the application model is obtained, a plurality of road surface images marked with the positions of the areas where the cracks are located are obtained, the road surface images are taken as training samples, the positions of the areas where the cracks are located in the road surface images are taken as training labels, the pre-trained application model is retrained until the second training stopping condition is met, and a final trained application model is obtained, so that the final trained application model can be used for detecting the road surface cracks. The application model can comprehensively extract the characteristics from the pavement image, accurately determine the distribution positions of the cracks, and realize the accurate detection of the pavement cracks. And when the comprehensive effect of a plurality of training models is transplanted to the application model, the training images do not need to be classified or marked manually, a large number of training images can be adopted for training, and the training effect of the application model is ensured.
Drawings
FIG. 1 is a schematic flow chart of a method for detecting a crack in a road surface according to an embodiment;
FIG. 2 is a block diagram of a pavement crack detection device according to one embodiment;
FIG. 3 is an internal block diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a pavement crack detection method is provided, and this embodiment is illustrated by applying the method to the terminal in fig. 1. It will be appreciated that the method may also be applied to a server, and may also be applied to a system comprising a terminal and a server, and implemented by interaction of the terminal and the server. In this embodiment, the method includes the steps of:
step S102, a plurality of training images, a plurality of training models and an application model are acquired.
The training images are plane images with category attributes, such as a photo of a cat and a photo of a dog. The training images are input into a classification model, and the classification model can output the category corresponding to the training images. For example, a photograph of a cat is input into a classification model, which outputs a category to which the training image belongs as a cat. For another example, the photographs of the dogs are input into a classification model, and the classification model outputs the class to which the training image belongs as the dogs.
The application model and the training models are neural network models with different structures. Neural network models are complex network systems formed by a large number of simple processing units (called neurons) widely interconnected, reflecting many of the fundamental features of human brain function, and are highly complex nonlinear dynamic learning systems. The different structures of the neural network model include different numbers of neurons in the neural network model, and/or different connection manners of the neurons in the neural network model.
The class range output by each neural network model is the same as the class range of the plurality of training images. For example, the plurality of training images consists of a scene photograph, a cat photograph, and a dog photograph, i.e., the category range of the plurality of training images includes three types of cats, dogs, and no cats and dogs, and thus the category range output by each neural network model also includes three types of cats, dogs, and no cats and dogs.
Specifically, the training image may be acquired by a photographing device, such as a photographed photo, or a frame of image in a photographed video. If images are stored in the local database, the images can also be directly called from the local database as training images. In addition, images may be searched from the internet as training images. In this embodiment, the requirement for the training image is low, and the training image is easy to obtain.
The plurality of training models and the application model may employ a convolutional neural network. Convolutional neural networks are a class of feedforward neural networks that contain convolutional computations and have a deep structure. The convolutional neural network comprises an input layer, at least one hidden layer and an output layer, wherein the at least one hidden layer is sequentially connected between the input layer and the output layer. Common structures of the hidden layer include a convolution layer, a pooling layer and a full connection layer.
The input layers of the convolutional neural networks are training images, the output layers of the convolutional neural networks are category ranges of a plurality of training images, and hidden layers of the convolutional neural networks are different. Wherein, the different hidden layers of the convolutional neural network comprise different numbers of hidden layers in the convolutional neural network and/or different numbers of neurons in the same hidden layer. The convolutional neural network is used for realizing a plurality of training models and an application model, so that the input and the output of each neural network model can be kept consistent, and different hidden layers of each convolutional neural network can be utilized, so that the different structures of each convolutional neural network can be conveniently realized.
In this embodiment, by acquiring a plurality of training models and one application model, the application model may be trained by using the plurality of training models, so that the application model has the characteristics of the plurality of training models. Because the structures of the training models are different, the feature extraction modes of the training models are different, and different features can be extracted from the same image. After the application model has the characteristics of a plurality of training models, the characteristics in the image can be comprehensively extracted. And a plurality of training images are acquired, so that samples can be provided for training of the application model by a plurality of training models.
Step S104, each training image is respectively input into each neural network model, and output of each neural network model is obtained.
The input of each neural network model is a training image, and the output is the category to which the training image belongs. For example, inputting a photograph of a cat into a neural network model that outputs a category to which the training image belongs as a cat; and inputting the photo of the dog into the neural network model, wherein the class to which the output training image of the neural network model belongs is the dog.
Specifically, a training image is input into a neural network model, the neural network model extracts feature data from the training image, and a series of processing is performed on the feature data, so that the probability that the training image belongs to each category is finally obtained, and the category with the highest probability is generally used as the category of the training image.
In this embodiment, each training image is input to each neural network model to obtain an output of each neural network model, and the output of the application model and the outputs of the plurality of training models may be compared, and model parameters of the application model are adjusted based on a difference between the two, so that the output of the application model is finally consistent with the output of the plurality of training models.
Step S106, determining the reference category of the training image according to the output of the plurality of training models corresponding to the same training image.
The reference class of the training image is a class designated for the training image and represents the target output of model training. For example, the reference class of the training image is a dog, the training image is input into the application model, and if the output of the application model is the dog, the training of the application model reaches the target; if the output of the application model is a cat, it indicates that the training of the application model has not reached the goal.
Specifically, the output of each training model is comprehensively considered, and the output of all training models is integrated into one result, and the result is used as a reference category of the training image. For example, of the five training models, the output of four training models is a dog, the output of one training model is a cat, and the reference class of the training image is a dog.
In practical application, a weight coefficient can be allocated to each output of the training model correspondingly, the outputs of the training models are multiplied by the corresponding weight coefficients and then added, and the added result is used as a reference class of the training image. For example, for the same training image, the output of training model a included 70% probability of dogs, 20% probability of cats, 10% probability of no-cat dogs, the output of training model B included 80% probability of dogs, 10% probability of cats, 10% probability of no-cat dogs, the output of training model C included 90% probability of dogs, 10% probability of cats, 0% probability of no-cat dogs, the output of training model D included 60% probability of dogs, 20% probability of cats, 20% probability of no-cat dogs, the output of training model E included 40% probability of dogs, 60% probability of cats, 0% probability of no-cat dogs. If the weight coefficient of the output of each training model is 0.2, the probability of the reference class of the training image being dog is 70% > -0.2+80% > -0.2+90% > -0.2+60% > -0.2+40% > -0.2=68%, the probability of cat is 20% > -0.2+10% > -0.2+20% > -0.2+60% > -0.2=24%, and the probability of no cat is 10% > -0.2+10% > -0.2+0% > -0.2+20% > -0.2+0.2+0.2+0.2=8%.
In this embodiment, according to the output of the plurality of training models corresponding to the same training image, the reference class of the training image is determined, and the outputs of the plurality of training models can be combined to train the application model, so that the application model can inherit the characteristics of the plurality of training models, the extracted characteristics have comprehensiveness, and the application model is facilitated to obtain accurate output.
And S108, adjusting model parameters of the application model until the first training stopping condition is met, and obtaining the pre-trained application model.
Wherein the difference between the output of the pre-trained application model corresponding to the training image and the reference class of the training image is less than a threshold. The first training stop condition includes at least one of the following conditions: the difference between the output of the application model corresponding to the training image and the reference category of the training image is smaller than a threshold value, the training times reach the preset times, the training time reaches the preset time, and the classification accuracy of the application model reaches the preset index.
In addition, neurons are functions that contain weights and bias terms. After receiving the data, the neuron multiplies the data by the weight, adds the bias term to the data, and outputs the result after calculation. The weights and bias terms are model parameters.
Specifically, each training image is sequentially input to each neural network model. Each neural network model can obtain corresponding output when each training image is input. The outputs of multiple training models for the same training image are integrated into the reference class of the training image. Model parameters of the application model are adjusted based on differences between the reference class of the training image and the output of the application model for the training image. And then inputting another training image into each neural network model to perform the same processing, and circulating the steps until the first training stop condition is met.
In this embodiment, by adjusting model parameters of the application model, differences between the output of the application model and reference categories of the training image are reduced, so that the output of the application model continuously approximates to the output of the plurality of training models, and the characteristics of the plurality of training models are transplanted to the application model.
Step S110, a plurality of pavement images marked with the positions of the areas where the cracks are located are acquired.
The road surface image is a plane image classified as a road surface. The location of the area where the crack marked on the road surface image is located refers to the distribution location of the crack on the road surface image. The road surface image is input into a trained neural network model, and the neural network model can output the position of the area where the crack is located. For example, the neural network model outputs (x, y, w, h) representing the distribution of cracks on the road surface image to (x, y), (x+w, y+ h) And (x, y+h) is the vertex. Alternatively, neural network model output (x min ,y min ,x max ,y max ) Representing the distribution of cracks on the road surface image to (x) min ,y min )、(x max ,y min )、(x max ,y max )、(x min ,y max ) Is within a rectangular region of vertices.
In particular, the road surface image may be acquired by a photographing device, such as a photographed photo, or a frame of image in a photographed video. If the local database stores the road surface image, the road surface image can also be directly called from the local database. In addition, road surface images may be searched for from the internet. After the road surface image is obtained, the road surface image can be displayed to a user, and the position of the area where the crack is located is selected on the road surface image by the user, so that the road surface image marked with the position of the area where the crack is located is obtained.
In this embodiment, the difference between the output of the pre-trained application model corresponding to the training image and the reference class of the training image is smaller than the threshold, and at this time, the characteristics of the plurality of training models are transplanted to the application model, so that the class to which the training image belongs can be accurately determined. And acquiring a plurality of road surface images marked with the positions of the areas where the cracks are positioned, and retraining the pre-trained application model by utilizing the plurality of road surface images, so that the application model can be changed from determining the category of the training image to determining the distribution positions of the areas where the cracks are positioned on the road surface images.
In practical applications, the output of the application model is changed to the region distribution position. Specifically, the output of the application model may be changed directly to the output having the region distribution position, or the region distribution position may be added to the output of the application model. For example, the raw outputs are confidence and category, on the basis of which the upper left abscissa, upper left ordinate, width and height are added.
And S112, taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stop condition is met, so as to obtain the final trained application model.
The finally trained application model is used for detecting pavement cracks. The second training stop condition includes at least one of the following conditions: the difference between the distribution position of the application model corresponding to the road surface image output and the position of the area where the marked crack on the road surface image is located is smaller than a threshold value, the training times reach the preset times, the training time reaches the preset time, and the classification accuracy of the application model reaches the preset index.
Specifically, each road surface image is sequentially input to the application model. And each time a pavement image is input, the application model can output the distribution position of the area where the crack is located. Based on the difference between the output distribution position and the position of the area where the crack marked on the pavement image is located, the model parameters of the application model are adjusted. Then another road surface image is input to the application model for the same processing, and the process is circulated until the second training stop condition is satisfied.
In the pavement crack detection method, the plurality of training images, the plurality of training models and the application model are acquired, the application model and the plurality of training models are neural network models, and the class range output by each neural network model is the same as the class range of the plurality of training images, so that the plurality of training images can be input into the plurality of training models and the application model to obtain the class of each training image output by each neural network. The training models are different in structure, the reference category of the training image is determined according to the output of the training models corresponding to the same training image, the training models can be utilized to extract different features from the training image, various features are fully extracted from the training image, the comprehensiveness of feature extraction is ensured, and therefore the accuracy of category judgment is improved. Based on the reference class of the training image and the reference class of the application model corresponding to the training image, model parameters of the application model are adjusted until a first training stopping condition is met, a pre-trained application model with the difference between the output of the corresponding training image and the reference class of the training image smaller than a threshold value is obtained, and the characteristics of comprehensively extracting the characteristics of a plurality of training models and accurately classifying the characteristics can be transplanted to the application model, so that the effect of integrating the plurality of training models can be achieved by one application model. After the application model is obtained, a plurality of road surface images marked with the positions of the areas where the cracks are located are obtained, the road surface images are taken as training samples, the positions of the areas where the cracks are located in the road surface images are taken as training labels, the pre-trained application model is retrained until the second training stopping condition is met, and a final trained application model is obtained, so that the final trained application model can be used for detecting the road surface cracks. The application model can comprehensively extract the characteristics from the pavement image, accurately determine the distribution positions of the cracks, and realize the accurate detection of the pavement cracks. And when the comprehensive effect of a plurality of training models is transplanted to the application model, the training images do not need to be classified or marked manually, a large number of training images can be adopted for training, and the training effect of the application model is ensured.
In one embodiment, acquiring a plurality of training images, a plurality of training models, and an application model, includes: establishing a plurality of training models and an application model; acquiring a plurality of classified images marked with the belonging categories; and taking the classified images as training samples, taking the categories of the classified images as training labels, and training each training model until the third training stopping condition is met, so as to obtain a trained training model.
The classified images are plane images with category attributes, such as a photo of a cat and a photo of a dog. The third training stop condition includes at least one of the following conditions: the difference between the class output by the training model corresponding to the classified image and the class to which the classified image belongs is smaller than a threshold value, the training times reach a preset number of times, the training time reaches a preset time, and the classification accuracy of the application model reaches a preset index.
Illustratively, the training model and the application model may employ the structure of an existing neural network model. For example, the training model is one of ResNet-152, SE-Net154, and MobileNet-V3, and the application model is ResNet-50.
Specifically, the classified image may be acquired by a photographing device, such as a photographed photo, or a frame of image in a photographed video. If images are stored in the local database, the images can also be directly called from the local database as classified images. In addition, an image may be searched from the internet as a classification image. After the classified image is acquired, the classified image may be displayed to the user, and the class to which the classified image belongs is selected by the user, thereby marking the classified image of the belonging class.
And sequentially inputting each classified image into each training model. Each training model can obtain corresponding output when inputting a classified image. Model parameters of each training model are adjusted based on differences between the output of that training model and the class of the corresponding marker of the classified image. And then inputting another classified image into each training model to perform the same processing, and circulating the steps until the third training stop condition is met.
In this embodiment, by acquiring a plurality of classified images marked with the belonging class and training each training model by using the plurality of classified images, a trained training model can be obtained, so that the application model can be trained by using a plurality of training models, so that the application model can inherit the characteristics of the plurality of training models, and the features can be comprehensively and accurately extracted from the images.
In one embodiment, inputting each training image into each neural network model to obtain an output of each neural network model, includes: and inputting a training image into a neural network model to obtain the probability that the training image output by the neural network model belongs to each category.
The probability that the training image belongs to each category refers to the probability that the training image belongs to one category. For example, for a training image, the output of training model a includes a 70% probability of a dog, a 20% probability of a cat, and a 10% probability of a no cat, indicating that training model a believes that the training image has a 70% probability of belonging to the image of the dog, a 20% probability of being the image of the cat, and a 10% probability of being neither the image of the cat nor the image of the dog.
Specifically, each training image is sequentially input to each neural network model. For example, the training image a is input to the neural network model a, the neural network model B, the neural network model C, the neural network model D, and the neural network model E, respectively; the training image B is respectively input into a neural network model A, a neural network model B, a neural network model C, a neural network model D and a neural network model E; and finally, respectively inputting the training image C into the neural network model A, the neural network model B, the neural network model C, the neural network model D and the neural network model E.
In this embodiment, a training image is input to a neural network model, and the neural network model outputs probabilities that the training image belongs to various categories. For a single neural network model, the class with the highest probability may be employed as the class of the training image determined by this neural network model. For a plurality of neural network models, the probability of the same category in the plurality of neural network models can be synthesized first to obtain the comprehensive probability of the category in the plurality of neural network models, and then the category with the highest comprehensive probability is used as the category of the training image common to the plurality of neural network models. Compared with the method that the most number of categories are adopted as the categories of the training images which are common to the plurality of neural network models in the categories of the training images which are determined by the plurality of neural network models, the method for calculating the comprehensive probability can better embody the difference of the plurality of neural network models on the discrimination results of the categories which are output by the plurality of neural network models for the same training image, and is beneficial to accurately determining the accurate category of the training image.
In one embodiment, determining the reference class of the training image based on the outputs of the plurality of training models corresponding to the same training image includes: taking the average value of probabilities that the same training image output by a plurality of training models belongs to the same category as the reference probability that the training image belongs to the same category; and taking the category corresponding to the maximum value in the reference probabilities of the categories to which the training image belongs as the reference category of the training image.
Specifically, aiming at inputting a training image into a neural network model, obtaining the probability that the training image output by the neural network model belongs to each category, and calculating the average value of the probabilities that the images output by a plurality of training models belong to the same category as the reference probability of the category. For example, for the same training image, the output of training model a included 70% of the probability of dogs, 20% of the probability of cats, 10% of the probability of no dogs, the output of training model B included 80% of the probability of dogs, 10% of the probability of cats, 10% of the probability of no dogs, the output of training model C included 90% of the probability of dogs, 10% of the probability of cats, 0% of the probability of no dogs, the output of training model D included 60% of the probability of dogs, 20% of the probability of cats, 20% of the probability of no dogs, 40% of the probability of dogs, 60% of the probability of cats, 0% of no dogs, then the probability of the training image belonging to dogs was 70% 0.2+80% 0.2+90%. 0.2+60%. 0.2=68%, the probability of cats being 20%. 2+10%. 0.2+10%. 0.2+0.2=0.2% 0.2+0.0.2+0.0.2+0.2% = 0.0.0.0.2+0.0.2+0.0.2% = 0.0.0.2%.0.0.2+0.2%.0.0.2%.0.0.0.2.0.2%.0%.
And selecting the maximum value from the reference probabilities that the training images belong to the categories, namely, the confidence coefficient is larger than 0.5, and taking the category corresponding to the maximum value as the reference category of the training images. For example, one training image has a probability of belonging to a dog of 68%, a probability of belonging to a cat of 24%, a probability of belonging to a non-cat dog of 8%, and a maximum value of 68%, and thus 68% of the corresponding dogs are taken as reference categories of the training image.
In this embodiment, the average value of probabilities that the same training image output by a plurality of training models belongs to the same category is used as the reference probability that the training image belongs to one category, so that the difference of the discrimination results of each category in the output of each training model can be fully considered. On the basis, the category corresponding to the maximum value in the reference probability of each category is used as the reference category of the training image, so that the category to which the training image belongs can be accurately determined.
In one embodiment, adjusting model parameters of the application model until a first training stop condition is met, resulting in a pre-trained application model, comprising: establishing a discriminator, taking the application model as a generator, and forming a generating type countermeasure network with the discriminator, wherein the discriminator is a neural network model; the reference category of the training image is used as a true sample to be input into a discriminator to obtain a true sample discrimination result, and the output of the training image corresponding to the application model is used as a false sample to be input into the discriminator to obtain a false sample discrimination result; alternately adjusting model parameters of the discriminators and model parameters of the generators until a fourth training stop condition is met, obtaining a pre-trained application model, wherein the difference between a true sample discrimination result and a false sample discrimination result in the discriminators after adjustment is larger than that in the discriminators before adjustment, and the difference between the true sample discrimination result and the false sample discrimination result in the generators after adjustment is smaller than that in the generators before adjustment.
Wherein the generated countermeasure network includes a generator and a arbiter by which relatively good outputs are generated for learning with respect to each other. In the training process, the object of the generator is to generate a real picture deception discriminator as much as possible. The objective of the discriminator is to separate the picture generated by the generator from the actual picture as much as possible. Thus, the generator and the arbiter constitute a dynamic "gaming process". The outcome of the final game is in an optimal state, the generator can generate enough pictures to "spurious. It is difficult for the arbiter to determine whether the picture generated by the generator is authentic or not.
In this embodiment, a model is applied as a generator, and a discriminator is built so that the generator and the discriminator constitute a generative countermeasure network. The goal of the generator is to try to approximate the output as close as possible to the reference class of the training image to the spoof arbiter, which aims to distinguish the output of the application model from the reference class of the training image as much as possible. Finally, the ideal result is that the discriminator cannot distinguish the output of the application model from the reference category of the training image, and at the moment, the characteristics of a plurality of training models can be completely transplanted to the application model, so that the optimal training effect is achieved.
Specifically, the application model is used as a generator, and thus the output of the application model is input as a dummy sample to the discriminator, thereby obtaining a dummy sample discrimination result. Correspondingly, the reference class of the training image is input into the discriminator as a true sample, so as to obtain a true sample discrimination result. Based on the difference between the true sample discrimination result and the false sample discrimination result, model parameters of the generator and the discriminator can be alternately adjusted. The object of the generator is to output a reference class which approaches the training image as close as possible to the deception discriminator, so that the difference between the discrimination result of the true sample and the discrimination result of the false sample in the adjusted generator is smaller than that of the discriminator before adjustment. The object of the discriminator is to distinguish the output of the application model from the reference category of the training image as much as possible, so that the difference between the true sample discrimination result and the false sample discrimination result in the adjusted discriminator is larger than that in the discriminator before adjustment.
Illustratively, the arbiter is a binary separator and may include three convolution layers. In practical application, the reference class of the training image and the relative entropy of the output of the application model can be taken from the discriminator to eliminate the influence of orders of magnitude, units and the like.
In one embodiment, the number of neurons in the application model is less than the number of neurons in each training model.
In this embodiment, the number of neurons in the application model is smaller than that of neurons in each training model, which indicates that the structure of the application model is simpler than that of the training model, so that the calculation amount in the application process is less, the efficiency of image processing can be improved, and the time of image processing can be shortened. And the application model is trained by a plurality of training models, so that the accuracy of image processing can be ensured.
In one embodiment, the number of road surface images is less than the number of training images.
Illustratively, the number of training images may be more than 100 times the number of road surface images.
In this embodiment, the training image does not need an artificial mark, and is easy to obtain. A large number of training images are adopted for training, the characteristics of a plurality of training models can be completely transplanted to an application model, and the training effect is good. On the basis that the characteristics of a plurality of training models are completely transplanted to the application model, a small amount of road surface images are adopted to retrain the application model, and the accuracy of the application model is not affected mainly for adjusting the output of the application model.
In one embodiment, the number of road surface images is less than the number of classification images.
Illustratively, the number of classified images may be 100 times or more the number of road surface images.
In this embodiment, classifying the image only needs to divide the classification, which is much easier than marking the crack distribution area. A large number of classified images are adopted for training, so that the accuracy of a plurality of training models can be ensured, and the accuracy of an application model trained by the plurality of training models is further ensured. On the basis that the characteristics of a plurality of training models are completely transplanted to the application model, a small amount of road surface images are adopted to retrain the application model, and the accuracy of the application model is not affected mainly for adjusting the output of the application model.
In one embodiment, after obtaining the final trained application model, the method further comprises: acquiring an image to be detected; inputting the image to be detected into a final trained application model to obtain the position of the region where the crack in the image to be detected is located.
The image to be detected is a pavement image of which the crack distribution area needs to be determined by applying a model.
Specifically, the image to be detected may be acquired by a photographing device, such as a photographed photo, or a frame of image in a photographed video. After the image to be detected is acquired, the image to be detected is input into a final trained application model, and the application model outputs the position of the area where the crack in the image to be detected is located.
In the embodiment, the position of the area where the crack in the image to be detected is obtained by acquiring the image to be detected and inputting the image to be detected into the finally trained application model, so that the accurate detection of the crack of the pavement is realized, and the accuracy can be improved by three percentage points.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps in fig. 1 may include a plurality of steps or stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily sequential, but may be performed in rotation or alternatively with at least a portion of the steps or stages in other steps or other steps.
In one embodiment, as shown in fig. 2, there is provided a pavement crack detection device, including: a pre-training acquisition module 201, a processing module 202, a determination module 203, an adjustment module 204, a retraining acquisition module 205, and a training module 206, wherein:
A pre-training obtaining module 201, configured to obtain a plurality of training images, a plurality of training models, and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images.
The processing module 202 is configured to input each training image to each neural network model, and obtain an output of each neural network model.
The determining module 203 is configured to determine a reference class of the training image according to outputs of the plurality of training models corresponding to the same training image.
An adjustment module 204, configured to adjust model parameters of the application model until a first training stop condition is satisfied, to obtain a pre-trained application model; the difference between the output of the training image corresponding to the pre-trained application model and the reference class of the training image is less than a threshold.
The retraining and acquiring module 205 is configured to acquire a plurality of road surface images marked with the positions of the areas where the cracks are located.
The training module 206 is configured to retrain the pre-trained application model with the pavement image as a training sample and the position of the area where the crack in the pavement image is located as a training label until a second training stop condition is met, so as to obtain a final trained application model; the finally trained application model is used for detecting pavement cracks.
According to the pavement crack detection device, the plurality of training images, the plurality of training models and the application model are obtained, the application model and the plurality of training models are all neural network models, and the class range output by each neural network model is the same as the class range of the plurality of training images, so that the plurality of training images can be input into the plurality of training models and the application model, and the class of each training image output by each neural network is obtained. The training models are different in structure, the reference category of the training image is determined according to the output of the training models corresponding to the same training image, the training models can be utilized to extract different features from the training image, various features are fully extracted from the training image, the comprehensiveness of feature extraction is ensured, and therefore the accuracy of category judgment is improved. Based on the reference class of the training image and the reference class of the application model corresponding to the training image, model parameters of the application model are adjusted until a first training stopping condition is met, a pre-trained application model with the difference between the output of the corresponding training image and the reference class of the training image smaller than a threshold value is obtained, and the characteristics of comprehensively extracting the characteristics of a plurality of training models and accurately classifying the characteristics can be transplanted to the application model, so that the effect of integrating the plurality of training models can be achieved by one application model. After the application model is obtained, a plurality of road surface images marked with the positions of the areas where the cracks are located are obtained, the road surface images are taken as training samples, the positions of the areas where the cracks are located in the road surface images are taken as training labels, the pre-trained application model is retrained until the second training stopping condition is met, and a final trained application model is obtained, so that the final trained application model can be used for detecting the road surface cracks. The application model can comprehensively extract the characteristics from the pavement image, accurately determine the distribution positions of the cracks, and realize the accurate detection of the pavement cracks. And when the comprehensive effect of a plurality of training models is transplanted to the application model, the training images do not need to be classified or marked manually, a large number of training images can be adopted for training, and the training effect of the application model is ensured.
In one embodiment, the pre-training acquisition module 201 includes a setup unit, an acquisition unit, and a training unit, wherein: the building unit is used for building a plurality of training models and an application model; an acquisition unit configured to acquire a plurality of classified images marked with belonging categories; the training unit is used for taking the classified images as training samples and the categories to which the classified images belong as training labels, training each training model until the third training stopping condition is met, and obtaining a trained training model.
In one embodiment, the processing module 202 is configured to input a training image into a neural network model, so as to obtain probabilities that the training image output by the neural network model belongs to each category.
In one embodiment, the determining module 203 comprises a computing unit and a selecting unit, wherein: the computing unit is used for taking the average value of probabilities that the same training image output by the training models belongs to the same category as the reference probability that the training image belongs to the same category; and the selection unit is used for taking the category corresponding to the maximum value in the reference probabilities of the training images belonging to each category as the reference category of the training images.
In one embodiment, the adjustment module 204 includes a networking unit, a discriminating unit, and an adjustment unit, where: the networking unit is used for establishing a discriminator, taking the application model as a generator and forming a generating type countermeasure network with the discriminator; the discriminator is a neural network model; the judging unit is used for inputting the reference category of the training image as a true sample to the judging device to obtain a true sample judging result, and inputting the output of the training image corresponding to the application model as a false sample to the judging device to obtain a false sample judging result; the adjusting unit is used for alternately adjusting the model parameters of the discriminator and the model parameters of the generator until the fourth training stopping condition is met, so as to obtain a pre-trained application model; the difference between the true sample discrimination result and the false sample discrimination result in the adjusted discriminator is larger than that in the discriminator before adjustment, and the difference between the true sample discrimination result and the false sample discrimination result in the adjusted generator is smaller than that in the generator before adjustment.
In one embodiment, the number of neurons in the application model is less than the number of neurons in each training model.
In one embodiment, the apparatus further comprises: the device comprises a detection acquisition module and a position detection module, wherein: the detection acquisition module is used for acquiring an image to be detected after the finally trained application model is obtained; the position detection module is used for inputting the image to be detected into the finally trained application model to obtain the position of the region where the crack in the image to be detected is located.
For specific limitations of the pavement crack detection device, reference may be made to the above limitation of the pavement crack detection method, and no further description is given here. The respective modules in the above-described pavement crack detection device may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 3. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The nonvolatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a pavement crack detection method.
It will be appreciated by those skilled in the art that the structure shown in fig. 3 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of: acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images; inputting each training image into each neural network model respectively to obtain the output of each neural network model; determining a reference category of the training image according to the output of the plurality of training models corresponding to the same training image; adjusting model parameters of the application model until a first training stopping condition is met, and obtaining a pre-trained application model; the difference between the output of the pre-trained application model corresponding to the training image and the reference class of the training image is less than a threshold value; acquiring a plurality of pavement images marked with the positions of the areas where the cracks are positioned; taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so as to obtain a final trained application model; the finally trained application model is used for detecting pavement cracks.
In one embodiment, the processor when executing the computer program further performs the steps of: establishing a plurality of training models and an application model; acquiring a plurality of classified images marked with the belonging categories; and taking the classified images as training samples, taking the categories of the classified images as training labels, and training each training model until the third training stopping condition is met, so as to obtain a trained training model.
In one embodiment, the processor when executing the computer program further performs the steps of: and inputting a training image into a neural network model to obtain the probability that the training image output by the neural network model belongs to each category.
In one embodiment, the processor when executing the computer program further performs the steps of: taking the average value of probabilities that the same training image output by a plurality of training models belongs to the same category as the reference probability that the training image belongs to the same category; and taking the category corresponding to the maximum value in the reference probabilities of the categories to which the training image belongs as the reference category of the training image.
In one embodiment, the processor when executing the computer program further performs the steps of: establishing a discriminator, taking the application model as a generator, and forming a generating type countermeasure network with the discriminator; the discriminator is a neural network model; the reference category of the training image is used as a true sample to be input into a discriminator to obtain a true sample discrimination result, and the output of the training image corresponding to the application model is used as a false sample to be input into the discriminator to obtain a false sample discrimination result; alternately adjusting model parameters of the discriminator and model parameters of the generator until a fourth training stop condition is met, so as to obtain a pre-trained application model; the difference between the true sample discrimination result and the false sample discrimination result in the adjusted discriminator is larger than that in the discriminator before adjustment, and the difference between the true sample discrimination result and the false sample discrimination result in the adjusted generator is smaller than that in the generator before adjustment.
In one embodiment, the processor when executing the computer program further performs the steps of: the number of neurons in the application model is less than the number of neurons in each training model.
In one embodiment, the processor when executing the computer program further performs the steps of: after a trained application model is obtained, an image to be detected is obtained; inputting the image to be detected into a final trained application model to obtain the position of the region where the crack in the image to be detected is located.
According to the computer equipment, the plurality of training images, the plurality of training models and the application model are acquired, the application model and the plurality of training models are all neural network models, and the class range output by each neural network model is the same as the class range of the plurality of training images, so that the plurality of training images can be input into the plurality of training models and the application model, and the class of each training image output by each neural network is obtained. The training models are different in structure, the reference category of the training image is determined according to the output of the training models corresponding to the same training image, the training models can be utilized to extract different features from the training image, various features are fully extracted from the training image, the comprehensiveness of feature extraction is ensured, and therefore the accuracy of category judgment is improved. Based on the reference class of the training image and the reference class of the application model corresponding to the training image, model parameters of the application model are adjusted until a first training stopping condition is met, a pre-trained application model with the difference between the output of the corresponding training image and the reference class of the training image smaller than a threshold value is obtained, and the characteristics of comprehensively extracting the characteristics of a plurality of training models and accurately classifying the characteristics can be transplanted to the application model, so that the effect of integrating the plurality of training models can be achieved by one application model. After the application model is obtained, a plurality of road surface images marked with the positions of the areas where the cracks are located are obtained, the road surface images are taken as training samples, the positions of the areas where the cracks are located in the road surface images are taken as training labels, the pre-trained application model is retrained until the second training stopping condition is met, and a final trained application model is obtained, so that the final trained application model can be used for detecting the road surface cracks. The application model can comprehensively extract the characteristics from the pavement image, accurately determine the distribution positions of the cracks, and realize the accurate detection of the pavement cracks. And when the comprehensive effect of a plurality of training models is transplanted to the application model, the training images do not need to be classified or marked manually, a large number of training images can be adopted for training, and the training effect of the application model is ensured.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images; inputting each training image into each neural network model respectively to obtain the output of each neural network model; determining a reference category of the training image according to the output of the plurality of training models corresponding to the same training image; adjusting model parameters of the application model until a first training stopping condition is met, and obtaining a pre-trained application model; the difference between the output of the pre-trained application model corresponding to the training image and the reference class of the training image is less than a threshold value; acquiring a plurality of pavement images marked with the positions of the areas where the cracks are positioned; taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so as to obtain a final trained application model; the finally trained application model is used for detecting pavement cracks.
In one embodiment, the computer program when executed by the processor further performs the steps of: establishing a plurality of training models and an application model; acquiring a plurality of classified images marked with the belonging categories; and taking the classified images as training samples, taking the categories of the classified images as training labels, and training each training model until the third training stopping condition is met, so as to obtain a trained training model.
In one embodiment, the computer program when executed by the processor further performs the steps of: and inputting a training image into a neural network model to obtain the probability that the training image output by the neural network model belongs to each category.
In one embodiment, the computer program when executed by the processor further performs the steps of: taking the average value of probabilities that the same training image output by a plurality of training models belongs to the same category as the reference probability that the training image belongs to the same category; and taking the category corresponding to the maximum value in the reference probabilities of the categories to which the training image belongs as the reference category of the training image.
In one embodiment, the computer program when executed by the processor further performs the steps of: establishing a discriminator, taking the application model as a generator, and forming a generating type countermeasure network with the discriminator; the discriminator is a neural network model; the reference category of the training image is used as a true sample to be input into a discriminator to obtain a true sample discrimination result, and the output of the training image corresponding to the application model is used as a false sample to be input into the discriminator to obtain a false sample discrimination result; alternately adjusting model parameters of the discriminator and model parameters of the generator until a fourth training stop condition is met, so as to obtain a pre-trained application model; the difference between the true sample discrimination result and the false sample discrimination result in the adjusted discriminator is larger than that in the discriminator before adjustment, and the difference between the true sample discrimination result and the false sample discrimination result in the adjusted generator is smaller than that in the generator before adjustment.
In one embodiment, the computer program when executed by the processor further performs the steps of: the number of neurons in the application model is less than the number of neurons in each training model.
In one embodiment, the computer program when executed by the processor further performs the steps of: after a trained application model is obtained, an image to be detected is obtained; inputting the image to be detected into a final trained application model to obtain the position of the region where the crack in the image to be detected is located.
According to the storage medium, the plurality of training images, the plurality of training models and the application model are acquired, the application model and the plurality of training models are all neural network models, and the class range output by each neural network model is the same as the class range of the plurality of training images, so that the plurality of training images can be input into the plurality of training models and the application model, and the class of each training image output by each neural network can be obtained. The training models are different in structure, the reference category of the training image is determined according to the output of the training models corresponding to the same training image, the training models can be utilized to extract different features from the training image, various features are fully extracted from the training image, the comprehensiveness of feature extraction is ensured, and therefore the accuracy of category judgment is improved. Based on the reference class of the training image and the reference class of the application model corresponding to the training image, model parameters of the application model are adjusted until a first training stopping condition is met, a pre-trained application model with the difference between the output of the corresponding training image and the reference class of the training image smaller than a threshold value is obtained, and the characteristics of comprehensively extracting the characteristics of a plurality of training models and accurately classifying the characteristics can be transplanted to the application model, so that the effect of integrating the plurality of training models can be achieved by one application model. After the application model is obtained, a plurality of road surface images marked with the positions of the areas where the cracks are located are obtained, the road surface images are taken as training samples, the positions of the areas where the cracks are located in the road surface images are taken as training labels, the pre-trained application model is retrained until the second training stopping condition is met, and a final trained application model is obtained, so that the final trained application model can be used for detecting the road surface cracks. The application model can comprehensively extract the characteristics from the pavement image, accurately determine the distribution positions of the cracks, and realize the accurate detection of the pavement cracks. And when the comprehensive effect of a plurality of training models is transplanted to the application model, the training images do not need to be classified or marked manually, a large number of training images can be adopted for training, and the training effect of the application model is ensured.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (10)

1. A pavement crack detection method, characterized in that the method comprises:
acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images;
inputting each training image into each neural network model respectively to obtain the output of each neural network model;
determining a reference category of the training image according to the output of the plurality of training models corresponding to the same training image;
establishing a discriminator, and forming a generated type countermeasure network with the discriminator by taking the application model as a generator; the discriminator is a neural network model;
The reference category of the training image is used as a true sample to be input into the discriminator to obtain a true sample discrimination result, and the output of the training image corresponding to the application model is used as a false sample to be input into the discriminator to obtain a false sample discrimination result;
alternately adjusting model parameters of the discriminator and model parameters of the generator until a fourth training stop condition is met, so as to obtain a pre-trained application model; the difference between the true sample discrimination result and the false sample discrimination result in the adjusted discriminator is larger than that in the discriminator before adjustment, and the difference between the true sample discrimination result and the false sample discrimination result in the adjusted generator is smaller than that in the generator before adjustment;
adjusting model parameters of the application model until a first training stopping condition is met, so as to obtain a pre-trained application model; the pre-trained application model corresponds to a difference between the output of the training image and a reference class of the training image being less than a threshold;
acquiring a plurality of pavement images marked with the positions of the areas where the cracks are positioned;
taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is positioned as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so as to obtain a final trained application model; the final trained application model is used for detecting pavement cracks.
2. The method of claim 1, wherein the acquiring a plurality of training images, a plurality of training models, and an application model comprises:
establishing a plurality of training models and an application model;
acquiring a plurality of classified images marked with the belonging categories;
and taking the classified images as training samples, taking the categories to which the classified images belong as training labels, and training each training model until a third training stopping condition is met, so as to obtain a trained training model.
3. The method of claim 1, wherein said inputting each of said training images into a respective one of said neural network models to obtain an output of each of said neural network models comprises:
and inputting one training image into one neural network model to obtain the probability that the training image output by the neural network model belongs to each category.
4. The method of claim 2, wherein determining the reference class of the training image from the outputs of the plurality of training models corresponding to the same training image comprises:
taking an average value of probabilities that the same training image output by the training models belongs to the same category as a reference probability that the training image belongs to the same category;
And taking the category corresponding to the maximum value in the reference probabilities of the categories of the training images as the reference category of the training images.
5. The method of any one of claims 1 to 4, wherein the number of neurons in the application model is less than the number of neurons in each of the training models.
6. The method according to any of claims 1 to 4, wherein after said deriving a final trained application model, the method further comprises:
acquiring an image to be detected;
inputting the image to be detected into the final trained application model to obtain the position of the region where the crack in the image to be detected is located.
7. The method according to any one of claims 1 to 4, wherein the first training stop condition comprises at least one of: the difference between the output of the application model corresponding to the training image and the reference category of the training image is smaller than a threshold value, the training times reach the preset times, the training time reaches the preset time and the classification accuracy of the application model reaches the preset index.
8. A pavement crack detection device, the device comprising:
The pre-training acquisition module is used for acquiring a plurality of training images, a plurality of training models and an application model; the application model and the training models are neural network models with different structures, and the class range output by each neural network model is the same as the class range of the training images;
the processing module is used for inputting each training image into each neural network model respectively to obtain the output of each neural network model;
the determining module is used for determining the reference category of the training image according to the output of the training images corresponding to the same training model;
the adjusting module is used for adjusting the model parameters of the application model until the first training stopping condition is met, so as to obtain a pre-trained application model; the pre-trained application model corresponds to a difference between the output of the training image and a reference class of the training image being less than a threshold;
the training acquisition module is used for acquiring a plurality of road surface images marked with the positions of the areas where the cracks are located;
the training module is used for taking the pavement image as a training sample, taking the position of the area where the crack in the pavement image is located as a training label, and retraining the pre-trained application model until a second training stopping condition is met, so that a final trained application model is obtained; the final trained application model is used for detecting pavement cracks;
The adjusting module comprises a networking unit, a judging unit and an adjusting unit, wherein:
the networking unit is used for establishing a discriminator, taking the application model as a generator and forming a generating type countermeasure network with the discriminator; the discriminator is a neural network model;
the judging unit is used for inputting the reference category of the training image as a true sample to the judging device to obtain a true sample judging result, and inputting the output of the application model corresponding to the training image as a false sample to the judging device to obtain a false sample judging result;
the adjusting unit is used for alternately adjusting the model parameters of the discriminator and the model parameters of the generator until a fourth training stopping condition is met, so as to obtain a pre-trained application model; the difference between the true sample discrimination result and the false sample discrimination result in the adjusted discriminator is larger than that in the discriminator before adjustment, and the difference between the true sample discrimination result and the false sample discrimination result in the adjusted generator is smaller than that in the generator before adjustment.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202011506513.1A 2020-12-18 2020-12-18 Pavement crack detection method, device, computer equipment and storage medium Active CN112633354B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011506513.1A CN112633354B (en) 2020-12-18 2020-12-18 Pavement crack detection method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011506513.1A CN112633354B (en) 2020-12-18 2020-12-18 Pavement crack detection method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112633354A CN112633354A (en) 2021-04-09
CN112633354B true CN112633354B (en) 2024-03-01

Family

ID=75317225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011506513.1A Active CN112633354B (en) 2020-12-18 2020-12-18 Pavement crack detection method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112633354B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117355845A (en) * 2021-06-18 2024-01-05 华为云计算技术有限公司 System and method for neural network model combining
JP7217570B1 (en) * 2022-08-04 2023-02-03 株式会社センシンロボティクス Information processing system and program, information processing method, server

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107945153A (en) * 2017-11-07 2018-04-20 广东广业开元科技有限公司 A kind of road surface crack detection method based on deep learning
CN110889463A (en) * 2019-12-10 2020-03-17 北京奇艺世纪科技有限公司 Sample labeling method and device, server and machine-readable storage medium
KR102090770B1 (en) * 2018-10-12 2020-03-18 성균관대학교산학협력단 automated image recognizer model generation and image recognizer unit and management method using thereof
CN111898507A (en) * 2020-07-22 2020-11-06 武汉大学 Deep learning method for predicting earth surface coverage category of label-free remote sensing image
CN112001411A (en) * 2020-07-10 2020-11-27 河海大学 Dam crack detection algorithm based on FPN structure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107945153A (en) * 2017-11-07 2018-04-20 广东广业开元科技有限公司 A kind of road surface crack detection method based on deep learning
KR102090770B1 (en) * 2018-10-12 2020-03-18 성균관대학교산학협력단 automated image recognizer model generation and image recognizer unit and management method using thereof
CN110889463A (en) * 2019-12-10 2020-03-17 北京奇艺世纪科技有限公司 Sample labeling method and device, server and machine-readable storage medium
CN112001411A (en) * 2020-07-10 2020-11-27 河海大学 Dam crack detection algorithm based on FPN structure
CN111898507A (en) * 2020-07-22 2020-11-06 武汉大学 Deep learning method for predicting earth surface coverage category of label-free remote sensing image

Also Published As

Publication number Publication date
CN112633354A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
CN111310862B (en) Image enhancement-based deep neural network license plate positioning method in complex environment
US20200065968A1 (en) Joint Deep Learning for Land Cover and Land Use Classification
EP3614308A1 (en) Joint deep learning for land cover and land use classification
EP3975135A1 (en) Topographic data machine learning method and system
CN109684922B (en) Multi-model finished dish identification method based on convolutional neural network
US8406535B2 (en) Invariant visual scene and object recognition
CN109029363A (en) A kind of target ranging method based on deep learning
CN109410184B (en) Live broadcast pornographic image detection method based on dense confrontation network semi-supervised learning
Baroffio et al. Camera identification with deep convolutional networks
CN104866868A (en) Metal coin identification method based on deep neural network and apparatus thereof
CN112633354B (en) Pavement crack detection method, device, computer equipment and storage medium
CN111401387B (en) Abnormal sample construction method, device, computer equipment and storage medium
CN110399820B (en) Visual recognition analysis method for roadside scene of highway
Majd et al. Transferable object-based framework based on deep convolutional neural networks for building extraction
CN113761259A (en) Image processing method and device and computer equipment
CN113537180A (en) Tree obstacle identification method and device, computer equipment and storage medium
Wang et al. A camouflaged object detection model based on deep learning
Chang et al. Locating waterfowl farms from satellite images with parallel residual u-net architecture
CN111444816A (en) Multi-scale dense pedestrian detection method based on fast RCNN
Moate et al. Vehicle detection in infrared imagery using neural networks with synthetic training data
CN113673505A (en) Example segmentation model training method, device and system and storage medium
CN112699842A (en) Pet identification method, device, equipment and computer readable storage medium
CN112818774A (en) Living body detection method and device
CN116612272A (en) Intelligent digital detection system for image processing and detection method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant